Getting scope right
As described in the syntax entry, the basic variable
binding mechanism (besides function abstraction) is a straight-forward
let
form. In a statement like this...
let x = x + 1,
y = 4
x * y
The x + 1
and the 4
are evaluated in the outer scope, creating a
new scope with new bindings for x
and y
, in which x * y
is
evaluated. As a catch, the block mechanism allows the let
's body to
be aligned with its definition, creating an effect that resembles the
free-form introduction (and redefinition) of variables anywhere in a
block:
let x = 4
print x
let x = x * 2
print x
This in fact resembles the way O'Caml works, and is more structured
than, for example, Python. The let
forms, though they don't have to
add indentation, do open new blocks, and the bindings of the variables
are immutable throughout their blocks.
But the fact that let
definitions are evaluated in the 'old'
environment means that they can not be used to define recursive
functions. At least not without something like a
Y-combinator. So we want
another binding construct. Common Lisp's labels
and O'Caml's let
rec
constructs are a bit clumsy — you have to explicitly decide
whether your function is going to be recursive, and group mutually
recursive functions into one chunk of definitions. Scheme's use of
define
forms at the top of a block, is already more convenient. But
I am especially charmed by JavaScript's simplistic rule that all
function definitions within a scope are bound before any of the code
in the block executes. This allows function definitions to live near
their use.
What I am doing is introduce a second binding construct, def
, which
can only define functions, no other types of values. def
forms can
appear anywhere in a block of expressions, with the caveat that there
has to be at least one normal expression per block. During parsing,
they are all moved to the 'top' of the block, or rather the whole
block is wrapped in a construct that indicates it should be evaluated
in an environment which has these functions bound (let rec
style).
For example, this tail-recursive reverse
:
def reverse list ->
def helper
[] acc -> acc
[x:xs] acc -> helper xs x:acc
helper list []
reverse [1; 2; 3]
def
is followed by the name of the function, and then by a set of
pattern -> body
cases, as in fn
.
(Why not allow def
to define non-function variables? The problem
with that is that it would, in every sane approach I can think of,
allow uninitialised variables to be accessed. The current rules make
that strictly impossible, which is a very nice property for a
language.)
One worrying aspect of this convention is that one of the golden rules
of functional languages, 'everything is an expression', no longer
holds. def
forms are more like statements in C-like languages.
Fortunately, they can only occur in one, well-defined place: an
expression-block, which is itself an expression.
So what would a module look like in this language? Here's a possibility.
module RandomFunctionality (pi, e, frob)
let pi = 3.141592653589793
def frob x -> x - pi
let e = 2.7182817
.
A module body is a block of expressions. Note that the dot on the last
line ends this block (though the EOF would also have ended it). I'm
not sure yet whether it'll make sense to have multiple modules in a
file, but it's nice to have the possibility. You can notice two
problems with the let
forms. Firstly, they nest — frob
and e
are
defined inside the block that defines pi
, and as such are not
top-level, but have to be exported from the module anyway. Secondly,
the last let
does not have a body, which is currently a syntax
error.
One solution that comes to mind is to have a 'magic' endmodule
expression, that determines from which scope the variables exported by
the module should be taken. Such an expression would not be allowed to
appear inside a function body. Another option is to implicitly,
invisibly insert such a thing at the end of the body of the innermost
let
expression at the bottom of the module body. The first is rather
ugly, the second a bit too magical for my tastes — not to mention that
it'd mess up the elegance of the parser.
There's a certain discrepancy between a module top-level being a set
of definitions, and a regular block of expressions being a
computation. Some languages, like Haskell or C, treat the two
completely differently. I'm not sure I want to do that — since they
also have some things in common, and regularity is nice. Besides, it
is nice to be able to close over variables in top-level definitions
(and possibly even being able to use regular flow control for
conditional compilation). For now, lacking a perfect solution, a
module is a regular block, culminating in one more more export
expressions, which have type ()
(unit), and determine the symbols
that are exported from the module.
module RandomFunctionality
let pi = 3.141592653589793
def frob x -> x - pi
let e = 2.7182817
export pi, frob, e
This nicely resolves the question of where the exported definitions are to be found. It does mean that the list of exported symbols isn't centralised at the top of the module anymore. This isn't always ideal, since you can't determine the whole interface by glancing at the top of a source file. But for such things one should use automatically generated documentation (more about that later) or development tools anyway. An advantage is that you don't require monstrous, messy lists of symbols for packages with a big interface, but you can just scatter export expressions throughout the package definition, in places where they make sense (below a data-type definition, right after a block of related functions, etcetera).