Closure analysis
Closure analysis is hard.
Or maybe my mental capacities are just too limited to be a serious compiler-implementer. In any case, it's been a long, tough struggle. But, I now have a relatively elegant, solid implementation of closures. Oh joyous day.
As with my code-generator, I started working on this with a way too ambitious mindset. I wanted something elegant, fast, and small, though I had never written anything comparable in the past. This mostly led to endless scribbling in notebooks, convincing myself that I was on the right track, only to realize I had missed two or three major show-stopping problems.
As usual, the solution was to start making small incremental changes to the existing code that moved it into the right direction without breaking it. A lot of these changes were later dumped again because they turned out to actually be very bad ideas, but by trying these dead ends, I got enough insight into the problem to actually end up with a working implementation.
I'll spare you my dead ends here. A major breakthrough was the new approach to polymorphism I described earlier. Giving everything a unique name allowed the closure analyser to throw around variables without worrying too much about context.
In short, one pass (I rolled this into the renamer) collects a set of free variables for every function definition. Then, during compilation, a closure analyser figures out which of these values (and which parts of each value, remember the entry on data-representation) is known at compile time, and doesn't have to be closed over. If any dynamic values remain, they are rolled into a context—for a single value, that is just the value itself, for a set of values, a vector is allocated, and the values are copied into it. This context value is then associated with the function pointer, and the pair of them make up the closure.
So far so good. Things only really start to get interesting when you
consider recursive definitions (all definitions introduced by def
forms in the same block can be mutually recursive). If a bunch of
functions can all close over each other, in which order do you create
the closures? Worse yet, how do you determine the layout of these
closures?
I had several pages of ugly-as-the-night code written to handle this, but actually the solution was quite easy. First, we determine which functions need to close over anything at all. We can divide the variables functions close over into three categories: dynamic values, static values, and sibling functions. Those that have any dynamic values definitely need to close over something, so we mark those as dynamic. Then, the nature of the remaining functions can be determined by simply going over the list of closed-over sibling for each of them, and marking them dynamic if any of those sibling are dynamic.
Now that we have this information, we can easily determine the amount of closed-over values for each function, and see whether we need to allocate a context vector for them. If they need a vector, we can allocate this vector, and their siblings can simply use the pointer to this vector to close over. If they are single-value closures, we have to determine which value will become the closure. If this value is a sibling pointer, we must make sure to first determine the representation of that sibling. To prove that this can not recurse endlessly, consider the situation where two sibling functions close only over each other, and thus both want a single-value context, namely each other. They would not close over anything else, and thus they would both be static, and not require a context at all.
Well, I guess that algorithm is actually not all that interesting, but it nicely demonstrates the kind of recursive mindfuck that keeps coming up in this project.
Finally, the environment in which the closing functions are then compiled is extended, with new bindings for each closed-over variable, associating the variable with the way its value can be found in the context value.
So far, I have punted out on lambda lifting (adding arguments to functions to make them no longer close over stuff) and clever sharing of context values. The second, I expect to be rather easy, and I will get to it soon. But lambda lifting (which was actually the first thing I tried to implement) is being made extremely awkward by the way I'm representing the syntax tree. Type information is stored in every node, and function nodes store their free variables. Any lifting transformation would have to update this info in all the nodes involved, which is a recipe for ugly, mutation-heavy, and fragile code.
Fortunately, my closure representation makes functions closing over a single value exactly as cheap as functions that do not close at all, so the benefit of lambda lifting in this compiler is actually not as big as it would be in others. Still, the fact that it appears that hard to do says suggests my syntax-tree representation is sub-optimal. I should probably go and read through the source code of some well-written compilers.
(As an aside, I'm finding the dynamic typing of Lisp is really
starting to chafe at this point. I've never had serious problems with
it in other projects, but the amount of datatypes involved in a
compiler—there are four different stages of type describing values, for
example—make it really easy to do something wrong, leading to annoying
debugging sessions where the wrong value first travels through several
layers of code and then causes a weird error in an unrelated place. I
should probably start adding check-type
assertions all over the place.)