A Convention For Controlled Side-Effects
(This is a theory-only entry. No code was written for it yet—in fact, I'm having trouble finding the time for any kind of serious Hob coding sessions lately.)
What I want is a system that allows us to separate pure and non-pure code in a disciplined way, without resorting to something as restrictive as monads, or something as burdensomly complicated as uniqueness typing.
I've been going over this in the back of my mind for about three years now (the first draft was done on a lonely, feverish train ride from Skopje to Thessaloniki in October 2007), and I've convinced myself that I have something workable. Of course, this has been the topic of much research, which I'm not on top of, and it might be a bit arrogant of me to believe that I've found a system that solves all the issues. I invite (as usual) any reader to point out flaws, and suggest relevant literature.
My system, in short, is this: function types (->
) get specialised
into two different types, ->
and =>
(I'm still trying to come up
with good English terms for them). The =>
type is reserved for
functions that close over mutable data, for example this one (where
ref
, =
, and !
are placeholder syntax for mutable references):
let x = ref 1
(fn i => x = !x + i)
=>
functions themselves count as mutable data for this purpose—a
function that closes over such a function gets the =>
type itself.
Things that mutate 'external' state, such as a print
function, also
have the =>
type (conceptually, they close over 'the world', which
is mutable).
All other function get the ->
type. This means that ->
does not denote
purity—such functions may modify their arguments at will, and return
mutable data. A pure function is one that has the ->
type and whose
argument and return types are all immutable.
By doing the analysis in terms of what closes over what, we allow
functions to internally create mutable values, even define internal
functions that mutate them, without requiring the whole function to be
impure. You could have a pure sort
function that takes an immutable
array, and returns a sorted version, which internally implements a
shamelessly imperative sort algorithm.
Imperative algorithms are great. They often are easier to write and perform better than they pure cousins. On top of that, they often only do their mutations in a limited context, and can, given a system like what I'm describing here, be encapsulated in a pure function.
What is awkward with this system, is that if you have, for example a
map
function of type fn (fn x -> y) [x] -> [y]
, you can now no
longer map your =>
functions over arrays. Any hard-core functional
programmer will say "of course not! duh!". But, coming from pleasantly
undisciplined languages like Lisp and JavaScript, I'm actually not
satisfied with that answer. There are many cases where you'd like to
do what map
does, but also, on the side, perform some side effect
from your mapping function.
Now, if we specify that a ->
type in a function type means that the
function takes 'any kind of function', and treat the type of function
passed as a kind of type variable, we can use them in the same way
that other polymorphic functions are used. If you give map
a ->
function, it is pure. If you give it a =>
, it isn't. This is analogous
to the fact that if you give id
an immutable argument, it is pure,
whereas if you give it something mutable, it isn't.
Doesn't this make everything a confusing jumble again? Let us say that the benefits of being able to distinguish pure functions are twofold: They allow us to make certain assumptions about our code, and they allow the compiler to do some transformations that wouldn't be safe with impure functions. Both of these are useful at the call site—meaning that it is not all that interesting whether a given function is pure, but rather, we want to know whether a given call is pure. Thus, to depend on the actual resolved types to determine purity seems acceptable.
Maybe it would be better to have yet another specifyer (+>
maybe?)
to explicitly denote polymorphic function arguments. You could then
also specify functions that require their argument to be ->
. I'll
have to see in practice whether this would be worth the additional
mental overhead and symbolic ugliness.
Another possible direction is to make this whole system implicit.
The compiler can determine at compile time into which category each
function falls, so we don't have to burden the programmer with
choosing the right arrow type. This has the advantage that you can
drop in, for example, a log statement somewhere without worrying about
it screwing up your types—the compiler will adjust. However, I expect
this will make it hard for people to keep track of which functions are
pure. If you're calling something from another module, how are you to
know whether it will make your function impure? The logging issue can
be worked around with some hack like Haskell's unsafePerformIO
.
(I must say I'm pleasantly surprised by how easy this was to write down. I half expected to run into an endless list of complications when trying to actually express these ideas. So again, if you spot any complications, write me. My email address is in the page footer.)