I’m reading Chris Okasaki’s Purely Functional Data Structures again while I work on my book on data structures in R. I read it a while back, but it is really eyeopening how much I missed when I just read it compared to now that I am actually implementing and writing about these data structures myself.
One thing I didn’t think much about when I read it the first time is how lazy evaluation and amortised analysis really work together—which is a bit surprising considering how much of the book is actually about that. Now, that I have had to think more about it myself in order to explain it in my own book, it finally clicks.
With amortised analysis you can think about saving up some “computation” by having cheap operations on a data structure that can then pay for more expensive operations later on, such that the average cost of operations is below a certain bound. Okasaki makes a distinction between using the “banker’s method” or the “physicist’s method”, but I never really thought much about that and I don’t think too much about it now either. I just think about saving up some “computation” in a data structure that I can use later.
One example of such a saving analysis is the functional queue. With functional lists you cannot really implement queues with pointer manipulation so the structure you have to work with is a single-linked lists where you can prepend elements. This makes it easy to implement stacks but hard to implement queues. What you can do to implement queue is to use two lists, a front queue, F, and a back queue, B. When you enqueue elements, you prepend them to the B list and when you dequeue them you take them from the front of the F list. Of course, at some point you will need to move elements from B to F to make this work, which requires a reversal of B.
No worries, though. If F is empty, you will just reverse B and call the result the new F. Amortised analysis tells you this will be fine and you will get constant time queue operations. The reasoning goes like this: every time you enqueue an element you pay one “computation” for doing that and you save one “computation” in the data structure. The length of B is the same as your total savings. When you reverse B to make the result the new F, the time is proportional to B so you have saved up for that. After you have reversed B your savings are depleted, but no worries. It is not a problem to have no savings; you just can’t go below zero.
This reasoning becomes a problem if you want to use the queue as a persistent data structure. Persistent here means that any earlier state of the data structure is potentially available to continue manipulations on.
If I have saved up enough “computation” for one reversal, and I now have an empty F list and a full B list, and I want to dequeue, then that is fine. The savings pay for the reversal of B. But with a persistent data structure, I could dequeue from the same “empty F, full B” queue several times. I only have savings enough to pay for the first one. The analysis breaks because I am using the same savings several times.
Future users of the queue can all access the savings, but the savings are really gone after the first use.
With lazy evaluation you get a different way to do amortised analysis. You can use debt instead of savings. You don’t think in terms of savings that can be used for expensive operations in the future; you think in terms of expensive operations—that you haven’t evaluated yet—as a debt that is in the data structure. Any future user of the data structure that wants to invoke the expensive operation must first have paid of the debt.
If, instead of reversing B whenever F is empty, we update F to be CAT(F,REV(B)) whenever the size of F equals the size of B, but do so lazily, then, to get to the expensive REV(B) operation, a future user will first have to had dequeued as many elements as it costs to reverse B.
The lazy REV(B) is there in the data structure as an expensive debt, but you cannot access it before you have paid off the debt by dequeuing all the cheap elements in F.
Instead of saving up for expensive operations in the future, we put the responsibility on future users to pay off the debt before they can access the expensive operation.
If we define concatenation as a lazy expression CAT(A,B) = CONS(CAR(A), CAT(CDR(A), B)) then we can concatenate in constant time, so the concatenation operation is cheap. There is not really any way of making the reversal operation cheaper than linear, but it is okay, we pay for it through the dequeueing of cheap operations.
You might object, now, that yes we can construct the queue this way, but nothing prevents us then from doing the concatenation and reversal and then dequeue until just before the reversal and leave the queue in this state. Then, we can restart from that queue and dequeue in many different future timelines.
That is a very good objection, and if we just had lazy evaluation this would be a problem. Of course, all future timelines from the point where we make the reversal will be on a timeline that has already paid off the debt, so in some sense we are okay, but we can’t have shared paying off of a debt and then all living happily on the result. It really only works if we remember the results of evaluating an expression so we don’t have to evaluate it again if we access it again later. If we remember the result of evaluated expressions, then only one of the future time lines will actually pay for it. All other timelines, although they are part of paying of the debt, will not see the expensive operation.
You cannot simply implement this using thunks as lazy evaluation. You do need memorisation as well.
I have implemented this in R that has some version of lazy evaluation, but only as parameter expressions. I could just implement the delayed expressions as thunks, but I have to wrap them in parameters passed to the creator of the thunk. So it is not quite this simple, but if you get it right, it is quite elegant.