Lazy queues

Today I implemented one of the data structures from C. Okasaki, Simple and efficient purely functional queues and deques, J. of Functional Programming, 5(4), 583-592, (1995) in R, and wrote about it in the chapter I am working on for my next book.

This is a queue implementation based on lazy lists that combine list concatenation and reversal, to move elements from a “back” list to a “front” list one step at a time as the queue is modified, through a “rotate” function. It requires that you construct the rotations as lazy operations and in its simplest form, the rotation function could look like this:

rot <- function(front, back, a) {
  if (is_nil(front)) cons(car(back), a)
  else {
    lazy_thunk <- function(lst) function() lst()
    lazy_thunk(cons(car(front), rot(cdr(front), cdr(back), cons(car(back), a))))
  }
}

The function does a combination of concatenation and reversal, that makes sense if you read my two previous posts. The back list is guaranteed to be one element longer than the front list when the function is called, so the base case of the recursion just puts a single element from the back list into an accumulator, and the recursive call puts the first element of the front list at the front of a list that contains the concatenation of the rest of the front list and the reversal of the back list.

Because there is no lazy evaluation of expressions in R, we have to wrap the result in a thunk. Because there is lazy evaluation of expressions, but not the way you would expect, the function doesn’t work. And it was interesting (although also a little frustrating) to figure out why.

The three parameters to the function are passed as promises—unevaluated expressions—and the accumulator is updated by prepending an element to each through a cons call. If this expression is passed along in the recursion as an unevaluated expression, each construction constructs a thunk with another cons call that needs to be evaluated later. Once you get to the end of the recursion and actually do need to access the accumulator, you have to evaluate this expression—so you need to evaluate a list of calls to cons as long as the accumulator is. This will exceed the stack size in R for long lists.

You can get around the problem in different ways. You can explicitly make sure that the cons calls are evaluated immediately

rot <- function(front, back, a) {
  if (is_nil(front)) cons(car(back), a)
  else {
    lazy_thunk <- function(lst) function() lst()
    tail <- cons(car(back), a)
    lazy_thunk(cons(car(front), rot(cdr(front), cdr(back), tail)))
  }
}

or you can just force the accumulator:

rot <- function(front, back, a) {
  force(a)
  if (is_nil(front)) cons(car(back), a)
  else {
    lazy_thunk <- function(lst) function() lst()
    lazy_thunk(cons(car(front), rot(cdr(front), cdr(back), cons(car(back), a))))
  }
}

Usually, it is a good idea to force all parameters if you return them in a closure, so that is the conservative approach and probably what you should aim for whenever you can get away with it.

Anyway, you have to wait a little bit to see how this all fits in with functional queues. I need to implement one more data structure before I’m finished with the current chapter.

Author: Thomas Mailund

My name is Thomas Mailund and I am a research associate professor at the Bioinformatics Research Center, Uni Aarhus. Before this I did a postdoc at the Dept of Statistics, Uni Oxford, and got my PhD from the Dept of Computer Science, Uni Aarhus.

Leave a Reply