Tell me your favourite functional data structures

I’m writing on my next book—Functional Data Structures in R—again now and would love to hear suggestions for data structures to include in the book.

I have, of course, already described linked lists and trees. R doesn’t have built-in linked lists—the lists in R are really more like arrays in how they work, although they are very dynamic—but you can implement linked lists via R lists by just having elements in the list work as pointers. This, naturally, gives you a list implementation of stacks. With a little bit of extra work, and it really doesn’t take much, you can then also get queues with amortised constant time operations.

If you implement lazy evaluation, which you can do with some tricks exploiting that arguments to functions in R are lazy evaluated, you can get worst case constant operations on queues as well.

I’ve described how you can implement skew binary numbers as lists and thus increase and decrease these in constant time, and how this can be used to implement random access lists.

Skew binary numbers can be represented as lists containing the positions of non-zero digits. The digit two, which can only occur at the least significant non-zero position, are represented as a duplicated position cell. By merging or splitting positions, you can increment and decrement in constant time.
Skew binary numbers can be represented as lists containing the positions of non-zero digits. The digit two, which can only occur at the least significant non-zero position, are represented as a duplicated position cell. By merging or splitting positions, you can increment and decrement in constant time.
You can represent lists as fully balanced trees and have lists of these at sizes matching the positions in skew binary numbers. The cons and cdr/tail operations then matches the operations you do to increment and decrement numbers.
You can represent lists as fully balanced trees and have lists of these at sizes matching the positions in skew binary numbers. The cons and cdr/tail operations then matches the operations you do to increment and decrement numbers.
Updating random access lists combines updating lists and trees, and because there are no more than log n trees, and these are fully balanced so have depth at most log n, the operations are logarithmic in time.
Updating random access lists combines updating lists and trees, and because there are no more than log n trees, and these are fully balanced so have depth at most log n, the operations are logarithmic in time.

I have leftist heaps and binomial heaps, and I plan to implement and describe skew binomial heaps and Brodal heaps (skew binomial heaps of heaps) as well.

Binomial heaps have a close correspondence to binary numbers, and updating them behaves much like arithmetic on binary numbers.
Binomial heaps have a close correspondence to binary numbers, and updating them behaves much like arithmetic on binary numbers.

I have described splay heaps and plan to describe splay trees in general for search trees—although here I have to figure out how best to implement a structure that needs to return both values and trees in one operation.

Operations for deleting the smallest element in a splay tree.
Operations for deleting the smallest element in a splay tree.
Transitions needed to compute the partition of splay trees into elements smaller than and larger than a new element to be inserted at the root.
Transitions needed to compute the partition of splay trees into elements smaller than and larger than a new element to be inserted at the root.

For sets and maps, I have an implementation of red-black search trees, although I haven’t implemented the delete operation yet—that operation is a lot more complex than the insertion operation.

Are there any important data structures I am missing and that I should include?

Author: Thomas Mailund

My name is Thomas Mailund and I am a research associate professor at the Bioinformatics Research Center, Uni Aarhus. Before this I did a postdoc at the Dept of Statistics, Uni Oxford, and got my PhD from the Dept of Computer Science, Uni Aarhus.

Leave a Reply