The manuscript for Functional Data Structures in R is coming along nicely. I’ve just finished the description of red-black search trees. Insertion there is simple, but deletion requires a bit more work.

I added some code for visualising trees and some experiments with runtime performance.

Now I want to go back through the previous chapters and update those with more experiments and visualisation, and also include some of the experiment code. I plan to do this before I return to the search tree chapter and write about splay trees.

Rene Thomsen posted this picture on twitter with the text

@ThomasMailund we have a lot of your ‘Beginning Data Science in R’ books available for both current and future data scientists at Scio+

It is great to see that somebody buys them at least. And there’s more there than I ever had myself.

By now, I only have one copy left of the Data Science book and one copy of the Functional Programming book, but still plenty of Metaprogramming and Object-oriented programming.

Yesterday, I finished the draft of the Heaps chapter of Functional Data Structures in R. It got a little shorter than I had planned, but instead the chapter on Immutable and persistent data got a little longer.

I had planned to include Brodal heaps in this chapter, but ended up not doing it. They are pretty cool in having optimal worst-case complexity, but there isn’t much novel in how you would implement them in R compared to binomial heaps. Instead, I described how skew-binomial numbers can be used as a framework for random access lists. That is one of the ideas that are used in Brodal heaps, but by using it for lists I get to show the trick without repeating too much code compared to what I would need to do for Brodal heaps.

Now I just have one more chapter to write before I start editing. The chapter on Sets, maps and search trees. There, I will include red-black trees—these can be handled very elegantly if you never delete elements, but it is possible to delete elements with some work that I still have to implement—and I will include splay trees. I don’t know exactly what to include beyond that.

Anyway, the current plan is that this will be the last chapter, but if I can think about something more to add, I might. If not, I will finish the draft and then let the book rest a few weeks so I can edit it with fresh eyes. While this book is resting, I will find something else to write on.

I’m writing on my next book—Functional Data Structures in R—again now and would love to hear suggestions for data structures to include in the book.

I have, of course, already described linked lists and trees. R doesn’t have built-in linked lists—the lists in R are really more like arrays in how they work, although they are very dynamic—but you can implement linked lists via R lists by just having elements in the list work as pointers. This, naturally, gives you a list implementation of stacks. With a little bit of extra work, and it really doesn’t take much, you can then also get queues with amortised constant time operations.

If you implement lazy evaluation, which you can do with some tricks exploiting that arguments to functions in R are lazy evaluated, you can get worst case constant operations on queues as well.

I’ve described how you can implement skew binary numbers as lists and thus increase and decrease these in constant time, and how this can be used to implement random access lists.

I have leftist heaps and binomial heaps, and I plan to implement and describe skew binomial heaps and Brodal heaps (skew binomial heaps of heaps) as well.

I have described splay heaps and plan to describe splay trees in general for search trees—although here I have to figure out how best to implement a structure that needs to return both values and trees in one operation.

For sets and maps, I have an implementation of red-black search trees, although I haven’t implemented the delete operation yet—that operation is a lot more complex than the insertion operation.

Are there any important data structures I am missing and that I should include?

Both Storm and I have definitely read this at some point and must then have forgotten. It is hard to say if we then reinvented the idea or just suffer from cryptomnesia.