Evaluating queue data structures

Working on my book on functional data structures in R, I have implemented three versions of functional queues, two that are not really functional—they need to update a data structure as side effects—and one that is. As far as I can tell, they should have roughly the same performance, but they don’t.

Asymptotically, they all work in \(O(1)\) time for all operations, this nice figure explains why:

The implementations have two lists and need to move elements from one to the other when you access the front of the queue, so you can imagine all enqueue operations costing twice as many computrones as the other operations and then that can pay for the movement of elements later.

However, the three implementations I have in R have different performances when I measure them. Two of them modify an environment and are not persistent; the third remembers one element more than the others and gives me a persistent data structure. For some reason, the last version has a performance that is about twice as bad as the other two.

I am stumped as to why.

You can reproduce the experiments with the code below—there might be a beer in it for you if you can explain to me why the functional version is so much slower.


setup < - function(n) n
evaluate <- function(empty) function(n, x) 
  elements <- 1:n
  queue <- empty
  for (elm in elements) 
queue <- enqueue(queue, elm)
  for (i in seq_along(elements)) 
queue <- dequeue(queue)

ns <- seq(5000, 10000, by = 1000)
performance \<- rbind(get_performance("explicity environment", ns, setup, evaluate(empty_env_queue())),
 get_performance("closure environment", ns, setup, evaluate(empty_closure_queue())),
 get_performance("functional queue", ns, setup, evaluate(empty_extended_queue())))

ggplot(performance, aes(x = as.factor(n), y = time / n, fill = algo)) +
  geom_boxplot() + 
  scale_fill_grey("Data structure") + 
  xlab(quote(n)) + ylab(expression(Time / n)) + theme_minimal()

Dictating instead of writing today

I have done something to my back that makes it very painful to sit at the computer today. It’s probably because I’ve been sitting too long at the computer the last couple of days. I have had this problem before, but usually only if I’ve been using the mouse too much. That hasn’t been a problem lately, but I do spend a lot of hours at the computer these days compared to some weeks ago. Anyway, I have some lecture notes that I would like to finish this week, so I can’t really give it a complete rest. So this gives me an excuse to try dictating text instead of writing.

I am dictating this blog post on my phone. I am dictating in the Ulysses editor. That is the editor I use for blogging. For some reason, I cannot dictate in Ulysses on my iPad. On the iPad I can dictate to iA Writer, but not Ulysses. On the phone I can dictate in both.

I’m surprised at how well dictation is working. When I tried it some years ago it really didn’t understand my thick Danish accent. Now it seems to understand most of the words I’m saying; I do have to change a few of them after I’ve dictated, but mostly to get stuff right. I still prefer using the keyboard over the microphone, but dictation is a working solution while my back is painful.

Object-oriented programming in R

I just finished the first chapter of my next book: Object-oriented programming in R. This time I am planning to go through the object-oriented systems in R and write what I can about how to write object-oriented programs in R.

I’ve put the book on Leanpub but to my shock they now require payment for publication there. US $99 + VAT, which is about the same as I have received in royalties from my other two books. If I include my Grammarly subscription in expenses involved in writing books, I am loosing money on this now.

I can only hope that I can actually make that much with sales in the long run. I am not even a third way there with Functional programming in R which has the same length as I aim at for this book. With Introduction to data science and statistical programming with R I’ve sold for just a tad more than $100, but that is a longer book and that was in the first two months of sales. I’m not really selling any copies any longer.

Since it is now actually costing me money to write books I have put a minimum price on the books. I won’t do that for the new book — I don’t think it is fair to charge for a book that isn’t even finished yet — but I have put a minimum price on the other two. It is $0.99+VAT for Functional programming in R and $1.99+VAT for Introduction to data science and statistical programming with R. It is about a third of the recommended price, so it is still pretty cheap, and it might make me enough money that I can afford publishing a fourth book.

Filter, Map, and Reduce

Just finished the next chapter in Functional Programming in R. This chapter covers the Filter, Map, and Reduce functions and the family of apply functions and a short introduction to the purrr package.

The chapters seem to be getting a little shorter here towards the end of the book, but there isn’t that much to say about these functions. I will think about it a little and see if I can add a little more, but I don’t feel like padding the chapter just to get its length up to the same size as the previous chapters.

I have thought about how to get the previous chapter a little longer. The continuation-passing-style is not that easy to get around mentally if you haven’t seen it before, so I think I will add another example to that section. I just need a good example to use. Will have to think about that.

Anyway, now that this chapter is drafted I will go back to working on Introduction to Data Science and Statistical Programming in R and get that edited so it is ready when teaching starts in two weeks.

Writing update

My writing is going slow the last couple of days. I actually had planned to write a bit in the weekend but then got side tracked by the units project so I only wrote about a thousand words. It was about the same today, but I did manage to get some proofreading done on the Data Science book at least.

I need to get the Data Science book ready for classes, which are now only two weeks away. At least I need the first eight chapters, the rest can technically speaking wait another seven weeks, but I would prefer to have the entire book edited before I hand it out to my students. So I will mostly be working on that this week and hopefully get it done by he weekend.

I just want to finish the current chapter in Functional Programming in R before I put it aside. It isn’t that hard to write but it is looking to be a little short compared to the previous chapters so I am thinking about what I could possibly add to it. I am drawing a blank right now, but maybe I will have an idea tomorrow.

I have imported the entire book isn’t Ulysses and it is nice to write there, but annoying to have to export every time I need to compile it. I don’t know if that could be automated. That would be another fun project if it is possible.