Working on my book on functional data structures in R, I have implemented three versions of functional queues, two that are not really functional—they need to update a data structure as side effects—and one that is. As far as I can tell, they should have roughly the same performance, but they don’t.
Asymptotically, they all work in \(O(1)\) time for all operations, this nice figure explains why:
The implementations have two lists and need to move elements from one to the other when you access the front of the queue, so you can imagine all enqueue operations costing twice as many computrones as the other operations and then that can pay for the movement of elements later.
However, the three implementations I have in R have different performances when I measure them. Two of them modify an environment and are not persistent; the third remembers one element more than the others and gives me a persistent data structure. For some reason, the last version has a performance that is about twice as bad as the other two.
I am stumped as to why.
You can reproduce the experiments with the code below—there might be a beer in it for you if you can explain to me why the functional version is so much slower.
devtoolsinstall_github("mailund/ralgo") library(ralgo) library(microbenchmark) library(tibble) library(ggplot2) setup < - function(n) n evaluate <- function(empty) function(n, x) elements <- 1:n queue <- empty for (elm in elements) queue <- enqueue(queue, elm) } for (i in seq_along(elements)) queue <- dequeue(queue) } } ns <- seq(5000, 10000, by = 1000) performance \<- rbind(get_performance("explicity environment", ns, setup, evaluate(empty_env_queue())), get_performance("closure environment", ns, setup, evaluate(empty_closure_queue())), get_performance("functional queue", ns, setup, evaluate(empty_extended_queue()))) ggplot(performance, aes(x = as.factor(n), y = time / n, fill = algo)) + geom_boxplot() + scale_fill_grey("Data structure") + xlab(quote(n)) + ylab(expression(Time / n)) + theme_minimal()