You know, people do use neighbour joining!
Thursday, March 27th, 2008Over the last couple of years, I have done a little work on phylogeny inference, including a few papers on neighbour joining. One thing that consistently happens when you submit a paper on this — and I bring it up because I have just gotten back reviewer reports on such a paper — is that at least one reviewer will tell you that neighbour joining is not interesting and one should focus on maximum likelihood / Bayesian trees instead.
Sorry to say it, but people do use neighbour joining — I am willing to bet that there are ten times as many people using neighbour joining to infer trees than there are people using the statistical approaches — so algorithmical improvements here do matter!
The statistical approaches are usually more accurate, and they are better at capturing the uncertainty in the inference and such, but they are slow! Not slow as in, “I’ll go get a cup of coffee while the program finish”, but slow as in “I’ll look at the tree when I am back from my vacation”.
Sure, they are fast enough for tens of leaves, but some people infer trees with thousands of leaves. I recently got an email from a guy who tried with tens of thousands of leaves and ran out of memory using one of my tools — it needed more than 4G so it chocked on the problem (but a student in our lap has now come up with a new algorithm that is less memory expensive so that should solve that problem).
For large trees, forget about ML or Bayesian approaches. They do not scale (yet).
People do use neighbour joining, so shut up and review the paper for what it is, not what you want it to be. Grrr!