19 Dec

Back up again

Well, I guess I’m back…

When I wrote the post on admixture proportions the other day I got back to this blog after having neglected it for a very long time. The wordpress dashboard was lighted up with necessary update and half of them required an update of the underlying software, such as MySQL and PHP.

I couldn’t do that myself so I asked to get it updated on the hosting server, which I got, but in the process the site moved server, so it has been offline a bit. First to fix some software issue and then also because it takes a little while for the various DNS servers to update their cache.

Anyway, from where I’m sitting now, the site is up and running again.

17 Dec

Estimating admixture proportions

I am not entirely sure about this, but something seems wrong to me in a number of papers I have read recently.

A couple of them I even reviewed before they were published so if I am right in my suspicion I am partly responsible.

Anyway, it has to do with estimating the admixture proportions when one population, let’s call it X, is admixed between two other populations, A and B, say. Rather, two populations A’ and B’, A’ closely related to A and B’ closely related to B, admixed to create the population X’ ancestral to X. X’ was created with a proportion of α from A’ and β=1-α from B’.

We want to estimate α.

In Durand et al. (2011) we get a test for this. It is based on counting ABBA-BABA patterns — essentially the D statistics without normalisation — and comparing these for two selected quartets of populations. They call it the f^ estimator and it is described around equation (7) and (8).

First there is one version where — in terms of the populations I described above — you compare the quartet (A, X, B and O) with (A1, A2, B, O) with two samples from A. The idea here is, as far as I understand, that A2 must be completely “A” so we see a contrast to how much X is compared to someone who is completly A.

There is nothing wrong with that, but it isn’t an estimate of the admixture proportions. It doesn’t take into account that “A-ness” has evolved since the admixture time — potentially for a long time if that event is far back in time — so we are seeing both the admixture and that evolution.

The second version takes another sequence related to A but that branched off before the admixture event. If we use that version we can actually get an estimate of the admixture proportions.

I will shortly explain how, but just mention that the thing that worries me is that I see the first case being used to estimate the proportions with (generally) acknowleding that it isn’t what it is doing; worse if you compare two populations to figure out how admixed they are and you ignore this problem, how do you know that it is the admixture proportions you are measuring and not the drift after that admixture event?

Okay, to the estimator.

I find it easier to think in terms of the f4 statistics from Patterson et al. (2012). In general the way of thinking about drift evolving along admixture graphs I find extremely elegant and easy to reason about, at least compared to counts of site patterns.

The f4 statistics — which is essentially the D statistics so very similar to the Durand ABBA-BABA counts — captures the overlap between the “drift flow” between two pairs of populations. f4(A,B;C,O) for example is the drift on the overlap of the path from A to B and from C to O. That is the overlap between the blue and the green line, or the drift on edge x. f4(A,B;C,O) = f4(C,O;A,B) = x

When there is admixture, the drift from one population to another takes more than one path, so for example the drift from X to B takes two different routes, one over the edge close to A, with probability alpha, and one over the edge close to B, with probability beta. For f4(C,O;X,B) we therefore again have the only overlap on edge x but we only take that path with probality alpha (the path we take with probability beta doesn’t overlap the path from C to O so it doesn’t get counted). f4(C,O;X,B) = αx.

Since f4(C,O;A,B) = x and f4(C,O;X,B) = αx we can estimate α as f4(C,O;X,B)/f4(C,O;A,B). This is called the f4 ratio estimator in Patterson et al. and is essentially the same as the second f^ estimator from Durand et al.

When the admixture event — or at least the branching off of the population that will admix — is ancestral to both A and C we have a different topology so the ratio is not equal to alpha. f4(C,O;A,B) = x + y so now we have f4(C,O;X,B)/f4(C,O;A,B) = αx / (x + y).

It is a lower bound for alpha, but how much below alpha you get depends on the length of branch y.

Unless I am misunderstanding the f^ statistics, and it is very different from the f4 ratio estimator, I think I am seeing several papers estimating alpha using the second topology. All those estimates are then too low.

Or am I missing something?

Durand, E.Y. et al., 2011. Testing for ancient admixture between closely related populations. Molecular Biology and Evolution, 28(8), pp.2239–2252.

Patterson, N. et al., 2012. Ancient admixture in human history. Genetics, 192(3), pp.1065–1093.

31 Oct

Workflow for ReadCube and Papers3

I am switching between ReadCube and Papers a lot. I like both tools (although I have had my share of problems with both as well), but they have different strengths and I want a combination.

ReadCube is really great for finding papers. Their enhanced PDFs makes it very easy to get to cited papers and they are really good at finding papers that have cited what you are reading. So for reading through the literature it is my favourite. It sucks when you have to cite papers, though. I have tried to use its citation tool but it only works with Word and you are screwed if you want to sort the cited references alphabetically.

Papers is better for citing, especially when you want to use BibTeX. It works well with most editors and it is very easy to export your library to a BibTex file. It isn’t automated, which is a pity, but it is reasonably good.

So what I really want is to use ReadCube when collecting my papers and then use Papers when citing them.

Here’s a little trick for automatically importing papers to Papers when you import them into ReadCube.

You can tell Papers to watch for PDFs in a folder and automatically import them. So if you go to preferences you can tell it to watch the ReadCube files. It should be in your Documents folder and be called something like ReadCube Media.

This won’t import the papers there, though, it just makes a folder where, if you add the PDFs there it will import them.

To automatically add new papers you go to Automator and make a Folder Action that simply copies new files into this folder.

Now, when you import a file in ReadCube you will also get it added to Papers.

03 Mar

Discusing papers during the review process…

For the last two-three years I’ve been signing my reviews. I find that I write better reviews when I’m not anonymous.

It shouldn’t be like that, but it is. I’m less likely to get lazy if I know that people will see who wrote the review.

It creates a dilemma, though: should I discuss manuscripts with authors before they are published? I am likely to run into them at meetings, and it is hard not to talk about the manuscripts there. If it is authors I’m frequently discussing with online the dilemma is there as well.

As a reviewer, you have two tasks: making sure that the science is solid and improving the presentation of the results (the manuscript). The second part is probably easier to do with a back-and-forth discussion, but the thing is, as a reviewer you are not really working for the authors but for the editor.

It is the editor who ultimately has to make a decision on the manuscript, and it is him or her you are assisting. This is why you really shouldn’t write your recommendations for acceptance or rejection in the review but only tell the editor. The editor needs to know what your concerns are and what the authors are doing to address them.

On the other hand, the paper is moving faster forward when you don’t have to wait weeks between each point and counter-point.

It gets even weirder when the manuscript is already out there on a preprint server, and you have already discussed it with the authors before you find yourself a reviewer of it, which has happened to me a couple of times recently.

How do you guys deal with these things?

24 Aug

High school

It was the best of times, it was the worst of times, it was the age of wisdom, it was the age of foolishness, it was the epoch of belief, it was the epoch of incredulity, it was the season of Light, it was the season of Darkness, it was the spring of hope, it was the winter of despair, we had everything before us, we had nothing before us, we were all going direct to Heaven, we were all going direct the other way – Dickens

Middle school — or junior high, which we called it when I went — should really be classified as a form of child abuse. I recognize that it isn’t as bad for everyone as it was for me, but those two years I spent in 7th and 8th grade were easily the worst and most unhappy times of my life. – Starts with a Bang

My high school (gymnasium) just turned 90. I wasn’t actually aware of this until I visited my home town this weekend where I got jubilee issue of the local news paper about it from my uncle.

I can sort of relate to Ethan’s (from Starts with a Bang) description of high school. Late teens is definitely a time where you are being judged all the time for what ever you do, fair or not. It wasn’t a bad time for me, though.

Working in science was far from my ideas of my future at the time. I wanted to be a musician and spent my years there playing in bands (earning a small living teaching guitar and playing at bars) and made a lot of friends doing that.

Reading this special issue of the paper about my old school is a lot of fun for me. I see people there who were good friends back then. People I played music with or just partied with. Found out for the first time that one of the professors at computer science went to my old school. Really a lot of fun.

I remember my time there very fondly. It was a time of music, philosophy and a lot of parties. Very different from my life now, focused on science. I think I would have been a very different man now if I hadn’t spent three years immersed in what is essentially humanities.

So happy birthday Herning Gymnasium and I hope you will invite me to write a little piece when you turn 100.