One of the reasons I’m in the UK this week is to work with David Balding on HapCluster, a method developed in his group that I have extended in various directions.

There’s a few extensions we want to add, and maybe (finally) get a paper written about all the extensions. The latter probably won’t happen … even with lots of extensions is is hard getting a paper accepted that just adds features here and there, and it is always necessary to compare with a lot of other tools (when really no single tool can beat all the others unless you find just the right “sweet spot” in parameter space where you are lucky, and that takes months of fiddling with parameters and I just can’t be bothered).

Anyway, one of the extensions we are looking at is mapping of quantitative traits. Ed Waldron already did this in his thesis, but I never got around to implementing it in my tool, ’cause I didn’t fully understand the mathematics. I still don’t, but I think that I get enough to implement the formulas now at least.

The only problem is that it is a lot of matrix expressions that I need to implement, and that is just hell to do in C++. At least if you, as I do, use GSL and CBLAS.

A simple expression such as

$$!\mathbf{V}^* := \left(\mathbf{V}^{-1} + \mathbf{B}^T\mathbf{B}\right)^{-1}$$

becomes

gsl_matrix *
calc_Vstar(const gsl_matrix *B, const gsl_matrix *invV)
{
gsl_matrix *copy = gsl_matrix_alloc(invV->size1, invV->size2);
gsl_matrix_memcpy(copy, invV);
gsl_blas_dgemm(CblasTrans, CblasNoTrans, 1.0, B, B, 1.0, copy);
gsl_matrix *Vstar = invert_matrix(copy);
gsl_matrix_free(copy);
return Vstar;
}

with a pre-computed $$\mathbf{V}^{-1}$$.

Computing

$$!\mathbf{m}^* := \left(\mathbf{V}^{-1} + \mathbf{B}^T\mathbf{B}\right)^{-1}\left(\mathbf{V}^{-1}\mathbf{m}+\mathbf{B}^T\mathbf{Y}\right)$$

we can re-use the computation from before, but it still ends up as this long function:

gsl_matrix *
calc_mstar(const gsl_matrix *Vstar, const gsl_matrix *invV,
const gsl_matrix *B, const gsl_matrix *m,
const gsl_matrix *y)
{
assert(Vstar->size1 == Vstar->size2);
int p = Vstar->size1;
// x := V^-1 m
gsl_matrix *x = gsl_matrix_alloc(p, 1);
gsl_blas_dgemm(CblasNoTrans, CblasNoTrans, 1.0, invV, m, 0.0, x);
// z := B'Y
gsl_matrix *z = gsl_matrix_alloc(p, 1);
gsl_blas_dgemm(CblasTrans, CblasNoTrans, 1.0, B, y, 0.0, z);
// x := V^-1 m + B'Y = x + z
for (int i = 0; i < p; ++i)
gsl_matrix_set(x, i, 0, gsl_matrix_get(x, i, 0) + gsl_matrix_get(z, i, 0));
// m* := (V^-1 + B'B)^-1 (V^-1 m + B'Y) where V* = (V^-1 + B'B)^-1
gsl_matrix *mstar = gsl_matrix_alloc(p, 1);
gsl_blas_dgemm(CblasNoTrans, CblasNoTrans, 1.0, Vstar, x, 0.0, mstar);
gsl_matrix_free(x);
gsl_matrix_free(z);
return mstar;
}

You have to break down the expressions into tiny little chunks, like you are writing assembly programs for arithmetic. It is very frustrating.

It is low-level, of course, because the direct translation of matrix expressions to code is pretty inefficient, with lots of temporary matrices being created and deleted. Still, that is pretty much how my code looks anyway, I am just manually doing the direct translation, ’cause I don’t want to spend too much time trying to optimise the code before I know that it actually works (and is too slow to be worth optimising).

Template expressions, like in Blitz++, should be able to handle this, but I have never managed to learn that library. I don’t really deal with matrix expressions that often, so I haven’t bothered. If I end up having to do much more matrix hacking, I will. Life is just too short to manually coding it up…

—

118-134=-16