Some thoughs on grid computing…
Wednesday, October 8th, 2008Earlier this week, the LHC Computing Grid went online. A description of the system can be found here, and blog posts about it here, here and here.
This got me thinking about grid computing for small scale scientists like myself.
I’ve had some experience with grid computing (see an old post about it here) but mostly I have found it too much trouble to be worth the effort.
Our typical computer use
For large projects that require years of CPU time, it is well worth the effort to set up the infrastructure to run computations on grids. You really need it to get your the computations done, and the overhead is very small in comparison with the actual computation time.
Most of my projects — and most of the projects we do at BiRC — are a bit different.
We do need the computation power, but we are usually tinkering with our programs for most of a project — since we rarely know exactly how to analyse our data until we are mostly done with it — so we cannot just distribute a fixed version of our software and then start distributing the computations.
The typical work flow is that we write a program for our analysis, then we run the analysis and when we look at the results we find some strange results here and there. Then we extend the software to either extract more information from the data, or to fix a bug that caused the weird results.
We then need to run the analysis again, and repeat the process.
The analysis might take a few CPU days to a few CPU months — so it is small scale for grid applications — but between each analysis we spend a week or so modifying and testing our software.
We have a small cluster of Linux computers for this, and it is always in one of two states: completely overloaded or burning idle cycles.
This is the situation grid computing could fix. Theoretically we should be able to get CPU cycles off the grid when we need it, and sell it to the grid when we are not running computations ourselves.
In practice, our work pattern makes this difficult.
The problems with small scale grid computing
If you are changing your software all the time, you need to distribute it together with the data you analyse.
This means you either send compiled binaries with the job submissions, or you compile the software as part of the job.
The former is fine if you have a program you can compile — and you’d better link it statically ’cause there is no guarantees about the libraries you can find on the resources that will run it.
If you have a bunch of scripts, you are not so lucky.
There are no guarantees that the computer that will run the computations has the script interpreter — or if it does that it is a version that can run your script — and even if it does, what about the modules you need?
You don’t want to have to compile BioPython or SciPy on a grid machine just to run your scripts. The overhead in CPU time is going to be several percentage of your actual run (at least if you parallelise your computations to high enough a degree to be worth the grid in the first place), and how can you even know that there is a compiler to compile it at the other end? You can’t, and there probably isn’t unless you are very lucky.
It is a major pain to see your jobs aborted after slowly making their way through the job queue, just because the host computer cannot even setup the environment you need for your computations.
What can we do about it?
If we want to use the grid for even smaller scale computations, at the very least we need an easier way to distribute new versions of our programs.
I have an idea for this.
Some grids, at least, are already dealing with “runtime environments” where you can specify that your job needs to run in a certain runtime environment, and the scheduler will only send your jobs to resources that can provide that environment.
This sounds like just the thing, but the catch is that it is up to the resource administrators to set up these environments and to tell the grid system that they provide them.
For something like LHC, it is probably not a problem to convince administrators to provide the right environment, but for Thomas Mailund it is.
What we need is a way for the grid users to be able to install environments on the resources!
So how about this: we introduce the concept of “runtime environment packages” that we can upload to the grid system. They consists of a setup script (configure ; make) and a test suite, for example.
When a resource is idle, it tests if there are new environments available in queue, downloads these, and tries to build and test them. If it succeeds, it informs the grid system that it can run the new type of environment. The scheduler only sends jobs to resources that have the right environments, so if your environment tests are working properly, you never end up on a resource that cannot run your jobs.
We could even add environment requirements on the environment packages, so they don’t have to be self-contained. E.g. to install SciPy, you don’t want to have to install Python itself, and there is no reason for resources without Python to try to install it only to give up.
To prevent resources to be filled up with old environment, we can add a time out period to environements, so they are deleted when they haven’t been used for a couple of days/weeks/months.
It shouldn’t be that hard to implement. I am sure I could do it, but I don’t have my own grid infrastructure to work with, so I guess I’ll have to intimidate persuade someone else to do it…