Posts Tagged ‘grid computing’

The dark cloud

Saturday, September 5th, 2009

Cory Doctorow discusses the dark side of cloud computing in Not Every Cloud has a Silver Lining:

Here’s something you won’t see mentioned, though: the main attraction of the cloud to investors and entrepreneurs is the idea of making money from you, on a recurring, perpetual basis, for something you currently get for a flat rate or for free without having to give up the money or privacy that cloud companies hope to leverage into fortunes.

That is, of course true, but for computationally or memory intensive computing, it is probably still the way to go.  And of course, it all comes down to pricing as well.  It doesn’t matter so much that I pay per cycle used if I end up paying less than running my own computer…

Anyway, read the piece for a different view on cloud computing.

248-258=-10

Virtualisation on the cloud

Saturday, June 13th, 2009

Yesterday I was at an interesting talk at the computer science department: Virtualization will be the Operating System of the Cloud, Steffen Grarup, VMWare

Virtualization is a neat approach to cloud computing.  You bundle up your application with the OS and whatever else it needs and pack it in a VMWare virtual machine, and then it can run on any machine on the cloud.  Well, any Intel or AMD machine, I guess, I don’t know what else, but of course there are some hardware limits.  The cool thing is that the dependencies on the software on the computers “out there” – that you have very little control over – is virtually gone.

I expect there to be a lot of devils in the details – like what happens if your code depends on e.g. SSE instruction sets that are not available on all processors, or what happens if your code thinks it has two or four cores and optimises its thread usage for that, but the real hardware has fewer – but a neat idea it is.

164-169=-5

Is less really more?

Wednesday, January 28th, 2009

This leader in The Economist argues that we are now using Moore’s law to to get cheaper computers, rather than more powerful computers.

Constant improvements mean that more features can be added to these products each year without increasing the price. A desire to do ever more elaborate things with computers—in particular, to supply and consume growing volumes of information over the internet—kept people and companies upgrading. Each time they bought a new machine, it cost around the same as the previous one, but did a lot more. But now things are changing, partly because the industry is maturing, and partly because of the recession. Suddenly there is much more interest in products that apply the flip side of Moore’s law: instead of providing ever-increasing performance at a particular price, they provide a particular level of performance at an ever-lower price.

I’m not sure that I agree.

Sure, our current computers are “good enough” for what we use them for.  Office applications, net surfing, watching a movie when on the move, etc. but we still want more.

We want new features.  Most features in an office package we will never use, but all those that are there are there because someone needed them, and when you want a feature, you want it there.

The features we want are probably more specialised.  The basic features that everyone uses have been around for ages.  So a new feature that you would love to see, would probably only benefit a few, but it would be great for those few.

I think that what is changing is that we only want the features we need and not those features that everyone else needs.

We don’t want to pay for an upgrade that adds 100 features where we only need one of them.  We just want the one feature we need.

So our approach to computing changes.  We move online.

We are happy to get features from Internet services that gives us what we need, but we don’t want to have all those features we don’t need installed on our local machine.  Slowing everything down and confusing our use experience.

The reason “net books” are hot is not that they are cheaper as such.  Sure, it helps on the sales that they are cheap, but they also provide the services we need and are likely to provide more and more services over time.

It is just an interface to the Net, and the services there keep getting better.

We are not demanding less, we have just realised that the computations doesn’t have to run on our desktop.  They can run somewhere else.  In the “cloud”.

It’s grid computing, baby.  Cloud computing.

Your interface to it might be getting cheaper — and why not? — but you still want more and more.

28-45=-17

Some thoughs on grid computing…

Wednesday, October 8th, 2008

Earlier this week, the LHC Computing Grid went online.  A description of the system can be found here, and blog posts about it here, here and here.

This got me thinking about grid computing for small scale scientists like myself.

I’ve had some experience with grid computing (see an old post about it here) but mostly I have found it too much trouble to be worth the effort.

Our typical computer use

For large projects that require years of CPU time, it is well worth the effort to set up the infrastructure to run computations on grids.  You really need it to get your the computations done, and the overhead is very small in comparison with the actual computation time.

Most of my projects — and most of the projects we do at BiRC — are a bit different.

We do need the computation power, but we are usually tinkering with our programs for most of a project — since we rarely know exactly how to analyse our data until we are mostly done with it — so we cannot just distribute a fixed version of our software and then start distributing the computations.

The typical work flow is that we write a program for our analysis, then we run the analysis and when we look at the results we find some strange results here and there. Then we extend the software to either extract more information from the data, or to fix a bug that caused the weird results.

We then need to run the analysis again, and repeat the process.

The analysis might take a few CPU days to a few CPU months — so it is small scale for grid applications — but between each analysis we spend a week or so modifying and testing our software.

We have a small cluster of Linux computers for this, and it is always in one of two states: completely overloaded or burning idle cycles.

This is the situation grid computing could fix.  Theoretically we should be able to get CPU cycles off the grid when we need it, and sell it to the grid when we are not running computations ourselves.

In practice, our work pattern makes this difficult.

The problems with small scale grid computing

If you are changing your software all the time, you need to distribute it together with the data you analyse.

This means you either send compiled binaries with the job submissions, or you compile the software as part of the job.

The former is fine if you have a program you can compile — and you’d better link it statically ’cause there is no guarantees about the libraries you can find on the resources that will run it.

If you have a bunch of scripts, you are not so lucky.

There are no guarantees that the computer that will run the computations has the script interpreter — or if it does that it is a version that can run your script — and even if it does, what about the modules you need?

You don’t want to have to compile BioPython or SciPy on a grid machine just to run your scripts.  The overhead in CPU time is going to be several percentage of your actual run (at least if you parallelise your computations to high enough a degree to be worth the grid in the first place), and how can you even know that there is a compiler to compile it at the other end?  You can’t, and there probably isn’t unless you are very lucky.

It is a major pain to see your jobs aborted after slowly making their way through the job queue, just because the host computer cannot even setup the environment you need for your computations.

What can we do about it?

If we want to use the grid for even smaller scale computations, at the very least we need an easier way to distribute new versions of our programs.

I have an idea for this.

Some grids, at least, are already dealing with “runtime environments” where you can specify that your job needs to run in a certain runtime environment, and the scheduler will only send your jobs to resources that can provide that environment.

This sounds like just the thing, but the catch is that it is up to the resource administrators to set up these environments and to tell the grid system that they provide them.

For something like LHC, it is probably not a problem to convince administrators to provide the right environment, but for Thomas Mailund it is.

What we need is a way for the grid users to be able to install environments on the resources!

So how about this: we introduce the concept of “runtime environment packages” that we can upload to the grid system.  They consists of a setup script (configure ; make) and a test suite, for example.

When a resource is idle, it tests if there are new environments available in queue, downloads these, and tries to build and test them.  If it succeeds, it informs the grid system that it can run the new type of environment.  The scheduler only sends jobs to resources that have the right environments, so if your environment tests are working properly, you never end up on a resource that cannot run your jobs.

We could even add environment requirements on the environment packages, so they don’t have to be self-contained.  E.g. to install SciPy, you don’t want to have to install Python itself, and there is no reason for resources without Python to try to install it only to give up.

To prevent resources to be filled up with old environment, we can add a time out period to environements, so they are deleted when they haven’t been used for a couple of days/weeks/months.

It shouldn’t be that hard to implement.  I am sure I could do it, but I don’t have my own grid infrastructure to work with, so I guess I’ll have to intimidate persuade someone else to do it…