Posts Tagged ‘mac’

This is driving me crazy!!!

Monday, September 21st, 2009

I’m trying to profile some changes to CoalHMM today, but for some reason both Instruments and Shark flatly refuses to show me debug information (like the source code) for functions in the Bio++ libraries.

In the weekend — on my macbook — it worked just fine, but today — on a different machine — there is just no way to get it!  I have rebuilt all the libraries with the Debug target, but to no avail.

This is extremely frustrating, ’cause without the source code I cannot identify the hotspots inside a function.

I’m getting the impression that regardless that I compile the library with Debug information, what gets installed is the Release target, but for the death of me I cannot figure out why?

And why on earth does it work differently on this machine?

264-295=-31

Profiling with Instruments

Sunday, September 20th, 2009

Today I tried the same profiling exercise as yesterday, but using Instruments.

You can run it directly from Xcode by picking the “Run with performance tool” entry in the Run menu.  Turns out you can do the same with Shark, but I didn’t notice that yesterday.

There are a lot of things you can profile with Instruments, but the only thing I have figured out how works so far is CPU performance.

The way Instruments profile is similar to Shark.  It samples during the execution of the program and thereby gets a picture of where the program spends its time.  The display of this is much nicer than Shark, though, with cool performance bars in various tracks.  You can add any number of tracks to profile memory usage, IO, etc. together with CPU performance, but as I said I haven’t yet figured out how to use this.

For CPU performance, it gives you an overview of which functions are taking up the run time, displayed very similar to Shark:

CPU profileA nice feature which I didn’t see in Shark – but perhaps it is there – is that you can also pick time slices of the execution and see which functions took up the time in that slice.  Say, if I zoom in on the first few seconds of the run, I can see which functions are used when reading in the data.  Something that will come in handy when I start my real work on figuring out how to improve that part of the program.

CPU profile in time sliceAs with Shark, you can also get a profile for where in the source code the time is spent.  For the full run, that is (of course) the same hotspot that Shark identified:

HotspotUnlike Shark, it doesn’t seem to give any hints on how to improve on the hotspot, but in this case it turns out to be a better choice, ’cause I learned something unpleasant about the code that is probably more valuable than the suggestion to use ESS instructions!

Browsing through the code — you can click on the function calls that work as hyperlinks — I found out that CoalHMM accesses the entries in the transition matrix through a virtual function call:

    /**
     * @brief Get the transition probability between two states.
     *
     * @param i initial state.
     * @param j final state.
     * @return the transition probability between the two states.
     */

    virtual double Pij(unsigned int i, unsigned int j) const = 0;

That is probably much more of a problem than not using SIMD instructions!

Getting the entry in a matrix shouldn’t be more than a few pointer calculations and fetching a value, but here not only is a function call needed (forget about inlining virtual function calls, that never works) it is a function call to a calculated function address, that is very likely to break the processor’s pipelining.

I haven’t tried changing it in CoalHMM so I don’t know how much it is costing us here, but in experiments we did when we implemented the SNPFile library we found that it was about an order of magnitude slower to use a virtual function to access a matrix entry.

It is perhaps not so surprising that this line is taking up much of the time when running the tool…

263-293=-30

Profiling with Shark

Saturday, September 19th, 2009

I have absolutely no experience with profiling on a Mac.  I’ve used gprof and valgrind a lot on Linux, but now that I’ve started developing on Mac I need to learn how to profile here as well.

A bit of googling tells me that there are two nice tools for this, Shark and Instruments.  I have both installed and decided to try out Shark first, since that looked a bit easier to use.  I am also going to try out Instruments later, but my experience with Shark was pretty good.

It is a sampler based profiler, so to use it you just start your application and then start sampling.  It will sample everything running on your computer, but if your program is doing a significant amount of work it will be easy to find it in the resulting performance profile, and you can then get rid of everything else with some filters.

I actually have something I need to profile having to do with file IO, but the data I need for that is on another machine that is now busy with actual computations, so for my experiments with Shark I just tried out our CoalHMM tool on the example data distributed with the code.

I started the tool, then started the sampling, and 30 seconds later I got this profile:

Performance profileIt is pretty clear from it that there is a hotspot worth looking at (in the Bio++ NumCalc library), and looking at the code Shark nicely shows where it is:

Hotspot in the codeIt even gives hints as to what the problem could be and how to fix it.  Neat!

The hotspot doesn’t surprise me much.  The application is a hidden Markov model, and I fully expected that most of the time was spent in the Forward algorithm.  The solution doesn’t surprise me either – and we are already working on an SSE improvement.  Still, with profiling you can never be sure, so it is nice to be confirmed.

I also tried the simple fix of enabling auto-vectorization (-ftree-vectorize) and compared that solution to the one before (something Shark also makes easy).

Profile comparisonIt gives a very modest improvement, but I guess it isn’t that easy for the compiler to automatically insert SIMD instructions in code like this… I expect more from our hand-coded version where we right now get two to four-fold improvements, depending on whether we are using float or double floating point precision.

262-292=-30

This limits the usefulness of Xgrid a bit…

Saturday, September 19th, 2009

Ok, I noticed this yesterday but figured it was a configuration issue that I could deal with.  When I run jobs on Xgrid, it runs one job per CPU and not one per core, which for my current use means that I only have half the CPU power compared to manual distribution of jobs.

I read the documentation, and it is supposed to run a job per core, but something is wrong on Snow Leopard and this is apparently a know issue.

I hope this gets fixed before I have a real need for the grid.

262-291=-29

Getting Xgrid up and running

Friday, September 18th, 2009

Ok, I don’t really have a lot of Macs but I’m planning on getting a Mac Pro or two at the office for some of my computations, so I wanted to figure out how Xgrid works so I can use that for those computations.  So I wanted to try it out on my macbook and iMac, just as a proof of principle.

Just getting the grid up and running, I ran into a few problems, so I’m going to write down how I finally manged to get it going here, so I can reproduce it later.

Setting up the grid

Step one, I downloaded the Xgrid Admin tool from here. I had also installed it earlier but without getting around to playing with it, but that installation disappeared with my upgrade to Snow Leopard and I had to install it again.

Starting the tool up, it asks for a controller.  I told it to just use my iMac and gave it a password.  All well and good, but so far no Agents to actually run any jobs.

Step two, I enabled Xgrid in the Sharing Systems Preferences.

SharingUnder Configure I picked the controller I had set up, and again I gave it a password.  Now comes the first problem I ran into.  I mistyped the password here – I wanted the same as for the controller just to make it easier on myself, but got it wrong.  The agent started up fine, didn’t complain about the password or anything, but it didn’t show up as an agent in the Admin tool.

I tried adding it there, but was told it was unavailable.  I mocked about with this for a while but just couldn’t get it to work at all.

It wasn’t until I tried connecting the macbook instead I figured it out.  There I got the password right, and it pop’ed up in the Admin tool.  So I made a wild guess about the password being the problem, retyped it in the Sharing dialogue and now the iMac finally connected as an agent

Agentsand the Admin tool told me I had 5.46 GHz to compute with

OverviewI’m a bit miffed that there was no authentication steps that could have told me what was going wrong, but I guess the trick is to just pick the same password for the controller and all the agents or something like that, ’cause that at least seems to work for me now.

Running a job

To submit jobs, you have to use the xgrid command.

Just running it gives you this:

$ xgrid
xgrid
usage: xgrid <options> <action> <parameters>
Any number of the following <options> may be specified:
 -h[ostname] <hostname-or-IP-address>
 -auth {Password | Kerberos}
 -p[assword] <password>
 -f[ormat] xml
A single <action> and its <parameters> must be specified:
 -grid list
 -grid rename -gid <grid-identifier> <new-name>
 -grid add <grid-name>
 -grid {delete | attributes} -gid <grid-identifier>
 -job list [-gid <grid-identifier>]
 -job {attributes | specification | log | wait} -id <job-identifier>
 -job submit [-gid <grid-identifier>] [-si <stdin>] [-in <indir>] \
 [-dids jobid[,jobid]*] [-email address] \
 [-art <art-path> -artid <art-identifier] [-artequal <art-value>] \
 [-artmin <art-value>] [-artmax <art-value>] <cmd> <arg1> ...
 -job batch [-gid <grid-identifier>] <xml-batch-submission-file>
 -job results -id <job-identifier> [-tid <task-identifier>] \
 [-so <stdout>] [-se <stderr>] [-out <outdir>]
 -job {stop | suspend | resume | delete | restart} -id <job-identifier>
 -job run [-gid <grid-identifier>] [-si <stdin>] [-in <indir>] \
 [-so <stdout>] [-se <stderr>] [-out <outdir>] [-email address] \
 [-art <art-path> -artid <art-identifier] [-artequal <art-value>] \
 [-artmin <art-value>] [-artmax <art-value>] <cmd> <arg1> ...

xgrid -?, or xgrid with no arguments, will print this usage message.

I don’t really know what the options mean, so I tried firing off a few, and I just kept getting the same output.  A bit disappointing.

I guessed that the hostname option was needed, but -hlocalhost just didn’t work for me, but eventually I found out that “-h localhost” would.  Well, not exactly work, but at least it complained that I needed to authenticate the command:

$ xgrid -h localhost -grid list
{
 error = "could not connect to localhost (Authentication failed)";
}

Adding a password with “-p password” did the trick.  Again, you do need the space between -p and the password.

$ xgrid -h localhost -p password -grid list
{
 gridList =     (
 0
 );
}

I don’t know what the output means here, but at least I was making progress.

Asking for a job list (I’m guessing here) gave me an empty list:

$ xgrid -h localhost -p password -job list
{
 jobList =     (
 );
}

which I expected since I haven’t submitted any jobs, so I tried sending a simple “ls” command.

$ xgrid -h localhost -p password -job submit ls
{
 jobIdentifier = 0;
}

In the Xgrid Admin tool I saw that the job had failed, so I figured it could be a path thing and gave it the full path of the job

$ xgrid -h localhost -p password -job submit /bin/ls
{
 jobIdentifier = 1;
}

and that seemed to do the trick:

JobsAsking for a job list with xgrid shows me two jobs

$ xgrid -h localhost -p password -job list{
 jobList =     (
 0,
 1
 );
}

I figured that -job results should give me the states of the jobs, like I could see them in the Admin tool, but I don’t get any output when I run that command, so I don’t know how that is supposed to work.

I can delete a job, though:

$ xgrid -h localhost -p password -job delete -id 0
{
}

Jobs (one deleted)but I still haven’t figured out how to get the status or output of the job from xgrid.

I guess it is time to stop experimenting and read the manual…

Update: Ok, I did just one more experiment.  If I run a program that is guaranteed to give some output I do get that output when I ask for the result.  I tried just running xgrid and I got the help text.  I guess the ls command I tried before was run in an empty directory and that is why it didn’t produce any output.

I still haven’t found a nice way to get the status, but the -job attributes command at least gives me a lot of info about the job including the jobStatus.

I still have some experimenting and reading to do before I get the grid up and running on some of the computations I am actually interested in, but I am optimistic now at least.

261-290=-29