Bio++ and namespaces
My RSI is acting up these days, so it will be a little while before I continue my tutorial on Bio++, but there is just a few more things I want to share with you from the Utils library. Well, the first thing is something that I think should be in the Utils library but really isn’t so I am going to implement it here and lobby to get it in…
Parameter namespaces
In my last post I talked about parameter parsing, but one thing I didn’t cover was namespaces. I knew that there was some support for this, but just couldn’t find it in the library.
I had a talk with Julien today about it, and it turns out that it is hidden in the NumCalc library as part of their “function” concept and that is why I couldn’t find it in Utils.
Anyway, using the namespace implementation in NumCalc is a bit complicated and requires that you implement a Parametrizable object, so not something you would use for general option parsing, so I have my own take on it here.
But first let me explain how namespaces work in Bio++.
It is just a convention, really, but the idea is that you have options separated by dots. It looks similar to attributes of objects in this way, but it is really just handled as long strings and the dots do not have any special meaning in how the options are parsed.
Anyway, if you have some sub-system of your application, say a substitution model with two parameters, you can call your model “foo” and define the two parameters, bar and baz, to it as
foo.bar = barValue
foo.baz = bazValue
in your options file or on the command line.
The prefix is then used to avoid name collisions with other models or sub-components with parameters bar and baz.
To read the options you can use their full name in calls to the ApplicationTools:
Not really a problem, but if you need to change the name of your module later, say if it collides with another model using the same prefix, you would need to change the prefix all the places you access the parameter.
A better solution is to define the namespace once and just refer to the parameters enclosed in it.
The code below does exactly that. It simply defines a function for extracting from the general parameters list all the parameters with a given prefix – “foo.” in this case – and puts it in a separate parameters list.
#include <Utils/ApplicationTools.h>
#include <Utils/AttributesTools.h>
namespace bpp { namespace namespace_tools {
// just to make it easier on the typing...
typedef std::map< std::string, std::string > ParameterMap;
ParameterMap extractNamespace(const std::string nameSpaceName,
const ParameterMap ¶ms)
{
ParameterMap newNameSpace;
unsigned int nameLength = nameSpaceName.size();
ParameterMap::const_iterator itr;
for (itr = params.begin(); itr != params.end(); ++itr) {
if (bpp::TextTools::startsWith(itr->first, nameSpaceName)) {
std::string newName = itr->first.substr(nameLength+1, itr->first.size());
newNameSpace[newName] = itr->second;
}
}
return newNameSpace;
}
}}
int main(int argc, char * argv[])
{
using namespace bpp;
using namespace bpp::namespace_tools;
using namespace std;
map< string, string > parameters;
parameters = AttributesTools::parseOptions(argc, argv);
string bar = ApplicationTools::getStringParameter("bar", parameters, "default");
ApplicationTools::displayResult("Parameter 'bar'", bar);
string foobar = ApplicationTools::getStringParameter("foo.bar", parameters, "default");
ApplicationTools::displayResult("Parameter 'foo.bar'", foobar);
map< string, string > fooParameters = extractNamespace("foo", parameters);
string nsBar = ApplicationTools::getStringParameter("bar", fooParameters, "default");
ApplicationTools::displayMessage("Names in namespace 'foo':");
map< string, string >::const_iterator itr;
for (itr = fooParameters.begin(); itr != fooParameters.end(); ++itr) {
ApplicationTools::displayResult(itr->first, itr->second);
}
return 0;
}
I’ve put the function in a (C++) namespace instead of a static class, but I guess that is just a matter of taste.
Running the program – using just command line arguments here instead of an options file – looks like this:
$ ./build/Debug/ParameterNamespaces bar=qux foo.bar=qax foo.qux=quux Parameter 'bar'........................: qux Parameter 'foo.bar'....................: qax Names in namespace 'foo': bar....................................: qax qux....................................: quux
“Procedure” parameters
Another nice approach to namespaces – that is implemented in the Utils library – is “procedure” arguments.
They give you a way of both choosing a model and at the same time providing arguments to it.
So say you have the choice between two methods for a given sub-module, “foo” and “bar”, where these take different parameters. You can choose the method and set arguments like this:
method = foo(a=1,b=1)
or
method = bar(c=3.14)
and parse it up in the code like this:
#include <Utils/ApplicationTools.h>
#include <Utils/AttributesTools.h>
#include <Utils/KeyvalTools.h>
int main(int argc, char * argv[])
{
using namespace bpp;
using namespace std;
map< string, string > parameters;
parameters = AttributesTools::parseOptions(argc, argv);
string model = ApplicationTools::getStringParameter("model", parameters, "foo");
string modelName;
map<string, string> modelArgs;
KeyvalTools::parseProcedure(model, modelName, modelArgs);
std::cout << "The model was " << modelName << '.' << std::endl;
if (modelName == "foo") {
int a = ApplicationTools::getIntParameter("a", modelArgs, 0);
int b = ApplicationTools::getIntParameter("b", modelArgs, 0);
std::cout << "You chose foo(" << a << ',' << b << ')' << std::endl;
} else if (modelName == "bar") {
double c = ApplicationTools::getDoubleParameter("c", modelArgs, 0.0);
std::cout << "You chose bar(" << c << ')' << std::endl;
} else {
cerr << "Unknown model: " << model << '.' << std::endl;
}
return 0;
}
You then use the program like this:
$ ./build/Debug/ParameterNamespaces "model=foo(a=1,b=2)" The model was foo. You chose foo(1,2) $ ./build/Debug/ParameterNamespaces "model=bar(c=3.14)" The model was bar. You chose bar(3.14) $ ./build/Debug/ParameterNamespaces "model=qux(d=42)" The model was qux. Unknown model: qux(d=42).
Nice, eh?
–
257-288=-31
September 15th, 2009 at 8:28 am
Hej Thomas,
So the thing is that there is a small confusion between ‘options’ and ‘parameters’ there… Parameters are a feature of numcalc, as they belong to the ‘function’ framework:
We have a Parameter object (a name, a value, an optional constraint), a ParameterList object (basically an enhanced vector of Parameter objects, a general Parametrizable interface (an object that *has* parameters). Functions are Parametrizable objects that output a value for a certain combination of parameters, and can be used in optimization functions and so on. Recently we introduced namespaces for Parametrizable object, in a simple way, without any convention of any kind: a namespace is a common prefix for all parameters. The default namespace/prefix is empty, but if you set the namespace as “coal.” or “myfunction_” or “__” for instance, then all parameter names are changed to start with this string, thats only it! The rationale is that ParameterList objects are required to have distinct names, so for complex functions it can become a problem (you can have a substitution model with a parameter named ‘alpha’, which is also the shape of the Gamma distribution used for rate across site models and so on). BUT, this is nothing related to ApplicationTools and file options. It is true however that for obvious reasons, we tend to name options in files after the parameter name in the function, so that alpha=0.5 is a natural choice to set the alpha value of the gamma distribution. However, there is no such requirement, and that’s why we did not introduce namespaces for option files. Recently we also came to a distinct convention to initiate parameters in functions, using the keyval syntax:
model=T92(kappa=3.5,theta=0.45)
will initiate a Tamura (1992) substitution model with parameter T92.kappa set to 3.5 and T92.theta=0.45 (so namespace “T92.”). The dot here is only of our choice, it is not a requirement of the namespace system at all. So again, ApplicationTools is a very general set of functions to parse options file, which is not bound to the Parameter system in numcalc.
Hope this clarifies things a bit :p
Julien.