This just isn't right

So, I'm working on some genome scale analysis and am a bit worried about numerical precision issues.  After all, even relatively stable operations might run into problems after milions of additions and multiplications.

So I figured I could try using interval arithmetic to get a feeling for the uncertainty in the numerical values.

Boost, as so often, is helpful here and has a library for interval arithmetic.

Great, I thought, I'll just chuck in the interval type in my algorithm (that is parameterised with the numerical type anyway so I can use either float, double or long double) and see how it goes.

Well, no luck.  All my intervals ends up being a single value, and that just isn't right.

A simple example like this, that that adds 1000 random floats, results in an empty interval:

#include <iostream>

#include <boost/numeric/interval.hpp>
using namespace boost::numeric;
using namespace interval_lib;

#include <ctime>
#include <boost/random.hpp>

int main (int argc, char * const argv[])
{
	typedef interval<double> I;

	static boost::mt19937 rng(std::time(0));
	static boost::uniform_real<> uni_dist(0,1);
	static boost::variate_generator<boost::mt19937&, boost::uniform_real<> > random_01(rng, uni_dist);

	I x = 0.0;
	for (int i = 0; i < 1000; ++i) {
		x += random_01();
	}
	std::cout << '[' << x.lower() << ',' << x.upper()
                  << "](" << (x.upper() - x.lower()) << ')'
                  << std::endl;

    return 0;
}

There is no way there is no precision lost in adding 1000 values!

Clearly I am doing something wrong, but what?

--

100-122=-22

Tags: , ,

12 Responses to “This just isn't right”

  1. Mikael Says:

    That could be a useful library - especially if it worked :-)

    Have you disabled SSE2 instructions ("-mfpmath=387" or something)?

  2. Thomas Mailund Says:

    I haven't explicitly enabled or disabled SSE2, but yes I noticed in the documentation that the library doesn't work with SSE2. I will try to see if I can explicitly disable it and see how it works...

    Not today or tomorrow, though, I left the code on my iMac and I'm out of town until Sunday :-S

  3. Mikael Says:

    I tried your program on VC C++, and found that that I got single value intervals just like you.

    However changing the loop to:
    x += random_01()/3.0;

    or just:
    x += 1/3.0;

    leads to the expected result: the interval size increases nicely as i increases.

    I think the lowest bits in your random numbers are zeroes. (Actually letting i<10000000 does produce error intervals).

    Btw, I got the same results both with and without SSE, so I do not think it matters (even though their docs state it does).

  4. Thomas Mailund Says:

    Hmm, why should it matter if the lowest bits are zeros?

    They are supposed to be random numbers between 0 and 1, but at this exact moment it strikes me that I never actually tested if they were :-S

    Anyway, I don't think that should matter for the numerical accuracy of the final number. The possible values should still get wider and wider...

  5. Thomas Mailund Says:

    Hmm, just tried with x += 1.0/3.0; and I still get single point intervals :-(

  6. Mikael Says:

    The interval addition as I understand it, works something like this:
    [a,b]+c = [round_down(a+c), round_up(b+c)]

    If the lowest bits in the mantissa are zeroes (and the exponent the same), there is no loss (and no rounding) in the addition, and the interval will not be expanded, right?

    This is the reason, I choose x+=1/3.0 - which adds rounding errors. If I run the program with x+=1/2.0 there is no accumulated error (because 1/2 can be represented exact as a binary number).

  7. Thomas Mailund Says:

    Hmm, I see your point.

    Using 1/3 still doesn't have any effect for me, though, but I'll take a look at that tomorrow.

  8. Thomas Mailund Says:

    On the other hand, even if both numbers have a fine representation, and within the accuracy of the floating point number used, I would still expect the interval to increase when adding a small number to a large number, since this addition is throwing away most of the digits of the small number.

    Anyway, since even

    	I y = 1.0;
    	I z = y / 10.0;
    	std::cout < < width(z) << std::endl;
    

    gives me a single point interval, it is clear that it isn't working for me. One tenth is not representable in binary for any finite size float.

  9. Mikael Says:

    Yes, exactly - if the exponent of the floats differs, you will loose precision when shifting the mantissa. And you do get rounding errors in your original example after running 10 million iterations (when x has grown large relative to the random number).

    Btw, your example (1/10.0) gives an interval width of 1.38778e-17 on my PC. (Those Macs...)

  10. Thomas Mailund Says:

    Hey, setting the flag -mfpmath=387 does work. Now I am getting intervals as well on my Mac.

    (I know I said I would try it later and all, but I had some time right now before heading off for the big family Easter lunch... The actual project I am working on will have to wait, though).

    Thanks for the hint about the flag, Mikael!

  11. Mikael Says:

    Oh - guess their docs were right after all :-)

    Btw, I think you RNG has exactly 32 bit resolution: a double has a mantissa of 53 bits, thus the 21 lowest bits would then be blank. After 10 millions iterations x becomes ~ 2^21, explaining why numerical errors are first introduced at this point.

  12. Thomas Mailund Says:

    You are probably right.

    Anyway, for the application I actually have in mind, the random numbers are not so important; those were just for testing.

    I'll keep you posted on the real problem :)

Leave a Reply