Datamining and identity

Google has figured out that you can identify people through their social networks, in the sense that they can identify when two different nodes in two different social networks are the same person.

I had similar thoughts when I learned about the Icelandic Cancer Project. The data they collect is kept anonymous.  Same for the massive data set that is at deCODE. But we know the pedigrees of Icelanders, and it should be possible to correlate the two and learn the identity of the individuals.

I it probably not something I should work on, since I collaborate with deCODE and want access to the data, but I feel that it should be quite possible.

