In light of the revelation that NSA has been collecting metadata on Americans, this post by Kieran Healy has been making the rounds, demonstrating how with “only” metadata the British (if only they’d known then what we know now) could have squelched the American Revolution by identifying the troublemakers – for example, the guy through whom messages were passing:
People are linked through the groups they belong to. Groups are linked through the people they share. This is the “duality of persons and groups” in the title of Mr Breiger’s article.
Rather than relying on tables, we can make a picture of the relationship between the groups, using the number of shared members as an index of the strength of the link between the seditious groups. Here’s what that looks like.
The analytical engine has arranged everyone neatly, picking out clusters of individuals and also showing both peripheral individuals and—more intriguingly—people who seem to bridge various groups in ways that might perhaps be relevant to national security. Look at that person right in the middle there….He seems to bridge several groups in an unusual (though perhaps not unique) way. His name is Paul Revere….
…I know nothing of Mr Revere, or his conversations, or his habits or beliefs, his writings (if he has any) or his personal life. All I know is this bit of metadata, based on membership in some organizations. And yet my analytical engine, on the basis of absolutely the most elementary of operations in Social Networke Analysis, seems to have picked him out of our 254 names as being of unusual interest.
This is, of course, why the government wants to collect the metadata – it’s useful to have this sort of power if you’re tracking enemies of the state.
But this is also, of course, why the public feels that it is a violation of their privacy – and why people are scrambling to insist that it’s not a big deal, because it’s “only” metadata:
One of the common defenses that has been trotted out over the past week to push back against criticism of the National Security Agency’s practice of collecting phone records for later review has been that all the NSA is collecting is so-called “metadata.” Essentially, this argument tells us not to worry because the NSA isn’t actually listening in to our phone calls, it’s just collecting records of who talked to who, when and for how long.
But such powers can be used to control lawful citizens, especially if it’s combined with the power of the IRS. Consider:
Unlike e-mails written in different languages or with personal touches, metadata about who sent and received a message, when it was sent and from where, always looks the same. Besides cutting down on the absolute amount of traffic to examine, metadata makes it easy to organize information and search for patterns, establishing social networks from individuals.
For some communications, metadata matters more than content. “A call to a suicide hot line, Alcoholics Anonymous, or a gay sex chat room at 2 a.m. are all more sensitive” than the actual message, said Christopher Soghoian, principal technologist at the American Civil Liberties Union. “You can text political donations. The metadata shows your political leanings, the content just shows the amount you gave. Calling a cell tower away from my house in the middle of the night indicates I’m not sleeping at home.”
“The public doesn’t understand,” she told me, speaking about so-called metadata. “It’s much more intrusive than content.” She explained that the government can learn immense amounts of proprietary information by studying “who you call, and who they call. If you can track that, you know exactly what is happening—you don’t need the content.”
Healy explains, you can do more than just “find Paul Revere”:
…we can do things like calculate centrality scores, or figure out whether there are cliques, or investigate other patterns. For example, we could calculate a betweenness centrality measure for everyone in our matrix, which is roughly the number of “shortest paths” between any two people in our network that pass through the person of interest. It is a way of asking “If I have to get from person a to person z, how likely is it that the quickest way is through person x?”
But either way, the argument that it’s “only” metadata should be recognized as a less-than-honest form of spin: there’s no “only” about it.
…From a table of membership in different groups we have gotten a picture of a kind of social network between individuals, a sense of the degree of connection between organizations, and some strong hints of who the key players are in this world. And all this—all of it!—from the merest sliver of metadata about a single modality of relationship between people