Mass Data Collection

Should the government be allowed to collect data on UK citizens to prevent terrorism and criminal activity?

A short article on mass surveillance, written by the President of the Open Source Initiative, Simon Phipps.

Surveillance is not always wrong. Far from it - British democracy has long allowed the police and security services to temporarily intrude on individuals when they believe them to be a threat to UK citizens. We allow them to do so when they have a warrant, which is only granted if they can prove they have real evidence.

But in terms of automated digital surveillance and blanket collection of evidence? I don't think so.

What is “blanket collection”?

Edward Snowden and other sources make it clear that we are all as much under surveillance in the digital age as leading public figures like Martin Luther King were in a previous age. The surveillance seems to be justified as “just in case”, and includes journalists and the legally-privileged correspondence of lawyers. Government officials frequently deny this despite the evidence. The reason may be that the intelligence services treat the *collection* of data and *analysing* that data to know what it shows as separate activities. They then only classify data analysis as “surveillance”.

They accumulate data from any source that's public or which they have a legal principle that they believe makes the data fair game, storing it for long periods in huge “data lakes”. They then use various justifications such as warrants and notification to secret courts to “go fishing” in the data lake. Agencies like GCHQ claim they are scrupulously following the law, although the government declines requests to explain precisely how.

Is it OK for everyone's data to be stored this way?

We can be sure that no computer algorithm yet devised gets things right every time. It's likely that there are many false positives, with data from parties unrelated to the reason for a search being drawn in accidentally. It would be far better to not have one's data mixed up in the lake to start with; the next best is to have that data thoroughly protected through encryption before we send it. But there's more to our internet communications than the message itself; there's also *metadata*. Metadata means information about the message (like the address on an envelope or the postmark on a stamp) but excludes the message itself (the letter in the envelope).

That's where you need to start being most concerned. The metadata remains readable no matter what we do, making it possible to triangulate even on encrypted messages. Triangulation means using apparently innocent data from other places to disclose hidden data that they and your message all have as context. For example, if I know someone's location is at a clinic, that they have recently purchased goods from a high-street chemist and that the web sites they have recently visited are about pregnancy, I don't actually need to *read* the text of the e-mail to their boyfriend to guess what it's probably about.

Blanket surveillance

With “blanket surveillance”, people can go fishing in those vast data lakes, using information from all sorts of places -- mobile phone locations, shopping records, surveillance cameras and more -- to triangulate and determine what should be decrypted and analysed with an appropriate legal justification. By then it's too late to stop any of those sources of data being used, even if it might seem wrong to be doing so.

The powers requested in recent attempts to change the law around this are open-ended and ill-defined. They lack meaningful oversight, transparency or accountability. They appear to be designed to permit the security services free rein in making their own rules and only having to justify their actions when the “fishing trip” is over.

Future risks

The breadth of data gathered – far beyond the pursuit of specific individuals – creates a risk of future abuse, not just by the security services pushing their limits but also by criminals and terrorists when they hack in. What's especially worrying is today’s legal justifications – where offered – make no accommodation for these risks. It's as if the security services believe they are perfect and impregnable.

So should the government be allowed to collect data on UK citizens to prevent terrorism and criminal activity? Yes, sometimes. But our representatives must ensure that each law which reduces our liberties is ring-fenced. Each needs to be justified objectively, governed with impartial oversight, and of limited duration.

Proposals for open-ended laws which permit blanket data accumulation do not meet these standards and we must challenge them if we are to protect democracy in Britain.

How safe is our privacy online?

  • Simon Phipps
  • Simon Phipps is an independent consultant providing insight and knowledge on open source and digital rights to businesses and governments worldwide through his company Meshed Insights Ltd. He is also pro bono President of the Open Source Initiative, the non-profit organisation that advocates for open source software and builds bridges between open source communities and maintains the canonical list of open source licenses. His writing is regularly featured in InfoWorld, ComputerWorld and other publications. He is a pro bono director of the UK’s Open Rights Group as well as on the advisory board of Open Source for America.

The text in this article is available under the Creative Commons License.