By Tom Blass and Duncan Jefferies
Big data could help solve many problems, but only if we trust organisations enough to share our data with them.
Big data is sometimes called the "nervous system" of the planet. Through analysis of huge datasets, governments and organisations can glean new insights into the causes of disease, poverty, energy usage and environmental damage, and spot ways to develop more effective solutions. “From a very basic perspective, smarter use of data simply reduces waste and increases efficiency,” says Francine Bennett, CEO of Mastodon C, which enables companies to minimize their carbon emissions while running big data operations.
These efficiency gains also stretch to highly targeted communications campaigns. By applying predictive analytics to customer data, businesses can send out their messages with unprecedented accuracy. Ambitious companies are not only using data to communicate more wisely, but to rethink everything they do. Liz Coll, Senior Policy Advocate at the consumer rights group, Consumer Focus, describes big data as the new oil, “a commodity which will drive innovation and transform business models.”
But to get the most out of big data, organizations need another valuable commodity: trust. “We are the data” – in the words of an Intel Labs-sponsored website, wethedata.org, which aims to raise people’s awareness of data ownership issues. If large numbers of concerned citizens refuse to share their information with organizations, then datasets will soon start to develop big holes, and the insights gained from them will lose their accuracy.
Until recently, the trust issue hasn’t been a major problem for most organizations. During the early years of Web 2.0, everyone seemed only too happy to share a huge amount of information about themselves through tweets, Facebook updates, geo-location data and other sources. But, it seems people are beginning to realize that it might not always be a good idea to turn off their internal privacy settings.
In October 2010, the Wall Street Journal reported that popular Facebook apps like Farmville had transmitted user identities to internet tracking companies. Six European data protection agencies are currently contemplating legal action against Google due to changes to its privacy policies. And WhatsApp, a popular messaging service, was recently found guilty of breaching international privacy laws: users are forced to grant the app access to their entire address book, including details of non-users of the service.
Perhaps partly as a result of these and other high-profile privacy cases, which highlight how little control internet users really have over their data, Ovum’s latest Consumer Insights Survey showed that only 14 percent of people now trust that personal data placed online will not be exploited. In addition, 68 percent of the internet population in 11 different countries would select a "do not track" (DNT) feature if easily available. The World Wide Web Consortium is working to establish a global DNT standard, but talks have proved inconclusive. As things stand, there is no requirement for sites to honor DNT requests.
Nevertheless, all the major web browsers have added a DNT feature, which signals the user’s preference to any sites they visit. But the option to turn it on is often buried deep in the browser settings, with few details about what it actually does. By March this year, only 11 percent of the 450 million users of Mozilla’s Firefox browser had turned the setting on, two years after it first became available (and it’s one of the more clearly worded examples). Internet Explorer 10, the latest version of Microsoft’s browser, recently took things a step further by turning the feature on by default. Brendon Lynch, Microsoft’s Chief Privacy Officer, wrote in a blog post that the decision represented an important step in the process of “establishing privacy by default, putting consumers in control and building trust online.”
A new book called Big Data: A revolution That Will Transform How We Live, Work and Think warns that the data being collected on people could even be used against them in future – perhaps to deny them a life-saving heart operation based on a prediction that they will take up smoking. Exploring the limits of data usage is part of the mandate of the Open Data Institute, a thinktank that seeks to catalyze an open data culture that has economic, environmental and social benefits. “For example, we believe, first and foremost, that personal data is not open data,” says Garvin Starks, CEO of the Open Data Institute.
He feels some of the most interesting projects in the data sector turn the traditional relationship between the data generator (individual) and collector (provider of goods or services) on its head, feeding back to those individuals information about their own consumption, preferences or buying habits. Starks was also a founding member (and remains a director of) the UK Department of Business Innovation and Skills’ Midata project – an initiative which aims to encourage private sector companies to release personal data to consumers, assist them in accessing that data safely, and get businesses to develop innovative services and applications that will help consumers to find better deals, or tell them interesting things about their spending habits.
Parts of the Midata agenda have now became statutory under the UK’s Enterprise and Regulator Reform Bill, which received assent in April. Starks is still involved in one aspect of it: in advance of any statutory obligation to do so, a number of energy companies plan to provide consumers with their energy consumption data in a machine readable format (a similar initiative called Green Button was launched in the U.S. in 2011). This will allow people to not only shop around for better prices, but also take steps to improve their energy efficiency.
“There is a huge amount of value in this information,” says Starks, who believes access to data of this kind could also spark more interest in micro generation. “The challenge lies in convincing customers that it is in their interest to understand it, and in building the tools to unlock it.”
Energy companies that participate now could reap long-term rewards from confidence-building dialogue with their customers. All organizations could also foster the greatest trust in the way they use people’s data by creating clear privacy policies. Currently, the dense, legalese-heavy documents people are asked to sign when joining a new site or social network tend, if anything, to obfuscate the company’s data-sharing policing rather than clarify it, building fear not trust. A site called Terms of Service; Didn’t Read aims to tackle this issue, rating and explaining website terms and privacy policies. But even once you’ve signed up for a site, there’s often no easy way to adjust your privacy settings – a common complaint directed at Facebook.
If organizations don’t take steps to give people greater control over their data, then governments might do so instead. In fact, they already are. The draft European Data Protection Regulation aims to unify data protection within the EU with a single law, enhancing the consumer’s right to know about the data companies have about them, and also bestowing "the right to be forgotten" – that is, to ensure that their personal data is deleted when, for example, they transfer their business to another supplier. The law is expected to be adopted by 2014, though discussions about the content are still ongoing. The US Congress, Federal Trade Commission and Department of Commerce are also considering regulations of their own. The Australian Privacy Act has already been amended so that, from March 2014, organizations need to ensure they manage people’s data in an open and transparent manner.
Lobbyists for big technology companies claim these regulations could stifle innovation, robbing us of some of the valuable insights and services big data provides (several thousand amendments to the EU Data Protection Regulation have already been put forward, angering consumer rights groups). One answer could lie in a new mathematical technique called differential privacy, which allows researchers to draw information from data repositories, but "blurs" any details that could be used to identify an individual. A world Economic Form report called Unlocking the Value of Personal Data: From Collection to Usage even envisions a future where all collected data is tagged with a software code that states an individual’s preference for how it is used.
By one estimate, there will be 44 times more data produced in 2020 than there was in 2009. And before long, data from biometric monitors, nanobots and other micro-sensors that document our health and activities in minute detail will be added to the mix – a revolution that will put an even deeper spin on the phrase "we are the data." Hopefully, by then, companies will have learned to be more open about how they use personal information, and individuals will be even more aware of its value.
Tom Blass is a freelance journalist specialising in law and security issues, and Editor of World Export Controls Review. Duncan Jefferies is a freelance journalist specialising in technology and digital innovation, and Assistant Editor at Green Futures.