Big Data Flood, Part 4: The Privacy Costs of Free Technology

Data and Analytics:

Dominik Maj

Everyday Implications of the Big Data Flood_ Part 1

This article is the fourth in a series on the everyday implications of the big data flood. Follow these links to read parts one, two, and three.

Whoever said it originally, you’ve probably seen this catchy statement at one time or another with regards to consumer technology you can use for free, like search engines:

"If you’re not paying for the product, you are the product.”

There’s an element of truth in it, but it’s still too simplistic and pessimistic.

Simplistic because modern technology has given us an interesting new spin on two-sided markets where it isn’t always obvious who is paying for what.

And pessimistic because the point isn’t that you – as in your actual, physical person – aren’t being bought and sold. It’s what you are doing, i.e. using the product. And what people do is bought and sold all the time on a little old thing called the labor market. When it comes to supposedly free technology products, the main difference is that you might not be paid for what you’re doing!

That’s why I think a more accurate – if less catchy – statement would be:

“Regardless of whether you’re paying money to use a product, know that your usage data is a product.”

When your data is a product, there are some things you should know. And you should then make some informed decisions which, by the way, are relatively uncommon among most internet users!

Less privacy could mean a restricted worldview.

It is hard to judge anyone here. We are surrounded by technology that can make our life easier at no financial cost. It just needs some information about us to understand the context of our situation.

For example, our phones’ navigation programs optimize our routes based on data from thousands of other travelers going the same way. Similarly, search engines give us results engineered to match our priorities and show us advertisements tailored to our interests

Handy, right? All you need to do is log in with your email address and let it know your location. Innocent enough?

Maybe. We don’t always understand which personal details can be pieced together from various seemingly meaningless details.

For instance, take the 2015 research conducted by Piotr Sapieżyński, which showed mobile phone users’ precise locations during an entire day can be inferred from the history of their routine Wi-Fi scans. Even if Wi-Fi is turned off.

Or that Michał Kosiński, a psychologist of Polish descent at Cambridge University, designed a highly accurate psychological profiling model based solely on the fanpages Facebook users like.

According to Kosiński:
70 likes are enough to predict someone’s character better than their friend, 150 likes make it better than their parents. With 300 likes the algorithm knows the person better than their partner.

This model was commercialized by Cambridge Analytica, who used it to tailor marketing campaigns to American and British voters in the emotionally charged presidential election and Brexit referendum, respectively. Campaigners were not only able to identify undecided voters but also reach them with personalized messages corresponding to their fears and general life situation.

When we give up our privacy in exchange the convenience of financially free services, we agree to receive information designed just for us. The ultimate cost could be a narrower worldview that closes us off to information that does not fit our personal narrative. We are then vulnerable to manipulation by those who can talk to us in “our” language to achieve their own ends.

And what is probably more disconcerting for many is that it is now possible to talk to others in “our” language as well.

Your real data can be used to create false information.

Once your data is out there, it’s practically impossible for you to control what can or will be done with it.

Take, for example, the results of neural-network-based research published in July 2017 that presented a very interesting – and very frightening – example of a video containing a speech by Barack Obama.

The neural network took an arbitrary text that should be put into a someone’s mouth. In this case, that someone was Barack Obama giving a pretend speech that looked stunningly real.

At roughly the same time, the combined efforts of scientists from the University of Erlangen-Nuremberg, the Max Planck Institute for Informatics, and Stanford brought the world a system that allows one person’s facial expressions to be visually dubbed onto the video recording of someone else.

Thanks to Canadian startup Lyrebird, there is also a system available to learn anyone’s voice characteristics and patterns with just a one-minute sample of his or speech.

Combine these audio and video applications and it is conceivable that we will wake up one day in a world where a video recording of your saying something is no longer valid proof that you actually said it.

Even the most comprehensive data protection solutions have weaknesses.

You can follow a number of best practices to help protect your online privacy. For example, you can ensure that your network is encrypted before using passwords, avoid revealing your email on a publicly available webpage, and send emails only to recipients you know. In modern payment systems you can also use a unique, one-time credit card number generated specifically for the transaction to minimize and insure against data loss.

Unfortunately, all the above examples are particular solutions to particular problems and tend to rely on your knowing what you are doing. Automated programs can help you but often involve extra costs to do so!

A fake profile is a simple do-it-yourself alternative to using your real data and knowing how to safeguard it. In fact, social media platforms and the internet in general are full of such fake names and identities. The problem is that such profiles are frowned upon if not banned altogether. With fair cause, as they tend to be used for unethical purposes ranging from plain rudeness, ratings manipulation, like farms, hate speech, and even stalking. Social media has become, thanks to anonymity and a lack of geographical barriers, a way for people to unite in pursuits both noble and nefarious, including terrorism.

The ideal data protection solution for the global flow of information we know today would be a single system allowing every person to check what is known about him or her, who knows it, and how this data is being processed.

For example, the system might keep a record of who has permission to send you a newsletter via email and who is allowed to call you with a telemarketing offer. Who knows your national identification number and who knows your physical address. It would be an equivalent to a central bank except governing the usage of data as opposed to money.

Designing and implementing such a system is of course not only technically difficult but also hard to understand from the legal and organizational points of view.

Take the Indian Aadhaar system, for example, which is designed to administer citizens’ identification numbers along with biometric data, retina and fingerprint scans, and bank account numbers so as to facilitate welfare payments to the poor. And secret surveillance, according to some observers.

Regardless of the system’s true purpose, its implementation resulted in a massive data leak. In May 2017 the most sensitive data of 130 million Indian citizens was exposed on governmental websites. It’s hard to predict the actual consequences of such a leak and its real scale.

In Poland, General Data Protection Regulations (GDPR, or RODO in Polish) are to come into force this May 2018. The regulations introduce and specify partial conditions for administering sensitive personal data, including the obligation to report any data leaks and a restrictive system of penalties in case of insufficient data protection. Currently, all Polish organizations that manage data are actively interpreting the new rules as they prepare for May and raising multiple questions as they do so. Only time will tell if this solution will prove useful, but they surely constitute a step in the right direction for citizens.

Bottom line: understand your privacy costs and the benefits of paying them.

The takeaway here is that you should think about financially free technology in terms of benefits – like everyone does – as well as privacy costs.

As an example, let’s go back to your phone’s navigation system. Maybe the benefit of using it outweighs any reasonable downside of the system’s provider knowing your driving patterns and feeding you targeted advertising as result.

Or maybe the risk of being constantly fed biased news stories isn’t worth having a technology service know all your personal preferences and proclivities. On the other hand, maybe you can fight a shrinking worldview by seeking out alternative viewpoints via other means, online or otherwise.

The point is that you have a reasonable understanding of the costs and benefits of financially free technology and make an informed decision about using it.

Sources:

Back
to Top

Big Data Flood, Part 4: The Privacy Costs of Free Technology

This article is the fourth in a series on the everyday implications of the big data flood. Follow these links to read parts one, two, and three.

Less privacy could mean a restricted worldview.

Your real data can be used to create false information.

Even the most comprehensive data protection solutions have weaknesses.

Bottom line: understand your privacy costs and the benefits of paying them.

Related Posts

Adoption: The Key to Uncover The Value of Data & Analytics Implementation

Promoting Data Literacy and Fostering a Data-Driven Culture

How To Ensure New Analytics Dashboards Get Adopted