Big data is a buzzword for several years, so that is already somewhat in decline (almost say that in the valley of the disappointment of the famous graph of the hype). But the fact is that the Big Data is increasingly present in our lives, whether we know it or not.
The Big Data is basically the ability to analyze large amounts of data and get results that help a target. The use is very general, it can be a Facebook ordering the feed so you think will interest more or bank when deciding how much someone can pay for a new house.
The positive part of Big Data is clear: more satisfied most effective companies and users. But it also has a negative face: privacy, discrimination and marginalization. Not everything is perfect when analyzing large amounts of data.
One of the great challenges of Big Data is privacy. That users are comfortable with certain functionality that they offer us through the analysis of massive data that we provide users does not mean that we want to do without our privacy.
This debate has been open for more than ten years. When Google launched its e-mail service, Gmail, it was quite controversial that it would show advertising related to the content of the mails. In the end the society has ended up accepting that the mails were scanned but always under the commitment of Google that it was only made by an automatic system, nobody in Google can read them nor the advertisers can have access to them.
Every time we share more data with certain companies, of which by the way Google is one of the most advanced. With an Android phone, it is also easy to share location, photos and browsing history with the giant Mountain View. This can make Google offers us the highest quality services (as relevant ads to the area where we find some clues when we do tourism or more ads tailored to what we normally require) but also shows that we are sharing a lot of sensitive information.
Recently a former CEO of American Express said in public that if the company wanted, through information buying habits, could determine whether a customer has an extramarital affair. Indeed, the data is there ready for someone to analyze. But the results may be that a finance company decides to blacklist these people and not give them credit. Where was the privacy?
Everyone has ever experienced that after performing a few searches for a product on the Internet they do not stop appearing related ads for a long time, despite the loss of interest in the matter or that has already been done purchase. This may not be an anecdote in many cases but in others it can be serious. For example, someone who can see the browser open or discover unconventional interests can be traumatic insist advertising of baby clothes when you just had a spontaneous abortion.
On the other hand, there is China, of course. Since the government are beginning to create a system that through the Big Data can determine if their citizens are loyal to the regime or not. And those who are not will face sanctions that will exclude them from many areas of public life. All this without doing anything illegal, only by actions that some algorithms determine are anti-system.
And of course there is the issue of hacking, where nobody seems to be free to have problems. Companies collecting huge amounts of data about us to give us a better service might not be a problem but if such data is stolen by third parties there is a big problem. Companies like Yahoo or Sony have seen the consequences.
The ethics of Big Data and security are increasingly important issues, and more when ever presented more than listening devices all the time what we say to give us solutions both mobile phones and devices in the home.
You may also like to read another article on Tiffany-Hines: 20 tips for using the Internet safely and responsibly
Algorithmic Discrimination and Marginalization
The other big problem of Big Data, once we assume that the data we share are private and that the unwanted effects are solved, is the possible discrimination and marginalization by algorithms.
Data analysis can make companies focus on groups that are more profitable. Maybe a thorough analysis of the data tell the bank that is better not to give credit to a black person. The single. Or a woman. Or simply someone who does not live in a densely populated city. What if a university decides that the cut marks to study should be lower for women because on average they do better (even with worse previous results)? What if a bar decides not to accept Asian people because they usually consume less? What if Amazon Prime charged more to certain ethnic groups because they usually return more products? What if Google decides to give less storage space to user groups whose profiling is less attractive to advertisers?
Today if a bank denies a credit for reasons of race or sex the scandal is high. But if a biased clerk decides not a biased algorithm, if you simply analyze the data, who can we recriminate a malicious attitude?
Moreover, this algorithmic marginalization is even more dangerous than exists at present by individual prejudices, since we are used to question the criteria of people but not the algorithms. We have a tendency to question what people say, we are accustomed to seeing mistakes and lies on the part of individuals but we have an excessive faith in an algorithm that is supposed to have no errors and is neutral.
And the algorithms are not perfect, they fail. In 2008 Google launched an algorithm that predicted flu epidemics based on searches. But in 2013 the algorithm failed. It may be only an anecdote but if investment decisions in preventing or vaccines had been made by this algorithm the consequences would have been disastrous , and the most affected have been the most vulnerable in society.
We also have the case of Wisconsin (USA), where judges have a controversial “helper” that signals the risk to reoffend in the criminal attitude. This “helper” is nothing more than software with a secret algorithm. And the judges trust. And they may be harshly condemning certain people based on a secret algorithm that cannot be evaluated publicly and transparently.
In 2010 the city of Chicago created an algorithm to predict which people (with previous) could commit a crime. It was used a couple of years and then closed. Officials said it was a success although there are critics, and precisely because of discrimination: it is not clear that the effect really had on crime (is difficult to isolate effects) and it really was discriminating who had a history, was black and His economic situation was bad.
We also have the case of Amazon, which is discriminating against black neighborhoods in the US most of their shipments in the day. Discrimination is purely algorithmic, of course. These neighborhoods have fewer stores, fewer stores and so this also affects Amazon’s fast shipping service. Decades of racism are making a technology that could be equalizing (to have access to varied and cheap products in a neighborhood like most densely stores) is not available to the poorest areas.
The big data can serve, based on past experiences, to predict the future. But algorithms that can be very precise with physical events, such as weather, may not serve with society. In the last 200 years there has been great progress of humanity. Poverty has been reduced, there is less discrimination, women have gone on to have an increasingly important position in society … but if we look only to the past can prevent these developments . If big data algorithms had been used in 1800, maybe no woman could have entered university or there would be no social mobility, since it was normal for people to do the same thing as their parents.
This is Big Data’s big problem. If too much data is accumulated and processed well, there may be groups that are currently marginalized, even more so, that are completely excluded from what society considers normal. Someone who lives in a situation of marginality can achieve, with many difficulties, to get out of there. But if companies and institutions completely exclude these people based on analysis of massive data, perhaps the marginality will spread even more. Big Data discrimination is detrimental to society .