Who am I? Where am I? Who did I meet? Who have I arranged to meet? What did we say? Where did I go next? What do I like/not like? What have I failed/succeeded at? What are my political beliefs? What are my sexual proclivities? What are my congenital/current health conditions?
These are all bits of data that are now routinely recorded, stored and increasingly being mined by governmental organisations and private companies all wanting to learn about our past, present and “probable” future behaviours for security, actuarial, social, medical and commercial applications.
What is Big Data?
There is no concrete definition rather we can characterise it by its features, and the central one is this: we can do things with a huge quanta of data that we are unable to do with smaller amounts, to extract new insights and create new sources of value. This encompasses storing every aspect of a user’s behaviour and then using advanced computational techniques such as Machine Learning and Artificial Intelligence, which allows prediction of future behaviours, based on historical patterns. Because we have more data (and the tools to process it at a vast and affordable scale), these predictions can become scarily accurate.
In what ways is Big Data transforming the world?
It is on track to touch all aspects of society. We will go from a world that we understand by experiencing it on an individual level, to one where we comprehend it on a more universal level. Today, small amounts of data that are usually just a simulacrum of the complex reality we are trying to deal with and tailored to our cognitive limitations to make sense of it. Tomorrow, we will use big data to surpass our faith in our individual powers and instead place trust in the data and the algorithms that govern it.
In medicine, doctors make episodic diagnoses based on their judgement. Sounds reasonable? In time, this will probably be considered barbaric. Why not use big data? We could enshrine the experience of all doctors and hundreds of millions of patients and case studies over decades, to identify the best treatments to achieve the best outcomes and even spot unknown adverse drug side effects. After all, the sum of all medical knowledge isn't in the possession of any single physician. If we choose to aggregate vast troves of healthcare information, we may learn what works best, just as Amazon recommends books not based on the inklings of a literary critic but from the correlating of sales data. This will mark a revolution in how society uses information and how information impacts society.
What are the privacy implications of big data?
Privacy is a big problem today and it will be a bigger problem tomorrow. We need to improve the legal framework that governs privacy to move beyond the system of notice-and-consent (that is, companies inform users what data they collect and how it's used, and people give their okay). In reality, it means that people tick a box agreeing to 60 pages of legal jargon with nary a glance. Instead, we need to consider the use and misuse of the data, not just the collection. We need to focus on the area of harm and not just on the inert, potential damage.
Are there any other challenges associated with Big Data?
While privacy is a problem, a newer issue is 'propensity'. This refers to the idea that algorithms may be making predictions about what we are likely to do, and we may find that we're penalised before we've actually committed the infraction - a real-life minority report. Big data may assign a 95% likelihood that a particular person will shoplift, or default on a loan, or fail to survive a surgical operation. We'll need to sanctify human agency and free will. At the same time, we'll need a new class of professional: “the algorithmist", to review big data analyses and techniques and provide society with the same transparency and accountability that we have today - to ensure that big data is not a black box that obviates the public interests.
To improve the way we interact with the world around us, bring people together and use technology to benefit society, GeoSpock™ is building a massively scalable, real-time, geospatial indexing engine.
We’re experts in solving the Big Data problems, but we’re also extremely concerned about the privacy of our users and the issues of trying to strike the fine line between providing an amazing and personally tailored user experience, to being invasive and irresponsible with our user’s data. The trouble is legislation and regulations on the use of this technology are out-dated, and governance of this data is largely at the discretion of each organisation, and not all of them will be using this information in a socially responsible way. The question is how we can pressure governments to take this issue seriously and educate the average person about how to protect their data.
Big Data as a tool for solving Big Problems
Let’s start with the negative. Big Data can be used for nefarious purposes if it falls into the wrong hands. There’s no ignoring the issue in our global society; as we become more and more connected with modern technology, more and more data will be recorded on us. As things stand the only reason why the general public isn’t in a mass panic is down to the way most Big Data companies (and governments) obfuscate their terms and conditions to downplay how they use people’s data. However, with recent revelations and current events, people are becoming more aware and duly concerned about their data.
Technology evolves at such an incredible pace that it can be hard for legislation and regulations to keep up. Normally this would be the point where some fool will incorrectly quote “Moore’s Law” to justify the march of computational power available doubling every 24 months. However, this is not the limiting factor when related to today's massively parallel and scalable million-processor systems such as the custom supercomputer Steve (our founder and CEO) is building as part of his doctoral research, designed to carry out real-time simulations of the human brain.
With the speed of technological improvements in mind; the problem of personal data protection will only get worse, which is something that governments and companies who are serious about protecting their user’s privacy need to get on top of.
There is, however, a right way of doing things and a silver lining. People are generally happy when companies are upfront and honest with the way that a user’s data is handled and are comfortable with personalised user experiences.
Just think of the times you have been recommended the perfect gift idea while browsing on Amazon, Google providing you with the right search results or Facebook suggesting a mutual friend you might know. Each of these positive user experiences is primarily derived from your previous interactions with the service, but what you might not know is that it is also based on the behaviour of every other person who has used the service. This is a very basic example of how the actions of the many can improve your personal experience. Now, it’s likely that you would agree to contribute this anonymized usage data if you knew how much it enhanced the system for everyone (think of it as the same as a contribution to Wikipedia or a charitable donation) – but the key question is: were you given a choice to disagree?
When browsing the Internet from a desktop machine the amount of meta-data one can infer from your actions is minimal, which has a way of trivialising the privacy/Big Data problem. However, with the advocate of cheap smartphones and mobile devices with GPS and location-based services the meta-data that can be passively derived is incredible. For example, it would be possible to determine where you live, where you work and where your favourite place to hang out is, this is dangerous if it fell into the wrong hands.
At GeoSpock we take privacy very seriously and keep our user data under lock and key, and we make it possible for people to opt out of personalised experiences. Most won’t as they see the value it gives them, but we think it’s important to have the choice and not take it than to have no choice at all…