Jan 11 2019

Predicting Brexit

This study has a fairly narrow focus, but it does relate to an interesting topic. A new analysis finds that the betting market predicted the Brexit vote an hour before the financial market.

This says something about the efficiency of these respective markets in processing and reacting to information. The authors also conclude that if the financial markets were optimally efficient they should have predicted the result of the Brexit vote two hours before they did.

OK, this is more than a bit wonky, but what I really want to discuss is the more basic concept of predictive markets as it relates to crowdsourcing and big data. The idea is that a lot of people in the aggregate may be better at either making decisions or reflecting emerging trends than any individual or small group. This gets interesting when you compare crowdsourcing like this to individual experts.

This is one of the basic concepts behind the free market. No committee could reasonably determine what the value of a gallon of milk should be from basic principles and available data. However, the price can be determined by millions of individual buying decisions determining what people are willing to pay.

Obviously this is a gross oversimplification, and there are many factors at work influencing markets. Individual buying and selling decisions are not free from bias, influence, and distortion. That’s one of the proper roles of regulation, to minimize distortions in the market. Regardless, the basic concept is sound – marketplaces generate information that can be used in the aggregate.

Interestingly, markets also destroy information, meaning that they render it useless for advantage. In this way market information is like currency that gets spent. In the above example, the betting market generated information that could have been used to make millions in that golden hour before the financial markets essentially caught up and destroyed the value of the information.

The basic concepts here apply broadly, well beyond financial markets. If we could somehow capture millions of individual decisions by people, that could generate powerful information. A marketplace with either betting or purchasing is one way to capture such information. But there are other ways.

Using Google to search for information is another way in which individuals make decisions that can be captured in the millions. This was the idea behind Google Flu Trends (GFT), which sought to “nowcast” flu outbreaks by analyzing who was searching on flu-related terms.

However, GFT spectacularly failed, and lasted only from 2008 to 2015. GFT was the go-to example of the power of big data – so why did it fail? That is a question for the experts to sort out – but this is what they are saying.

Part of the problem was that Google did not keep the GFT algorithm up to date. As search behavior morphed over time, its predictive ability waned. Further, there are many complex reasons why people search on various terms, and that produces a lot of noise in the data that confused the algorithm. In the end, surveys and standard analysis by the CDC proved far superior to GFT.

But Google Trends still exists, and experts are eyeing this treasure trove of Big Data as a source of information. The stock market, for example, would love to use Google Trends to get any edge over the competition (but again, the stock market is very efficient at quickly destroying the value of information).

I think the lesson here is that Big Data is not a simple panacea, but it does contain enormous potential. Using that potential requires care, however. There is another technology trend, however, that I suspect will make an incredibly powerful alliance with Big Data – artificial intelligence.

AI is also getting more and more powerful, with deep learning algorithms that are just made for big data. I am extremely interested in what the marriage of Google Trends and adaptive AI will bring. Here are my questions and concerns:

First, what are the real limitations of big data? Specifically, did GFT fail more because of how Google implemented it, or because it was conceptually flawed? Are there inherent limits on the information that can be gleaned from big data because of the nature of chaos? Are those limits surmountable? I think deep learning AI will likely answer this question.

Perhaps more importantly – how will AI powered big data be used? One can easily imagine a number of benign and malevolent uses. Predicting disease outbreaks or terror attacks would be extremely useful applications. Targeting advertising to individuals (already a thing) is in the grayzone – it might be useful, but it feels like an invasion of privacy.

The question is – how much will these algorithms know about us collectively and individually? How explicitly will we consent to the gathering, analysis and use of this information? How easily can it be exploited by bad actors to do harm? Will it, therefore, need to be regulated, and how?

Information has always been power, but in the information age (especially the age of big data and deep learning AI) information may be the ultimate power.


No responses yet