Daily Archives: May 13, 2012

The Three Things You Really Need to Know About Big Data Right Now!

A recent post over on the World Future Society on The Three Things You Need to Know About Big Data, Right Now annoyed me because the first thing I saw was that the data experts are organizing and they want a revolution. So what? First of all, we’re few and far between, and, more importantly:

  1. For Business, Big is, as it has always been, meaningless
    Like I said in my recent post on how There’s No Such Thing as BIG Data in Business, we’re not doing protein folding, climate modelling, nuclear simulations, supercollider data interpolation, cosmological computations, or even trying to beat Deep Blue at Chess. We’re looking for answers to everyday business problems, which comes from analyzing heterogeneous and related data, possibly through federation, and not from throwing everything into a number cruncher to see what comes out. Although you may have 100 Million transactions in your ERP, you don’t need to analyze them all at once. Analyzing all of your spend at once is akin to comparing your DVD Player to the kitchen sink to an apple. We don’t need a revolution. An evolution will do just fine.
  2. They Don’t Need Your Data
    Yes, every advertiser and his dog is going to want your data to better target you with advertisements you are more likely to look at, but you don’t have to give it. Remember, they only care about statistics anyway, and, depending on the data being collected, a sample as small as 30 can have some statistical relevance. And while economists, population researchers, and medical researchers may have a valid need for certain data items about you, you don’t have to share all your private details and can most likely keep most of your unique identifying details anonymous without impacting the accuracy of their study.
  3. The Correlations will be Uncanny but Many Won’t Matter Anyway
    While some stuff you can predict is amazing, some is not, and while it’s always frustrating to not be able to predict some behaviours and outcomes, we not only accept that as a fact of business and of life, but wouldn’t have it any other way. (There’d be no stock market if we could predict everything.) For example, it’s not unexpected that web-savvy people will be searching for “unemployment” information as soon as they get laid off. But it is unexpected that a subset of Indian Politicians and a subset of Drug Dealers would have much in common (as per this recent Freakonomics Blog Entry. And neither of these facts really helps us with anything.
    Plus, the predictions will never be perfect. Like weather and stock market models, when you try to model large-scale behaviour like consumer activity over the long term, every now and again the model will fall flat on its face.

To summarize, Big Data is like Big Cloud — full of too much hot air.