The only buzzword on par with big data is cloud. According to the converted, or should I say the diverted, better decision are made with better data, and the more data the merrier. This sounds good in theory, but most algorithms that predict demand, acquisition cost, projected sales prices, etc. are based on trends. But these days the average market life of a CPG product, especially in electronics or fashion, is six months or less, and the reality is that there just isn’t enough data to predict meaningful trends on. Moreover, in categories where the average lifespan is longer, you only need the data since the last supply/demand imbalance, global disruption, or global spike in demand as the data you need for the current trend before that is irrelevant … unless you are trying to predict a trend shift, in which case you need the data that falls an interval on each slide of the trend shift for the last n trends.
And if the price only changes weekly, you don’t need data daily. And if you are always buying from the same geography, dictated by the same market, you only need that market data. And if you are using “market data” but 90% of the market is buying through 6 GPOs, then you only need their data. In other words, you only need enough relevant data for accurate prediction. Which, in many cases, will just be a few hundred dat points, even if you have access to thousands (or tens of thousands or even hundreds of thousands).
In other words, big data does not mean good data, and the reality is that you rarely need big data.
But you know that AI doesn’t work without big data? Well, their are two fallacies here.
The first fallacy is that (real) AI exists. As I hoped would have been laid bare in our recent two-week series on Applied Indirection, the best that exists in our space is assisted intelligence (which does nothing without YOUR big brain behind it, and the most advanced technology out there is barely borderline augmented intelligence.
The second fallacy is that you need big data to get results from deep neural networks or other AI statistical or probabilistic machine learning technologies. You don’t … as long as you have selected the appropriate technology appropriately configured with a statistically relevant sample pool.
But here’s the kicker. You have to select the right technology, configure it right and give it the right training set … encoded the right way. Otherwise, it won’t learn anything and won’t do anything when applied. This requires a good understanding of what you’re dealing with, what you’re looking for, and how to process the data to extract, or at least bubble up, the most relevant features for the algorithms to work on. But if you don’t know how to do that, then, yes, you might need hundreds of thousands or millions of data elements and an oversized neural network or statistical classifier to identify all the potentially relevant features, analyze them in different ways, find the similarities that lead to the tightest, most differentiable clusters and adjust all the weights and settings to output that.
But then, as MIT recently published (E.g. MIT, Tech Review), and some of us have known for a long time, many of the nodes in that neural networks, calculations in the SVM, etc. are going to be of minimal, near zero, impact and up to 90% of the calculations are going to be pretty much unnecessary. [E.g. the doctor saw this when he was experimenting with neural networks in grad school over 20 years ago; but due to the lack of processing power (as well as before and after data sets to work on) then versus now it was a bit of trail and error to reduce network size]. In fact, as the MIT researchers found, you can remove most of these nodes, make minor adjustments to the other nodes and network, retrain the network, and get more or less equivalent results with a fraction of the calculations.
And if you can figure out precisely what those nodes are measuring and extract those features from the data before hand and create appropriately differentiated metadata fingerprints and feed those instead to a properly designed neural network or other multi-level classifier, not only can you get fantastic results with less calculation, but less data as well.
Great results come from great data that is smartly gathered, processed, and analyzed — not big data thrown into dumb algorithms where you hope for the best. So if you’re still pushing for bigger and bigger data to throw into bigger and bigger networks, you’re doing it wrong. That’s the wrong way to do it. And the only way you can call it AI is if you re-label AI to mean Anti-Intelligence.