Big Data: Are You Still Doing it Wrong?

The only buzzword on par with big data is cloud. According to the converted, or should I say the diverted, better decision are made with better data, and the more data the merrier. This sounds good in theory, but most algorithms that predict demand, acquisition cost, projected sales prices, etc. are based on trends. But these days the average market life of a CPG product, especially in electronics or fashion, is six months or less, and the reality is that there just isn’t enough data to predict meaningful trends on. Moreover, in categories where the average lifespan is longer, you only need the data since the last supply/demand imbalance, global disruption, or global spike in demand as the data you need for the current trend before that is irrelevant … unless you are trying to predict a trend shift, in which case you need the data that falls an interval on each slide of the trend shift for the last n trends.

And if the price only changes weekly, you don’t need data daily. And if you are always buying from the same geography, dictated by the same market, you only need that market data. And if you are using “market data” but 90% of the market is buying through 6 GPOs, then you only need their data. In other words, you only need enough relevant data for accurate prediction. Which, in many cases, will just be a few hundred dat points, even if you have access to thousands (or tens of thousands or even hundreds of thousands).

In other words, big data does not mean good data, and the reality is that you rarely need big data.

But you know that AI doesn’t work without big data? Well, their are two fallacies here.

The first fallacy is that (real) AI exists. As I hoped would have been laid bare in our recent two-week series on Applied Indirection, the best that exists in our space is assisted intelligence (which does nothing without YOUR big brain behind it, and the most advanced technology out there is barely borderline augmented intelligence.

The second fallacy is that you need big data to get results from deep neural networks or other AI statistical or probabilistic machine learning technologies. You don’t … as long as you have selected the appropriate technology appropriately configured with a statistically relevant sample pool.

But here’s the kicker. You have to select the right technology, configure it right and give it the right training set … encoded the right way. Otherwise, it won’t learn anything and won’t do anything when applied. This requires a good understanding of what you’re dealing with, what you’re looking for, and how to process the data to extract, or at least bubble up, the most relevant features for the algorithms to work on. But if you don’t know how to do that, then, yes, you might need hundreds of thousands or millions of data elements and an oversized neural network or statistical classifier to identify all the potentially relevant features, analyze them in different ways, find the similarities that lead to the tightest, most differentiable clusters and adjust all the weights and settings to output that.

But then, as MIT recently published (E.g. MIT, Tech Review), and some of us have known for a long time, many of the nodes in that neural networks, calculations in the SVM, etc. are going to be of minimal, near zero, impact and up to 90% of the calculations are going to be pretty much unnecessary. [E.g. the doctor saw this when he was experimenting with neural networks in grad school over 20 years ago; but due to the lack of processing power (as well as before and after data sets to work on) then versus now it was a bit of trail and error to reduce network size]. In fact, as the MIT researchers found, you can remove most of these nodes, make minor adjustments to the other nodes and network, retrain the network, and get more or less equivalent results with a fraction of the calculations.

And if you can figure out precisely what those nodes are measuring and extract those features from the data before hand and create appropriately differentiated metadata fingerprints and feed those instead to a properly designed neural network or other multi-level classifier, not only can you get fantastic results with less calculation, but less data as well.

Great results come from great data that is smartly gathered, processed, and analyzed — not big data thrown into dumb algorithms where you hope for the best. So if you’re still pushing for bigger and bigger data to throw into bigger and bigger networks, you’re doing it wrong. That’s the wrong way to do it. And the only way you can call it AI is if you re-label AI to mean Anti-Intelligence.

Comprehensive Category Management: Are You Still Doing it Wrong?

As we said five years ago (and probably even earlier than that), spot buying individual categories at market lows or evening running reverse auctions at opportune times is NOT category management. And for that matter, neither is a strategic sourcing event that throws everything in the category into a strategic negotiation, especially if the category is metals and you are including the kitchen sink.

And you might be thinking that the doctor needs a psychiatrist because how could it not be category management if you are addressing the whole category? Category Management isn’t just about grouping all seemingly related items and running an event. Category management is about grouping items that have related characteristics that allow the items to be sourced effectively under the same strategy.

For example, while it might make theoretical sense to group printers, ink, and paper together —- because you use them together, from a sourcing point of view, ink and paper often go better with office supplies and printers with hardware. You can probably get them thrown in for free with a server purchase. But that’s just the start.

For example, if you source a lot of metal parts, you should probably start by grouping them by primary metal, since the price of steel, aluminum, etc. will largely dictate the price of those parts. Furthermore, it might even make sense to not only source all of the parts from the same supplier but even buy the metal on behalf of the supplier with your better negotiating power and/or credit rating.

But that’s just the start. Then you have to make sure the parts are (best) produced using similar processes, because giving a part to a supplier that is only easily produced by laser cutting when the supplier only has traditional machining / cutting is not going to be a good decision. Even though the volume will lower their cost of metal, the extra work will increase the cost per unit.

So sometimes you will need to group the category into sub-category by metal and production style and get bids separately and together (from any supplier that can offer both) and do a multi-level analysis to find out the best approach. (And this is yet a another reason that SI has been telling you since DAY ONE that you need an optimization-backed sourcing platform as this is the only way you can effectively analyze all the options.)

And sometimes you will have to ignore items with a large demand or core material component because they are cheaper when sourced as part of a different category buy as they can be produced by other suppliers or bundled for a larger volume-based discount.

For example, consider an organization-wide UPS replacement. They are technically a power transformer with a battery, but you wouldn’t source them from the manufacturer that manufactures custom transformers for your on-site renewable solar and wind farm since you’d source them from your hardware supplier who supplies you with the rest of your office electronics as they would be buying such units in bulk from a manufacturer who produces them in bulk and gives you a better deal.

Comprehensive category management is looking at a category from a holistic perspective and finding the right segmentation to get the best overall value through the right sourcing method at the right time.

It’s not just a one-time slice-and-dice, it’s a continual analysis of the category from a multi-dimensional and current market perspective to make sure each time an event is run, the right strategy is used across the right sub-category of products and services which are offered to the right prospective supply base.

And it requires up-front market analysis before the event as well as optimization-backed analysis during. So you need a good analytics platform, preferably with some automation that can constantly pull in market data, analyze it to current cost, plot and predict the trends, and provide the necessary market intelligence that can be compared to a best-practice knowledge base that will indicate the event type that has been the most historically successful under current conditions. (And in the spirit of our recent Applied Indirection series, this is not AI, this is RPA with parameterized suggestion look-up.)

SIM? Is It Old News or a Shiny New Pair of Shoes? Part III

As per our last two posts, SIM (Supplier Information Management) is a very mature and stable technology with a large number of software vendors not only providing the tools and best practices to manage supplier life-cycles, but to manage risk, compliance, receivables, and even spend repositories for spend management. And now that every suite vendor has built, or acquired it, the technology is a commodity in the Supply Management Space, and an acquisition of the typical implementation is not likely to get baby that new pair of shoes anytime soon. Especially since most of these platforms use static data models, fixed workflows, and have little support for supply chain visibility beyond tier 1.

More specifically, as per our last post, what is needed is a SIM tool that allows for a truly dynamic data model, adaptable workflow, and a supply chain organization map that could truly bring a new wave of value to a modern Supply Management organization.

And while many of the classic platforms do not have this capability, as well as many of the best-of-breed platforms, some of the newer, and more innovative, platforms are going down this path.

For example, Ivalua, one of the few suite providers built from the ground up on a single code-base, has spent years building a powerful workflow engine that underlies their entire platform and that can be configured to support just about any supplier on-boarding process you can imagine — as well as integrate just about any data source you want to augment the profiles through its end-user data source integration capability.

Then we have SourceMap, which allows you to map your supply chain down to the source raw material, collect data up and down the chain, and dynamically alter it as raw material providers entered the chain or dropped off. And you can visualize it, create risk models that work on propagated data up and down the chain, and even estimate the impact of a delay or disruption.

And, more importantly, we have HICX, the little vendor that could, did, and keeps on trucking. Fully dynamic, adaptable data model that can even be configured into your own workflows and allow you to hang sub-tier supplier information off of supplier nodes. A powerful UI which can be heavily customized, and more innovations coming soon.

In other words, while classic SIM is old-tech and indistinguishable between about two dozen providers, modern SIM is beginning to undergo a resurgence, and when we finally get open networks, centralized, validated data, and community intelligence, we’ll see a new level of value ooze from these solutions.

So choose wisely, and your solution may just grow with you (instead of taking you back to 2009 when we had a feeling things would get better, but didn’t).

Sixteen Hundred and Ninety One Years Ago Today …

Constantine’s Bridge was officially opened in the presence of emperor Constantine the Great (who ruled Rome between 306 and 337 AD when he was acclaimed emperor after his father’s death). This was a 2,437 m Roman bridge over the Danube, 1,137 m of which spanned the riverbed, that is currently considered the longest ancient river bridge and one of the longest of all time — especially considering it was a wooden arch bridge with wooden superstructure (with masonry piers). [The longest pure arch bridges today barely exceed 500 m’s in length.]

While it only lasted four decades (which is still impressive given its mostly wooden construction), it is still a feat of ancient engineering and an accomplishment in logistics as it allowed for horse and cart delivery of goods (and men) in place of boats.

SIM? Is It Old News or a Shiny New Pair of Shoes? Part II (Updated)

As per our last post, SIM (Supplier Information Management) is a very mature and stable technology with a large number of software vendors not only providing the tools and best practices to manage supplier life-cycles, but to manage risk, compliance, receivables, and even spend repositories for spend management. And now that every suite vendor has built, or acquired it, the technology is almost a commodity in the Supply Management Space, and an acquisition thereof is not likely to get baby that new pair of shoes anytime soon. Or is it?

As great as they are, most SIM products —- stand alone best-of-breed or integrated suite offerings, have at least one weakness —- and often two. In particular, the data model and the workflow. Just like early spend analysis solutions were often tied to one, rigid, UNSPSC-based data model, most current SIM solutions are also tied to one, rather rigid, data model. In addition, most of those solutions with some SLM (Supplier Lifecycle Management) also have rigid workflows.

This worked well when business processes were predictable and stable and corresponded to products with long life-spans. But the times they-have-a-changed. These days, product life-spans are measured in quarters, and not years, if we are lucky. Associated processes change to not only accommodate the new product demands but to adapt to new technologies and new business requirements. If the workflow can’t adapt, the capability, and overall usefulness, of the tool is limited.

A SIM product that could not only allow a user to define, and redefine, data models as necessary but define, and redefine, workflows as necessary would offer more value than current SIM platforms. And if that product could also maintain full audit trails, which not only track data changes but model and workflow changes, and insure that old records and workflows can still be seamlessly accessed when the data model or workflow changes, then that would be even better.

And if that SIM product went even further and allowed for dynamic organizational, supply base, and user-defined hierarchies, that would be icing on the cake. Supply Chains are not boring because they are not static. They are constantly changing. The supply chain can not only change from product to product, but batch to batch as a primary raw material or part supplier runs out of material, becomes unreachable due to a political or natural disaster, or simply gets greedy and forces the higher tier supplier to find a new source. A good SIM solution will allow the supply chain map to evolve in real-time as the supply chain evolves. Moreover, with acquisitions, mergers, and spin-offs being the normal modus operandi for many businesses, a SIM solution that can easily adapt the organizational data model is also required. Finally, for maximum productivity, a user needs to be able to maintain their own view of the supply chain, back and front, relevant to them. They need to maintain their view of the relevant multi-tier supply base and the relevant hierarchies in their organization that they have to report to and serve.

In other words, a SIM tool that allowed for a truly dynamic data model, workflow, and supply chain organization map could bring a new wave of value to a modern Supply Management organization and the individual with the foresight to acquire such a tool might just get baby a new set of shoes. But is this available? And is it becoming common place?