Auto-Classification is NOT the Answer, Part I

Today’s post is co-authored by Eric Strovink of BIQ.

Not a month doesn’t go by these days without a new spend classification / consulting play hitting the market. Considering that true spend analysis is one of only two sourcing technologies proven to deliver double digit percentage savings (that average 11%), one would think this would be a good thing. But it’s not. Most of these new plays are focusing on automatic classification, analysis, and reporting — which is not what true spend analysis is. True spend analysis is intelligently guided analysis, and, at least until we have true AI, it can only be done by a human. So what’s wrong with the automatic approach?

1. Automatically Generated Rule Sets are Difficult to Maintain

Almost all of today’s auto-classifiers generate a single level rule set that is so large that the size alone makes it unwieldy. This is because auto-classifiers depend on string matching techniques to identify vendor names or line items. But when a new string-matching rule is added, what is its impact on the other rules? There is no way to know other than to replay the rules every time. This quickly exhausts the patience of anyone trying to maintain such a rules set, and produces errors that are difficult to track down and essentially impossible to fix. Worse, what happens when you delete a rule? The process is intrinsically chaotic and unstable. We get calls all the time from users who have thrown up their hands at this.

But with a layered rule set (more on this in part II), where each rule group takes priority over the rule sets above it, the average organization can achieve a reasonable first-order mapping result with only a few hundred GL mapping rules and a few hundred vendor mapping rules, along with a handful of rules to map vendor + GL code combinations in the situations where a vendor supplies more than one Commodity (and an even smaller number of exception rules where a vendor product or service can map to a different Commodity depending upon spend or use). If finer resolution is required, map more GL codes and more vendors; or map just the GL codes and vendors that are relevant to the sourcing exercise you are contemplating. There’a a reason for the 80-20 rule; it makes sense. Mapping a vendor like Fred’s Diner is irrelevant. Mapping a vendor like IBM correctly and completely, with full manual oversight and control, is critical.

2. Finding Errors, Performing Q/A, Avoiding Embarrassment

How can a spend cube be vetted? It’s actually quite easy. Run a “Commodity Summary Report” (originally popularized by The Mitchell Madison Group, circa 1995 – example here). This report provides a multi-page book, one page per Commodity, showing top vendors, top GL codes, and top Cost Centers, ordered top-down by spend. Errors will jump out at you — for example, what is this GL doing associated with this Commodity? Does this Vendor really supply this Commodity? Does this Cost Center really use this Commodity?

Then invert the Commodity Summary Report to book by Vendor, showing top GL codes, top Commodities, and top Cost Centers. Errors are obvious again; why is this Commodity showing up under this Vendor? What’s the story with this GL code being associated with this Vendor? Then invert the Commodity Summary Report to book by GL code, showing top Vendors, top Commodities, and top Cost Centers. When you refine the rules set to the point where nothing jumps out at you using any of these three views, then congratulations: you have a consistent spend map that will hold up well to any outside examination. If someone crawls down into the weeds and finds an inaccurate GL mapping, simply add a rule to the appropriate group (probably Vendor), and the problem is solved. If the mapping tool is a real-time tool, as it ought to be, the problem can be solved immediately, in seconds.

[N.B. We encourage you to run the Commodity Summary Report on the results of your automatically-generated rules set. But please do it only if you are sitting down comfortably. We don’t want you to hurt yourself falling off the chair.]

3. Automated Analysis is NOT Analysis

All an automated system can do is repeat a previously identified analysis. Chances are that if the analysis was already done, the savings opportunity was already found and addressed. That means that after the analysis is done the first time, no more savings will be found. The only path to sustained savings is when a user manually analyzes their data in new and interesting ways that yield new and previously unnoticed patterns or general trends with outliers well outside the norm — as it is those outliers that represent the true savings opportunities. And sometimes the only way to find a novel savings opportunity is to allow the analyst to follow her hunches to uncover unusual spending patterns that could allow significant savings if normalized.

4. True Analysis Goes Well Beyond AP Data

Last but not least, it must be pointed out that the bulk of the (dozens of) spend analysis cubes that need to be built by the average large company are on PxQ (price x quantity) data, not on A/P data. In the PxQ case, classification is totally irrelevant; yet PxQ analysis is where the real savings and real insights occur. More on that in an upcoming Spend Analysis Series.

In our next post, we’ll review the final reason that auto-classification is not the answer.

Share This on Linked In