Spend Analysis III: Common Sense Cleansing

Today I’d like to welcome back Eric Strovink of BIQ who, as I indicated in part I of this series, is going to be authoring the first part of this series on next generation spend analysis and why it is more than just basic spend visibility. Much, much more!

Many observers would acknowledge that there’s not a lot of difference between
viewing cleansed spend data with SAP BW or Cognos or Business Objects,
and viewing cleansed spend data with a custom data warehouse from a
spend analysis vendor. They’re all OLAP data warehouses; they all
have competent data viewers; they all provide visibility into
multidimensional data. What has historically differentiated spend
analysis from BI systems is the cleansing process itself (along with,
in contrast to the BI view, the decoupling of data dimensions from the
accounting system).

Because it’s hard to distinguish one data warehouse from another, cleansing has
become an important differentiator for many spend analysis
vendors. The vendor has typically developed a viewpoint as to
the relative merits of manual labor/offshore resources, automated
tools, custom databases, and so on, and sells its SA product and
services around that viewpoint. Unfortunately, all the resulting
hype and focus on cleansing services, from both these vendors and the analysts
who follow them, has obscured a simple reality —
namely, that effective data cleansing methods have
been around for years, are well understood, and are easy to implement.

The basic concept, originated and refined by various consultants and
procurement professionals during the early to mid-1990’s, is to build
commodity mapping rules for top vendors and top GL codes (top means
ordered top-down by spending) — in other words, to apply common sense 80-20
engineering principles to spend mapping. GL mapping catches the
“tail” of the spend distribution, albeit approximately; vendor
mapping ensures that the most important vendors are mapped correctly;
and a combination of GL and vendor mapping handles the case of
vendors who supply multiple commodities. If more accuracy is
needed, one simply maps more of the top GLs and vendors. Practitioners routinely
report mapping accuracies of 95% and above. More importantly, this
straightforward methodology enables sourcers to achieve good
visibility into a typical spend dataset very quickly, which in
turn allows them to focus their spend management efforts (and
further cleansing) on the most promising commodities.

Is it necessary to map every vendor? Almost never; although third-party vendor mapping
services are readily available, if you need them. And, as far
as vendor familying is concerned, grouping together multiple instances
of the same vendor clears up more than 95% of the problem.
Who-owns-whom familying using commercial databases seldom
provides additional insight; besides, inside buyers are usually well
aware of the few relationships that actually matter. For example,
you won’t get any savings from UTC by buying from Carrier and
from Otis Elevator. And, it would be a mistake to group Hilton
Hotels under their owners, since they are all franchisees.

[N.B. There are of course cases where insufficient data exist
to use classical mapping techniques. For example, if the dataset
is limited to line item descriptions, then phrase mapping is
required; if the dataset has vendor information only, then
vendor mapping is the only alternative. Commodity maps based
on insufficient data are inaccurate commodity maps, but they
are better than nothing.]

80-20 logic also applies to the overall spend mapping problem.
Consider a financial services firm with an indirect spend base.
Before even starting to look at the data, every veteran sourcer
knows where to start looking first for potential savings:
contract labor, commercial print, PCs and computing, and so
on. Here is a segment of the typical indirect spending
breakdown, originally published by The Mitchell Madison
Group
:

If you have limited resources, it can be counterproductive to start
mapping commodities that likely won’t produce savings, when good estimates
can often be made as to where the big hits are likely to be. If you can score some
successes now, there will be plenty
of time to extend the reach of the system later. If
there are sufficient resources to attack only a couple of
commodities, it makes sense to focus on those commodities alone, rather than to
attempt to map the entire commodity tree.

The bottom line is that data cleansing needn’t be a complex,
expensive, offline process. By applying common sense
to the cleansing problem, i.e. by attacking it incrementally
and intelligently over time, mapping rules can be developed, refined,
and applied when needed. In fact, whether you choose to have an initial
spend dataset created by outside resources, or you decide to create it yourself,
the conclusion is the same:
cleansing should be an online, ongoing process, guided
by feedback and insight gleaned directly (and incestuously)
from the powerful visibility tools of the spend analysis system
itself.
And, as a corollary, cleansing tools must be placed directly into the hands of
purchasing professionals so that they can create and refine
mappings on-the-fly, without any assistance from vendors or internal IT experts.

Next: Defining “Analysis”