Category Archives: Spend Analysis

Aptium Global : An Emerging Spend Powerhouse

Regular readers of this blog will remember that I’ve mentioned Aptium Global a few times, chronicled one of their success stories in Tuesday’s Lean Services post (with another hitting the blog sphere tomorrow),and ran a great guest post on Quantifying Quality in Lean Sourcing Initiatives by founder and principal Lisa Reisman. Aptium Global is a specialized consultancy that works primarily with small and medium sized manufacturing companies to help them save money on purchases through Lean Sourcing approaches.

Well today, in addition to industry heavyweight Stuart Burns, who runs their European practice, Aptium Global can add FreeMarkets legend Tony Poshek, inventor of The Puddy Principle to strategic sourcing. Tony, who has also put in considerable time at GE (as well as managing events for GM and other Fortune 50 heavyweights) has sourced almost $2B in his sourcing career and saved over 300M, or an average of 15% above and beyond what industry leading sourcing teams have saved. Tony was interviewed by Lisa last year and the interview is archived over on e-Sourcing Forum, archived in Part 1 and Part 2. Check it out!

Add this to Aptium’s forthcoming launch of an industry specific Metal Miner offering for companies that source metals, commodities, and components with high metal concentrations, and it’s easy to predict that Aptium Global is poised to become a powerhouse in their corner of the sourcing space.

The Metal Miner sourcing solution is a packaged two-week analysis that is designed to provide a small or mid-sized company with real time market condition and savings strategies for all of their metals and metal services spending in two to three weeks. A proprietary analytical solution built on over half a century of combined global metals sourcing experience, the solution is designed to provide you with a strategic framework to metals sourcing that can provide you and your executive team the insight you need for critical strategic sourcing decisions. Metal Miner uses state-of-the art analysis technology, takes into account a high-level assessment of the supply market for each category (including the main price drivers, the degree of fragmentation, domestic/offshore supply bases, and hedging mechanisms), and produces a customized report with specific implementable savings strategies for each category in which a significant savings can be achieved.

So, if you need sourcing help, particularly in metal or metal services categories, I’d contact them now. The secret’s out … and it won’t be long before the lines are jammed and the e-mail boxes overflowing.

Spend Analysis VI: New Horizons (Part 2)

Today I’d like to welcome back Eric Strovink of BIQ (acquired by Opera Solutions, rebranded ElectifAI) who, as I indicated in part I of this series, is authoring the first part of this series on next generation spend analysis and why it is more than just basic spend visibility. Much, much more!

Federation

One of the most serious limitations of OLAP analysis is the schema structure itself — typically a “star” schema, where a voluminous “fact” or “transaction” file is surrounded by supporting files, or “dimensions.” In the case of spend analysis, dimensions are Supplier, Cost Center, Commodity, and so on; transactions are typically AP records.

Why is this schema limiting? Because there are only certain ways that a dimension file can be linked to transaction files, and it isn’t always clear which file ought to be the transaction file and which files ought to be dimensions. For example, suppose that the transaction file consists of AP transactions, and a dimension file consists of invoice line items. The problem is that the invoice line item file is “moving faster” than the AP file; i.e. for every invoice number that appears in AP, there are multiple invoice lines that match. Which invoice line item should we link to?

Well, we could invert the problem and build the dataset from the invoice detail file instead, except that we typically won’t have invoice detail for every AP record, so that probably won’t work. Here’s a couple of ideas that will work: (1) we could build a separate measure column for invoice line items, and include them as AP record equivalents (coercing the two record types into a common format); (2) we could drop the associated AP record whenever we have invoice line item data, and include the AP information inside those line items, redundantly.

There are other options, too.

But the essential problem is that we have two separate datasets, and we’re trying to join them at the hip. There is an AP dataset, and there is an invoice line item dataset, and never the twain shall meet, except artificially. Even when there is no granularity issue at all, and when one dataset can be normalized or snowflaked such that every matching line item can be joined through from the other, the amount of effort required to set up the index->index->index relationships can be daunting.

Instead, why not create two separate datasets, efficiently and quickly; and then as a final step, federate them together on a common dimension? Suppose the federation logic was “join” — in that case, we’d drill on an element in dataset A; dataset B would drill on the common dimension from A; and then A would drill again on the common dimension from B. What we’d see is the perfect join of all of the records from A and from B that shared a common key in the common dimension; and we’d have the ability to reference all data from any dimension of both A and B.

There are many forms of federation in addition to join — for example, “master-slave,” where we drill on A, and B shows us its common nodes; but does not feed those back to A. That relationship can go the other way, as well, from B to A. In addition, there’s a “disjoint” operation — show me all the nodes in B that don’t share a key in the common dimension with A (and vice versa).

Federation represents a key productivity enhancer for dataset creation, as well as a simplification to the dataset building process in general. Federation also passes the “usability” litmus test, in that the resulting datasets are much easier to understand than massive levels of index indirection and snowflaking, and have the potential to produce richer results.

The technical challenges for federation are considerable: maintaining multiple connections to multiple datasets; representing multiple data dimensions inside the context of a single data viewer; providing mechanisms for pulling data seamlessly from multiple datasets for reports and analyses; and last but not least, augmenting the OLAP engine to perform federation operations effectively and quickly.

Is federation worth it? I think so, emphatically.

This brings to an end our initial Spend Analysis series. Thanks for the opportunity, Michael; and thanks to everyone who took the time to wade through it.

As Eric said, this ends Sourcing Innovation’s initial series on spend analysis. I’d like to thank Eric for his enlightening posts and hope that you learned something from them.

Spend Analysis V: New Horizons (Part 1)

Today I’d like to welcome back Eric Strovink of BIQ (acquired by Opera Solutions, rebranded ElectrifAI) who, as I indicated in part I of this series, is authoring the first part of this series on next generation spend analysis and why it is more than just basic spend visibility. Much, much more!

Many of the limitations of spend analysis derive from its underlying
technology. As I’ve discussed in previous installments, the extent to which spend analysis can be made more useful to business users is often the
extent to which those limitations can be hidden or eliminated. In essence: an analysis tool is useful to business analysts only if business analysts actually use it. Which means that there is a fine line that vendors must walk between delivering technology to business users, and shielding them from it — without going too far and creating an unnecessary vendor dependency.

In this installment and the next, we’ll look at a few advanced features that aren’t necessarily available today, but that should be possible to provide in future, without crossing that line.

Meta Aggregation

By definition, a spend transaction contains the “leaf” level of a
hierarchy only. Consider the following:

   HR Consulting
Mercer
Deloitte
IT Consulting
IBM
Accenture
Management Consulting
KPMG
CGI

Low level transactions typically contain “Mercer” or “CGI,” but
not “IT Consulting” or “HR Consulting,” because those intermediate
hierarchy positions (“nodes”) represent an artificial organization
imposed by the user, and have no reality at the transaction level.

Suppose, though, that I’d like to be able to treat intermediate nodes as though they had reality inside the transaction set itself. Simple example: I’d
like to derive a new range dimension based on the top level of the above
dimension. I want to know which consolidated groupings are at $0-$100K,
which are at $100-$500K, and so on. I don’t care about IBM or
KPMG any more; all I care about is aggregating my own groupings.

In mathematical terms, I’m asking for f(g(x)) — the ability to apply
dimension derivation to a previous aggregation step; and, inductively and more generally, to do the same to the meta-aggregated dimension itself.

In OLAP implementation terms, I’m asking the engine to treat the intermediate nodes from any dimension, at any hierarchy level, as virtual transaction columns rather than as dimensional nodes. The problem is, intermediate nodes aren’t static; they’re changing all the time. That means a dimension derived on artificial rollup values must be re-derived whenever the hierarchy of the source dimension is altered; and, since hierarchy editing must be a real-time operation (as I have argued in this series and elsewhere),  the dimension derivation must also be performed on-the-fly.

Tricky as this might be to implement, the logic is easy to specify from the business user’s perspective. The user simply picks a previously-defined dimension and a hierarchy level on which to base his new dimension, and he’s done.

Visual Crosstabs

The utility of Shneiderman diagrams (or “treemaps”) to display hierarchical information is well known; the BIQ site has a live example.

The treemap is useful because it is visually intuitive; in this example, the relative sizes of the rectangles represent the relative magnitude of spending. The colors indicate relative change in spending; red is bad, green is good; lighter green is better. Inner rectangles show the breakdown at the next level of the hierarchy.

Clicking inside one of the white-bordered rectangles provides an expanded lower-level view of the hierarchy; clicking the up-arrow button moves back up a level.

Now, suppose that rather than the inner rectangles showing a lower level of the same hierarchy, instead they showed the breakdown of spending
within another dimension entirely — i.e., a “visual crosstab.” The visual crosstab would not only show magnitudes, but trends as well.

Unlike with meta aggregation, where the user interface is simple and
the implementation complex, here the user interface is complex and the
implementation fairly simple. The utility of the visual crosstab will
depend strongly on the user interface — for example, how does the user change the resolution of the outer dimension to a different hierarchy level? What might that do to the level of the inner dimension? How might the user invert the view, so that the inner dimension becomes the outer, and the outer becomes the inner? Globally, how can the user be kept aware of what’s being viewed/inverted/clicked, and therefore be able to make sense of the result?

Spend Analysis IV: Defining “Analysis”

Today I’d like to welcome back Eric Strovink of BIQ (acquired by Opera Solutions, rebranded ElectrifAI) who, as I indicated in part I of this series, is authoring the first part of this series on next generation spend analysis and why it is more than just basic spend visibility. Much, much more!

“No canned report survives first contact with the analyst.”

Analysis = Agility

Reporting on a large transaction dataset is
technically challenging. For example, pointing ordinary reporting
tools at a large dataset doesn’t work well, because what might
seem like a perfectly ordinary and reasonable database query can require minutes to
complete, sometimes even hours. That’s why OLAP (“On Line
Analytical Processing”) technology is required in order to
return results quickly on large datasets, and that’s why every data
warehouse uses some variant of it.

OLAP is not a panacea. OLAP database queries only
work within a rigid framework — that is, queries are
fast only within the data dimensions and hierarchies that have
been pre-defined. To ask a question outside of that rigid
framework, and to get an answer to that question in a reasonable
amount of time, the underlying dataset structure must be changed —
either dimensional hierarchies must be altered, data re-mapped, or
entirely new data dimensions created.

Data analysis is an inherently ad hoc process —
to paraphrase Sun Tzu, “no canned report survives
first contact with the analyst.” But, in order to be able to perform the
OLAP queries that support ad hoc reporting, it is necessary to change the dataset structure
to support those queries. And, it had better be possible
to do that quickly and easily; otherwise OLAP power cannot
be brought to bear on the ad hoc report, which means that
the report can’t be generated without great pain.

Analysis therefore equates, in a very real sense, to “agility”;
in other words, how quickly and easily one can:

  • generate new dimensions;
  • change existing dimensional hierarchies;
  • map and family new and existing dimensions.

Agility also applies at a higher level. I argued in

Spend Analysis I: The Value Curve
that the notion of
one dataset for spending data is limiting, because many
different analysis views — especially commodity-specific
views — can be key to driving additional value. If the spend analysis process involves the creation
of multiple datasets over time, and it’s hard or expensive to build
or modify datasets, then that process can’t move forward.

Agility also requires that the above operations be performed by
business users with limited IT skills, on their own, without
assistance from vendor or internal experts. If the system is
not agile, then the default decision is not to analyze,
as pointed out in

Spend Analysis II: The Psychology of Analysis
. That is
the worst possible outcome for the enterprise, because it perpetuates
information starvation in a land of data plenty.

Analysis = Speed

Here’s a heretical statement, coming from a spend analysis vendor:
anything that a spend analysis system does for you can be done
with ordinary tools. You
can use a database system to load a large dataset; you can cleanse your own data by
writing database queries;
you can write programs
to build reports; you can dump data to pivot tables. You can get great answers to your
questions. Some old-school sourcing consultants still use manual methods like
these, and some home-grown spend analysis systems built around
tools like Microsoft Access are still operating today.

However, if you do use a modern spend analysis system, you can produce
those same pivot tables and reports with a few mouse-clicks; and, you can alter their
properties and constraints with slice-and-dice operations easily and quickly. For every
report that the old-school consultant generates, you’ll have had the opportunity to generate hundreds.
Does this mean that your insights will be better than those of the consultant? Not necessarily;
but it’s hard to argue that they shouldn’t be.

If your spend analysis system isn’t agile, though, you’ll be back in the same boat
as the crusty old consultant, and he’ll be laughing at you. You’ll have to extract
transactions from the system and hack at them with the same tools that the consultant
uses, with the same productivity loss.

Analysis = Power

It’s important to distinguish between ad hoc reporting and reporting in general.
Does the spend analysis system have the ability to produce complex and custom
reports, guided by you?
Or are its reports written in some programming language like Java or C++,
the source code for which is inaccessible to you
and unmodifiable by anyone but the vendor?

Analysis power is precisely the power that you wield as a business user,
independent of canned reports supplied by a vendor. Complex, multi-page
reports such as the original MMG Commodity Spending Report (below),
variants of which are now commonplace across the e-sourcing space, should
be within your reach to create quickly and easily — without any
programming, database queries, or other IT magic, and yet with full flexibility
to build whatever it is that you need.


Click the image to enlarge

Next: Spend Analysis V: New Horizons (part 1)

Spend Analysis III: Common Sense Cleansing

Today I’d like to welcome back Eric Strovink of BIQ (acquired by Opera Solutions, rebranded ElectrifAI) who, as I indicated in part I of this series, is going to be authoring the first part of this series on next generation spend analysis and why it is more than just basic spend visibility. Much, much more!

Many observers would acknowledge that there’s not a lot of difference between viewing cleansed spend data with SAP BW or Cognos or Business Objects, and viewing cleansed spend data with a custom data warehouse from a spend analysis vendor. They’re all OLAP data warehouses; they all have competent data viewers; they all provide visibility into multidimensional data. What has historically differentiated spend analysis from BI systems is the cleansing process itself (along with, in contrast to the BI view, the decoupling of data dimensions from the accounting system).

Because it’s hard to distinguish one data warehouse from another, cleansing has become an important differentiator for many spend analysis vendors. The vendor has typically developed a viewpoint as to the relative merits of manual labor/offshore resources, automated tools, custom databases, and so on, and sells its SA product and services around that viewpoint. Unfortunately, all the resulting hype and focus on cleansing services, from both these vendors and the analysts who follow them, has obscured a simple reality — namely, that effective data cleansing methods have been around for years, are well understood, and are easy to implement.

The basic concept, originated and refined by various consultants and procurement professionals during the early to mid-1990’s, is to build commodity mapping rules for top vendors and top GL codes (top means ordered top-down by spending) — in other words, to apply common sense 80-20 engineering principles to spend mapping. GL mapping catches the “tail” of the spend distribution, albeit approximately; vendor mapping ensures that the most important vendors are mapped correctly; and a combination of GL and vendor mapping handles the case of vendors who supply multiple commodities. If more accuracy is needed, one simply maps more of the top GLs and vendors. Practitioners routinely report mapping accuracies of 95% and above. More importantly, this straightforward methodology enables sourcers to achieve good visibility into a typical spend dataset very quickly, which in turn allows them to focus their spend management efforts (and further cleansing) on the most promising commodities.

Is it necessary to map every vendor? Almost never; although third-party vendor mapping services are readily available, if you need them. And, as far as vendor familying is concerned, grouping together multiple instances of the same vendor clears up more than 95% of the problem. Who-owns-whom familying using commercial databases seldom provides additional insight; besides, inside buyers are usually well aware of the few relationships that actually matter. For example, you won’t get any savings from UTC by buying from Carrier and from Otis Elevator. And, it would be a mistake to group Hilton Hotels under their owners, since they are all franchisees.

[N.B. There are of course cases where insufficient data exist to use classical mapping techniques. For example, if the dataset is limited to line item descriptions, then phrase mapping is required; if the dataset has vendor information only, then vendor mapping is the only alternative. Commodity maps based on insufficient data are inaccurate commodity maps, but they are better than nothing.]

80-20 logic also applies to the overall spend mapping problem. Consider a financial services firm with an indirect spend base. Before even starting to look at the data, every veteran sourcer
knows where to start looking first for potential savings: contract labor, commercial print, PCs and computing, and so on. Here is a segment of the typical indirect spending breakdown, originally published by The Mitchell Madison Group:

< Graphic No Longer Available >

If you have limited resources, it can be counterproductive to start mapping commodities that likely won’t produce savings, when good estimates can often be made as to where the big hits are likely to be. If you can score some successes now, there will be plenty of time to extend the reach of the system later. If there are sufficient resources to attack only a couple of commodities, it makes sense to focus on those commodities alone, rather than to attempt to map the entire commodity tree.

The bottom line is that data cleansing needn’t be a complex, expensive, offline process. By applying common sense to the cleansing problem, i.e. by attacking it incrementally and intelligently over time, mapping rules can be developed, refined, and applied when needed. In fact, whether you choose to have an initial spend dataset created by outside resources, or you decide to create it yourself, the conclusion is the same:
cleansing should be an online, ongoing process, guided by feedback and insight gleaned directly (and incestuously) from the powerful visibility tools of the spend analysis system itself.
And, as a corollary, cleansing tools must be placed directly into the hands of purchasing professionals so that they can create and refine mappings on-the-fly, without any assistance from vendors or internal IT experts.

Next: Defining “Analysis”