Category Archives: Guest Author

Spend Analysis V: User-Defined Measures, Part 2

Today’s post is from Eric Strovink of BIQ.

Sometimes you want and need control over how (and when) measures are calculated. Such measures are termed “user-defined” measures.

As we saw previously, there are two kinds of user-defined measures:
(1) post-facto computations that are performed after transactional roll-up (the usual definition), which we’ll consider here, and
(2) those that are performed during transactional roll-up, which were covered in Part 1.

In the above example the gray “Ref” columns are filtered on commodity count/spend in Q1 2003, and the normal column is filtered on commodity count/spend in Q1 2004. If we then additionally filter on four business units:

we can now see the quarter-on-quarter comparison for just those business units:

You can see that two filter operations are occuring every time the dataset is filtered; one for the regular filter above, and one for the reference filter, modified by non-conflicting filter operations. This “dynamic” reference filtering can be quite powerful, since the relationship between the two periods is now available at any filter position in the dataset.

Now, let’s add a post-rollup (“nodal”) computed measure that calculates the %difference between these columns. The code reads like this:

 

$%Diff$ = ($Amount$-$RefFilter4.Amount$)/$RefFilter4.Amount$ * 100;

 

Now, if we sort top down by %difference, we can see quite clearly the quarter-on-quarter difference sorted by worst to best, considering just the four cost centers above:

This percentage difference is available to all analysis modules, because it is calculated at every node, not just at the nodes that are currently being displayed.

Next installment: Meta Dimensions; but I’ll take a few weeks off before diving back in!

Previous Installment: User-Defined Measures, Part I

Share This on Linked In

Spend Analysis IV: User-Defined Measures, Part 1

Today’s post is from Eric Strovink of BIQ.

A “measure” is a quantity that’s computed for you in an analysis dataset — for example, spend, count of transactions, and so on. There could be many measures in a dataset, such as multiple currencies, or entirely different quantities such as number of units.

Measures are derived from the data supplied, and rolled up (typically summed) to hierarchical totals. Sometimes, however, you want and need control over how (and when) the measure is calculated. Such measures are termed “user-defined” measures.

Let’s first dispense with the usual definition of user-defined measures — namely, eye candy that has no reality outside of the current display window. You can identify eye candy by looking for the little asterisk in the User Manual that says “certain operations” aren’t possible on a user-defined measure. That’s the tip-off that the tool isn’t really creating the measure at all — it’s just computing it on the fly, as needed, for the current display. In order to be truly useful, user-defined measures must have reality at all drillpoints (all “nodes”) in the dataset, at all times, so they can be used freely in analysis functions, just like “regular” measures. It’s no wonder that many “analysis” products avoid performing the millions of computations required to do this properly, preferring instead to do the handful of computations required to pass casual inspection during the sales process. You’ll discover once you dive into the product that its “user-defined” measures are useless; but by then it’s too late.

There are two kinds of user-defined measures:
(1) post-facto computations that are performed after transactional roll-up (the usual definition), and
(2) those that are performed during transactional roll-up, which we’ll consider here.


Click to enlarge

In the above example there are two savings scenarios identified, “Plan1” and “Plan2”. Plan 1 is a 10% savings scenario, and Plan 2 is a 20% savings scenario. However, this savings plan is complex, because it is a real savings plan. It applies only to spend with certain vendors, and only in certain categories. Thus, as you can see from the numbers, savings aren’t just “10% or 20% of the total regardless of what the total might be”; rather, the numbers are never 10% or 20% of the total (and sometimes aren’t reduced at all) because the savings are applied only to certain vendors (24 of 30,000), and only in certain commodity categories.

So how was this done? In order to compute accurate Plan1 and Plan2 amounts at every drillpoint (i.e. every line item in every dimension), the filter criteria must be applied to each transaction as it is being considered for roll-up. And, since the percentage is likely a dynamic parameter (able to be changed by the user in real time), and since the filter is likely also to be dynamic (“I would like to add (subtract) this vendor or commodity to (from) the filter”), the cube can’t be “pre-computed” as many OLAP systems do. In fact, the roll-up has to occur in real time, from scratch; and it has to involve decision-making at every transaction. Here is the fragment of decision-making code that computes the Plan1 measure:

if (FastCompare(User.NewFamily.Filterset))

addto($Plan1$,$TransMeasure.Amount$*(100-User.VendorSpendReduction1.Plan1SavingsPercent)/100); 

else

addto($Plan1$, $TransMeasure.Amount$);

Note that this fragment resembles a real program (because it is), and it could be arbitrarily complex (because it might need to be). However, it was built by a user (with aid from an integrated program development environment), and it is compiled (on the fly, in real time) by the system into custom p-code1 that executes extremely quickly2.  The result is two additional measures that are calculated without noticeable delay.

Although it might be too much to expect a non-technical user of a spend analysis system to produce a code fragment such as the above, the cube nevertheless can be delivered to that user with the Plan1 and Plan2 measures in place, allowing him to alter both the filter parameters (“User.NewFamily.Filterset”) and the savings percentages (“User.VendorSpendingReduction1.Plan1SavingsPercent”), without having to understand or modify the code fragment in any way.

Next installment: User-Defined Measures, Part 2, in which I show how the “simple” case of post-facto user-defined measures can yield surprising and interesting results when combined with another critical concept, dynamic reference filters.

Previous Installment: Crosstabs Aren’t “Analysis”

1 The p-code instructions in this case are designed to maximize performance while minimizing instruction count.
2 BIQ executes 50M pcode instructions per second on an ordinary PC.

Share This on Linked In

Spend Analysis III: Crosstabs Aren’t “Analysis”

Today’s post is from Eric Strovink of BIQ.

Pivot tables are great, no question about it. They’re also a pain in the neck to build, so any tool that builds a crosstab automatically is helpful. But crosstabs are where many “spend analysis” systems declare victory and stop.

Pivot tables are most useful when built in large clusters. Hundreds of them, for example, booked automatically down some dimension of interest (like Cost Center by Vendor, booked by Commodity). They’re also best when they’re created dynamically, inserted into existing Excel models, with the raw data readily available for secondary and tertiary analyses.

It’s also useful to see a breakdown of all dimensions by a single dimension — i.e., hundreds of thousands, even millions, of pivot table cells calculated automatically on every drill. For example, here’s MWBE spend, broken down against each of 30,000 nodes, re-calculated on every drill.


Click image to enlarge

Not to belabor the point, but there’s a big difference between (1) dumping a single crosstab to an HTML page, and (2) inserting hundreds of pivot tables into an existing Excel model, or calculating 120,000 crosstab cells automatically on every drill. The former is interesting. The latter supports serious analysis.

Are pivot tables the most useful way to present multidimensional data? Often, they aren’t. The Mitchell Madison Group’s Commodity Spend Report books top GL accounts, top Cost Centers, and top Vendors by Commodity, with a monthly split of spend, showing share of total category spend at each display line. Is a static report like this one “analysis?” No, of course not. But in this case the multi-page “report” isn’t static at all. It was built with a simple extract from the dataset, inserted directly into a user-defined Excel model. The output is trivially alterable by the user, putting analysis power directly into his or her hands. For example, with a simple tweak the report could just as easily be booked by Vendor, showing Commodity, Cost Center, and so on — or adapted to an entirely different purpose.

What about matching externally-derived benchmarks to internal data? Is it useful to force-fit generic commodity benchmark data into an A/P dataset, as some spend analysis vendors try to do, and pretend that actionable information will result? Or is it more productive to load relevant and specific benchmark data into a flexible Excel model that you control, and into which you insert actuals from the dataset? The former approach impresses pie-in-the-sky analysts and bloggers. The latter approach produces concrete multi-page analyses, like this, that demonstrate how “best price” charged for an SKU, per contract, might not be “best price” after all (who ever heard of a PC whose price was flat for 12 months?)1


Click image to enlarge

Next installment: User-Defined Measures

Previous Installment: Why Data Analysis is Avoided

1 This example is based on disguised but directionally accurate data. A similar analysis on actual data identified hundreds of thousands of dollars in recoverable overcharges.

Share This on Linked In

Spend Analysis II: Why Data Analysis Is Avoided

Today’s post is from Eric Strovink of BIQ.

If I have learned one thing during my career as a software developer and software company executive, it’s this: contrary to what I believed when I was a know-it-all 20-something, there are a lot of clever people in the world. And clever people make smart decisions (for example, reading this blog, which thousands do every day).

One of those decisions is the decision NOT to perform low-probability ad hoc data analysis. It’s a sensible decision. Sometimes it’s based on empirical study and hard-won experience, and sometimes it’s a gut feel; but either way, the decision has a strong rational basis. It’s just not worthwhile.

A picture is helpful:


Click image to enlarge

The above shows the expected value of an ad hoc analysis of a $100K savings opportunity. On the X axis is the number of days required to prepare and analyze the data; on the Y axis, the probability that the analysis will be fruitful. I chose a $700 opportunity cost per analyst-day; choose your own number, it doesn’t really matter.

Note that the graph is mostly “underwater”; that is, the expected value of the analysis is largely negative. Unless the probability of success is quite high, or the time taken to perform the analysis is quite low, it’s simply not a good plan to undertake it.

We are all faced with the decision, from time to time, whether to explore a hunch or not. However, an analyst can only work for 220 days per year. Sending a key analyst off on a wild goose chase could be a serious setback, so it’s a risky decision, and we don’t do it, and so our hunches remain hunches forever.

But what if it wasn’t risky at all?

Nothing can be done about the “probability that the analysis will be fruitful”; that’s fixed. But plenty can be done about the “number of days required to prepare and analyze data.” Suppose a dataset could be built in 5 minutes, and analyzed in under an hour? This turns the expected value of speculative analysis sharply positive. Suddenly it is a very good idea indeed to perform ad hoc analysis of all kinds.

And that’s good news. Because there is a ton of useful data floating around the average company that nobody ever looks at. Top of the list? Invoice-level detail, from which all kinds of interesting conclusions can be drawn. Try this experiment: acquire invoice-level (PxQ) data from a supplier with whom you have a contract. Dump it into an analysis dataset, and chart price point by SKU over time. Chances are, like most companies, you’ll find something very wrong, such as prices all over the map for the same SKU (ironically, sometimes this happens even if you have an e-procurement system that’s supposed to prevent it). If you have a contract, only one of those prices is correct; the rest are not, and represent money on the table that you can recover trivially.

Of course, please don’t spend weeks or months on the exercise, because then it won’t pay off. Instead, find yourself a data analysis tool with which you can move quickly and efficiently — or a services provider who can use the tool efficiently for you (and thus make a contingency-based analysis worthwhile for both of you). Bottom line: if you can’t build a dataset by yourself, in minutes, you’ll end up underwater, just like the graph.

Next installment: Crosstabs Aren’t “Analysis”

Previous Installment: It’s the Analysis Stupid

Share This on Linked In

Spend Analysis I: “It’s The Analysis, Stupid.”

Today’s post is from Eric Strovink of BIQ.

James Carville is not my favorite person, but he’s a funny man. And the above bastardization of his (in)famous Clinton campaign quote seems quite apropos, given the current frenetic level of marketing activity around “spend analysis” (I’m always amused by vendors using this term, because… excuse me for asking… “where’s the “analysis?”)

So why is there so much spend analysis marketing activity, all of a sudden? I suspect it’s “Oracle Terror”. For the last nine years, I’ve watched spend analysis vendors promote their “product” — typically a service masquerading as a product — using the same tired strategy: “We classify data better than [those other guys].” Problem is, when you spend so much time and effort dumbing down spend analysis to a simple-minded premise, you open the door for almost anyone, even a sleepy ERP vendor, to steal your lunch. And that’s exactly what has happened. Oracle has neatly synthesized all of the “classification” messages together, packaging them up with some Silicon Valley marketing magic, and the legacy spend analysis vendors are in a panic. You’re absolutely right, folks, Oracle’s messaging is better than yours. Smarter, more sophisticated, priced innovatively — it’s both ironic and funny. The only surprise is that this didn’t happen years ago.

But here’s the point: real spend analysis is so much more than classification, that the whole classification discussion is absurd. It has always been absurd. Classification-centrism is the Titanic of spend analysis, aiming squarely at a snowball on the top of the iceberg, while completely ignoring the massive value beneath. Nevertheless, relentless classification-oriented marketing over many years has warped end-user perceptions, and carried analysts right along with it. Current analyst firm surveys are spending over 90% of their time on classification questions, Pandit’s hopelessly off-target book (previously dissected and dismissed by Sourcing Innovation) is garnering new attention, and so on.

My iconoclastic point of view has been outlined in these (and other) pages before, but put very simply, it’s this: Classification is easy. Armed with appropriate tools, any intelligent person (your admin, for example) can be trained to do it effectively, in about an hour; and the rules they generate can be applied automatically to new transactions, forever after. When you stop to consider that sourcing consultants have been performing effective spend analysis for years, using nothing more than pencil and paper, it’s obvious that the classification Emperor really doesn’t have any clothes.1

In fact, true value lies in the analysis that you perform. Value is about results, and results come from analysis, not from a data classification process that is just a baby step toward value realization, and one that may not even be relevant. For example, consider that spend classification is really only useful for A/P data. There are many higher-value sources of data lying around, and many datasets can be built from them. In most of those datasets, classification has no place at all. By the way, how many spend datasets do you plan on building? One? Just on A/P data? Then you are missing out on value, by a wide margin.

In this series, I’ll discuss the requirements for ad hoc data analysis, and the very real value that results from it. Spend analysis, at the end of the day, is just data analysis; so it’s critical that your data analysis tools provide the necessary power and flexibility to make you successful.

Next installment: Why Data Analysis Is Avoided

1Ironically, based on the datasets we’ve seen from customers who have walked away from their classification-centric vendors, talking a great game on classification doesn’t necessarily mean delivering great classification.

Share This on Linked In