Today’s post is from Eric Strovink of BIQ.
A “measure” is a quantity that’s computed for you in an analysis dataset — for example, spend, count of transactions, and so on. There could be many measures in a dataset, such as multiple currencies, or entirely different quantities such as number of units.
Measures are derived from the data supplied, and rolled up (typically summed) to hierarchical totals. Sometimes, however, you want and need control over how (and when) the measure is calculated. Such measures are termed “user-defined” measures.
Let’s first dispense with the usual definition of user-defined measures — namely, eye candy that has no reality outside of the current display window. You can identify eye candy by looking for the little asterisk in the User Manual that says “certain operations” aren’t possible on a user-defined measure. That’s the tip-off that the tool isn’t really creating the measure at all — it’s just computing it on the fly, as needed, for the current display. In order to be truly useful, user-defined measures must have reality at all drillpoints (all “nodes”) in the dataset, at all times, so they can be used freely in analysis functions, just like “regular” measures. It’s no wonder that many “analysis” products avoid performing the millions of computations required to do this properly, preferring instead to do the handful of computations required to pass casual inspection during the sales process. You’ll discover once you dive into the product that its “user-defined” measures are useless; but by then it’s too late.
There are two kinds of user-defined measures:
(1) post-facto computations that are performed after transactional roll-up (the usual definition), and
(2) those that are performed during transactional roll-up, which we’ll consider here.
Click to enlarge
In the above example there are two savings scenarios identified, “Plan1” and “Plan2”. Plan 1 is a 10% savings scenario, and Plan 2 is a 20% savings scenario. However, this savings plan is complex, because it is a real savings plan. It applies only to spend with certain vendors, and only in certain categories. Thus, as you can see from the numbers, savings aren’t just “10% or 20% of the total regardless of what the total might be”; rather, the numbers are never 10% or 20% of the total (and sometimes aren’t reduced at all) because the savings are applied only to certain vendors (24 of 30,000), and only in certain commodity categories.
So how was this done? In order to compute accurate Plan1 and Plan2 amounts at every drillpoint (i.e. every line item in every dimension), the filter criteria must be applied to each transaction as it is being considered for roll-up. And, since the percentage is likely a dynamic parameter (able to be changed by the user in real time), and since the filter is likely also to be dynamic (“I would like to add (subtract) this vendor or commodity to (from) the filter”), the cube can’t be “pre-computed” as many OLAP systems do. In fact, the roll-up has to occur in real time, from scratch; and it has to involve decision-making at every transaction. Here is the fragment of decision-making code that computes the Plan1 measure:
Note that this fragment resembles a real program (because it is), and it could be arbitrarily complex (because it might need to be). However, it was built by a user (with aid from an integrated program development environment), and it is compiled (on the fly, in real time) by the system into custom p-code1 that executes extremely quickly2. The result is two additional measures that are calculated without noticeable delay.
Although it might be too much to expect a non-technical user of a spend analysis system to produce a code fragment such as the above, the cube nevertheless can be delivered to that user with the Plan1 and Plan2 measures in place, allowing him to alter both the filter parameters (“User.NewFamily.Filterset”) and the savings percentages (“User.VendorSpendingReduction1.Plan1SavingsPercent”), without having to understand or modify the code fragment in any way.
Next installment: User-Defined Measures, Part 2, in which I show how the “simple” case of post-facto user-defined measures can yield surprising and interesting results when combined with another critical concept, dynamic reference filters.
Previous Installment: Crosstabs Aren’t “Analysis”
1 The p-code instructions in this case are designed to maximize performance while minimizing instruction count.
2 BIQ executes 50M pcode instructions per second on an ordinary PC.
Share This on Linked In