Spendata: A True Enterprise Analytics Solution

As we indicated in our last article, while Spendata is the absolute best at spend analysis, it’s not just a spend analysis platform. It’s a general-purpose data analytics platform that can be used for much more than spend analysis.

The current end-state vision for business data analytics is a “data lake” database with a BI front end. The Big X consultancies (aided and abetted by your IT department, which is only too eager to implement another big system) will try to convince you of the data paradise you’ll have if you dump all of your business data into a data lake. Unfortunately, reality doesn’t support the vision, because organizational data is created only to the extent necessary, never verified, riddled with errors from day one, and left to decay over time as it’s never updated. The data lake is ultimately a data cesspool.

Pointing a BI tool at the (dirty) lake will spice up the data with bars, pies, waves, scatters, multi-coloured geometric shapes, and so on, but you won’t find much insight other than the realization that your data is, in fact, dirty. Worse, a published BI dashboard is like a spreadsheet you can’t modify. Try mapping new dimensions, creating new measures, adding new data, or performing even the simplest modification of an existing dimension or hierarchy, and you’ll understand why this author likes to point out that BI should actually stand for Bullsh!t Images, not Business Intelligence.

So how does a spend analysis platform like Spendata end up being a general-purpose data analytics tool? The answer is that the mechanisms and procedures associated with spend analysis and spend analysis databases, specifically data mapping and dimension derivation, can be taken to the next level — extended, generalized, and moved into real time. Once those key architectural steps are taken, the system can be further extended with view-based measures, shared cubes where custom modifications are retained across refreshes, and spreadsheet-like dependencies and recalculation at database scale.

The result is an analysis system that can be adapted not only to any of the common spend analysis problems, such as AP/PO analysis or commodity-specific cubes with item level price X quantity data, but also to savings tracking and sourcing and implementation plans. Extending the system to domains beyond spend analysis is simple: just load different data.
The bottom line is that to do real data analysis, no matter what the domain, you need:

  • the ability to extend the schema at any time
  • the ability to add new derived dimensions at any time
  • the ability to change mappings at any time
  • the ability to build derivations, data views, and mappings that are dependent on other derivations, mappings, views, inputs, linked datasets, and so on, with real-time “recalc”
  • the ability to create new views and reports relevant to the question you have … without dumping the data to Excel
  • … and preserve all of the above on cube data refreshes
  • … in your own copy of the cube so you don’t have to wait for anyone to agree
  • … and get an answer today, not on the next refresh next month when you’ve forgotten why you even had the question in the first place

You don’t get any of that from a spend analysis solution, or a BI solution, or a database pointing at a data lake. You only get that in a modern data analysis solution — which supports all of the above, and more, for any kind of data. A data analysis system works equally well across all types of numeric or set-valued data, including, but not limited to sales data, service data, warranty data, process data, and so on.

As Spendata is a real data analysis solution, it supports all of these analyses with a solution that’s easier and friendlier to use than the spreadsheet you use every day. Let’s walk through some examples so you can understand what a data analysis solution really can do.


Spending data consists of numerical amounts that represent the price, tax, duty, shipping, etc. paid for items purchased. Sales data is numerical amounts that represent the price, tax, duty, shipping, etc. paid for items sold.

They are basically the inverse of each other. For every purchase, there is a sale. For every sale, there is a purchase. So, there’s absolutely no reason that you shouldn’t be able to apply the exact the same analysis (possibly in reverse) to sales data as you apply to spend data. That is, IF you have a proper data analysis tool. The latter part is the big IF because if you’re using a custom tool that needs to map all data to a schema with fixed semantics, it won’t understand the data and you’re SOL.

However, since Spendata is a general-purpose data analysis tool that builds and maintains its schema on the fly, it doesn’t care if the dataset is spend data or sales data; it’s still transactional data and it’s happy to analyze away. If you need the handholding of a workflow-oriented UI, that can also be configured out of the box using Spendata‘s new “app” capability.

Here are three types of sales analysis that Spendata supports better than CRM/Sales Forecasting systems, and that can’t be done at all with a data lake and a BI tool.

Sales Discount Variation Analysis Over Time By Salesperson … and Client Type

You run a sales team. Are your different salespeople giving the same mix of discounts by product type to the same types of customers by customer size and average sales size?

Sounds easy right? Can’t you simply plot the product/price ratio by month by salesperson in a bubble chart (where volume size correlates to bubble size) against the average trend line and calculate which salespeople are off the most (in the wrong direction)? Sure, but how do you handle client type? You could add a “color” dimension, but when the bubbles overlap and the bubbles blur, can you see it visually? Not likely. And how do you remember a low sales volume customer which is a strategic partner, so has a special deal? Theoretically you could add another column to the table “Salesperson, Product/Price Ratio, Client Type, Over/Under Average”, and that would work as long as you could pre-compute the average discount by Product/Price Ratio and Client Type.

And then you realize that unless you group by category, you have entirely different products in the same product/price ratio and your multi-stage analysis is worthless, so you have to go back and start again, only to find out that the bubble chart is only pseudo-useful (as you can’t really figure it out visually because what is that shade of pink (from the multiple red and white bubbles overlapping) — Fuchsia, Bright, or Barbie — and what does it mean) and you will have to focus on the fixed table to extract any value at all from the analysis.

But then you’ll realize that you still need to see monthly variations in the chart, meaning you want the ability to drag a slider or change the month and have the bubble chart update. Uh-oh, you forgot to individually compute all the amounts by month or select the slider graph! Back to square one, doing it all over again by month. Then you notice some customers have long-term, fixed prices on some products, which messes up the average discount on these products as the prices for these customers are not changing over time. You redo the work for the third (or is it the fourth? time), and then you realize that your definitions of client type “large, medium, and small” are slightly off as a client that should be in large is in medium and two that should be in small were made medium. Aaarrrggghhh!!!

But with Spendata, you simply create or modify dimensions to the cube to segment the data (customer type, product groups, etc.) You leverage a dynamic view-based measure by customer type to set the average prices per time period (used to calculate the discount). You then use filters to define the time range of interest, another view with filters to click through the months over time, a derived view to see the performance by quarter, another by year. If you change the definition of client type (which customers belong to which client type), which products for customers are fixed prices, which SKU’s that are the same type, time range of interest, etc. you simply map them and the entire analysis auto-updates.

This flexibility and power (with no wasted effort) gives you a very deep analysis capability NOT available in any other data analysis platform. For example, you can find out with a few clicks that your “best” salesperson in terms of giving the lowest average discount is actually costing you the most. Turns out, he’s not serving any large customers (who get good discounts) and has several fixed price contracts (which mess up the average discounts). So, the discounts he’s giving the small clients, while less than what large customers get, are significantly more than what other salespeople provide to other small customers. This is something you’d never know if you didn’t have the power of Spendata as your data consultant would give up on the variance analysis at the global level because the salesman’s overall ratio looked good.

Post-Merger White-Space Analysis

White space sales analysis is looking for spaces in the market where you should be selling but are not. For example, if you sell to restaurants, you could look at your sales by geography, normalized by the number of establishments by type or the sales of the restaurants by type in that geography. In a merger, you could measure your penetration at each customer for each of the original companies. You can find white space by looking at each customer (or customer segment) and measuring revenue per customer employee across the two companies. Where is one more effective than the other?

You might think this is no big deal because this was theoretically done during the due diligence and the opportunity for overlap was deemed to be there, as well as the opportunity for whitespace, and whatever was done was good enough. The reality couldn’t be further from the truth.

If the whitespace analysis was done with a standard analytics tool, it has all the following problems:

  • matching vendors were missed due to different name entries and missing ids
  • vendors were not familied by parent (within industry, geography, etc.)
  • the improperly merged vendors were only compared against a target file built by the consultants and misses vendors
  • i.e. it’s poor, but no worse than you’d do with a traditional analytics tool

But with Spendata, these problems would be at least minimized, if not eliminated because:

  • Spendata comes with auto-matching capability
  • … that can be used to enrich the suppliers with NAICS categorization (for example)
  • Spendata comes with auto-familying capability so parent-child relationships aren’t missed
  • Spendata can load all of the companies from a firmographic database with their NAICS codes in a separate cube …
  • … and then federation can be used to match the suppliers in use with the suppliers in the appropriate NAICS category for the white space analysis

It’s thus trivial to

  1. load up a cube with organization A’s sales by supplier (which can be the output from a view on a transaction database), and run it through a view that embeds a normalization routine so that all records that actually correspond to the same supplier (or parent-child where only the parent is relevant) are grouped into one line
  2. load up a cube with organization B’s sales by supplier and do the same … and now you know you have exact matches between supplier names
  3. load up the NAICS code database – which is a list of possible customers
  4. build a view that pulls in, for each supplier in the NAICS category of interest, Org A spend, Org B Spend, and Total Spend
  5. create a filter to only show zero spend suppliers — and there’s the whitespace … 100% complete. Now send your sales teams after these.
  6. Create a filter to show where your sales are less than expected (eg. from comparable other customers or Org A or Org B). This is additional whitespace where upselling or further customer penetration is appropriate.

Bill Rate Analysis

A smart company doesn’t just analyze their (total) spend by service provider, they analyze by service role and against the service role average when different divisions/locations are contracting for the same service that should be fulfilled by a professional with roughly the same skills and same experience level. Why? Because if you’re paying, on average, 150/hr for an intermediate DBA across 80% of locations and 250/hr across the remaining 20%, you’re paying as much as 66% too much at those remaining locations, with the exception being San Francisco or New York where your service provider has to pay their locals a cost-of-living top-up just so they can afford to live there.

By the same token, a smart service company is analyzing what they are getting by role, location, and customer and trying to identify the customers that are (the most) profitable and those that are the least (or unprofitable when you take contract size or support requirements into account), so they can focus on those customers that are profitable, and, hopefully, keep them happy with their better talent (and not just the newest turkey on the rafter).

However, just like sales discount variation analysis over time by client type, this is tough as it’s essentially a variation of that analysis, except you are looking at services instead of products, roles instead of client types, and customer instead of sales rep … and then, for your problem clients, looking at which service reps are responsible … so after you do the base analysis (using dynamic view based measures), you’re creating new views with new measures and filters to group by service rep and filter to those too far beyond a threshold. In any other tool, it would be nigh impossible for even an expert analyst. In Spendata, it’s a matter of minutes. Literally.

And this is just the tip of the iceberg in terms of what Spendata can do. In a future article, we’ll dive into a few more areas of analysis that require very specialized tools in different domains, but which can be done with ease in Spendata. Stay tuned!