Category Archives: Spend Analysis

Supply Management Technical Difficulty … Part IV.2

A lot of vendors will tell you a lot of what they do is so hard and took thousands of hours of development and that no one else could do it as good or as fast or as flexible when the reality is that much of what they do is easy, mostly available in open source, and can be replicated in modern Business Process Management (BPM) configuration toolkits in a matter of weeks. In this series we are tackling the suite of supply management applications and pointing out what is truly challenging and what is almost as easy as cut-and-paste.

In our first three posts we discussed basic sourcing, basic procurement, and supplier management where few technical challenges truly exist. Then, yesterday, we started a discussion of spend analysis, where there are deep technical challenges (that we discuss today), but also technical by-gones that vendors still perpetuate as challenges and stumpers that are not true challenges but are to many vendors who didn’t bother to spend the time it takes to hire a development team that can figure them out. Yesterday we discussed the by-gones and the first-stumper. Today, we discuss the second big stumper and the true challenges.


Technical Stumper: Multi-Cube

The majority of applications support one, and only one, cube. As SI has indicated again and again (and again and again), the power of spend analysis resides in the ability to quickly create a cube on a hunch, on any schema of interest, analyze the potential opportunity, throw it away, and continue until a new value opportunity is found. This also needs to be quick and easy, or the best opportunities will never be found.

But even today, many applications support ONE cube. And it makes absolutely no sense. Especially when all one has to do to create a new cube is just create a copy of the data in a set of temporary tables designed just for that and update the indexes. In modern databases, it’s easy to dynamically create a table, bulk copy data from an existing table to the new table, and then update the necessary index fields. The cube can be made semi-persistent by storing the definition in a set of meta-tables and associating it with the user (which is exactly how databases track tables anyway).

Apparently vendors are stumped on how easy this is, otherwise, the doctor is stumped as to why most vendors do not support such basic capability.


Technical Challenge: Real-time (Collaborative) Reclassification

This is a challenge. Considering that the reclassification of a data set to a new hierarchy or schema could require processing every record, and that many data sets will contain millions of transaction records, and modern processors can only do so many instructions per second, this will likely always be a challenge as big data gets bigger and bigger. As a result even the best algorithms can generally only handle a few million records on a high end PC or laptop in real time. And while you can always add more cores to a rack, there’s still a limit as to how many cores can be connected to an integrated memory bank through a high-speed bus … and as this is the key to high-speed data processing, even the best implementations will only be able to process so many transactions a second.

Of course, this doesn’t explain why some applications can re-process a million transactions in real time and some crash before you load 100,000. This is just bad coding. This might be a challenge, but it’s still one that should be handled as valiantly as possible.

Technical Challenge: Exploratory 3-D Visualization

There’s a reason that Nintendo, X-box, and PlayStation keep releasing new hardware. They need to support faster rendering as the generation of realistic 3-D graphics in real-time requires very powerful processing. And while there is no need to render realistic graphics in spend analysis, creating 3-D images that can be rotated in real-time, blown up, shrunk down, drilled into to create a new 3-D image, which can again be rotated, blown-up, shrunk, drilled into, etc. in real time is just as challenging. This is because you’re not just rendering a complex image (such as a solar system, 3-D heated terrain map, etc.) but also annotating it with derived metrics that require real-time calculation, storing the associated transactions for tabular pop-up, etc. — and we already discussed how hard it is to reclassify (and re-calculate derived metrics on) millions of transactions in real time.

Technical Challenge: Real-time Hybrid “AI”

First of all, while there is no such thing as “AI”, because machines are not intelligent, there is a such thing as “automated reasoning” as machines are great at executing programmed instructions using whatever logic system you give them, and while there is no such thing as “machine learning” as it requires true intelligence to learn, there is a such thing as an “adaptive algorithm” and the last few years have seen the development of some really good “adaptive algorithms” that employ the best “automated reasoning” techniques that can, with training, over time improve to the point where classification accuracy can (quickly) get to 95% or better. And the best can be pre-configured with domain models that can jump-start the classification process and often get up to 80% accuracy with no training on reasonably clean data.

But the way these algorithms typically work is that data is fed into a neural network or cluster-machine, the outputs compared to a domain model, and where the statistical-based technique fails to generate the right classification, the resulting score is analyzed and the statistical weights or boundaries of the cluster modified, and the network or cluster machine re-run until the classification accuracy reaches a maximum. But in reality, what needs to happen is that as users correct classifications in real-time when doing ad-hoc analysis in derived spend cubes, the domain models need to be modified and the techniques updated in real time, and the override mapping remembered until the classifier automatically classifies all similar future transactions correctly. This requires the implementation of leading edge “AI” (which should be called “AR”) that is seamlessly integrated with leading edge analytics.

In other words, while building any analytics application may have been a significant challenge last decade when the by-gones were still challenges and the stumpers required a significant amount of brain-power and coding to deal with, that’s not the case anymore. These days, the only real challenge is real-time reclassification, visualization, and reasoning on very large data sets … even with parallel processing, this is a challenge if a large number of records have to be reprocessed, re-indexed, and derived dimensions recalculated.

But, of course, the challenges, and lack of, don’t end with analytics. Stay tuned!

Supply Management Technical Difficulty … Part IV.1

A lot of vendors will tell you a lot of what they do is so hard and took thousands of hours of development and that no one else could do it as good or as fast or as flexible when the reality is that much of what they do is easy, mostly available in open source, and can be replicated in modern Business Process Management (BPM) configuration toolkits in a matter of weeks.

So, to help you understand what’s truly hard and, in the spend master’s words, so easy a high school student with an Access database could do it, the doctor is going to bust out his technical chops that include a PhD in computer science (with deep expertise in algorithms, data structures, databases, big data, computational geometry, and optimization), experience in research / architect / technology officer industry roles, and cross-platform experience across pretty much all of the major OSs and implementation languages of choice. We’ll take it area by area in this series. In our first three posts we tackled basic Sourcing, basic Procurement, and Supplier Management and in this post we’re deep diving into Spend Analytics.

In our first three posts, we focussed just on technical challenges, but in this post, in addition to technical challenges, we’re also going to focus on technical stumpers (which shouldn’t be challenges, but for many organizations are) and technical by-gones (which were challenges in days gone by, but are NOT anymore).


Technical By-Gone: Formula-Based Derived Dimensions

In the early days, there weren’t many mathematical libraries, and building a large library, making it efficient, and integrating it with an analytics tool to support derived dimensions and real time reporting was quite a challenge, and typically required a lot of work, and a lot of code optimization that often required a lot of experimentation. But now there are lots of libraries, lots of optimized algorithms, and integration is pretty-straight forward.

Technical By-Gone: Report Builder

This is just a matter of exposing the schema, selecting the dimensions and facts of interest, and feeding it into a report object — which can be built using dozens (and dozens) of standard libraries. And if that’s too hard, there are dozens of applications that can be licensed and integrated that already do all the heavy lifting. In fact, many of your big name S2P suites now offering “analytics” are doing just this.

Technical Stumper: Multi-Schema Support

When you get right down to it, a schema is just a different indexing of data, which is organized into records. This means that all you need to do to support a schema is add an index to a record. This also means that all you need to do to support multiple schemas is to add multiple indexes to a record. This means that by normalizing a database schema into entity tables, relationship tables, and other discrete entities, it’s actually easy to support multiple categorizations for spend analysis including UNSPSC, H(T)S codes, a modified best-practice service provider schema for successful spend analysis, and any other schema needed for organizational reporting.

This says that all you need to support another schema is a set of schema tables that define the schema and a set of relationship tables that relate entities, such as transactions, to their appropriate place in the schema. One can even use general purpose tables that support hierarchies. The point is that there are lots of options and it is NOT hard! Maybe a lot of code (and code optimization), but it is NOT hard.


Technical Stumper: Predictive Analytics

Predictive Analytics sounds challenging, and creating good analytics algorithms takes time, but a number of these have been developed across the areas of analytics that works well, and the only thing they require is good data — since the strength of a good analytics application resides in its ability to collect, cleanse, enhance, and classify data, it shouldn’t be hard to just feed that into a predictive analytics library. But apparently it is. As few vendors offer even basic trend analysis, inventory analysis, etc. Why they don’t implement the best public domain / textbook libraries or implement third part libraries and solutions which have more powerful, and adaptive, algorithms that work better with more data for all of the common areas that prediction has been applied to for at least five years is beyond the doctor. While it’s a challenge to come up with newer, better, algorithms, it’s not hard to use what’s out there, and there is already a lot to start with.

Come back tomorrow as we continue our in-depth discussion of analytics.

Analytics Gotchas To Watch Out For!

Companies win or lose in the modern marketplace based upon the actionable insights they can derive from their data. We’re in the age of information warfare (that is used against us everyday, especially in political elections, but that’s a post for a different blog), and companies are competing with, or attacking, each other based on the quality of their data. This is why big data science is taking off and why many companies are engaging third party “experts” to help them get started. But not all of these experts are truly experts. Here are some questions to ask and gotchas to walk out for when considering the pitches from supposed experts.


What’s in the 10% to 20% that wasn’t mapped?

While it’s true you can get a lot of insight when 80% of the spend is mapped, and often enough to get an idea where to dig in, there’s two things to watch out for when the data “experts” come back and say they’ve mapped 80%. First of all, is it 80% of spend, 80% of transactions, or 80% of the supply base? Be very careful to understand what 80% was mapped. If it was spend, chances are it’s the big value transactions, and the tail spend is unmapped. If the tail spend contains small, but critical components to production (such as control chips for expensive electro-mechanical systems), this could be problematic if this spend is increasing year over year, or everyone could be okay. If it’s 80% of transactions, this could leave the largest value transactions unmapped, which could completely skew the opportunity analysis. If it’s 80% of the supply base, the riskiest suppliers could go unmapped, and the risk analysis could be skewed.


How many of the recommendations are backed up with your data and not just industry benchmarks?

If the “expert” says that their benchmarks indicate huge opportunities in specific categories, make sure the benchmarks are based on your data, and not data gathered from your competitors (and used in lieu of doing a detailed analysis on your data). Make sure the “experts” are not taking shortcuts (because your data was dirtier than they expected and they didn’t want to make the effort to clean it).

Remember what Sir Arthur Conan Doyle said, It is a capital mistake to theorize before one has data. Insensibly one begins to twist facts to suit theories, instead of theories to suit facts. And, specifically, before one has their own data, not just any data!


Never forget there are lies, damn lies, and statistics!

In the best case, statistics are used for supporting arguments rather than leading, illuminating, arguments. In the worst case, they are used to plug holes in the preliminary analysis that the “expert” would rather not tell you about. (Experts rarely want to admit your data is so dirty and incomplete that they couldn’t do their preliminary analysis to the promised level of accuracy in the time given. They’d rather discover that during the project, after they have it, and, if necessary, put in a change order to clean it for you, and take more of your money.)

Remember that statistics can be used to skew arguments just about anyway you want to, especially if you are willing to use +/- with 90% confidence …


Not everything that can be counted counts!

Remember what Einstein said, because in this age of data overload it’s never been more true. Detailed analysis on certain types of trend data, social media reviews, and segmented consumer purchase patterns don’t always yield any insight, especially when the goal is spend reduction or demand optimization. It all comes down to what Denning said, if you do not know how to ask the right question which will help you focus on the right data then you discover nothing.


There’s no such thing as an alternative fact!

While most consultants in our space won’t try to sell you alternative facts, they may try to sell you alternative interpretations to the ones the data suggest. This is almost as bad. Always remember that Aldous Huxley once said that facts do not cease to exist because they are ignored. If we only had to deal with experts and consultants ignoring facts. These consultants especially in the political arena, that try to sell alternative facts or alternative interpretations are selling what has to be the biggest crock of bullsh!t they have come up with yet.

Finally, its not only opportunities that multiply as they are seized (Sun Tzu), it is also misfortunes that come from making bad decisions from bad or incomplete analyses.

And yes, someone has to be the gnashnab!

When Selecting Your Prescriptive, and Future Permissive, Analytics System …

Please remember what Aaron Levenstein, Business Professor at Baruch College, said about statistics:

Statistics are like bikinis. What they reveal is suggestive, but what they conceal is vital.

Why? Because a large number of predictive / forecasting / trending algorithms are statistics-based. While good statistics, with good sufficiently-sizeable data sets, can reach a very high, calculable, probability of accuracy a statistically high percentage of the time, if a result is only 95% likely 95% of the time, then the right answer is only obtained 95% of the time (or 19 / twenty times), and the answer is only “right” to within 95%. This means that one time out of twenty, the answer is completely wrong, and may not even be within 1%. It’s not the case that one time out of twenty the prediction is off more than 5%, it’s the case that the prediction is completely wrong.

And if these algorithms are being used to automatically conduct sourcing events and make large scale purchases on behalf of the organization, do you really want something going wrong one in twenty times, especially if an error that one time could end up costing the organization more than it saved the other nineteen times because it was primarily sourcing categories that were increasing with inflation or decreasing according to standard burn rates as demand dropped on outdated product offerings, but one such category was misidentified. If instead of identifying the category as about to be in high-demand, and about to sky-rocket in cost due to the reliance on scarce rare earth metals (that are about to get scarcer as the result of a mine closure), it identified it as low-demand, cost-continually-dropping, over the next year and chose a monthly-spot-buy auction, then costs could increase 10% month over month and a 12M category could, over the cost of a year, could actually cost 21.4M (1M + 1.1M + 1.21M …), almost double! If the savings on the other 19, similarly valued, categories was only 3%, the 5.7M the permissive analytics system saved would be dwarfed by the 9.4M loss! Dwarfed!

That’s why it’s very important to select a system that not only keeps a record of every recommendation and action, but a record of its reasoning that can be reviewed, evaluated, and overruled by a wise and experienced Sourcing professional. And, hopefully, capable of allowing the wise and experienced Sourcing professional to indicate why it was overruled and expand the knowledge model so that one in twenty eventually becomes one in fifty on the road to one in one hundred so that, over time, more and more non-critical buying and automation tasks can be put on the system, leaving the buyer to focus on high-value categories, which will always require true brain power, and not whatever vendors try to pass off as non-existent “artificial intelligence” (as there is no such thing, just very advanced machine-learning based automated reasoning).

Are We About to Enter the Age of Permissive Analytics?

Right now most of the leading analytics vendors are rolling out or considering the roll out of prescriptive analytics, which goes one step beyond predictive analytics and assigns meaning to those analytics in the form of actionable insights the organization could take in order to take advantage of the likely situation suggested by the predictive analytics.

But this won’t be the end. Once a few vendors have decent predictive analytics solutions, one vendor is going to try and get an edge and start rolling out the next generation analytics, and, in particular, permissive analytics. What are permissive analytics, you ask? Before we define them, let’s take a step back.

In the beginning, there were descriptive analytics. Solutions analyzed your spend and / or metrics and gave you clear insight into your performance.

Then there are predictive analytics. Solutions analyzed your spend and / or metrics and used time-period, statistical, or other algorithms to predict likely future spend and / or metrics based on current and historical spend / metrics and present the likely outcomes to you in order to help you make better decisions.

Predictive analytics was great as long as you knew how to interpret the data, what the available actions were, and which actions were most likely to achieve the best business outcomes given the likely future trend on the spend and / or metrics. But if you didn’t know how to interpret the data, what your options were, or how to choose the best one that was most in line with the business objectives.

The answer was, of course, prescriptive analytics, which combined the predictive analytics with expert knowledge that not only prescribed a course of action but indicated why the course of action was prescribed. For example, if the system detected rising demand within the organization and predicted rising cost due to increasing market demand, the recommendation would be to negotiate for, and lock-in supply as soon as possible using either an (optimization-backed) RFX, auction, or negotiation with incumbents, depending upon which option was best suited to the current situation.

But what if the system detected that organizational demand was falling, but market demand was falling faster, there would be a surplus of supply, and the best course of action was an immediate auction with pre-approved suppliers (which were more than sufficient to create competition and satisfy demand)? And what if the auction could be automatically configured, suppliers automatically invited, ceilings automatically set, and the auction automatically launched? What if nothing needed to be done except approve, sit back, watch, and auto-award to the lowest bidder? Why would the buyer need to do anything at all? Why shouldn’t the system just go?

If the system was set up with rules that defined behaviours that the buyer allowed the system to take automatically, then the system could auto-source on behalf of the buyer and the buying organization. The permissive analytics would not only allow the system to automate non strategic sourcing and procurement activities, but do so using leading prescriptive analytics combined with rules defined by the buying organization and the buyer. And if prescriptive analytics included a machine learning engine at the core, the system could learn buyer preferences for automated vs. manual vs. semi-automated and even suggest permissive rules (that could, for example, allow the category to be resourced annually as long as the right conditions held).

In other words, the next generation of analytics vendors are going to add machine learning, flexible and dynamic rule definition, and automation to their prescriptive analytics and the integrated sourcing platforms and take automated buying and supply chain management to the next level.

But will it be the right level? Hard to say. The odds are they’ll make significantly fewer bad choices than the average sourcing professional (as the odds will increase to 98% over time), but, unlike experienced and wise sourcing professionals, won’t detect when an event happens in left-field that totally changes the dynamics and makes a former best-practice sourcing strategy mute. They’ll detect and navigate individual black swan attacks but will have no hope of detecting a coordinated black swan volley. However, if the organization also employs risk management solutions with real time event monitoring and alerts, ties the risk management system to the automation, and forces user review of higher spend / higher risk categories put through automation, it might just work.

Time will tell.