GEN-AI IS NOT EMERGENT … AND CLAIMS THAT IT WILL “EVOLVE” TO SOLVE YOUR PROBLEMS ARE ALL FALSE!

A recent article in the CACM (Communications of the ACM) referenced a paper by Dan Carter last year that demonstrated that the claims of Wei et.al in their 2022 “Emergent Abilities of Large Language Models” were unsubstantiated and merely wrong interpretations of visual artifacts produced by computing graphs using an inappropriate semi-log scale.

Now, I realize the vast majority of you without advanced degrees in mathematics and theoretical computer science won’t understand the majority of technical details, but that’s okay because the doctor, who has advanced degrees in both, does, can verify the mathematical accuracy of Dan’s paper, and the conclusion:

LLMs — Large Language Models — the “backbone” of Gen-AI DO NOT have any emergent properties. As a result, they are no better than traditional deep learning neural networks, and are, at the present time, ACTUALLY WORSE since our lack of deep research and understanding means that we don’t have the same level of understanding of these models, and, thus, the ability to properly “train” them for repeatable behaviour or the ability to accurately “measure” the outputs with confidence.

And while our understanding of this new technology, like any new technology, will likely improve over time, the realities are thus:

  • no amount of computing power has ever hastened the development of AI technology since research began in the late 60s / early 70s (depending on what you accept as the first paper / first program), it’s always taken improvements in algorithms and the underlying science to make slow, steady progress (with most technologies taking one to two DECADES to mature to the point they are ready for wide-spread industrial use)
  • the technology currently takes 10 times the computing power (or more) to compute “results” that can be readily computed by existing, more narrow, techniques (often with more confidence in the results)
  • the technology is NOT well suited to the majority of problems that the majority of enterprise software companies (blindly jumping on the bandwagon with no steering wheel and no brakes for fear of missing out on the hype cycle that could cause a tech market crash unequally by any except the dot-com bust of the early 2000s) are trying to use it for (and yes, the doctor did use the word “majority” and not “all” because, while he despises it, it does have valid uses … in creative (writing, audio, and video) applications [not business or science applications] where it has almost unequalled potential compared to traditional ML designed for math and science based applications)

And the market realities that no one wants to tell you about are thus:

  • former AI evangelists and some of the original INVENTORS of AI are turning against the technology (out of a realization that it will never do what they hoped it would, that its energy requirements could destroy the planet if we keep trying, and/or that maybe there are some things we should just not be meddling with at our current stage of societal and technological evolution), including Weizenbaum and Hinton
  • Brands are now turning against AI … and even the Rolling Stone is writing about it
  • big tech and companies that depend on big tech (like Pharma) are starting to turn against AI … and CIOs are starting to drop Open AI and Microsoft CoPilot because, even when the cost is as low as $30 a user, the value isn’t there (see this recent article in Business Insider)

Now, the doctor knows there are still hundreds of marketers and sales people in our space who will consistently claim that the doctor is just a naysayer and against progress and innovation and AI and modern tech and blah blah blah because they, like their companies, have gone all in on the hype cycle and don’t want their bubble burst, but the reality is that

the doctor is NOT against “AI” or modern tech. the doctor, whose complete archives are available on Sourcing Innovation back to June 2006 when he started writing about Procurement Tech, has been a major proponent of optimization, analytics, machine learning, and “AI” since the beginning — his PhD is in advanced theoretical computer science, which followed a math degree — and, after actually studying machine learning, expert systems, and AI, he used to build optimization, analytics, and “AI” systems (including the first commercial semantic social search application on the internet)

what the doctor IS against is Gen-AI and all the false claims being made by the providers about its applicability in the enterprise back office (where it has very limited uses)

because the vast majority of the population does not have the math and computer science background to understand

  1. what is real and what is not
  2. what technologies (algorithms) will work for a certain type of problem and will not
  3. whether the provider’s implementation will work for their problem (variation)
  4. whether they have enough data to make it work

and, furthermore, this includes the vast majority of the consultants at the Big X and mid-sized consultancies who graduate from Business Schools with very basic statistics and data analytics training and a crash course in “prompt engineering” who can barely use the tech, couldn’t build the tech, and definitely couldn’t evaluate the efficacy and accuracy of the underlying algorithms.

The reality is that it takes years and years of study to truly understand this tech, and years more of day-in and day-out research to make true advancement.

For those of you who keep saying “but look at how well it works” and produce 20 examples to prove it, the reality is that it’s only random chance that it works.

With just a bit of simplification, we can describe these LLMs as essentially just super sophisticated deep neural networks with layers and layers of nodes that are linked together in new and novel configurations, with more feedback learning, and structured in a manner that gives them an ability to “produce” responses as a collection of “sub-responses” from elements in its data archive vs just returning a fixed response. As a result they can GENerate a reply vs just selecting from a fixed one. (And that’s why their natural language abilities seem far superior to traditional neural network approaches, which need a huge archive of responses to have a natural sounding conversation, because they can use “context” to compute, with high probability, the right parts of speech to string together to create a response that will sound human.)

Moreover, since these models, which are more distributed in nature, can use an order of magnitude more (computational) cores, they can process an order of magnitude more data. Thus, if there is ten to one hundred times the amount of data (and it’s good data), of course they are going to work reasonably well for expected queries at least 95% of the time (whereas a last generation NN without significant training and tweaking might only be 90% out of the box). If you then incorporate dynamic feedback on user validation, that may even get to 99% for a class of problems, which means that it will appear to be working, and learning, 99 times out of 100 instead of 19 out of 20. But it’s NOT! It’s all probabilities. It’s all random. You’re essentially rolling the bones on every request, and doing it with less certainty on what a good, or bad, result should look like. And even if the dice come “loaded” so that they should always roll a come out roll, there are so many variables that there are never any guarantee you won’t get craps.

And for those of you saying “those odds sound good“, let me make it clear. They’re NOT.

  • those odds are only for typical, expected queries, for which the LLM has been repeatedly (and repeatedly) trained on
  • the odds for unexpected, atypical queries could be as low as 9 in 10 … which is very, very, bad when you consider how often these systems are supposed to be used

But the odds aren’t the problem. The problem is what happens when the LLM fails. Because you don’t know!

With traditional AI, you either got no response, an invalid response with low confidence, or a rare (compared to Gen-AI) invalid response with high confidence, where the responses were always from a fixed pool (if non-numeric) or fixed range (if numeric). You knew what the worst case scenario would be if something went wrong, how bad that would be, how likely that was to happen, and could even use this information to set bounds and tweak the confidence calculation on a result to minimize the chance of this ever happening in a real world scenario.

But with LLMs, you have no idea what it will return, how far off the mark the result will be, or how devastating it will be for your business when that (eventually) happens (which, as per Murphy’s law, will be after the vendor convinces you to have confidence in it and you stop watching it closely, and then, out of the blue, it decides you need 1,000 custom configurations of a high end MacBook Pro in inventory [because 10 new sales support professionals need to produce better graphics] in a potentially recoverable case or it decides to change your currency hedge on a new contract to that of a troubled economy (like Greece, Brazil, etc.) because of a one day run on the trading markets in a market heading for a hyperinflation and a crash [and then you will need a wheelbarrow full of money to buy a loaf of bread — and for those who think it can’t happen, STUDY YOUR HISTORY: Germany during WWII, Zimbabwe in 2007, and Venezuela in 2018, etc.]). You just don’t know! Because that’s what happens when you employ technology that randomly makes stuff up based on random inputs from you don’t know who or what (and the situation gets worse when developers [who likely don’t know the first thing about AI] decide the best way to train a new AI is to use the unreliable output of the old AI).

So, if you want to progress, like the monks, leave that Genizah Artificial Idiocy where it belongs — in the genizah (the repository for discarded, damaged, or defective books and papers), and go find real technology built on real optimization, analytics, machine learning, and AI that has been properly researched, developed, tested, and verified for industrial use.

Analytics Is NOT Reporting!

We’ve covered analytics, and spend analysis, a lot on this blog, but seeing the surge in articles on analytics as of late, and the large number that are missing the point, it seems we have to remind you again that Analytics is NOT Reporting. (Which, of course, would be clear if anyone bothered to pick up a dictionary anymore.)

As defined by the Oxford dictionary, analytics is the systematic computational analysis of data or statistics and a report is a written account of something that has been observed, heard, done, or investigated. In simple terms, analysis is what is done to identify useful information and reporting is the process of displaying that information in a fancy-shmancy graph. One is useful, one is, quite frankly, useless.

A key requirement of analysis is the ability to do arbitrary systematic computational analysis of data as needed to find the information that you need when you need it. Not just a small set of canned analysis on discrete data subsets that become completely and utterly useless once they are run the first time and you get the initial result — which will NEVER change if the analysis can’t change.

Nor is analysis a random AI application that applies a random statistical algorithm to bubble up, filter out, or generate a random “insight” that may or may not be useful from a Procurement viewpoint. Sometimes an outlier is indicative of fraud or a data error, and sometimes an outlier is just an outlier. Maybe the average transaction value with the services firm is 15,000 for the weekly bill; which makes the 3,000 an outlier, but it’s not fraud if the company only needed a cyber-security expert for one day to test a key system — in fact, the insight is useless.

As per our recent post on a true enterprise analytics solution, real analysis requires the ability to explore a hunch and find the answer to any question that pops up when it pops up. To build whatever cube is needed, on whatever dimensions are required, that rolls up data using whatever metrics are required to produce whatever insights are needed to determine if an opportunity is there and if it is worth being pursued. Quickly and cost-effectively in real-time. If you have to wait for a refresh, or spend days doing offline computation in Excel to answer a question that might only save you 20K, you’re not going to do it. (Three days and 6K of your time from a company perspective is not worth a 20K saving if that time spent preparing for a negotiation on a 10M category can save an extra 0.5%, which would equate to 50K. But if you can dynamically build a cube and get an answer in 30 minutes, that 30 minutes is definitely worth it if your hunch is right and you save 20K.)

Analysis is the ability to ask “what if” and pursue the answer. Now! Not tomorrow, next week, or next month on the cube refresh, or when the provider’s personnel can build that new report for you. Now! At any time you should be able to ask What if we reclassify the categories so that the primary classification is based on primary material (“steel”) and not usage (“electrical equipment”); What if the savings analysis is done by sourcing strategy (RFX, auction, re-negotiation, etc.) instead of contract value; and What if the risk analysis is done by trade lane instead of supplier or category. Analysis is the process of asking a question, any question, and working the data to get the answer using whatever computations are required. It’s not a canned report.

Analytics is doing, not viewing. And the basics haven’t changed since SI started writing about it, or publishing guest posts by the Old Greybeard himself. (Analytics I, II, III, IV, V, and VI.)

Advice For Dealing With The PROCUREMENT STINK from Leading Consultants!

Last week, the doctor asked fellow niche/independent consultants as to how we can help to dispel the PROCUREMENT STINK which is permeating the space as a result of poor choices, bad information, and sometimes bad actors, which include the reasons we described in that article as well as many more.

Why? Because it’s going to take a collective effort among analysts, consultants, and vendors to dispel the stink permeating the Procurement space, and no one on his or her own will have all the solutions. As expected, some of the greats chimed in with their thoughts and ideas and these thoughts and ideas need to be given center stage, so this is what we’re going to do today!

James Meads

Clarity and transparency on your business model is key, especially if you have revenue streams from solution providers.

As Patrick Van Osta echoed in the comments, the uphill path to recovery, I feel, is for consultants to reclaim the position of sole trusted advisor, and there’s no way we’re ever going to be trusted advisors if we are not clear and transparent in our operations and goals. If we’re hiding our intentions, or upsides, how will the client know whether or not our goals actually align with theirs?

Joël Collin-Demers

I’m 100% on-board with the need for transparency and taking decisions based on what’s best for the client long term. Your job is to make yourself redundant as soon as possible!

In Procurement, there’s always another project. ALWAYS. You don’t have to milk one for life, with your help and guidance, you can open the client’s eyes as to not only how much there is to do, but how much they can do better, for a great ROI, with your help. Just like there’s well over 25,000 (or 35,000) species of fish in the sea, there are tens of thousands of unique aspects to Procurement in a modern enterprise. And just like you have to know where to fish, what hook to use, and what bait to use to catch a type of fish, you need to know the equivalents for each category, methodology, and process.

Jon W. Hansen

Practitioners stop looking at technology as the “silver bullet” solution but instead focus on doing the real and hard work while Solution providers stop selling shiny paper and “falling in love” with your own technology. … and us consultants have to help the practitioners do the work, understand what they need, and steer clear of the vendor with the shiny new tech (that doesn’t actually do anything [more than cheaper, proven tech]).

Paul Martyn

Consultancies (and their clients) need to Provide performance based compensation with uniqueness. For example, provide specialist consultants with compensation that includes equity. In short, align compensation to customer value (revenue growth and retention). Because, right now, most of the good consultants that can generate the ROI a client should expect are not incentivized to do well on point-based projects (like an Affordable RFP), but instead are incentivized to work on, and sell, long-term “solution” oriented consulting that lines the firm’s (and not the clients’) pocketbook (i.e. keep doing the fishing vs. teaching the client). As a result, most of the good consultants move out of the roles they are needed in to the roles they are incentivized to take.

Vinnie Mirchandani

The web lulled a number of procurement (and IT) folks into expecting vendor, negotiation etc intelligence for cheap, if not free. Vendors are not afraid to spend on sales and marketing. Procurement needs to adopt a similar mindset to even the game.

The best things in life may be free, but the best things in business are not. (As the Arrogant Worms pointed out over three decades ago, you get NOTHING FOR NOTHING!) And if you don’t have the right tools that enable the right processes powered by the right intelligence, you’re not going to win the game. Remember that all of the best sports teams use high-tech sports tech backed by science and data analytics to help their athletes reach peak condition. Raw talent only gets you in the game. You need the right training to win, or, at least, the right guidance and tech to enable you as you learn.

There’s a lot of STINK out there now, but if you follow this advice, you’ll go a long way to removing it. After all, you can’t solve everything with a pressure washer.

Proper Project Planning is Key to Procurement Project Prosperity! Part 2

In Part 1 we noted that we wrote about the importance of Project Assurance, and how it was a methodology for keeping your Supply Management Project on Track, ten years ago and that this typically ignored area of project management is becoming more important than ever. Given that the procurement technology failure rate, as well as the technology failure rate as a whole, hasn’t improved in the last decade, and is still as high as 80% (or more) depending on the study you select, that’s a problem. Especially when, for many companies, theses projects typically start in the million dollar range. (Even if the annual license is only 100K, by the time you multiply that by 3, the minimum term any vendor will give you, the annual maintenance fee by 3, and then add the implementation, integration, training, and ongoing integration maintenance costs and ongoing training costs, it’s well over 1M.)

But we also noted whereas there might have been a time when this was enough to tip the odds of success in your favour, it’s not quite enough anymore. Given the complexity of modern procurement (which hasn’t had as many complex problems to deal with simultaneously in over two decades) and modern technology (which is now AI enabled, AI backed, AI powered, AI enhanced, and or AI driven, even if it isn’t), when most organizational users are still struggling with basic technology (not enabled, backed, powered, enhanced, or driven by [fake] AI bullcr@p).

We told you we were going to dig into the project steps and help you understand what you need to do to get it as right as you can and greatly increase your odds of success. But first, there is one critical action you need to make that is common to all steps that is critical for your Procurement Project Prosperity and that is:

  • Engage an independent expert to guide you through the entire process and help where needed, including assurance.

As noted, this individual

  • cannot be an internal resource, even from a different department, as they are still subject to the internal pressures from the C-Suite (fast, cheap, etc.) that might be counter-productive to project success (that is critical for eventually obtaining the ROI you purchased the platform for in the first place)
  • cannot be a vendor representative as their only goal is to get you to buy more, or at least keep your subscription at the initial purchase level (which likely contained seats you never used, SKUs you don’t use enough to justify, and third party feeds/integrations you aren’t taking advantage of)
  • cannot be an implementation team representative, even if they are a third party consultancy, as the odds are that consultancy has a preferred partnership with the vendor and will be biased towards keeping the vendor and doing whatever is easiest (and thus most profitable for) the vendor to keep getting their implementation referrals

Now, what’s the difference between helping and pure assurance? In addition to making sure each step is accomplished effectively, this person is also guiding you through the creation of the necessary artifacts of each step to ensure success. This person is helping you define the goals, not just ensuring the goals are met. The person is simultaneously a project guide and a project evaluator, bringing the Procurement Best Practices and Technology Knowledge that your organization doesn’t have, and helping you identify the right intersection to take you forward on your journey.

And this goes well beyond just helping you write an RFP (although this is a key step, which is why the doctor has been telling you to get expert RFP help for your Procurement technology RFP for close to two decades, because a bad RFP is one of the leading causes of project failure).

This is because, as we noted ten years ago in our original Project Assurance Series (Part I, Part II, Part III, Part IV, and Part V), project success depends on more than just getting the technical specifications right. Project success also depends on getting the talent right — as it is the people who will have to use the new system. And project success also depends on getting the transition right —- if the changeover is not smooth, significant disruptions to daily operations can occur. And, equally important, they also depend on an often overlooked 4th “T” —- tracery. Organizational success depends on selecting a superior strategy and seeing it through until the desired results are achieved (or the organization changes the strategy). (And since you don’t know what you don’t know, the small cost of engaging an expert, relative to the overall project cost, will generate a return far, far greater than the technology ever will.)

Tracery, which stems from late Middle English, can be defined as a “delicate, interlacing, work of lines as in an embroidery” or, more modernly, as a “network”. Implementing a strategy requires effectively implementing all of the intersecting “threads” that are required to execute the strategy to success. If any one aspect is overlooked, the project can fail. And if you can’t even see all the threads, it should be easy to understand how most projects essentially fail as soon as they begin and why you need a master weaver if you want to beat the odds and actually succeed.

Come back for our next installment where we will dig into the six traditional project steps outlined in our original series and dive into what your independent, third party, Procurement technology project guide (who will be independent from you, your vendor, and the vendor’s third party implementation team) needs to do.