Category Archives: AI

GEN-AI IS NOT EMERGENT … AND CLAIMS THAT IT WILL “EVOLVE” TO SOLVE YOUR PROBLEMS ARE ALL FALSE!

A recent article in the CACM (Communications of the ACM) referenced a paper by Dan Carter last year that demonstrated that the claims of Wei et.al in their 2022 “Emergent Abilities of Large Language Models” were unsubstantiated and merely wrong interpretations of visual artifacts produced by computing graphs using an inappropriate semi-log scale.

Now, I realize the vast majority of you without advanced degrees in mathematics and theoretical computer science won’t understand the majority of technical details, but that’s okay because the doctor, who has advanced degrees in both, does, can verify the mathematical accuracy of Dan’s paper, and the conclusion:

LLMs — Large Language Models — the “backbone” of Gen-AI DO NOT have any emergent properties. As a result, they are no better than traditional deep learning neural networks, and are, at the present time, ACTUALLY WORSE since our lack of deep research and understanding means that we don’t have the same level of understanding of these models, and, thus, the ability to properly “train” them for repeatable behaviour or the ability to accurately “measure” the outputs with confidence.

And while our understanding of this new technology, like any new technology, will likely improve over time, the realities are thus:

  • no amount of computing power has ever hastened the development of AI technology since research began in the late 60s / early 70s (depending on what you accept as the first paper / first program), it’s always taken improvements in algorithms and the underlying science to make slow, steady progress (with most technologies taking one to two DECADES to mature to the point they are ready for wide-spread industrial use)
  • the technology currently takes 10 times the computing power (or more) to compute “results” that can be readily computed by existing, more narrow, techniques (often with more confidence in the results)
  • the technology is NOT well suited to the majority of problems that the majority of enterprise software companies (blindly jumping on the bandwagon with no steering wheel and no brakes for fear of missing out on the hype cycle that could cause a tech market crash unequally by any except the dot-com bust of the early 2000s) are trying to use it for (and yes, the doctor did use the word “majority” and not “all” because, while he despises it, it does have valid uses … in creative (writing, audio, and video) applications [not business or science applications] where it has almost unequalled potential compared to traditional ML designed for math and science based applications)

And the market realities that no one wants to tell you about are thus:

  • former AI evangelists and some of the original INVENTORS of AI are turning against the technology (out of a realization that it will never do what they hoped it would, that its energy requirements could destroy the planet if we keep trying, and/or that maybe there are some things we should just not be meddling with at our current stage of societal and technological evolution), including Weizenbaum and Hinton
  • Brands are now turning against AI … and even the Rolling Stone is writing about it
  • big tech and companies that depend on big tech (like Pharma) are starting to turn against AI … and CIOs are starting to drop Open AI and Microsoft CoPilot because, even when the cost is as low as $30 a user, the value isn’t there (see this recent article in Business Insider)

Now, the doctor knows there are still hundreds of marketers and sales people in our space who will consistently claim that the doctor is just a naysayer and against progress and innovation and AI and modern tech and blah blah blah because they, like their companies, have gone all in on the hype cycle and don’t want their bubble burst, but the reality is that

the doctor is NOT against “AI” or modern tech. the doctor, whose complete archives are available on Sourcing Innovation back to June 2006 when he started writing about Procurement Tech, has been a major proponent of optimization, analytics, machine learning, and “AI” since the beginning — his PhD is in advanced theoretical computer science, which followed a math degree — and, after actually studying machine learning, expert systems, and AI, he used to build optimization, analytics, and “AI” systems (including the first commercial semantic social search application on the internet)

what the doctor IS against is Gen-AI and all the false claims being made by the providers about its applicability in the enterprise back office (where it has very limited uses)

because the vast majority of the population does not have the math and computer science background to understand

  1. what is real and what is not
  2. what technologies (algorithms) will work for a certain type of problem and will not
  3. whether the provider’s implementation will work for their problem (variation)
  4. whether they have enough data to make it work

and, furthermore, this includes the vast majority of the consultants at the Big X and mid-sized consultancies who graduate from Business Schools with very basic statistics and data analytics training and a crash course in “prompt engineering” who can barely use the tech, couldn’t build the tech, and definitely couldn’t evaluate the efficacy and accuracy of the underlying algorithms.

The reality is that it takes years and years of study to truly understand this tech, and years more of day-in and day-out research to make true advancement.

For those of you who keep saying “but look at how well it works” and produce 20 examples to prove it, the reality is that it’s only random chance that it works.

With just a bit of simplification, we can describe these LLMs as essentially just super sophisticated deep neural networks with layers and layers of nodes that are linked together in new and novel configurations, with more feedback learning, and structured in a manner that gives them an ability to “produce” responses as a collection of “sub-responses” from elements in its data archive vs just returning a fixed response. As a result they can GENerate a reply vs just selecting from a fixed one. (And that’s why their natural language abilities seem far superior to traditional neural network approaches, which need a huge archive of responses to have a natural sounding conversation, because they can use “context” to compute, with high probability, the right parts of speech to string together to create a response that will sound human.)

Moreover, since these models, which are more distributed in nature, can use an order of magnitude more (computational) cores, they can process an order of magnitude more data. Thus, if there is ten to one hundred times the amount of data (and it’s good data), of course they are going to work reasonably well for expected queries at least 95% of the time (whereas a last generation NN without significant training and tweaking might only be 90% out of the box). If you then incorporate dynamic feedback on user validation, that may even get to 99% for a class of problems, which means that it will appear to be working, and learning, 99 times out of 100 instead of 19 out of 20. But it’s NOT! It’s all probabilities. It’s all random. You’re essentially rolling the bones on every request, and doing it with less certainty on what a good, or bad, result should look like. And even if the dice come “loaded” so that they should always roll a come out roll, there are so many variables that there are never any guarantee you won’t get craps.

And for those of you saying “those odds sound good“, let me make it clear. They’re NOT.

  • those odds are only for typical, expected queries, for which the LLM has been repeatedly (and repeatedly) trained on
  • the odds for unexpected, atypical queries could be as low as 9 in 10 … which is very, very, bad when you consider how often these systems are supposed to be used

But the odds aren’t the problem. The problem is what happens when the LLM fails. Because you don’t know!

With traditional AI, you either got no response, an invalid response with low confidence, or a rare (compared to Gen-AI) invalid response with high confidence, where the responses were always from a fixed pool (if non-numeric) or fixed range (if numeric). You knew what the worst case scenario would be if something went wrong, how bad that would be, how likely that was to happen, and could even use this information to set bounds and tweak the confidence calculation on a result to minimize the chance of this ever happening in a real world scenario.

But with LLMs, you have no idea what it will return, how far off the mark the result will be, or how devastating it will be for your business when that (eventually) happens (which, as per Murphy’s law, will be after the vendor convinces you to have confidence in it and you stop watching it closely, and then, out of the blue, it decides you need 1,000 custom configurations of a high end MacBook Pro in inventory [because 10 new sales support professionals need to produce better graphics] in a potentially recoverable case or it decides to change your currency hedge on a new contract to that of a troubled economy (like Greece, Brazil, etc.) because of a one day run on the trading markets in a market heading for a hyperinflation and a crash [and then you will need a wheelbarrow full of money to buy a loaf of bread — and for those who think it can’t happen, STUDY YOUR HISTORY: Germany during WWII, Zimbabwe in 2007, and Venezuela in 2018, etc.]). You just don’t know! Because that’s what happens when you employ technology that randomly makes stuff up based on random inputs from you don’t know who or what (and the situation gets worse when developers [who likely don’t know the first thing about AI] decide the best way to train a new AI is to use the unreliable output of the old AI).

So, if you want to progress, like the monks, leave that Genizah Artificial Idiocy where it belongs — in the genizah (the repository for discarded, damaged, or defective books and papers), and go find real technology built on real optimization, analytics, machine learning, and AI that has been properly researched, developed, tested, and verified for industrial use.

The Gen-AI Crash Can’t Come Soon Enough!

Author’s note: this first appeared as a LinkedIn post, elaborated upon in the comments of THE REVELATOR‘s post it referenced.

$1 trillion rout hits Nasdaq 100 over AI jitters in worst day since 2022!

This is a headline from the Economic Times this week. And a foreshadowing of things to come.

As far as the doctor is concerned, the impending The Gen-AI Cr@p market collapse can’t happen fast enough! Too many people don’t remember the 80s and how all the AI “promises, promises, that were made were the promises, promises they betrayed” …

because processing power, new languages/models/constructions, and expert mimicry is not enough!

The reality is that, until we have a fundamentally better understanding of human intelligence, or can at least assemble and properly support as many cores as humans have neurons [which, FYI, we shouldn’t conceive of as we couldn’t produce the energy requirements with current technology globally to power it] (and not the equivalent of a pond snail at best … look where that got us, it’s golden nugget of insight is we should eat one rock a day), there is zero chance of a new AI “breakthrough” actually approximating anything close to intelligence.

All of the true advancements in our lifetime are going to come from human intelligence (HI!) (that creates better algorithms, models, processes, etc. and then properly, manually, embeds those enhancements in next generation tech).

Remember, they’ve been promising us true AI since the [19]70s … and they are no closer now then the great minds who created, and in their wiser years abandoned, AI (which has materialized as Artificial Idiocy) because some pursuits are still beyond the grasp of mice and men (and others shouldn’t be attempted)!

Every 3 to 5 years they promise us that the brand new shiny tech is the Staples Big Read Easy Button, and every 3 to 5 years this brand new shiny tech fails to deliver. Gen-AI is just the latest in a long line of over-hyped, under-performing tech whose “hype” cycle is almost over and the next tech that is going to bring us a great market crash (which, giving the ridiculous amount of money dumped into this technology which will never be appropriate for the Enterprise, could bring about a crash that might rival the great dot.com crash of 2000 – and if you don’t remember that, you really should look it up — a lot of software providers, especially those whose solutions provided limited actual value relative to the investment made [or money wasted on “marketing” and “brand”] bit the dust).

The even sadder reality of the situation is that we don’t need the tech. In almost every business domain, there has been software which, with a bit of manpower and human intelligence, has solved the majority of our current business problems, even the most complicated global supply chain/trade problems. All we had to do was stop using the monolith technology from two-plus decades ago and take a small chance on newer, better, more powerful players who started to solve real problems with software the average Jane could use.

In Procurement, the vast majority of companies aren’t using the tech we had ?????? ???? years ago! (When the doctor built the leading strategic sourcing ???????????? solution (the first with multi line item support) and THE REVELATOR had a leading ML (machine-learning) based application for ????????? ????? ???????? ?????????????? and realization [for everyone else, think guided sourcing strategy, like Levadata does for electronics, based on market and organizational data, with execution support]).

If the average organization even had this V 1.0 technology, they’d do SO MUCH better across the board (and now we are at V 3.0 in most Procurement applications; in optimization, what I did at Iasta [acquired by Selectica, rebranded Determine, who sunset what they didn’t understand] and consulted on (in other players) after was V2 and Trade Extensions (acquired by Coupa) gave us V3 with full supply chain support and modelling capability beyond your dreams (and now maybe Coupa’s understanding with Arne and Fredrik (founders) gone … but there are those of us who still understand the phenomenal vision and realization thereof of the great Arne Andersson).

Also, the reality is that if anyone understood what Coupa Supply Chain Optimization [Llamasoft] or Logility [Starboard] could do in the right hands … THE REVELATOR‘s parts management dreams and scenario-based Procurement guidance from the late 90s and early 00s would come true.

(And we don’t need no fake-take to make it happen! Proper catalog-enhanced true SaaS solutions have been built with integrated intake for the last decade. You just have to look beyond the same old, same old 10 to 20 vendors that Gartner and Forrester tell you about every year (pretending that the other 656 don’t exist). Vroozi [see our 2-part summary: Part I and Part II] has had this capability since day one, and once all these Gen-AI and fake-take plays come crashing down because they don’t actually enable true Procurement or have any real Procurement capability under the hood, you’re going to see a new generation of true Strategic Procurement providers rise up and offer something that every enterprise, and mid-size enterprises in particular, needs and can benefit from. And when this reckoning comes, it will humble any organization still on one of these powerless platforms. So the time to find a real platform is now!)

If You Still Don’t Believe That Gen-AI is Bad for Procurement …

Then maybe you should do the math.

It’s very expensive for what it doesn’t do. You can pay 10K a month or more just for a conversational interface to search your data or push data into your applications. For 10K a month, you can get a decent core P2P application or source-to-contract application that, well, actually does something.

It’s even more expensive to train these systems on your policies, connect them to your applications, test that basic requests generate reasonable responses, train it to guide your users to get to an eventual answer, and so on. This could easily be more than a year or three of license fees.

But the true costs are in the utilization. Every time a user asks a question, or responds to a question posed by the Gen-AI to try and elicit the users intent, it takes compute time. LOTS of compute time. At least 10X the compute time of a standard search engine or keyword based retrieval system. In some cases, 30X. (The wattage required is easily 10 to 30 times traditional Google search.) So if you’re a mid-sized organization with more than 1,000 employees, a portion of your cloud computing costs, which average between 2.4 Million and 6 Million a year (according to CloudZero), is going to increase 10X to 30X. Let’s say 5% of that was basic search and inquiry, 120K to 300K. Almost inconsequential. But multiply it by 10 to 30, and you’ve just added another 1 Million to 9 Million to your bill. Think about that.

That “low-cost” Gen-AI “chatbot” that makes enterprise search and application interface “easy” (but not as easy as a well designed workflow, FYI), that you think costs 10K a month after implementation, training, and most importantly, cloud computing costs could actually be costing you 100K a month (or even 500K). For what? A fancier Google?

As Procurement professionals, you can, and should, do the math. So even if you don’t believe the doctor when he says Gen-AI is a fallacy, then believe the math.

The math says Gen-AI is just NOT worth it.

Have all the Big X fallen for Gen-AI? Or is this their new insidious plan to hook you for life?

Note the Sourcing Innovation Editorial Disclaimers and note this is a very opinionated rant!  Your mileage will vary!  (And not about any firm in particular.)

Almost every single Big X and Mid-Sized Consulting firm  is putting “Gen-AI” adoption in their top 10 (or top 5) strategic imperatives for Procurement, and its future, and that it’s essential for analytics (gasp) and automation (WTF?!?).

It’s absolutely insane. First of all there are almost no valid uses for Gen-AI in business (unless, of course, your corporation is owned by Dr. Evil), and even less valid uses for Gen-AI in Procurement.

Secondly, the “Gen” in “Gen” AI stands for “Generative” which literally means MAKE STUFF UP. It DOES NOT analyze anything. Furthermore, automation is about predictability and consistency, Gen-AI gives you neither! How the heck could you automate anything. You CAN NOT! Automation requires a completely different AI technology built on classical (and predictable) machine learning (where you can accurately calculate confidences and break/stop when the confidence falls below a threshold).

Which begs the question, have their marketers fallen for the Gen-AI marketing bullcr@p hook, line, and sinker? Or is this their new insidious plan to get you on a never-ending work order? After all, when it inevitably fails a few days after implementation, they have their excuses ready to go (which are the same excuses being given by these companies spending tens of millions on marketing) which are the same excuses that have been given to us since Neural Nets were invented: “it just needs more content for training“, “it just needs better prompting“, “it just needs more integration with your internal data sources“, rinse, lather, and repeat … ad infinitum. And, every year it will get a few percentage points better, but if it gets only 2% better per year, and the best Gen-AI instance now is scoring (slightly) less than 34% on the SOTA scale, it will be (at least) 9 (NINE) years before you reach 40% accuracy. In comparison, if you had an intern who only performed a task acceptably 40% of the time, how long would he last? Maybe 3 weeks. But these Big X know that once you sink seven (7) figures on a license, implementation, integration, and custom training, you’re hooked and you will keep pumping in six to seven figures a year even though you should have dropped the smelly rotten Gen-AI hot potato the minute you saw the demo (and asked them for a more traditional enterprise application they can deliver with guaranteed value).

So, maybe they aren’t misled when it comes to Gen-AI. Maybe they are just shrewd financial managers because it’s their biggest opportunity to hook you for life since they convinced you that you should outsource for “labour arbitrage” and “currency exchange” (and not materials / products you can’t get / make at home) and other bullsh!t arguments that no society in the history of the world EVER outsourced for. (EVER!) Because if you install this bullcr@p and get to the point of “sunk cost”, you will continue to sink money into it. And they know it.   Or do they?

In our view, the sad reality is that while one or two financial managers may have gone deep enough down the Gen-AI rabbit hole to figure this out, most of them likely just don’t see the downside for them or their clients.  Given all the hype the creators of these Gen-AI* models are pushing, with prolific examples only of success cases and upside, with very little education on the realities (because few of us are highlighting all of the risks of Gen-AI and failures when misapplied), maybe all they are seeing are promises that are just too good to ignore.

So, please, ignore the Gen-AI until you’ve validated a use case and instead remember When You Should Use Big X. Every solution and services provider has strengths and weaknesses. Please use them for their strengths, be successful, and increase the project success rate. (Post-Edit: As of 2024, technology project failure is at an all-time high. We don’t want to see any more of it!)

*Remember that AI, and Gen-AI in particular, is a fallacy.

The Gen AI Fallacy

For going on 7 (seven) decades, AI cult members have been telling us if they just had more computing power, they’d solve the problem of AI. For going on (seven) 7 decades, they haven’t.

They won’t as long as we don’t fundamentally understand intelligence, the brain, or what is needed to make a computer brain.

Computing will continue to get exponentially more powerful, but it’s not just a matter of more powerful computing. The first AI program had a single core to run on. Today’s AI program have 10,000 core super clusters. The first AI programmer had only his salary and elbow grease to code, and train the model. Today’s AI companies have hundreds of employees and Billions in funding and have spent 200M to train a single model … which told us we should all eat one rock per day upon release to the public. (Which shouldn’t be unexpected as the number of cores we have today powering a single model is still less than the number of neurons in a pond snail.)

Similarly, the “models” will get “better”, relatively speaking (just like deep neural nets got better over time), but if they are not 100% reliable, they can never be used in critical applications, especially when you can’t even reliably predict confidence. (Or, even worse, you can’t even have confidence the result won’t be 100% fabrication.)

When the focus was narrow machine learning/focussed applications and accepting the limitations we had, progress was slow, but it was there, was steady, and the capabilities, and solutions improved yearly.

Now the average “enterprise” solution is decreasing in quality and application, which is going to erase decades of building trust in the cloud and reliable AI.

And that’s the fallacy. Adding more cores and more data just accelerates the capacity for error, not improvement.

Even a smart Google Engineer said so. (Source)