Monthly Archives: February 2026
Claims of Complete Gen-AI Auditability Are Complete BullCr@p
Proponents of Gen-AI will argue that you should go all in on their next-gen LLMs because, unlike current systems and many humans (who are lousy keepers of record), their decisions, like their actions, are 100% auditable. And, again, that’s complete and utter bullcr@p.
Just because you can ask the LLMs to output their reasoning, and you can ask them to log everything they do from the minute you start the interaction, but, because the reasoning is all based on probabilistic math at a scale NO human can understand (and for which we have NO measurements yet), you have no idea WHY the LLM reasoned a certain something or IF the Gen-AI will reason the same way on the same request, even if that request is re-iterated only 5 minutes later!
You can simply search the internet for hundreds of examples out there of people giving the exact same prompt to the exact same LLM AI five minutes later and getting a slightly to completely different response.
Gen-AI LLMs don’t understand. They don’t actually reason. And they definitely don’t think! That’s why they are NOT auditable. And that’s also why they should NEVER make a decision. (However, since they can analyze more data, and for some tasks have, more often than not, achieved a competence beyond an average human happy to regress in IQ to the late neolithic era, they should definitely suggest decisions with their reasoning. But the LLMs should never, ever, execute on those decisions without human approval.)
The reality is that only an LLMs ability to log what was done in an immutable blockchain format is useful compared to an employee who knowingly did something wrong for a bad reason. Since the AI is not intelligent, and doesn’t have ethics, it has no reason NOT to log its reasoning and why an action was taken. But, as per above, the LLM is still NOT auditable.
If You Have Two “AI” “Agents” Talking to Each Other …
… then, as Stephen Klein of Curioser.AI points out, you have a puppet show, “except instead of sock puppets, we’re using large language models and API loops”!
Just because it happens autonomously, looks social, appears to have an identify, and fakes a dialogue, it doesn’t mean there is anything more to it than the modern equivalent of a puppet show.
Gen-AI is the ultimate show and if P.T. Barnum were alive today, it would be his ultimate circus. But unlike the scarecrow, it doesn’t have a brain. It may have the ability to harness more compute power and data than any algorithm we have developed to date, but it is still dumber than a pond snail.
It has very few valid uses. I’ve discussed some of them before, but let’s make it perfectly clear what little it can actually do:
- natural language processing — and, properly trained, it can not only equal, but even exceed the best last generation tech in semantic and sentiment processing
- large corpus search — while it will never be 100% accurate, it can find just a few potentially relevant documents among millions with few false positives and negatives
- large corpus summarization — again, while it will never be 100% accurate, and most good summaries won’t be top tier, it can summarize large amounts of data, and usually extract just the relevant data in response to your query
- idea retrieval — not generation, retrieval of ideas based on a review and summarization of petabytes of data; very relevant for users dependent on LLMs who are suffering minor to severe cognitive atrophy; with proper prompting this can take the form of
- strategy / workflow suggestion
- devil’s advocate
- usage and workflow prediction during application development
- rapid PROTOTYPE generation for usability and efficacy analysis
(not enterprise application development)
The reality is that Gen-AI
- cannot reason,
- is not deterministic, and
- is essentially nothing more than a meta-prediction engine;
- is providing ideas based on meta-pattern identification,
- is predicting based on a layered statistical model beyond ANY human understanding, and
- generates code riddled with security issues and possibly even boundary errors;
- and let’s not ignore the fact that hallucinations are a core function that CANNOT be trained out .
This means that often the only way to succeed with Gen-AI is to more-or-less abandon Gen-AI LLMs in production applications except as Natural Language Parsers (as they are easier to train to accuracy levels beyond last generation semantic parsers which could take months to train to high effectiveness — and I know this from personal experience), and revert back to the AI tech that was just reaching maturity and industrialization readiness that I was writing about in the late 2010s. The reality is that, if you are willing to use some elbow grease and put the hours in, you can create spectacular applications with last-generation tech, and then use Gen-AI as a natural language interface layer to simplify utilization, integration, and complex workflows. If you are willing to create the right guardrails, where the Gen-AI LLM can only trigger specific application services with specific data in specific contexts, with HUMAN approval, then you can use it responsibly. Otherwise, it’s a crapshoot as to the results you’ll get.
For example, you should never use it for negotiation, which can be as much as reading the other person, as this is a very risky application as the number of soft-based data points you need for a decent prediction typically far outnumbers what you have available … even for public figures where you believe you have lots and lots of data on them available to judge their reactions. But hey, if you want to lose your lunch money, and possibly your entire bank account, go ahead and let it act as your buyer (but if it can lose hundreds powering a vending machine, imagine how much it can lose on a seven to nine figure category).
Even though plenty of vendors will provide some very convincing demos that seem to indicate Gen-AI LLMs can do otherwise, don’t fall for the tricks. During the demo, The Wizard of Oz is hiding behind the curtain. The not-so-great thing about LLMs is that, for a very specific set of tasks/situations, they can be overtrained on a very specific corpus to over-perform against those tasks and greatly increase the chances that any demo they deliver to you works fantastically well.
However, what this also means, is that you definitely do not want to use the Gen-AI LLM for tasks that are quite distinct and significantly different than the tasks/situations the Gen-AI LLM was over-trained for as the Gen-AI LLM is going to perform quite poorly at best, and possibly quite disastrously at worst. The reality is that once the puppeteer is no longer pulling the strings, all bets as to efficiency and effectiveness are off.
The Gen-AI ringmasters are employing the same philosophy and same techniques that made some of the early spend auto-classification providers “leaders” with unheard of success rates compared to when the average organization employed similar auto-classification tech and got dismal results. (Because they just didn’t know what “AI” actually stood for!)
Don’t be fooled by the ringmasters. If you want results, lie its AI and buy solutions that work.
It’s Not Outcomes. It’s Capability.
And that’s why outcomes is a dirty word! (Part I and Part II)
More specifically, it’s about capability, knowledge, the ability to be self-sufficient, and continual improvement.
Our rant focussed on the fact that the entire point of “outcome”-based pricing was to not only lure you away from more affordable products and services (especially if you were willing to do just a little bit more yourself), but take away your self-sufficiency, capability, and even knowledge and ensure your entire existence slowly became 100% dependent on the vendor for key processes. That you’d have no choice but to keep using them because you lost the capability to take the function back in-house. That you’d be the next mark in the grift that keeps on taking.
A big problem with “outcomes”, and another reason that it is a dirty word, is that it’s always focussed on “metrics” that have an impact on “the bottom line” today in a manner that the C-Suite can see on the balance sheet. Since the point of a business is to make profit, all of the “outcome”-pricing vendors argue that it’s the right approach.
While you should get “results”, that’s not the only thing you should be measuring, and it should not be the focus of your measurements. Because when you focus only on “results”, the focus is whatever gets you the best results, and, more exactly, what gets you the best results TODAY. That means you will make decisions that will jeopardize the potential for mid, and definitely long, term results in exchange for better results today that will please the client, your boss, the C-Suite, and/or the shareholders.
A great example of the danger of “outcome”-focus is classic sourcing — and the introduction of e-auctions (which are surging again because people forget the long-term impacts of auction over-use) that kicked our space off!
When awards are reduced to lowest price, and the volumes are large enough that a few contracts can sustain a struggling supplier, especially in tough economic times, suppliers will often sacrifice almost all of their margin just to get an award. This results in a great, immediate, win for the buyer, who can show a huge savings on the balance sheet, but it’s actually a huge risk. If the supplier sacrifices too much margin and costs rise too quickly, their viability is at risk. If they unexpectedly go out of business, the buyer has to find new supply quickly, and if the market becomes tight, this could skyrocket costs or even result in costly stock-outs or, even worse, production line shutdowns. The savings not only disappear over night, but costs increase. And even if the supplier doesn’t go bankrupt, when you go back to market, after a few years, if inflation was low, you might save 1% to 2%, but typically the best case scenario is you find someone who can match the price. However, what typically happens is that the price increases, sometimes by a lot! Why? Because the focus was on getting the best price now, versus coming up with a plan to ensure prices, or at least production costs, continued to decrease over time. Instead of looking for a supplier who would continually invest in better technology, renewable materials and energy, process improvement, etc. to keep costs down, you look for a supplier who’ll cut every corner they can to get a good price now. If you do a strategic engagement and find the first type of supplier, and enter into a long term contract where they know they can continue to invest in improvement, they’ll likely come back with a solution, and a contract, that guarantees a continual cost decrease year-over-year. This would actually benefit you more because not only you would you be able to claim an “outcome” every single year, but you know you have a supplier you can count on to deliver! (And you won’t have to explain the cost increase next time you go to market.)
In order to be a successful business, you don’t have to just profit this year, but you have to profit next year, and the year after that, and the year after that, and so on.
What this really means is that you need to be:
- instituting processes that will allow you to not only be more efficient, but get more efficient (with experience) over time,
- implementing supporting technologies that help you continually increase efficiency, including automation solutions that requires less and less exception management
- increasing your knowledge and capability, so you can always make the best decisions, use the best solutions, and know when a third party can be more efficient or more cost effective (because it’s either a part-time position that’s not worth the hire internally or a function that’s not core to your business and you’d rather it be managed externally until such time as it makes sense to reclaim the function)
- identifying metrics that focus on capturing process improvement, increasing capabilities, capturing knowledge (for future generations of HUMAN employees), and that result in improvement year-over-year
and NOT focussing on destructive one-time outcomes (that will hurt you later, and possibly a lot more than you realize).
Despite what they say, Size Matters! Part II
In Part I, we noted that size really does matter … when you are selecting a ProcureTech or Source to Pay solution, and, in particular, it’s the size of YOUR spend that matters (and not the size of the vendor or even the vendor offering).
We noted that there is no one-size fits all, that the three main tiers of organizations (small, mid-sized, large) have three different needs (which are nuanced, especially in the mid-market as going from small-mid to big-mid can require leaps in complexity), and that you should be paying based upon the tier you need.
But with 3 tiers of solutions out there, the reality is that if you select a bigger solution than you need, you’re going to pay a lot more for a much smaller return. And that’s just NOT good Procurement.
So what should you pay?
It’s all based on your size, maturity, and need. Well, we answered this a bit in the past when we did our series on how much should you outlay for source to pay. (Part 1, Part 2, and Part 3.)
In the series we did in 2023, our answer was 120K to 500K+ (a year), and we were mainly focussed on the mid-mid-market upward. The answer today is similar.
Small Enterprise, < 20M in external spend, 12K to 24K a year. All you need is basic e-Procurement and basic process support. Many shareware suites and low coding platforms will allow you to configure a lot of what you need. At 20M, your full savings potential is 2M or less, and you’re likely to only realize a quarter of that in the first year, or 500K. So spending more than 24K on a license (when you’ll also have implementation and support costs) does not guarantee a worthwhile return.
Medium Enterprise < 500 M in external spend, 60K to 240K a year. You need basic sourcing execution and e-Procurement. A baseline solution does enough at the lower end, and an enhanced solution with deep supplier management, deep P2P+, and some compliance and risk capabilities. Here, the potential savings could be as high as 50M at the high end, or as low as 3M on the low end, with a potential opportunity ranging from 1M to 10M in the first year. At the low end, especially considering the personnel costs, you probably don’t want to pay more than 100K to guarantee a return. At the higher end, you could pay a Million for a small suite and get a return, but considering there’d only be a couple of categories where it would deliver any incremental value, why pay a Million when there are solutions for 250K that deliver the same value for 90%+ of activity. (For the few categories where it’s worth it, just hire a consultant with access to specialized tools!)
Large Enterprise >= 500M in external spend, 480K to 1M+, depending on the particular deep capabilities you need and any specialized modules and support you need. There’ll be more than enough categories to justify the additional spend, and saving an extra 2% on a 50M category will pay for the increased platform cost!
That’s the rule of thumb. Higher or lower depends upon the expected return. This means that before you spend more, you should work out a realistic ROI. If the return isn’t realistically there, you don’t spend more than you need.
Now I know plenty of vendors will disagree with me, but when solutions exist at all tiers that do everything an appropriately sized buying organization will need at the price points above, why pay more? (Even though the ABC suites will tell you that you should!)
Also, please note, these are license costs with basic support only. If you want or need more support or services, expect to pay more. (And do the ROI on the services before you contract them.)
