Daily Archives: February 19, 2026

If You Have Two “AI” “Agents” Talking to Each Other …

… then, as Stephen Klein of Curioser.AI points out, you have a puppet show, “except instead of sock puppets, we’re using large language models and API loops”!

Just because it happens autonomously, looks social, appears to have an identify, and fakes a dialogue, it doesn’t mean there is anything more to it than the modern equivalent of a puppet show.

Gen-AI is the ultimate show and if P.T. Barnum were alive today, it would be his ultimate circus. But unlike the scarecrow, it doesn’t have a brain. It may have the ability to harness more compute power and data than any algorithm we have developed to date, but it is still dumber than a pond snail.

It has very few valid uses. I’ve discussed some of them before, but let’s make it perfectly clear what little it can actually do:

  • natural language processing — and, properly trained, it can not only equal, but even exceed the best last generation tech in semantic and sentiment processing
  • large corpus search — while it will never be 100% accurate, it can find just a few potentially relevant documents among millions with few false positives and negatives
  • large corpus summarization — again, while it will never be 100% accurate, and most good summaries won’t be top tier, it can summarize large amounts of data, and usually extract just the relevant data in response to your query
  • idea retrieval — not generation, retrieval of ideas based on a review and summarization of petabytes of data; very relevant for users dependent on LLMs who are suffering minor to severe cognitive atrophy; with proper prompting this can take the form of
    • strategy / workflow suggestion
    • devil’s advocate
  • usage and workflow prediction during application development
  • rapid PROTOTYPE generation for usability and efficacy analysis
    (not enterprise application development)

The reality is that Gen-AI

  • cannot reason,
  • is not deterministic, and
  • is essentially nothing more than a meta-prediction engine;
  • is providing ideas based on meta-pattern identification,
  • is predicting based on a layered statistical model beyond ANY human understanding, and
  • generates code riddled with security issues and possibly even boundary errors;
  • and let’s not ignore the fact that hallucinations are a core function that CANNOT be trained out .

This means that often the only way to succeed with Gen-AI is to more-or-less abandon Gen-AI LLMs in production applications except as Natural Language Parsers (as they are easier to train to accuracy levels beyond last generation semantic parsers which could take months to train to high effectiveness — and I know this from personal experience), and revert back to the AI tech that was just reaching maturity and industrialization readiness that I was writing about in the late 2010s. The reality is that, if you are willing to use some elbow grease and put the hours in, you can create spectacular applications with last-generation tech, and then use Gen-AI as a natural language interface layer to simplify utilization, integration, and complex workflows. If you are willing to create the right guardrails, where the Gen-AI LLM can only trigger specific application services with specific data in specific contexts, with HUMAN approval, then you can use it responsibly. Otherwise, it’s a crapshoot as to the results you’ll get.

For example, you should never use it for negotiation, which can be as much as reading the other person, as this is a very risky application as the number of soft-based data points you need for a decent prediction typically far outnumbers what you have available … even for public figures where you believe you have lots and lots of data on them available to judge their reactions. But hey, if you want to lose your lunch money, and possibly your entire bank account, go ahead and let it act as your buyer (but if it can lose hundreds powering a vending machine, imagine how much it can lose on a seven to nine figure category).

Even though plenty of vendors will provide some very convincing demos that seem to indicate Gen-AI LLMs can do otherwise, don’t fall for the tricks. During the demo, The Wizard of Oz is hiding behind the curtain. The not-so-great thing about LLMs is that, for a very specific set of tasks/situations, they can be overtrained on a very specific corpus to over-perform against those tasks and greatly increase the chances that any demo they deliver to you works fantastically well.

However, what this also means, is that you definitely do not want to use the Gen-AI LLM for tasks that are quite distinct and significantly different than the tasks/situations the Gen-AI LLM was over-trained for as the Gen-AI LLM is going to perform quite poorly at best, and possibly quite disastrously at worst. The reality is that once the puppeteer is no longer pulling the strings, all bets as to efficiency and effectiveness are off.

The Gen-AI ringmasters are employing the same philosophy and same techniques that made some of the early spend auto-classification providers “leaders” with unheard of success rates compared to when the average organization employed similar auto-classification tech and got dismal results. (Because they just didn’t know what “AI” actually stood for!)

Don’t be fooled by the ringmasters. If you want results, lie its AI and buy solutions that work.