Author Archives: thedoctor

Fastest Freeway to Financial Failure? Gen-AI!

Not joking here.

First of all, AI is getting more expensive for coding.

Input-output token pairs, which used to cost pennies per M tokens, are approaching $100/M for high-end models.

An average enterprise app starts at 100,000 lines. It will require 2M output tokens for initial output. It will take at least 5 iterations to get code good enough for the devs to even begin to work with, or 10M tokens. Then you will have to test and debug, figure another 5 iterations, or 20M tokens. But this doesn’t include the context history or coding samples required to produce a baseline, integrate a security framework, or account for multiple service-based deployments. This will consume an additional 10X to 30X the token count, and you will require 40M to 80M tokens to produce the app along with an experienced team of senior developers who will have to shore, as only 20% of AI-generated code survives unscathed. And then comes the testing, debugging, and QA. This could double the token requirement again.

For coding, which requires about 20 tokens per line, it would, in theory, only require 10,000 tokens to produce 5,000 lines of code, which is the net-new production code you’d expect from a senior developer every year, but given that it will require at least 5 iterations to get something to start with, and then all the updates to get it to testing and then all the testing and debugging, that’s at least 50M tokens as per above — with prices expected to rise (and possibly double) by the time you’re done (at the current rapid rate of token cost increase), or $10,000 to $20,000. Not bad in theory, as a senior Dev costs you 10X to 20X that on the low end, but …

As we said before, only 20% of AI code ends up being usable, so you still need a team of devs to review it and fix the major bugs/issues. With 80K lines needing correction, and a top dev only producing 5,000 lines of net new production code a year, you would still need 16 devs. That’s still expensive. You might realize that you only need to fix the critical issues to get your MVP out the door, and cut the team in half because you can stagger the reviews and fixes to issues. And while you think you saved the cost of 12 devs …

As time goes on, you realize there are fundamental flaws in the code. The security framework it chose was an old framework off of an abandoned Github code branch that used a lot of methods and procedures that were already marked for deprecation in the next framework release, which hit as soon as you released your code. They all have to be redone. The “multilingual” support is clumsy and requires the manual production of very carefully crafted fixed format text files. The workflow is rigid and not malleable. You wanted it AI friendly, but it doesn’t properly support MCP. And so on.

Then, like so many enterprise app startups are finding, you can’t scale the MVP into enterprise quality, have to scrap it, and rewrite if from scratch. Which means the 10K to 20K in LLM cost and the 800K to 1600K + in minimal dev support cost to get the MVP up and running in a production environment was all wasted — most of your seed money went up in smoke, and you have to start from scratch.

Second, its performance is much worse for trying to correct/update existing code where it has to ensure all unit, functional, user journey, workflow, and integration tests still work. This is evidenced by the fact that many companies, like Uber are now blowing through their annual AI budgets in a quarter. Engineers trying to rely heavy on AI are already spending 2,000 a month! Backtracking the math, it’s easy to see that the amount of project code, documentation, and online (GitHub) samples it has to ingest and compute to create an output, that might not even be 20% acceptable on the first few passes, is astronomical!

Plus, as we’ve explained before, when a dev has to correct up to 80% of the code, you’re losing on the efficiency improvement if a dev is spending 20% of their salary to get you that 20% increase in code lines which, as we’ve also explained before, is still of a worse quality than if that senior dev had wrote it by hand, that’s not a savings. That’s, at best, net 0.

However, this isn’t taking into account that it will likely have to be refactored or written out in very short order. You won’t get the median 2.5 to 3 year lifespan for a small app or 5 to 7 years for an enterprise framework, you’ll get 0.5 to 1 year — which means you’ll write and re-write each line of code three times as often with the use of AI. Or, in other words, you’ll inadvertently spend three times as much on that code! And your customers won’t pay 3 times as much for an app just because you spent three times what you need to, so bankruptcy will be just around the corner!

Third, it is getting infinitely more expensive for any document processing with a legal ramification.

Judges are now fed up with AI hallucinations and slop. Include AI hallucinations, and you’re getting fined at a minimum, and probably sanctioned.

Even worse, if it takes out a risk mitigation clause or creates an unforeseen risk you didn’t catch, a failure could cost you (hundreds) of millions of dollars that you would have otherwise been protected against if an experienced lawyer had written the contract for you.

Fourth, it’s making us physically AND mentally sick.

The cognitive atrophy is becoming well documented. People aren’t remembering what they wrote even an hour later when they use Gen-AI. They are being lulled into a false sense of security and accepting its outputs, even when those outputs are false and dangerous to their health (and tells them to effectively commit suicide). (But go ahead, eat that poisonous mushroom. The one rock a day it told you to eat will protect you, right?) Average decline in mental acuity and performance after regular use is 17% (which effectively equates to a loss of 17 IQ points. In comparison, it took us almost 120 years since the Victorian age [before we had industrial revolution technology to make our lives easier or media to dumb us into submission] to lose 14 IQ points). It’s making our society mentally sick!

Moreover, given how much energy and water a modern data centre consumes annually (100MW for a hyperscalar site or an amount of energy that would power at least 10,000 greedy American homes for a year) as well as how much water it consumes for cooling (100M+ G, assuming it recycles efficiently, or easily 200M+ G if it doesn’t, which would meet all the water needs of at least 5,000 of those homes per year, if not all 10,000), when energy and fresh water is becoming in scarce supply in first world countries, we’re jeopardizing the well being of 10,000 people for every unneeded AI data centre that we build. Given that there are now about 11,500 data centers consuming about 2% of planetary energy and likely between 0.1% to 1% of available fresh/drinking water, that’s a lot of energy and water being wasted to produce cr@p code and poor documents that can often be produced better by interns*. Especially when, in energy or water stressed areas, these data centers take systems to the breaking point and risk our health due to lack of necessary heating, cooling, bathing, and/or drinking water.

But, even worse, since this energy often comes from grids powered by dirty coal and oil, and the water extracted from desalination plants also require energy from those same grids powered by dirty coal and oil, they are polluting the environment to a significantly measurable degree as they account for somewhere between 0.5% and 1.0% of global CO2 emissions. With the global slowdown in shipping thanks to all the conflicts in the Red Sea and the Strait of Hormuz as well as the lack of water (due to less rainfall) in the Panama Canal, and the rampant increase in Data Center construction, data centers will soon account for more CO2 production than global (unregulated) shipping, which is the dirtiest industry on the planet. That’s NOT good for our health!

* There’s a reason Builder.ai was successful in its efforts to pass off human-written code as AI for over 7 years. Human produced code actually works! Even hastily written shoddy code works better than AI generated code by orders of magnitude!

The Mythical AI ROI!

A few companies claimed ROI from AI. (About 6% if you believe McKinsey or 5% if you believe MIT.)

And by few, we mean a few. One in twenty (1 / 20) is not a lot. And that’s just some ROI, not amazing ROI. Not necessarily enough to justify the elimination of even a single human (that you had hoped to replace), as that human is still generating more ROI than the BS AI you were sold (and making decisions at a much higher success rate).

There’s only one way to get true AI ROI.

1. Stop believing in Artificial Intelligence, realize all the vendors claiming it are only offering Artificial Idiocy, and that the best you can get is Augmented Intelligence.

Repeat

2. Identify a major problem that is hurting.

3. Use your Human Intelligence (HI) to map the current, and required, workflows end-to-end.

4. Identify all the manual steps that could be automated with the right data.

5. Do the hard work of identifying where all the data is, implementing a data orchestration platform to collect it all, and make it forward deployed everywhere it is needed for task automation.

6. Automate each step with the appropriate (A)RPA tool.

7. Implement a workflow orchestration platform to connect all of the steps together to the extent possible which ensures everything that can be automated with the automation and orchestration tools is once the intelligent human provides the right inputs and makes the right decisions.

8. Analyze where humans are still involved and where human inputs and/or decisions can be further automated through the integration of additional (external) data feeds and encoding of the (business) logic the human always uses to make the decision.

9. Analyze what’s left and determine where “AI”, even with a poor accuracy rate and hallucinations, could be helpful to an intelligent human making decisions and acquire small, focussed, specialized model licenses only for those steps.

10. Ensure Augmented Intelligence, connected to your forward deployed data, is available everywhere Human Intelligence (HI) requires it to make a decision.

until all major problems solved.

One by one. Put the effort in once, do it right, and with modern tech, you’ll never have to do it again.

You only win with AI when you’ve first centralized, validated, and forward deployed your data; implemented deterministic (adaptive) robotic process automation everywhere you can, and identified precise use cases where custom solutions actually provide a benefit (and not just a fairy-tale promise).

You CAN NOT Safely Use LLMs for Contracts or Legal Work!

Darlene Newman recently wrote a great article that makes it abundantly clear why you CAN NOT Safely use LLMs for Contracts or any other document with any Legal implications whatsoever!

Not only can you not train out hallucinations, because they are a fundamental function of the technology, but every time the LLM touches the document, it can (and likely will) corrupt something that was already correct (and reviewed) before.

In other words, you collect all your reference documents, ask it to generate a contract that contains all of your mandatory clauses, addresses all the risks, incorporates the schedule, specifies the requirements, etc. etc. etc. and get back a 50 page document where the section, paragraph, and sentence quality ranges from masterpiece to monkey on crack. You then spend hours (to days) fixing everything and ask the LLM to simply correct spelling, grammar, and ensure key requirements are met in the new/changed sections only (giving it the original document for comparison). The LLM spits out a cleaned up copy, you review all the sections you updated, it looks good, and you send it out.

Little do you know that because you added an article in one section, shortened a sentence in another section, and improved the grammar in a third section that it decided to rewrite half those sections for you, because it decided the specific requirements you called out for the new sections weren’t addressed enough. In the process, other key requirements are dropped, risk mitigations have been written out, and the contract now heavily favours the other side when something goes wrong. Not at all what you intended, but that’s what you got because you didn’t review all 50 pages with care.

Maybe not too bad if nothing goes wrong, and maybe devastating if it does.

But nothing goes wrong in the short term, so your Legal team decides to use it to try and defend a claim against your company. This is where it goes from bad to much, much, worse. You upload the brief, you outline your counterpoints, you upload your supporting documents — including the relevant law and cases you know of, you ask it to find more law and cases relevant to your defense, and ask it to create your first response. You let it chug, go to lunch, and come back to a 60 page, 220 point response with half a dozen statues and two dozen cited cases.

You go through all the law, realize that only 8 of the statutes are (somewhat) relevant, remove the 3 that aren’t and the fake one the LLM found on the internet. Then you go through all the cases, realize only 14 are actually supporting, 7 are not relevant, and 3 were completely hallucinated and make the corrections. Mark all the paragraphs that are okay, the ones that need updates, and what updates are needed. Get sign off on what’s good, what needs updates, and push it through again. It comes back with a couple of new potential statutes, another 8 potential cases, updates to multiple paragraphs, and you review again. You find one of the statutes potentially relevant, 4 of the cases real and usable, and half of the paragraphs look good. You mark all this, make the updated correction lists, get sign-off, and send it back to the LLM. You don’t notice it also changed 5 of the paragraphs you were completely happy with, changed some quotes to non-existent quotes, and replaced an approved reference with a hallucinated one. This goes on for a few more iterations, where key clauses/references are not rechecked, and you still end up with a 70 page document with a dozen hallucinations, 3 non-existent cases, and faulty logic despite review by multiple senior partners, because no one checked what they were happy with last iteration because they expected the LLM would not change it because they explicitly told the LLM not to.

Unlike an intern, who is naturally lazy and tired of working 84 to 112 weeks for peanuts and will happily ignore anything you tell him to ignore, as well as intelligent (when he chooses to be), the dumber-than-a-doornail LLM recomputes the meaning of inputs on every request, has the same chance of messing up on every request, has the same chance of understanding the request but predicting you were being facetious and actually want it to rewrite the paragraphs chock full of hallucinations, and so on. You don’t notice, submit the brief with $1,000/hour senior partner sign off, and make a mockery of your firm with all the AI slop (as well as securing it a massive fine from a p!ssed off judge tired of AI slop).

And there’s no way to stop it. It doesn’t matter how detailed your instructions are. It doesn’t matter how much effort you go through to lock parts of the document down with automated input and output checks and re-dos when the LLM screws up. Every time the LLM touches the document, something will corrupt. The only thing that is unknown is whether or not is how detrimental the corruption is.

As per Darlene’s post,

Microsoft Research tested 19 AI models across 310 professional documents. They gave each model a document editing task, then another, then another … for 20 interactions in total. Frontier models corrupted 25% of document content by the end.

25%! That’s a lot of corruption of good content. And enough to ensure you get AI slop every time!

9 Signs You Were FORCED To Negotiate

Tom Mills, author of Procure Bites, recently gave us 9 signs you were born to negotiate. Now, since, as we said before, some of you are still in organizations where Purchasing is still treated as an old-school function, and run by old-school die hards who still think it’s the (19)80’s, might be wondering where it came from because that’s not the negotiating behaviour you’re used to seeing in your Procurement team who act like they are wild west gunslingers who win or lose the deal at the poker table. (They are The Good, The Bad, and The Wealthy like their sales peers, after all.)

Tom’s profile might be the profile of a Procurement negotiation professional you want to see, but if your Procurement organization is still the Island of Misfit Toys, that’s not the profile you have. This post is for you, and describes the lead buyer in your Purchasing department that was put there because they didn’t belong (or want to be) anywyere else, and, for one reason or another, the organization can’t (or won’t) get rid of them just yet.

Enjoy!

Like Any Tool, AI Won’t Solve Leadership Problems!

Paul Martyn is right to cringe a little every time he hears a solution provider say:

AI and automation won’t replace employees. It will free them up for more strategic work
Because there are two fundamental problems with this statement.

1. As Paul points out in his recent article, if strategic work is not already happening, that’s not a technology problem. That’s a leadership problem!

2A. You can’t drop tech in and suddenly become more efficient unless you have all the data and processes in place to support it — and it’s a money back guarantee you don’t have all of the data and processes in place to support it.

2B. Unless AI stands for Augmented Intelligence, AI will actually consume MORE of your time as you deal with the hallucinations and errors it will create on a regular basis. (Remember, only 1 in 20 organizations are seeing a return on their AI investments, and I guarantee those are the ones that either got tricked into, or simply bought, old fashioned RPA (robotic process automation) that actually works.

Don’t fall for the spin. If you want strategy

1. Make sure it’s already happening.

Maybe it’s only 10% of categories going through strategic sourcing, but you have to start somewhere. Then you can increase that percentage as you automate more tactical work.

2. Allocate time to (old-school) automation.

One at a time, pick a very time consuming process ripe for automation. Map it end to end. Redesign it for automation. Automate it. As time frees up, more time for strategy and automating more processes.

3. When the automation effort in time-consuming / painful processes that remain exceeds the expected time return over the next 12 months, look for outside help.

Not before. And that’s how you don’t fall for the spin!