Category Archives: rants

Chief Sustainability Officer: USA Edition

A version of the graphic below has been making the rounds on LinkedIn for a few months (and the doctor wishes he could point to the original source of this [on LinkedIn], but either Google mis-indexed it [as the link goes to a user’s profile page] or it’s gone), and a more recent version can be found in this post.

These are great … if you are based in the EU. However, they are not so great if you are based in the USA, as outlined in our first quarter post on how in the corporate world, sustainability/ESG is NOT a priority. So, the doctor decided to correct it for you if you are based in the USA. Enjoy!

Gen-AI is Bad for Consulting Firms … But Even Worse For You When the Consulting Firms Blindly Use It!

A recent post on LinkedIn noted how there’s a wave of AI products flooding the consultancy and advisory space and how they are, frankly mediocre, overpriced wrappers on public models with minimum innovation, if any.

This is sad, but true, and it’s not the worst of it. The worst of it is that some of the Big X firms are training tens of thousands of consultants and f6ckw@ds on these tools to generate hundred page pitch decks and three hundred page strategy and implementation guides of standard generic, meaningless, drivel to deliver to you as “highly tailored guidance and expertise from their leading partners with 20 years experience delivering high-value projects” and charge you tens of thousands of dollars for the privilege.

This is especially egregious when you can use free/cheap (and I’m talking put it on your personal credit card cheap because you won’t notice the fee that is less than your monthly coffee charge from the coffee shop) to build the exact same pitches, strategy, and implementation guides from the thousands of freely available documents on the web in a few hours with a few generic prompts over a Sunday morning coffee. (And then, when the coffee kicks in, realize it’s all a load of cr@p and put in the bit bucket, but at least you will know what a load of cr@p looks like in pitch deck, strategy guide, and implementation plan form and will recognize it the next time an overpriced Big X tries to sell it to you for a ridiculous price tag and will have learned something from the exercise.)

Now that there are companies selling overpriced “custom” products to these consultancies, the situation is only getting worse, especially when the “customization” is just a wrapper with some pre-engineered prompts that aren’t well tested, only work at a point in time, don’t really give the consultancies what they need, and sometimes translate mediocre inputs to inputs that are even worse. Moreover, when you consider the price is sometimes a 100X multiple on the products they build on top of, it’s disgusting. Consultancies are paying more for less, and, in return, you are paying even more for even less!

Which makes no sense when the current publicly available LLM tech is being offered cheap (to try and hook you on it, even though, as we’ve repeatedly explained, the tech is not ready for prime time and will never deliver more than a fraction of what they are promising), and new implementations will get a lot cheaper. Just look at how DeepSeek undercuts the cost by a factor of 100 and gets 90% of ChatGPT (as long as you don’t mind exposing all of your secrets to the CCP). LLMs are nothing more than a fancy next-gen “deep learning” Neural Networks that construct responses vs. serving up canned responses (which is why hallucinations and lies are a core function, not an error that can be trained out) which gets us closer (but no cigar) to decent natural language processing (NLP) for the express purpose of the generation of desired outputs from inputs, but not there (and now, in addition to all the false positives and false negatives, we had to deal with, we now get to deal with hallucinations and lies as well). It’s not secret magic, it’s layers and layers of interconnected statistics and probabilities that no human can understand, in rather standard models that any Theoretical CS and Applied Math PhDs can build, and implementations that are better and cheaper are going to keep appearing as time goes on.

This means three things to any consultancy thinking about using these custom “AI” solutions

  • you still have to be even more tech savvy to use them to any degree of effectiveness
  • it’s not “the art of the prompt“, it’s the art of the training (even though they don’t really learn because they are NOT intelligent) because that determines the maximum level of effectiveness you will ever reach with them (and you need to provide them with sufficient correct data, which needs to be in the high gigabytes at a minimum, and, preferably, in the petabytes)
  • you don’t have to worry about when they are right (enough), which will happen between 90% and 95% of the time with proper training and proper prompting, or when they are obviously wrong, which will happen a very low percentage of the time (say 5% to 9%), but when they are oh so wrong but the response is constructed in a way that is oh so convincing that an above average person in intellect and experience wouldn’t know otherwise (that danger zone between obviously wrong and good enough that is likely only 1% to 2% of the time).

Now remember that your consultants aren’t that tech savvy, and you should know right off the bat incorporating and using these is going to be difficult and time consuming. (There’s a reason we are constantly advising you to be very careful about using Big X for tech selection and tech projects, and that’s because, even though they say it is, it’s NOT their forte. They weren’t built on tech, and they don’t have the best talent in tech — that talent goes to the big tech companies who can offer the 500K salaries to leading devs or the wild-west startups that leading devs think are cool.)

You only have so much clean and complete data you can use for training. You can’t just throw in the 1000s of decks you’ve built as you can’t share work you’ve explicitly created and sold to past clients, and the AI won’t anonymize the decks and suggestions (even though you think it will). It won’t know that “Ford” is the name of your client and might think that “Ford Data” is another term for shallow data and copy sections from that custom strategy straight into your pitch deck for General Motors (and chances are your overworked junior consultant won’t catch it when skimming that 200 page deck with only 2 hours to go before the meeting). And we know what happens then … (and it ends with the consultancy not keeping either client).

It will take a lot of analysis to identify those 1% to 2% of cases where it is very, very wrong but so convincingly right that you will miss some. What happens when you do and give your client advice that explodes in their faces? (We’ll let you answer that one.)

And for you as a consumer, if your consultancy is using this Bogus AI tech, it means that:

  • the situation that results from solution delivered might be even worse than the situation you started with (as should be evidenced not just by the tech project failure rate that is approaching 92% but the fact that 42% of projects are being abandoned during implementation!)

A solution designed by Gen-AI is not a solution. A real solution is a solution designed by human intelligence that uses real, augmented intelligence, to research and validate that solution. Remember that if you are going to hire a consultant!

When Someone Says “Real AI”, Ask For Details!

We shouldn’t have to remind you, but since too many people are falling for, and buying into, the hype and selecting tech that does not, and can not, ever,work, we are going to remind you yet again.

Computers do NOT think!

To think is to direct one’s mind … where one is an intelligent being, not a dumb box. Computers thunk … they compute using algorithms (which are hopefully advanced and encapsulate expert guidance and knowledge, but that is far from guaranteed).

Computers do NOT learn.

Appropriately selected and implemented probabilistic / statistical / machine learning algorithms will improve their performance over time as more data becomes available, but they do not learn. Learn is to acquire knowledge (or skill), and by definition, knowledge can only be acquired by an intelligent being.

Computer Programs Can Adapt …

but there’s no guarantee the adaption is going to improve their performance under your definition, or even maintain their performance. Their performance could actually decrease over time.

What is critically important is that there are two primary types of algorithms that can be used to create an AI application:

Deterministic and Probabilistic

A deterministic algorithm is one that, by definition, given a particular input will, no matter what, always produce the same output, with the underlying machine always passing through the same sequence of states. As long as you don’t screw up the input, or the retrieval of the output, (and, of course, the hardware doesn’t fail), it is 100% reliable.

A probabilistic algorithm, in comparison, is an algorithm that incorporates randomness or unpredictability into its execution, and may or may not produce the same output given successive iterations of the same input. Nor is there even any guarantee that the algorithm will produce a correct, or even an acceptable, input a given percentage of the time. Well designed, these algorithms may allow for consistently faster computation, better identification of edge cases, or even a lower chance of error, on average, for a certain class of inputs (but with the caveat that other classes of inputs may suffer a higher error rate).

Deterministic algorithms can be relied on to execute certain tasks and functions autonomously with no oversight and no worry. Probabilistic cannot. In other words, you cannot assign a probabilistic algorithm a task for autonomous computation unless you can live with the worst possible outcome of the algorithm getting it wrong. And this is what Gen-AI, and most of today’s “AI” tech, is based on.

This is the critical problem with today’s AI-tech and AI-Hype. Especially when a probabilistic system can, by definition, use any method it likes to determine a probability (which may or may not be at all appropriate, since a model is only valid if it accurately captures the “population” dynamics) and may, or may not, be accurate. For some of these situations, it will be the case that neither the company nor the provider of the system will have enough historical data (market situation and outcome) to even attempt to make a reasonable prediction, and there definitely won’t be enough data to know the accuracy, because standard measures of model accuracy (like the Brier Score), tend to require a lot of data, especially if you have a situation where you need to accurately identify rare events as this could require 1,000 or more “data points” (which, in a typical market scenario, would require enough data to identify the market condition and then the unexpected change”).

(And this is exacerbated by the reality that, for many of these situations, one could likely employ more traditional “statistical techniques” like trend analysis, clustering, classical machine learning, etc. to solve much of the problem at hand.)

It’s important to remember that Gen-AI LLMs, which power most of the new (fake) agentic tech, are all probabilistic based (and designed in such a way that hallucinations are a core function that CAN NOT be eliminated), and much of it is complete and utter garbage for what it was designed for, and even worse for tasks it wasn’t defined for (like math and complex analyses). (Everyday we see a new example of complete and utter failure, often due to hallucinations, of this tech. For example, you can’t even get a list of real books out of it — as per a recent contribution to the Chicago Sun Times which which published its Summer Reading List of 15 books, of which only 5 of which actually exist. And then there are numerous examples of lazy lawyers getting raked over the coals by judges for using ChatGPT to do their homework and quoting fake cases!)

While we do need to augment purely deterministic tech with more adaptive tech that uses the best “statistical techniques” to more quickly adapt to situations, we need to spell out the techniques and restrict ourselves to what is now “classic machine learning” where the algorithms have been well researched and stress tested over decades (not modern Gen-AI powered agentic tech that has worse odds than your local casino). At least then we’ll have confidence and can enforce bounds on what the solution can actually do (to limit any potential damage).

Especially now that we finally have the computing power we need to effectively use tried-and-true “classic” ML/AI techniques that require large data stores and huge processing power for highly accurate predictions. The reality is that even though this tech has existed for at least 25 years, the computing power required made it totally impractical for all but the most critical situations. Twenty-five years ago, a large Strategic Sourcing Decision Optimization (SSDO) model would run all weekend. Today you can solve it in a few seconds on a large rack server (with 64 cores, GB of cache, and high-speed access to TB of storage). The fact that we finally have (near) real time capability means that this tech is not only finally usable in all situations, but finally effective.

[And if vendors actually hired real computer scientists, applied mathematicians, and engineers and built more of this tech, instead of script kiddies cobbling together LLMs they don’t understand, we would be a decade ahead of where we are today.]

A Very Brief History of “Safe” American Inventions and Products

More specifically, a brief history of inventions and products developed, or (primarily) adopted, in the USA as perfectly “Safe” for public use when they were anything but! From the late 1800s to the present day.

Asbestos: large scale mining began in the late 1800s when manufacturers and builders decided it was a great thermal and electrical insulator whose adverse effects on human health were not widely recognized and acknowledged until the (late) 1970s; even today exposure is still the #1 cause of work-related deaths in the world (with up to 15K dying annually in the US due to asbestos-related disease)

Aspirin: as per our previous post, invented in 1897, available over the counter in 1915, it was heavily promoted as the cure all in the 1920s through the 1940s and might have cost us over a hundred thousand lives due to overprescription during the Spanish Flu pandemic alone

Cocaine: from the late 1880s through the early 1910s, your physicians were big fans of the Victorian wonder drug (as per this Lloyd Manufacturing Ad archived on the NIH site) as it was effectively the first effective local anesthetic the western world knew about (which was endorsed by the Surgeon-General of the US Army in 1886), although the real popularity was in the public, with an estimated 200,000 cocaine addicts in the US by 1902; still, it was 1914 before it was restricted to prescription use, 1922 before tight regulations were put in place, and likely the late 1940s before prescription and dispensation finally came to an end; moreover, it was generally viewed as harmless and non-addictive until crack emerged in 1985 (even though the number of cocaine related deaths in the US climbed to 2 per 1,000 in 1981)

DDT: (this is particularly relevant to Gen-Z who are fully on-board the Gen-AI hype train) developed in the 1940s as the first modern synthetic insecticide, Gen Z’s grandparents and great-grandparents used to run through DDT clouds that were sprayed in the streets of your cities and towns in the 1940s through the 1960s, as the first health risks were not reported until roughly 1962 when Rachel Carson published Silent Spring, and it wasn’t until 1972 when the US banned it for adverse effects on human health (as well as the environment); to this day, we’re still not sure how many deaths it has contributed to, although the UN estimates 200K people globally still die from toxic exposure to pesticides, of which DDT was the first and the precursor to many newer derivations (Source)

PFAS, inc. PTFE (Teflon)

developed by DuPont in 1938, spun off into Chemours, it found use as a lubricant and non-stick coating for pans, and was produced using PFOA (C8), which we now know (and should have known much sooner, but there was a massive PFAS cover up) is carcinogenic (but only for the last decade or so as it was only classified as such in 2013 even though we should have known by the late 1990s) but they still aren’t banned (even though legislation was proposed last year to phase them out over the next decade); because of the cover ups and lack of studies until recent times, we still don’t know how deadly this was, and is, but estimates are that PFAS likely killed 600K annually between 1999 and 2015 and 120K annually after that in the USA (Source) … WOW!

Tobacco: in the 1950s, cigarettes were advertised as good for you with Doctor (Camel Advertisement) and Dentist (Viceroy Advertisement) recommendations on the ads! Despite the fact that health risks were known since the late 1950s (when the first epidemiological study showing an association between smoking and lung cancer was published by Wynder and Graham), minors in the USA could still buy cigarettes until 2009 … even though Tobacco likely killed over 100 Million people globally in the 1900s (Source)

etc.

We could go on, but the point is this: like most cultures, the USA is not good at picking winning technology that is safe for everyday use, or at least safe enough under appropriately designated usage conditions.

There’s a reason that most countries have harsh regulations on the introduction of new consumer products and technologies that US lobbyists and CEOs scream about, and that’s because more mature countries (which have been around longer than a mere 249 years) understand that no matter how safe something seems, every advancement comes at a cost, every invention comes with a risk, and every convenience comes at a price — and until we know what we are paying, when we need to pay it, and how much we are going to pay, we shouldn’t rush in head first with blinders on.

And while we might still get it wrong, the reality is that we’re more likely to get it right if we take our time and properly evaluate a new technology or advancement first, and even if we get it partially wrong, as in the case of Aspirin, at least the gain should outweigh the cost. For example, even though it can be argued Aspirin was rushed to market, when used in proper doses, the side effects for the vast majority of the population are typically much less than the anti-inflammatory benefits as, for decades, there was no substitute. Even if it gave a person stomach irritation or minor ulcers, if it was life-saving, then that was a reasonable cost at the time.

However, in the cases of DDT, PFAS, and Tobacco, there was no excuse for the lack of research, and, in some cases, the prolonged cover up of research that indicated that maybe the products were not safe but, in fact, very deadly, and since they brought no significant life saving benefits (Malaria wasn’t a big concern in the USA; people were cooking with butter, lard, and oils for centuries; and, in small quantities, both alcohol and cannabis were known to not only be safer, but even medicinal in the right quantities), there was no need to rush them to market.

The simple fact of the matter is that no tech — be it chemical/medicinal, (electro-)mechanical, or computational — can be presumed safe without adequate testing over time, and that’s why we need regulations and proper application of the scientific method. A lack of apparent side effects doesn’t mean that there are none. That’s why we have the scientific method and mathematical proofs (for confidence and statistical certainty), which is something today’s generation doesn’t appear to know a thing about (especially if they just did a couple of years of college programming) as they’ve probably never been in a real lab [or played with uranium like their grandparents because it was legal in the USA to sell home chemistry kits with uranium samples to children in the 1950s, and these kits included the Gilbert U-238 Atomic Energy Lab] and more than likely don’t know the rule of thumb that you should generally add the acid to the base (and not vice versa because, otherwise, this could happen) and that you should definitely add the acid to whatever liquid [typically water] you are diluting it with.

Regulations exist for a reason, and that reason is to keep us safe. The Hippocratic Oath should not be restricted to
doctors and the Obligation of the Order should not be restricted to engineers. Every individual in every organization bringing a product to market should be bound by the same, and regulations should exist to make sure that all organizations take reasonable care in the development and testing of every product brought to market, real or virtual. (This doesn’t mean that every product needs to be inspected, but that regulations and standards exist for organizations to follow, and those caught not following the regulations should be subject to fines that would ensure that not just the company, but the C-Suite personally, was bankrupted if the company was found to have ignored the regulations.)

While Gen Z might like the Wild Wild West (which the USA never grew out of) as much as Gen X who created the dot com boom, we need to remember that the dot com boom ended in the dot com bust in 2000, and that if this new generation continues to latch on to AI like Boomers would latch on to blankies and teddies, it just means they are doomed to repeat the mistakes of their grandparents (and will bring about a tech market crash that makes the dot com bust look like a blip). You’re supposed to learn from history, NOT repeat it!

Got a Headache? Don’t Take an Aspirin or Query a LLM!

Yesterday we provided you with a brief history of Aspirin, the first turn-of-the-century miracle drug that was both society’s salvation and sorrow, though the latter wouldn’t be known for more than half a century. As we discussed, it was hailed as a miracle and life-saving drug that could be used for everything from the common cold to global pandemics. And it worked, for a price. That price, when it needed to be paid, was usually one of many, many side effects which were often minor and insignificant compared to the perceived benefit the drug was bringing, except when they weren’t and they enflamed ulcers and/or increased gastrointestinal bleeding and created a life threatening situation, caused hyperventilation in a pneumonia patient, or induced a pulmonary edema and killed the patient. While the death rate even at the height of over-prescription was likely only 3%, and less than a 10th of that today, it’s still not good.

The reason for this, as we elaborated in our last post, is because, like many of the breakthrough technologies that came before, it was not only rolled out before the side effects, and more importantly, the long term effects, were well understood, but before even the proper use for the desired primary effects were well understood (as evidenced by the fact that the best physicians were routinely prescribing two to four times the maximum safe dosage during the Spanish Flu Pandemic almost 20 years after first availability). While there were benefits, there were consequences, some of them severe, and others deadly.

Medicine is as much a technology as a new mode of transportation (boat, automobile, airplane, etc.), a new piece of manufacturing equipment, a new computing device, or a new piece of software.

Now you see the point. Every breakthrough tech cycle is the same. Whether it is medicine, farm machinery, the airplane, or modern software technology — and this includes AI and definitely includes LLMs like ChatGPT.

As Aspirin proves, even if the first test seems to be successful, there’s always more beneath the surface. Especially when the population numbers in the billions and every individual could react differently. Or, in the case of an LLM, billions of people who have thousands of queries, the large majority of which have never been tested, and all of which could generate unknown results.

Moreover, there have not been significant large-scale independently funded academic studies that we can use to understand the true strengths and weaknesses, truths and hallucinations, and appropriate utilization of the technology. As Mr. Klein has pointed out in a recent LinkedIn post that asked who funded that study, over 80% of AI industry “studies” are funded by undisclosed sources, and most of them, like most industry studies these days (see Mr. Hembitski’s latest post) don’t contain good data on demographics, sample size, test material, or potential bias.

That would be the first step to trying to get a grip on this technology. The next step would be to create reasonable measures that we could use to appropriately define technology categories and domains for which we could identify tests and measures that would give us a level of confidence for a given population of inputs or usage. If you consider a traditional (X)NN (Neural Network), which have a fixed set of outputs and are designed to process inputs from a known population, we have developed methodologies to determine the accuracy of such models with high confidence through testing and random sampling with sufficiently sized data sets using appropriate statistical models. Furthermore, mathematicians have proved the accuracy of those models for a given population and we know that if appropriate tests have demonstrated 90% accuracy for a population with 98% confidence, the model is 90% accurate with 98% confidence when used properly.

We have no such guarantees for LLMs, nor any proof that they are reliable. “It worked fine for me” is NOT proof. Vendors quoting nebulous client success stories (without client names or real data) is not proof. Moreover, the fact they raised millions of dollars to bring this technology to market is definitely not proof. (All a raise proves is that the C-Suite sales team is very charismatic and convincing and great at selling a story. Nothing more. In fact, fund raising would be more honest if securities law allowed fund raising via poker and takeover protection via gunfighting, as imagined in the season two episode of Sliders “The Good, the Bad, and the Wealthy“. At least then the shenanigans would be out in the open.)

The closest thing out there to a good industry study on LLMs and LRMs is likely Apple’s newest study, as summarized in The Guardian, where they find that “standard AI models outperformed LRMs in low-complexity tasks while both types of model suffered “complete collapse” with high-complexity tasks“.

The study also found that as LRMs neared performance collapse they began “reducing their reasoning effort and that if the problem was complex enough even when provided with an algorithm that would solve the problem, the models failed.

Still we have to question this study, or more precisely, the release of this study (especially given the timing). Did Apple do it out of genuine academic interest to get to the bottom of the technology claims, or are they doing it to cast doubt on competition as rivals are claiming they are behind in the AI race (and thus they are focussing only on the negatives of the technology to show that their competition doesn’t have what their competition claims to have and are thus not behind).

The point is, we don’t understand this technology, and that fact should scream louder in your head every day. Look at all the bad stuff we’ve discovered so far, and it’s likely we’re not even close to being done yet:

Yes there is potential to the new technology, as there is with all discovery, but until we understand fully not only what that is, how to use it safely, and, most importantly, how to prevent harm, we should approach it with extreme caution and we should most definitely not let it tell us how to run our business or our lives — or else, like an Aspirin overdose, it might just kill us. (And remember, Aspirin was studied for 18 years before it was made available without a prescription, and deadly side effects and prescribed overdoses still happened. In comparison, today’s LLMs and LRMs haven’t been formally studied at all, and the providers of this technology want you to run your business, and your life, off of them in next-generation agentic systems. Think about that! And when the migraine comes, remember, don’t take Aspirin!)