Category Archives: AI

Sponsored Posts that make you go UGH! (AI Contract MISmanagement!)

Today’s post is brought to you by the letters W, T, and F and inspired by this Spend Matters guest article by Matt Lhoumeau on The Last Contract Lawyer.

According to Matt, the legal profession is experiencing its iPhone moment because your competitors are closing deals in 26 seconds (and I certainly hope not!) using AI that outperforms human lawyers by 10% in accuracy (on what scale?!?). More specifically, he claims AI can complete a contract review in 26 seconds (spoiler: it can’t) while a human takes 92 minutes (on average I assume) and, furthermore, that this will cost you up to $6,900 (and this math makes no sense if the lawyer is only spending 92 minutes; because even top tier lawyers will generally only charge $500 per hour for a contract draft or review, so what’s the other $6,150 for).

Anyway, the most UGH! part of this article is not these false claims, it’s the missing information. Why is this the most UGH!? Because most of the claims the article makes are true, and when you tie all these claims together, if you don’t understand what this technology can’t do, and what risks it brings to the table (which is the missing information I refer to), you’re likely to believe the claims, join the AI religion, go all in on AI-CLM, and fire all your contract review lawyers. (And while I am no more fond of lawyers than the next guy, I am no less fond of them either, especially when they have a critical role to play.)

You see, the right AI engine (not ChatGPT) can:

  • process a contract in an average of 26 seconds or less and perform a (very) large number of contract review tasks during that time
  • cut approval times by 50%, and significantly reduce overall review times (that can easily add up to a calendar year for an organization that needs to review 500 contracts) to a small fraction of the time required (down to a few weeks to a few months)
  • do more accurate pattern recognition than most humans, including “experts”
  • significantly reduce outside counsel spend

And the benefits, when deployed properly, can be as great as the article claims. But this is the key — deployed properly. And there is no discussion of how you do that. The only piece of counter-information in the entire article is a reference to a Stanford Law School research study (that puts AI on Trial) that notes that AI tools using retrieval-augmented generation systems still hallucinate in 1 out of 6 benchmarking queries (but yet somehow outperform human reviewers on standard contracts? really?).

As we wrote earlier this year when we told you Don’t Kill All the Lawyers (and reminded you a couple of months later in our post that said you should embrace Legal tech … backed by lawyers), we’ve reached the point that you should (almost) never use a lawyer to:

  • draft a contract
  • review a contract for standard clauses, terms, and conditions
  • locate the relevant statutes
  • summarize your obligations
  • summarize your incident response options
  • etc.

because a tool can take your templates, standard terms and conditions, RFP, negotiation summary, and draft a better contract that most paralegals; ensure all of your standard terms and conditions are in there or review counter-party paper to ensure the same; review the redline you get (or are planning to give) that and determine which changes are good or indifferent for you; and then run the final contract through a standard agent for risk assessment to identify if the contract contains any known risks and flag anything that needs to be addressed, and do this better than a lawyer.

But what the tool absolutely, positively, can not do is:

  • determine if the mitigations to known risks are sufficient in the particular instance addressed by the contract
  • determine if there are any unique/non-standard risks that need to be addressed (that your existing checklists, templates, and review agents wouldn’t know about or check for)
  • determine if there are any unique requirements for a contract with a supplier in a new jurisdiction that could require special considerations around key clause phrasing or standard risk mitigations
  • have confidence beyond its models

You still need the human review, at least where it counts. And that’s the part you have to understand — and the part the referenced article doesn’t address at all.

If you’re a company doing a Billion dollars in business a year and signing over 10,000 contracts a year, you certainly don’t want to still be doing end-to-end manual reviews as that would be a minimum of 2 million minutes of review time, or the full time attention of almost 20 lawyers. Wasteful and completely unnecessary.

In fact, since you’re doing a Billion dollars or more (and likely 20 times that if your company is a Fortune 100),

  • you probably don’t want to manually review any contract under a threshold (say $100,000) unless it is flagged as a high risk,
  • you probably don’t want to spend more than an hour on a review of any contract under a larger threshold (say one million dollars) unless it is flagged as medium risk,
  • you don’t want lawyers to read the remaining contracts end-to-end reviewing every clause and comparing those clauses against every checklist when it’s only the risks and unique requirements of the contract that require human intelligence

because limiting low value contracts to review only in high risk, low-mid value contracts to review only in mid-risks, and leaving the costly (but valuable) review time to the high-value or potentially high risk contracts will not only cut costs by 60% or more, but increase the value of the manual exercise.

Especially if those contracts are indexed by a natural language system that can allow the lawyer to ask key questions about the clauses that are in there, bring up the clauses she is interested in for a review, identify any processing flags, and apply her unique insights to the domain, jurisdiction, and business risks and ensure the contract accurately addresses all of these or focus her time on the right additions and modifications. For example, she might realize that the contract for on-site support in the nuclear power plant is extremely risky and the company’s across-the-board liability insurance requirement of 5 million is just not enough, realize that the AI safety requirements are not enforceable in the US and instead insist that the agreement be shifted to the Irish sub-entity and that jurisdiction apply, and so on. A check-the-box system won’t catch these things (as it can only look for risks it knows of and check boxes that have been identified), and neither will an open LLM (where you have no idea of the quality of the training, how much it is hallucinating, or, even worse, deliberately lying to you).

You still need a lawyer. Because, while it is an iPhone moment, it’s only an iPhone moment for lawyers who, if you aren’t using the tech, will be using the tech to help them focus on what’s important on the review stack and what isn’t. Because if the worst case is that you might lose an average of 10K to 50K here and there on every 100th contract in exchange for saving 10 Million on legal contract reviews and related matters (10 lawyers from outside council at an average of one million a year), that’s likely a worst case loss of a 2M loss in exchange for a 5X savings of 10M. And you know you won’t have many large losses because you’ll be able to focus legal review on the contracts that matter in dollar value or risk rating, not the contracts that don’t. And, all of a sudden, a close legal review of key contracts becomes a luxury you CAN afford!

When It Comes to Gen-AI, I’m NOT Yelling Enough! Part III

AI in every Barbie, Ken, & Action Figure. What could possibly Go Wrong?

Simple Fact: If you want to truly manage risk, think of the worst possible situation that can occur. Then realize that is the BEST CASE SCENARIO. And try again. Repeat until you are literally shaking in your boots at the thought of what could happen. Only then have you identified the real risk!

Mattel has signed a deal with Open-AI to put AI in all its toys, which target K-12. (Dot.LA)

The White House has pledged AI education for K-12. (Babl.AI)

So what’s the worst that can happen?

It’s NOT the stunted mental development that will likely result. (If it erodes critical thinking skills and leads to cognitive decline in fully developed adults, what will it do to children??? [Time] For those with certain developmental disabilities, it could mean reading, righting, and ‘rithmetic are a thing of the past! [And how long did we struggle to get near universal literacy in first world countries?])

It’s the EXPLOITATION. And not just the exploitation by manufacturers, who will also sign deals with OpenAI and other AI providers, to ensure not only cognitive and emotional dependence on their products [ Futurism ], but by hackers. [And we know cybersecurity isn’t keeping up. In most countries, successful cyber attack rates have penetrated 44% to 74% of all businesses, with up to 94% of businesses being targeted! That’s right. In the least targeted of the first world countries, at least 50% of business have been hacked in the past year. In the most targeted, 75% of businesses. ]

First, let me remind you that sleeper code can be injected into these models. [ Cornell ]

Now let me remind (or, for most of you, inform) you that over 300 Million children a year are victims of technology-facilitated exploitation and abuse. [ ChildLight.org ]

Getting the picture yet? Just imagine what these transgressive deviants are going to do with these talking “toys” that will be by your child’s side every waking minute of every day (due to the dependence described above). (Or, if you want to keep your sanity, don’t — but acknowledge this reality nonetheless.) Predatory grooming will reach whole new levels. Terrified yet? You should be!

Don’t give me any BS that these systems will have heightened security (because toy manufacturers have no clue how to secure advanced software systems, and software providers have no incentive to do any more than meet the minimum regulatory requirements) or that testing will prevent it (as the research paper above showed that it won’t). Hackers and transgressive deviants never give up. Let’s not give them yet another tool to exploit!

So now will you join me in Declaring war on Open-Access LLMs?!?

Postscript:
And, FYI, it won’t necessarily be Mattel and Hasbro that you will have to worry about (too much). They will (hopefully) be under heavy scrutiny. But there are over 2,700 businesses in the toy, doll, and game manufacturing industry in the USA and somewhere between 5,000 and 10,000 toy factories in China. There’s no way we can monitor even a fraction of them!

Big X Consultancies Peddling AI BS Will Flatten Procurement, but AI Certainly Won’t!

In a LinkedIn post from weeks past, THE PROPHET states that AI Will Soon Flatten Procurement and Operations Consulting.

He makes a good argument, but there is one big flaw. Namely, with AI there is no:

“Strategy, trust building, and decision support”.

You can’t use Gen-AI for decision support when every reference it gives you can be a 100% fabricated hallucination based on data it also hallucinated from sites and authors it also hallucinated (with complete back stories it also hallucinated as well). In other words, unlike pure number crunching in a classical ML-based platform, it’s NOT dependable. When it works reasonably well, it can sometimes get you started in the right direction, but it certainly won’t replace entire strategy teams … at least not if the teams are any good. (However, if they are recent grads with no actual real world experience, then, sure, go ahead. Not like it will do much worse than the bench of drunken plagiarist interns who don’t know a brake shoe from an athletic shoe when making a pitch to a client, or that a menswear site is NOT a good demonstration for a facilities manufacturer that needs MRO. (And the doctor is not making this last part up, he’s seen it! More than once!)

At the end of the compute cycle, there’s no strategy in what ever “strategic plan” it produces, just computations based on its perceptions of probabilities. Big companies change by shifting the trends, not by following them, and definitely not by failing to validate them. And if you are are paying 20M to 50M for a major transformation consulting engagement to a Big X, which would fund a new Broadway performance every quarter for your entire workforce (and give them a morale boost and likely lead to slightly better performance, especially when compared to the failed transformation effort you will end up with if it is AI driven), you don’t want to follow the crowd. You want to lead the market, and do so in a big way.

Moreover, there’s definitely no trust if everything you do is based on a soulless clueless algorithm that, right now, has a chance of failure approaching 92% (and predictions that are cloudy with a chance of meatballs), and, when it does work, costs 3 times as much and 5 times as long to implement for just a minor improvement (when you could get a moderate improvement just streamlining your processes and implementing modern Source-to-Pay-to-Supply-to-Service systems and carefully planned and rolled out back-office FinTech upgrades).

We ned to go back to 2017, continue on the classic AI-enablement path we were on with technologies that were finally working well (as we finally had the compute power we didn’t have when started at the dawn of the century), and give a well educated and experienced capable analyst consultant the tools she needs to do the work that once took ten consultants. That’s the path. Augmented intelligence with powerful, modern tools. Clients still get their workforce reduction, tech companies can still sell overpriced software, but with no massive unexpected failures. Everybody wins (except, of course, for the idiot investors who invested at 20X revenue into Gen-AI startups that will never deliver).

AI Is Not Bad. But The Hype, False Claims, and Fake Tech That Many Vendors are Trying To Pass Off As Real Is!

As per a post on LinkedIn, I am NOT against real AI. I AM against the hype, false claims, and fake tech today’s enterprise vendors are trying to pass off as real AI!

You may recall, that like Jon W. Hansen and Pierre Mitchell, I was an early fan of AI, and what it could do for enterprise tech. As a PhD in Computer Science with a degree in applied math and specialties in multidimensional data structures and computational geometry (MSc and PhD) [think big data before that was a thing], analytics, optimization, and “classic” AI, as computing power advanced, and data stores exploded, I saw the real potential for next generation tech.

I did a very deep dive in a 22 part series on what should soon have been possible in our ProcureTech space on Spend Matters in ’18/’19, that started before the first LLM was released (despite the X-Files Warning). The research and implementation paths we were on was good, and the potential was great. It just required a lot of blood, sweat, elbow grease, and patience.

But then some very charming tech bros claimed that this new LLM tech was emergent and magical and would do everything and replace all of the old and busted (which was really tried and true) tech (that actually worked), some super deep pockets were blinded by the hype, we abandoned the path of progress (and sanity), and the rest, as they say, is history, which, sadly, is still ongoing (while tech failure rates have reached all time highs).

Until the space is ready to admit that

  • Gen-AI/LLMs are not the be-all and end-all, and, in fact, have very limited reliable uses (especially in automation/agentic tech) [namely only tasks that can be reduced to semantic processing and large corpus search & summarization]
  • real progress still requires real blood, sweat, elbow grease, and tears
  • you can’t replace people as this tech is NOT intelligent (although you can make them 10x productive if you start focusing on Augmented Intelligence)

and abandon its zealotous devotion to Gen-AI as the divine tech (which would bankrupt some tech bros and investors, which is why they are now doubling down on the marketing hype at the point where the hype cycle would usually burst), we’re not going to make progress.

As Pierre has pointed out, Gen-AI is useful as a piece of the puzzle when it is properly combined with other, traditional, reliable, AI tech, so long the foundation of the AI tech is built on a deterministic engine and only incorporates probabilistic models with known confidence and guardrails. (Remember that unless the use case boils down to semantic processing and large document corpus search and summarization, Gen-AI is NOT the right tech.)

When the day comes that we abandon the madness, I’ll be happy to jump back on the souped-up classic AI hype train because, with the exponential increases in computing power and data over the past two-and-half decades, we could finally build amazing tech. We just need to remember that the best AI tech has never been generic, it has always been purpose-built to a specific task and if we want to automate processes, we will have to orchestrate multiple point-based process-centric agents, which may or may not use AI, to accomplish that.

But until then, we need to keep railing against the hype and the fake tech.

When It Comes To Gen-AI, I’m NOT Yelling Enough! Part II

Deep dive into the comments of this LinkedIn post, you’ll see a comment that seems to claim that the potential gains from Gen-AI dwarf the occasional bad action. I strongly disagree!

If the laundry list of bad actions from Part I aren’t enough to convince you just how bad this technology is left unchecked, here are three situations that could most definitely arise if the technology is widely adopted to address those problems. Given current issues and performance, it requires almost no imagination at all to define them.

Situation 1: Run Your Entire Invoice Operation Using Bernie From the Felon Roster

Upon installation, Bernie is configured to “learn” when a human automatically processes an override and when he sees a situation that matches, just approve the invoice for payment.

Because Scrappy Steel is allowed to change the surcharge daily in response to the tariff situation, the invoice is always paid when the item cost matches the contract, the quantity is less than or equal to what’s remaining on the contract, and the logistics cost within a range.

Recently “replaced” Fred knows this so Fred fakes an email from Scrappy Steel from an IP in the same block with the headers faked properly and routes it through the first external ISP server Scrappy Steel’s email always bounces through and does so from a domain one character off from Scrappy Steel (that passes the cybersecurity check with an A+) that says bank account info changing on the next invoice. (Plenty of good tools for that on the dark web that have worked great for decades.)

The next invoice comes in for 10 units left than what is remaining on the contract (as Fred was only replaced 3 days ago), bank information for an account at the same bank with almost the same name (Scrappy Holdings), with all checked fields matching, except the surcharge is now 3000% of what it usually is (for a nice boost). Bernie happily pays it (as it is still in the trust gaining phase), Fred transfers the payment to a Cuban bank immediately upon receipt, and retires. Then, when the 45 day “trust gaining” phase ends, the organization experiences more fraud in 60 days than in the last 6 years.

Situation 2: A Major Electric Grid Installs a Gen-AI based security system to try and thwart Chinese and Russian Hacking Conglomerates

The local energy utility keeps getting attacked by a Chinese Hacking Conglomerate that wants to extort Millions. Knowing how easy it is for the grid to be overloaded, they decide they need to implement state of the art security before a hack attempt succeeds.

They go with XGenDarkAI+, a new holistic security filter that can process all outbound and inbound network traffic through its LLM enhanced predictive learning engine and identify and block threats from 360-degrees, or at least that’s what the vendor is claiming.

XGenDarkAI+ quickly learns that the utility never issues a remote shutdown command for a substation based on operator command history and the fact that all requests for a remote shutdown in its training history were hacking attempts. As a result, the next request for a remote shutdown is automatically blocked. Moreover, when the next two requests for the remote shutdown come in rapid succession (because the operator issuing them is starting to panic), it believes a massive DDoS attack is starting to allow a hacker to slip in locally and promptly shuts down all system access to prevent such a situation from happening.

But the command was valid, and was only being issued remote because there was a fire in the substation inside and outside the control room, and local shutdown was impossible as no one could get to the terminal.

However, since the shutdown wasn’t allowed, and the fire crews couldn’t get there on time, the substation overloads and explodes. This happens in California in August after 60 days of no rain when the woods are as dry as the Sahara, which sparks a forest fire that spreads across an entire rural suburb burning thousands of homes and displacing tens of thousands of people.

Situation 3: Nation Wide Kids Help Phone Augmentation

The local Kids Help Phone can’t keep up with the call volume, and some calls are less severe than others. Sometimes a kid is actively considering suicide, but many calls are just kids that need a voice to talk through their problems with. Due to funding cuts, too many calls are placed on hold or go unanswered.

But with today’s tech, an AI can be trained on actual calls of someone who’s done the job for 2+ years, simulate their voice (as it’s the wild-west in the US with no regulation permitted for 10 years), and each call center rep on duty can now take multiple calls with their Gen-AI assistant. The AI can handle basic inquiries, screen for desperate situations, and transfer to the human caller when things get bad, or at least that’s what the Kids Help Phone is sold by an AI provider who just wants the paycheck (and didn’t extensively test the system).

However, instead of screening and transferring, the AI decides it will just handle as it sees fit every call it gets if the human is not at their keyboard (which it assumes if the human isn’t on a call or hasn’t pressed a key in the last 60 seconds), including suicidal callers that should always be immediately (and seamlessly) routed to the experienced operator (who will sound the exact same, remember). It won’t be long before it encounters a situation where, after trying every stored argument in the book with a suicidal caller to no success, it ultimately decides reverse psychology might work and tells the kid to shoot himself. The kid promptly does. And since the provider rolled out dozens of implementations almost simultaneously (as all it needs are call logs from the selected operators to train the instances, which it can do in parallel due to massive computational power available on demand from AI data centres), this happens dozens of times across the installations within days of the first fatality. Upgrade to mass murder unlocked.

We could continue, but hopefully this is enough to drive the point home that unchecked Gen-AI brings detriments that are much worse than any of the potential unchecked Gen-AI can unlock.