Category Archives: MDM

Data is Too Darn Expensive Today … But It Won’t Be For Long

THE PROPHET, who has recently discovered ranting is his new favourite thing to do (on LinkedIn), recently complained that Procurement, Commodity, and Supplier Data is Too Darn Expensive.

And while he’s right in that data is often too expensive for what it is, it’s not going to stay that way. Next generation providers are going to commoditize quality data and lower anonymized community data subscriptions to win (and keep) clients, because they know that there’s no value in advanced technology alone (and especially in analytics, optimization, and AI wihtout quality data to feed it) but there are three key points he missed in his rant where he complained about data prices and advocated the use of LLMs and Gen-AI as a substitute (which they are not, and considering how much they hallucinate, we wouldn’t even trust them to be directionally accurate — just feed the historical data you can get your hands on into Excel and do some basic trend plotting if directionality is enough).

1) As Lisa Reisman noted in the comments, sometimes you need highly granular accurate data by geography, volume, and production methodology. When pennies make a difference, because you are buying tens or hundreds of millions worth of the material for a global operation, it matters.

2) Most firms are still ignoring their own data, which, when run through something like Covalyze (which THE PROPHET should love as it was founded and designed by economists), gives very accurate target cost models on any category the firm has enough historical data on, allowing them to pinpoint where they need more data and why for cost breakdowns (and should cost models to refine the target cost models), and which suppliers they actually need those expensive profiles on. Then they can go to pay by the sip providers like Veridion for basic supplier data or other emerging commodity and supplier data portals.

3) The amount of data most firms need is much less than they think. In the tail, most of the spend is not significant enough for any market data to provide insight on a significant savings potential beyond what you will get from analyzing your own historical data and market quotes. When pennies won’t make a difference, you don’t do detailed cost breakdowns by raw material. When the product is a commodity that can be supplied by multiple suppliers at similar price points and equal quality levels, you don’t do deep risk profiles because you can just go to the next supplier in the queue if the first one fails you. And so on. You only do detailed analysis where there is statistical likelihood of a real opportunity or a real risk. Otherwise it’s a waste of time, money, and resources as no organization today even comes close to fully analyzing the significant categories and risks they have in any given year. Thinking you will do more is delusional and not worth it if you don’t have the basics covered.

By the time firms actually need more data, you can bet a next generation of data providers will have it readily available and cheap by today’s standards.

Cross-Enterprise Part Data Synchronization Nightmare? Don’t Get Reactive, Get Creactives!

For those who have been following along, and who had access, the doctor first covered Creactives in 2021 on Spend Matters Pro (I: Overview and II: Deep Dive) and provided one of the first North American overviews of Creactives’ industry leading AI Knowledge Engineering Platform for Material (& Service) Classification and Master Data Governance.

We’ll give a brief review of the foundations in this post, but focus on the enhancements which, honestly, have been made throughout the platform and force us to cover most of it again (but that’s a good thing for you).

Founded back in 2000 as a cost reduction consultancy to help global manufacturers in Automotive, Aerospace & Defense, Industrial Equipment & Machinery, High-Tech, Chemicals, Metal Production, Food and Beverage, Pharma, Energy Production, Oil & Gas companies get a grip on their data, they were the first to realize that the largest problem multi-national manufacturers had was data synchronization across the enterprise. The major Source-to-Pay players like to gloss over this fact, because then it would become all too clear that there isn’t just one ERP (or PLM) to integrate to, one supplier master, and, more importantly, one part master, but dozens — at least one for, and in, each of the different markets served by the multinational and typically in the language of that location.

As a result, global consolidation of demand for centralized sourcing of common parts or core materials becomes almost impossible since every ERP:

  • has its own record structure for a part/material
  • has its own categorization and GL-coding
  • shoves most of the core requirements in the description
  • and every department uses their own shorthand to cram a paragraph into 256 characters or less
  • usually in their own language
  • and they may or may not include the supplier in the description
  • etc.

This makes cross-organizational data mappings and harmonization at the part level almost impossible — and also explains why almost no vendors even attempt to tackle the problem and label it a “data cleansing and harmonization project by consultancies before you implement our system“. Which, as we know, means it usually doesn’t get done, or get done well, and then the sourcing/procurement/supply chain solutions these customers buy never deliver on the promises that were made (because all those promises assumed correct, complete, and universally harmonized data across the organization).

If the organization is lucky, it will stumble upon one of the supplier data cleansing, harmonization, and enrichment providers (and by that we mean Tealbook, Veridion, or ScoutBee), and at least get its supplier master in order, which will help it use one of the third party manufacturing supplier discovery platforms (like Find My Factory or PartFox) and use that to cross-reference its database and at least figure out which suppliers are supplying its parts and materials and capable of supplying more parts and materials across the organization, but that’s about it. It will still have to investigate the part records one by one to determine precisely what the part is, if it is a duplicate record of another part (somewhere else in the category hierarchy or in another ERP), which suppliers can supply the part, which suppliers can supply a more-or-less equivalent substitute part, and what the differences are.

For a typical multi-national enterprise which will have thousands of material groups across dozens of ERP instances (30 to 50 is NOT uncommon), this just doesn’t happen. At best, the organization will identify the top product lines, the top parts by cost or product in each product line, and undertake a focussed effort to try and identify commonality in suppliers and parts for only those parts across the organization, and stop the consulting engagement there. The hope is that this will cover 60% to 80% of the core direct spend, but since the mapping will only be 60% to 80% at best, the organization will miss out on 1/2 to 2/3 of the direct sourcing and cost optimization opportunities that would be available to it if they achieved 95% to 98% mapping across all of the parts across the entire organization.

Creactives was developed over two decades to solve this specific challenge (which is a nightmare in most large multi-national enterprises), and that is precisely what it does. As was written four years ago, at its core, Creactives is a platform designed to properly identify, and classify, procurement items in enterprise master data to support the proper taxonomic classification, reporting and analysis within Procurement and other enterprise systems. It does this by way of custom designed ML and AI technology that has been developed over two decades (and which was developed long before the new generation of hallucinatory LLMs, which is why it actually works), which integrates proprietary dictionaries, semantic processing, linguistic identification, clustering, and deep learning technology in a highly specialized and optimized arrangement that takes advantage of a human in the loop to achieve very high accuracy in its classification. Creactives guarantees 95%, but typically achieves 98% (or more) for the majority of its clients.

This is not easy to do when, in an average organization, even as something as simple as a ball bearing might appear in a dozen different organizational group categories (spare parts, MRO spare parts, electric motor spare parts, bearings and accessories, mechanical parts, steel parts, appliance repair, etc.) and then variations of those material groups across dozens of ERP systems. Even when you get more complex, such as motors, and you think you’d have standardization across the enterprise, you still typically have a few different categories (electric, mechanical, appliance, etc.) and a few (to a few dozen) variations. When today’s appliances will contain hundreds of parts, automotive/aerospace vehicles thousands, and electronics systems tens of thousands, a large multi-national enterprise will use hundreds of thousands of parts and have millions (and millions) in its databases as it will have many (and sometimes dozens of) duplicates, a lot of equivalent variations, even more substitutions, and often these will be replicated across multiple suppliers the organization is doing business with. (When we hinted it’s a nightmare task, we weren’t joking.)

This is what the core of the Creactives Material & Services Master Data Governance product does in its TAM4 offering, and what powers its

  • data cleansing and enrichment service (which underlies not only its initial implementation and integration services to get the organization up and running but its ongoing data cleansing & enrichment service)
  • spend analysis
  • data assistants (including its SAP integration that ensures the user always selects the right part and its new part creator)

When a buyer first selects Creactives, the first thing it will do is integrate all of the organizations ERP systems, bring all the data in (with source tracking), create translations of all of the data to the working languages of the project (while maintaining the source data), organize the records under the existing material groups, analyze all the data, assign the records to the lowest level UNSPSC categorization possible, use this to recommend the new material group structure (by identifying duplicate, poorly defined, or unused material groups), identity as many of the (re)mappings as possible, create a sample list of mappings for verification (where it believes it has the mappings right) for the human in the loop, and create a representative list for mappings that need to be made (where one mapping will allow it to potentially map dozens of other parts based on that insight) for the human in the loop.

At this point the human will be able to see the initial mapping progress dashboard which will summarize the number of ERP instances, geographic coverage, languages, locations, original material groups, new material groups, UNSPSC (sub) categories, brands, codes, mappings, and computed accuracy. From here, the reviewers can drill into the data from any of the previously mentioned dimensions, review the line items, verify or correct, and more importantly, dive into the unmapped or the mappings needing verification, do the mappings and verifications, and kick off the next training cycle. This will continue until the desired accuracy is achieved (which will improve over time as new data comes into the system properly mapped and the system is retrained on a regular basis).

Once the initial training cycle is complete, all of this data will be explored, managed, and enriched through their Material Master Governance product — TAM 4. From here the buyer can explore categories, identify and merge/delete duplicates, associate substitutes, bulk upload new parts or supplier-part data, download part records for editing by the shop-floor team (who only know how to use Excel) and then upload those records again when modified, manage suppliers, monitor all of the integrations, and keep track of current workflow processes around training or data enrichment. The platform can also be used by the user to manage all of their associated tasks around (new) part creation/enrichment, approvals, duplicate management, (supplier) relationships, etc. And, of course, they can also dive into the standard spend and product reports.

Part creation and modification through the platform is trivial. To modify a part, the user can simply edit any field of any existing part, as well as all of the default language translations in any working language used by the organization (which could be dozens in a large multi-national), and the updates, once approved, will be pushed to all integrated systems. To create a new part, the user can find the closest part record in the system, copy it, add or remove fields as necessary, and change the field data as necessary. Creating minor variations or changes as a product design changes over subsequent iterations is extremely quick and easy, and since the platform can integrate with the PLM and bring in the associated diagrams and drawings, data can truly be harmonized across enterprise NPD, Sourcing, Manufacturing, and Supply Chain systems.

Since our last deep dive four years ago, the TAM platform has been enhanced significantly. The self-serve Vanessa interface (that consulting partners can use when clients want to continue to work with their preferred consulting firm on sourcing/supply chain/data management projects) has had workflow and usability improvements (and has reduced the amount of data a human-in-the-loop has to manually categorize for maximum mapping efficiency) by way of an increased focus on identifying the most representative records for training and classification purposes (through [statistical]) similarity to other records]). (They’ve also optimized their processes too and can typically get a multi-national enterprise up and fully operational across its dozens of systems in three months. The data can then be fully maintained from that point on, through automated data cleansing and synchs on a weekly basis.)

The core analytics platform is enhanced and the category explorer, which is built on the core platform, allows the user to drill down and filter on any dimension at any time, and the user can even do pattern searches on key (description) fields. It’s also been enhanced to allow users to identify records with missing fields and categories which would most benefit from manual review and data record enhancement. They’ve also improved the dashboard interface summaries that allow a user to quickly get a high-level understanding of a category that includes the attributes, material types, plant and country distribution, spend, suppliers, etc.

Duplicate management has been greatly enhanced and a user can determine for any category, subcategory, or part, the material groups or parts that partially match the filter, fully match, have active processes, the materials used, and the stock, consumption, and order amounts. Drilling in to a stock part, the user can not only see the identified duplicates in the system but the stock, consumption, and PO units and dollars for each, which gives insight into the savings potential from volume leverage and standardization across a right-sized group of suppliers. From here they can accept the duplicate (and the parts will be merged in the master), assign it for processing to someone else (if it looks like it might be a substitute if not an exact duplicate but an expert is needed to classify it), or confirm one or more parts are distinct and not part of a duplicate group (that should be associated with the stock item in one or more of the organization’s ERPs).

The smart part creation/modification is also enhanced (and fully embedded in the tool) and allows a user to bring up similar parts by description, select one, add or subtract fields, update standard attributes (and the system has been trained on data sheets that specify standard part attributes for tens of thousands of categories and has all of these templates at its disposal), select from standard (enforced) values, carefully control the description field, in every language, that will be fed back to the ERP, and ensure the (new) part is properly categorized from the beginning. And, of course, the user can also upload/pull in all of the documents, diagrams, and models necessary to completely describe that part and store all of the part data in one central location.

As we said before in our summary of Vanessa, it’s the power tool an expert in multi-lingual direct material data classification needs to unify data across dozens of global ERP, MRP and associated database instances and harmonize tens, or hundreds, of thousands of parts across those dozens of systems to enable more efficient NPD/NPI, sourcing, supply base rationalization, and supply chain design. And one that will likely not be equalled for years*. So if you need better cross-enterprise part and material masters, we cannot stress enough how important it is to include Creactives in your RFP/evalulation.

* As ForeStreet found out when they decided eight years ago to build an AI-backed supplier intelligence tool, it’s a very significant challenge and not one you can just throw a DLNN or LLM at and solve it (and, if you try, you’ll end up with a situation that’s only worse). (Remember, it took Forestreet 7 years to get the first version of its platform ready for wide market release, and supplier intelligence is a simpler problem. You shouldn’t be surprised that it took Creactives 18 years to get to self serve and over 20 to get where they are today — which explains why they are currently alone in their category.)

Just like there was no Æther, there’s no data fabric either!

In a recent LinkedIn posting just before the holidays, THE REVELATOR asked a very important question. A question that may have gone overlooked given that many people are busy trying to get their work done before the holidays so they can get a few days off. And a question that must NOT be forgotten.

1. How does the old technology phrase “garbage-in, garbage-out” apply to Gartner’s Data Fabric post?

Data files. Databases. Data stores. Data warehouses. Data lakes. Data Lakehouses. And now … the data fabric … which is, when all is said and done, just another bullsh!t organizational data scheme designed to distract you from the fact that your data is dirty, that data storage providers don’t know what to do about it, but these data storage providers still need to sell you on something new to maintain their revenue streams.

You see, the great thing about today’s SaaS middleware enabled apps is that they don’t care where the data is, what organizational structure the data is stored in, etc. As long as the data has a descriptor that says “this field, which is in this format, in this db stores X” (where X describes the data) and an access key, the SaaS middleware can suck the data in, convert that data into the format it needs, and work with that data.

However, now that we are in the age of “AI”, the most important thing has become good, clean, data. However, just “weaving” your bad data together doesn’t solve anything. In fact, with today’s technology, it just makes things MANY times worse. We are now at garbage in, hazardous waste out!

Unfortunately there’s nothing we can do if the AI zealots are now adding hallucinogenics to their kool-aid, because it sounds like they are trying to bring back the magical medeival Æther … *groan*

THE REVELATOR then went on to ask …

2. Why does Gartner confuse more than inform and enlighten?

At the end of the day, you have a better chance of appearing as an enlightened Guru to someone who is lost and confused than to someone who is clear headed and confident in one’s direction!

Like the other big analyst firms, they profit off of being the Gurus the executives turn to when they can’t make sense of the hogwash filled marketing madness they are inundated with every day!

More specifically, their sales people need to say: “Our senior analyst has all of the answers … and they can be yours at the low, low introductory price of only 9,999,99 USD a day*.” So they don’t really care about whether or not they are confusing more than enlightening, as long as the sales are coming in. (In fact, they aren’t even looking to see how they are doing as long as the money keeps rolling in

* one day only, after that, full rate of 29,999.99 a day applies …

But the questions didn’t stop there. The next question was:

3. Why are Data Problems Solved Downstream?

The answer to this is not as easy or straightforward, but when you consider that:

  1. it’s hardwork to solve the problems at the source and
  2. most of these analyst firms are staffed with analysts with little fundamental understanding of technology or the domains they are analyzing the technology for, don’t want to admit it, and are happy to take guidance from the vendors cutting them the biggest cheques and spending the most time “educating” them on the paradigm the vendor wants to see …

What should one expect.

Case in point. Did IDC just happen to come up with a “Worldwide SaaS and Cloud-Enabled Spend Orchestration Map” on its own at the same time a whole bunch of these solutions hit mainstream? (Especially when it takes person years of research and development to design a new map and analyze vendors, at least if you want to try and get it right.) Especially when they don’t have enough senior analyst talent to adequately cover core S2P?

Another case in point. Did Gartner merge it’s P2P into a S2P map because it honestly believes the entire market is heading there (FYI it’s not, look at the Mega Map), or because it doesn’t have enough analyst talent left to attempt to cover the market fragmented?

At the end of the day, it takes many years and many degrees to get a fundamental understanding of modern technology (which all runs on math, by the way) and many more years to get expertise in a business domain … so what can you honestly expect of kids straight out of school who make up significant portions of analyst teams???

Which led to the next question.

4. Can innovation co-exist with exclusivity?

Innovation happens, but then big stalwarts in the space scoop it up to try and remain competitive enough to keep their current customers locked in, a vacuum is created, and the cycle starts anew.

Until Trump dismantles them entirely, the US, like most of the pseudo-free first world, has enough anti-monopoly laws to ensure the cycle continues.

So yes, innovation can coexist with exclusivity, it just takes decades to realize what could happen in less than one decade as a result of having to start over so many times.

Finally, this led to the final question:

5. Does the VC investment model of: for every ten investments, seven fail, two are mediocre, and one “hits pay dirt” have anything to do with the 80%+ technology project failure rate?

It most certainly does! The fact that VCs are happy for seven investments to fail entirely (and then just move the good people to other investments if those people want to keep working) doesn’t help the project failure rate … especially since so many companies don’t survive long enough to master models that will lead to success, instead of failure, 80%+ of the time or to take the time to gauge, plan, and do implementations properly (because, if they don’t sell the next deal within a quarter, the investors will drop them faster than a hot potato).

Enterprises have a Data Problem. And they will until they accept they need to do E-MDM, and it will cost them!

This originally published on April (29) 2024.  It is being reposted because MDM is becoming more essential by the day, especially since AI doesn’t work without good, clean, data.

insideBIGDATA recently published an article on The Impact of Data Analytics Integration Mismatch on Business Technology Advancements which did a rather good job on highlighting all of the problems with bad integrations (which happen every day [and just result in you contributing to the half a TRILLION dollars that will be wasted on SaaS Spend this year and the one TRILLION that will be wasted on IT Services]), and an okay job of advising you how to prevent them. But the problem is much larger than the article lets on, and we need to discuss that.

But first, let’s summarize the major impacts outlined in the article (which you should click to and read before continuing on in this article):

  • Higher Operational Expenses
  • Poor Business Outcomes
  • Delayed Decision Making
  • Competitive Disadvantages
  • Missed Business Opportunities

And then add the following critical impacts (which is not a complete list by any stretch of the imagination) when your supplier, product, and supply chain data isn’t up to snuff:

  • Fines for failing to comply with filings and appropriate trade restrictions
  • Product seizures when products violate certain regulations (like ROHS, WEEE, etc.)
  • Lost Funds and Liabilities when incomplete/compromised data results in payments to the wrong/fraudulent entities
  • Massive disruption risks when you don’t get notifications of major supply chain incidents when the right locations and suppliers are not being monitored (multiple tiers down in your supply chain)
  • Massive lawsuits when data isn’t properly encrypted and secured and personal data gets compromised in a cyberattack

You need good data. You need secure data. You need actionable data. And you won’t have any of that without the right integration.

The article says to ensure good integration you should:

  • mitigate low-quality data before integration (since cleansing and enrichment might not even be possible)
  • adopt uniformity and standardized data formats and structures across systems
  • phase out outdated technology

which is all fine and dandy, but misses the core of the problem:

Data is bad (often very, very bad), because the organizations don’t have an enterprise data management strategy. That’s the first step. Furthermore this E-MDM strategy needs to define:

  1. the master schema with all of the core data objects (records) that need to be shared organizational wide
  2. the common data format (for ids, names, keys, etc.) (that every system will need to map to)
  3. the master data encoding standard

With a properly defined schema, there is less of a need to adopt uniformity across data formats and structures across the enterprise systems (which will not always be possible if an organization needs to maintain outdated technology either because a former manager entered into a 10 year agreement just to be rid of the problem or it would be too expensive to migrate to another system at the present time) or to phase out outdated technology (which, if it’s the ERP or AP, will likely not be possible) since the organization just needs to ensure that all data exchanges are in the common data format and use the master data encoding standard.

Moreover, once you have the E-MDM strategy, it’s easy to flush out the HR-MDM, Supplier/SupplyChain-MDM, and Finance-MDM strategies and get them right.

As THE PROPHET has said, data will be your best friend in procurement and supply chain in 2024 if you give it a chance.

Or, you can cover your eyes and ears and sing the same old tune that you’ve been singing since your organization acquired its first computer and built it’s first “database”:

Well …
I have a little data
I store it on my drive
And when it’s old and flawed
The data I’ll archive

Oh, data, data, data
I store it on my drive
And when it’s old and flawed
The data I’ll archive

It has nonstandard fields
The records short and lank
When I try to read it
The blocks all come back blank

I have a little data
I store it on my drive
And when it’s old and flawed
The data I’ll archive

My data is so ancient
Drive sectors start to rot
I try to read my data
The effort comes to naught

Oh, data, data, data
I store it on my drive
And when it’s old and flawed
The data I’ll archive

Enterprises have a Data Problem. And they will until they accept they need to do E-MDM, and it will cost them!

insideBIGDATA recently published an article on The Impact of Data Analytics Integration Mismatch on Business Technology Advancements which did a rather good job on highlighting all of the problems with bad integrations (which happen every day [and just result in you contributing to the half a TRILLION dollars that will be wasted on SaaS Spend this year and the one TRILLION that will be wasted on IT Services]), and an okay job of advising you how to prevent them. But the problem is much larger than the article lets on, and we need to discuss that.

But first, let’s summarize the major impacts outlined in the article (which you should click to and read before continuing on in this article):

  • Higher Operational Expenses
  • Poor Business Outcomes
  • Delayed Decision Making
  • Competitive Disadvantages
  • Missed Business Opportunities

And then add the following critical impacts (which is not a complete list by any stretch of the imagination) when your supplier, product, and supply chain data isn’t up to snuff:

  • Fines for failing to comply with filings and appropriate trade restrictions
  • Product seizures when products violate certain regulations (like ROHS, WEEE, etc.)
  • Lost Funds and Liabilities when incomplete/compromised data results in payments to the wrong/fraudulent entities
  • Massive disruption risks when you don’t get notifications of major supply chain incidents when the right locations and suppliers are not being monitored (multiple tiers down in your supply chain)
  • Massive lawsuits when data isn’t properly encrypted and secured and personal data gets compromised in a cyberattack

You need good data. You need secure data. You need actionable data. And you won’t have any of that without the right integration.

The article says to ensure good integration you should:

  • mitigate low-quality data before integration (since cleansing and enrichment might not even be possible)
  • adopt uniformity and standardized data formats and structures across systems
  • phase out outdated technology

which is all fine and dandy, but misses the core of the problem:

Data is bad (often very, very bad), because the organizations don’t have an enterprise data management strategy. That’s the first step. Furthermore this E-MDM strategy needs to define:

  1. the master schema with all of the core data objects (records) that need to be shared organizational wide
  2. the common data format (for ids, names, keys, etc.) (that every system will need to map to)
  3. the master data encoding standard

With a properly defined schema, there is less of a need to adopt uniformity across data formats and structures across the enterprise systems (which will not always be possible if an organization needs to maintain outdated technology either because a former manager entered into a 10 year agreement just to be rid of the problem or it would be too expensive to migrate to another system at the present time) or to phase out outdated technology (which, if it’s the ERP or AP, will likely not be possible) since the organization just needs to ensure that all data exchanges are in the common data format and use the master data encoding standard.

Moreover, once you have the E-MDM strategy, it’s easy to flush out the HR-MDM, Supplier/SupplyChain-MDM, and Finance-MDM strategies and get them right.

As THE PROPHET has said, data will be your best friend in procurement and supply chain in 2024 if you give it a chance.

Or, you can cover your eyes and ears and sing the same old tune that you’ve been singing since your organization acquired its first computer and built it’s first “database”:

Well …
I have a little data
I store it on my drive
And when it’s old and flawed
The data I’ll archive

Oh, data, data, data
I store it on my drive
And when it’s old and flawed
The data I’ll archive

It has nonstandard fields
The records short and lank
When I try to read it
The blocks all come back blank

I have a little data
I store it on my drive
And when it’s old and flawed
The data I’ll archive

My data is so ancient
Drive sectors start to rot
I try to read my data
The effort comes to naught

Oh, data, data, data
I store it on my drive
And when it’s old and flawed
The data I’ll archive