Data Governance is Essential to Good Data Management …

… so why is there still so little of it in most organizations?

Good data is becoming ever more successful to business and Procurement success, especially if you want to use any any sort of predictive analytics or AI, but so few organizations have so little data governance, if they have any at all. With good data, you can get great insight into current operations, opportunities, and ordeals. Without good data, you have no clue what you’re buying or selling, what processes are going on at any point of time, or what problems are festering about to explode and cause major issues.

But good data is a rarity in most organizations, getting rarer by the day due to rapidly increasing data volumes (in excess of 400 million terabytes of data being generated daily across the globe), lack of controls in legacy systems, poor data processes, and lack of good IT talent with enough history to know what the data is, what it’s used for, and how to qualify it as good, or bad.

Why? Because organizations are putting systems in place before understanding what data those systems will need, where it will live, how it will be validated, how it will be maintained, how it will be archived, and how it will eventually be retired.

In most organizations, when they need data for an analytics-based project, the current answer is to get a “data warehouse”, “data lake”, or “data lakehouse”; dump all the organizational data to that warehouse, lake, or lakehouse; possibly run a simple AI-cleansing/enrichment algorithm, and hope for the best. However, this is not governance, and, in fact, exacerbates the problem more than it solves it. Now there are two copies of bad data, no strategy for pushing back any data that is cleansed, and if the data is changed in the source system before any eventual synch with the data warehouse, which data is correct? Chances are neither record is fully accurate, and any synch has to be done at the field level, if you have enough data to validate which field is correct (as you can’t just use time stamps, because if some data was updated by AI and unvalidated, it may not be right).

Governance is not just maintaining data in systems as you use it, occasionally validating it against third party databases or by manual review, and occasionally enriching it.

Governance is


  • defining what data the organization needs for its various functions
  • defining what data will be collected
  • defining what systems it will be maintained in, and, if the data is in multiple systems, which system is master
  • defining which data fields are critical and how they will be validated
  • defining when and how critical fields will be revalidated
  • defining the process for any data migration from master systems

And doing it


  • collecting the data
  • installing a new system
  • stating an analytics / AI project

NOT AFTER!

But how many organizations do that? Most don’t even do a proper RFP (taken in by the FREE RFP scam), even though the solution to good software (which is critical to maintaining good data) is an Affordable RFP.

Moreover, part of the RFP for any software solution should define the data management strategy as it impacts, and is impacted by, the solution.