Master Data

Master Data Management (MDM) is a way of keeping your data accurate.

I’ve spoken about MDM at companies, technology groups, business groups, database groups, conferences, college classes, and more.

Here’s the general overview that I typically give (download the PowerPoint file).


“Too much information is driving me insane.” (Too Much Information, The Police, 1981)

“Two men say they’re Jesus. One of them must be wrong.” (Industrial Disease, Dire Straits, 1982)

“A man with one watch always knows what time it is. A man with two watches is never sure.” (San Diego Union, Lee Segall, 9/30/1930)

To master your data, take these steps:

  • look at each record of incoming data
  • standardize the formatting (phone numbers, addresses, etc.)
  • validate that it is allowable in your system (sure, 1/1/1900 is a real birthdate, but is it really the date that you made a sale?)
  • have a human being look through the edge cases (this John Smith might be the same as that other John Smith)
  • integrate the good/new/updated record into your master
  • communicate the update/entry to any systems that are interested


When this is done, it opens up many possibilities for improvement.

When you know that your data is correct, you can:

  • govern it (lock down who can add/change data, and make it adhere to your own rules)
  • synchronize it (keep all your systems in tune with each other)
  • centralize it (set up a single master ID that keeps pointers to other systems and IDs)
  • log it (see how and when data changed, and which system and person did it)
  • analyze it (for meaningful results)
  • mine it (for unexpected insights)
  • act on it (make faster and better decisions)
  • enrich it (roll in data from outside sources like the Better Business Bureau, US Census Bureau, and others)
  • adapt it (if you acquire a new company, merging their data with your becomes much more manageable)
  • scale it (if you need to process 100x more information, it becomes a matter of adding new hardware as needed)


Once it’s all running smoothly, you could also:

  • improve retention by finding existing customers who feel ignored
  • prevent fraud by comparing new data with old, according to patterns that you expect
  • minimize returned mail and packages by ensuring that addresses are accurate
  • identify your best customers (least effort, most profit = cash cows) and keep them happier
  • find mismatched data (do John Sr, John Jr, and John III all live at the same address? separate them)
  • absorb new sources and new types of data easily
  • stop wasting time with yoyos (customers who come and go and come and go)
  • find the key influencers in your network of customers and leads
  • grab the low-hanging fruit (least effort, big payoff)

Download the slides here