Coursera Learner working on a presentation with Coursera logo and
Coursera Learner working on a presentation with Coursera logo and


Most software systems have lists of knowledge that are shared and employed by several of the applications that structure the system.

For example: A typical ERP system will have at the very least Customer Master, Item Master and Account Master data lists. This master data is usually one among the key assets of a corporation. In fact, it’s commonplace for a corporation to be acquired primarily for access to its Customer Master data.

Rudimentary Master Data Definition

One of the foremost important steps in understanding master data is going to know the terminology. To start, there are some alright understood and simply identified master data items, like “customer” and “product.” Truth be told, many define master data just by reciting a commonly prescribed master data item list, such as: Customer, Product, Location, Employee and Asset.

But how you identify elements of knowledge that ought to be managed by a MDM software is far more complex and defies such rudimentary definitions. Which has created tons of confusion around what master data is and the way it’s qualified.

To give a more comprehensive answer to the question of “what is master data?”, we will check out the 6 sorts of data typically found in corporations:

  1. Unstructured Data: Data found in email, white papers, magazine articles, corporate intranet portals, product specifications, marketing collateral and PDF files.
  2. Transactional Data: Data about business events (often associated with system transactions, like sales, deliveries, invoices, trouble tickets, claims and other monetary and non-monetary interactions) that have historical significance or are needed for analysis by other systems. Transactional data are unit level transactions that use master data entities. Unlike master data, transactions are inherently temporal and instantaneous naturally.
  3. Metadata: Data about other data. It’s going to reside during a formal repository or in various other forms, like XML documents, report definitions, column descriptions during a database, log files, connections and configuration files.
  4. Hierarchical Data: Data that stores the relationships between other data. It’s going to be stored as a part of an accounting or separately as descriptions of world relationships, like company organizational structures or product lines. Hierarchical data is usually considered an excellent MDM domain because it’s critical to understanding and sometimes discovering the relationships between master data.
  5. Reference Data: A special sort of master data wont to categorize other data or wont to relate data to information beyond the boundaries of the enterprise. Reference data are often shared across master or transactional data objects (e.g. countries, currencies, time zones, payment terms, etc.)
  6. Master Data: The core data within the enterprise that describes objects around which business is conducted. It typically changes infrequently and may include reference data that’s necessary to work the business. Master data isn’t transactional in nature, but it does describe transactions. The critical nouns of a business that master data covers generally fall under four domains and further categorizations within those domains are called subject areas, sub-domains or entity types.

The four general master data domains are:


Within the customer’s domain, there are customer, employee and salesperson sub-domains.


Within products domain, there are product, part, store and asset sub-domains.


ithin the locations domain, there are office location and geographic division sub-domains.


Within the opposite domain, there are things like contract, warranty and license sub-domains.

Some of these sub-domains could also be further divided. As an example, customer could also be further segmented supported incentives and history, since your company may have normal customers also as premiere and executive customers. Meanwhile, product could also be further segmented by sector and industry. This level of granularity is useful because requirements, lifecycle and CRUD cycle for a product within the Consumer grocery (CPG) sector is probably going very different from those for products within the clothing industry. The granularity of domains is actually determined by the magnitude of differences between the attributes of the entities within them.


While identifying master data entities is pretty straightforward, not all data that matches the definition for master data should necessarily be managed intrinsically. Generally, master data is usually little portion of all of your data from a volume perspective, but it’s a number of the foremost complex data and therefore the most precious to take care of and manage.

So, what data do you have to manage as master data?

We recommend using the subsequent criteria, all of which should be considered together when deciding if a given entity should be treated as master data.


Because master data is employed by multiple applications, a mistake within the data in one place can cause errors altogether the applications that use it.

For example:

An incorrect address within the customer master might mean orders; bills and marketing literature are all sent to the incorrect address. Similarly, an incorrect price on an item master are often a marketing disaster and an incorrect account number in an account master can cause huge fines or maybe jail time for the CEO—a career-limiting move for the one that made the error .

The Benefits of making a standard Master Data List

While creating a clean master list are often a frightening challenge, there are many positive benefits to rock bottom line that come from having a standard master list, including:

A single, consolidated bill, which saves money and improves customer satisfaction

No concerns about sending an equivalent marketing literature to a customer from multiple customer lists, which wastes money and irritates the customer

A cohesive view of consumers across the organization, that way users know before they turn a customer account over to a set agency whether or not that customer owes money to other parts of the organization or, more importantly, if that customer is another division’s biggest source of business

A consolidated view of things to eliminate wasted money and shelf space also because the risk of artificial shortages that come from stocking an equivalent item under different part numbers

Finally, the movement toward SOA and SaaS make MDM a critical issue.

For example:

If you create one customer service that communicates through well-defined XML messages, you’ll think you’ve got defined one view of your customers. But if an equivalent customer is stored in five databases with three different addresses and 4 different phone numbers, what is going to your customer service return?

Similarly, if you opt to subscribe a CRM service provided through SaaS, the service provider will need an inventory of consumers for its database. Which list will you send?

For all of those reasons, maintaining a top quality, consistent set of master data for your organization is rapidly becoming a necessity. The systems and processes required to take care of this data are referred to as Master Data Management.


Master Data Management (MDM) is that the technology, tools and processes that ensure master data is coordinated across the enterprise. MDM provides a unified master data service that gives accurate, consistent and complete master data across the enterprise and to business partners.

There are a few things worth noting during this definition:

MDM isn’t just a technological problem. In many cases, fundamental changes to business process are going to be required to take care of clean master data and a few of the foremost difficult MDM issues are more political than technical.

MDM includes both creating and maintaining master data. Investing tons of your time, money and energy in creating a clean, consistent set of master data may be a wasted effort unless the answer includes tools and processes to stay the master data clean and consistent because it gets updated and expands over time.

Depending on the technology used, MDM may cover one domain (customers, products, locations or other) or multiple domains. The advantages of multi-domain MDM include a uniform data stewardship experience, a minimized technology footprint, the power to share reference data across domains, a lower total cost of ownership and a better return on investment.

The 6 Disciplines of a robust MDM Program

Given that MDM isn’t just a technological problem, meaning you can’t just install a bit of technology and have everything sorted out, what does a robust MDM program entail?

Before you start with a master data management program, your MDM strategy should be built around these 6 disciplines:

Governance: Directives that manage the organizational bodies, policies, principles and qualities to market access to accurate and authorized master data. Essentially, this is often the method through which a cross-functional team defines the varied aspects of the MDM program.

Measurement: How are you doing supported your stated goals? Measurement should check out data quality and continuous improvement.

Organization: Getting the proper people in situ throughout the MDM program, including master data owners, data stewards and people participating in governance.

Policy: the wants, policies and standards to which the MDM program should adhere.

Process: Defined processes across the info lifecycle wont to manage master data.

Technology: The master data hub and any enabling technology.


Once you secure buy-in for your MDM program, it’s time to urge started. While MDM is best when applied to all or any the master data in a corporation, in many cases the danger and expense of an enterprise-wide effort are difficult to justify.

PRO TIP: it’s often easier to start out with a couple of key sources of master data and expand the trouble once success has been demonstrated and lessons are learned.

If you are doing start small, you ought to include an analysis of all the master data that you simply might eventually want to incorporate in your program in order that you are doing not make design decisions or tool choices which will force you to start out over once you attempt to incorporate a replacement data source. For instance, if you’re initial customer master implementation only includes the ten ,000 customers your direct sales department deals with, you don’t want to form design decisions which will preclude adding your 10,000,000 web customers later.

Your MDM project plan is going to be influenced by requirements, priorities, resource availability, time-frame and therefore the size of the matter. Most MDM projects include a minimum of these phases:

As you’ll see, MDM may be a complex process which will continue for an extended time. Like most things in software, the key to success is to implement MDM incrementally in order that the business realizes a series of short-term benefits while the entire project may be a long-term process.

Additionally, no MDM project are often successful without the support and participation of the business users. IT professionals don’t have the domain knowledge to make and maintain high-quality master data. Any MDM project that doesn’t include changes to the processes that make , maintain and validate master data is probably going to fail.

The rest of this text will cover the small print of the technology and processes for creating and maintaining master data.


Whether you purchase a MDM tool or plan to build your own, there are two basic steps to making master data:

Cleaning and standardizing the info

Matching data from all the sources to consolidate duplicates.

Cleaning and Standardizing Master Data

Before you’ll start cleaning and normalizing your data, you want to understand the info model for the master data. As a part of the modeling process, you ought to have defined the contents of every attribute and defined a mapping from each source system to the master data model. Now, you’ll use this information to define the transformations necessary to wash your source data.

Cleaning the info and reworking it into the master data model is extremely almost like the Extract, Transform and cargo (ETL) processes wont to populate a knowledge warehouse. If you have already got ETL tools and transformation defined, it’d be easier just to switch these as needed for the master data rather than learning a replacement tool. Here are some typical data cleansing functions:

Normalize data formats: Make all the telephone number s look an equivalent, transform addresses then on to a standard format.

Replace missing values: Insert defaults, search ZIP codes from the address, search the Dun & Bradstreet Number.

Standardize values: Convert all measurements to metric, convert prices to a standard currency, change part numbers to an industry standard.

Map attributes: Parse the primary name and surname out of a contact name field, move Part# and partno to the PartNumber field.

Most tools will cleanse the info that they will and put the remainder into a mistake table for hand processing. Counting on how the matching tool works, the cleansed data are going to be put into a master table or a series of staging tables. As each source gets cleansed, you ought to examine the output to make sure the cleansing process is functioning correctly.

Matching Data to Eliminate Duplicates

Matching master data records to eliminate duplicates is both the toughest and most vital step in creating master data. False matches can actually lose data (two Acme Corporations become one, for example) and missed matches reduce the worth of maintaining a standard list.

Some matches are pretty trivial to try to. If you’ve got Social Security Numbers for all of your customers or if all of your products use a standard numbering scheme, a database JOIN will find most of the matches. This rarely happens within the world; however, so matching algorithms are normally very complex and complicated. Customers are often matched on name, surname, nickname, address, phone number, mastercard number then on, while products are matched on name, description, part number, specifications and price.

How do you have to Merge Your Data?

Most merge tools merge one set of input into the master list, therefore the best procedure is to start out the list with the info during which you’ve got the foremost confidence then merge the opposite sources in one at a time. If you’ve got tons of knowledge and tons of problems with it, this process can take an extended time.

PRO TIP: you would possibly want to start out with the info from which you expect to urge the foremost benefit once it’s consolidated then run a pilot program thereupon data to make sure your processes work which you’re seeing the business benefits you expect.

From there, you’ll start adding other sources as time and resources permit. This approach means your project will take longer and possibly cost more, but the danger is lower. This approach also allows you to start with a couple of organizations and add more because the project demonstrates success rather than trying to urge everybody on board from the beginning.

Another factor to think about when merging your source data into the master list is privacy. When customers become a part of the customer master, their information could be visible to any of the applications that have access to the customer master. If the customer data was obtained under a privacy policy that limited its use to a specific application, you would possibly not be ready to merge it into the customer master.

Because of implications around privacy, you would possibly want to feature a lawyer to your MDM planning team.

At now, if your goal was to supply an inventory of master data, you’re done. Print it out or burn it to an external disk drive and advance. If you would like your master data to remain current as data gets added and altered, you’ll need to develop infrastructure and processes to manage the master data over time.

The next section provides some options on the way to do exactly that.


There are many various tools and techniques for managing and using master data. We’ll cover three of the more common scenarios here:

Single copy: during this approach, there’s just one master of the master data. All additions and changes are made on to the master data. All applications that use master data are rewritten to use the new data rather than their current data. This approach guarantees consistency of the master data, but in most cases it’s not practical. That’s because modifying all of your applications to use a replacement data source with a special schema and different data is, at least, very expensive. If a number of your applications are purchased, it’d even be impossible.

Multiple copies, single maintenance: during this approach, master data is added or changed within the single master of the info, but changes are sent bent the source systems during which copies are stored locally. Each application can update the parts of the info that aren’t a part of the master data, but they can’t change or add master data.

For example:

The inventory system could be ready to change quantities and locations of parts, but new parts can’t be added and therefore the attributes that are included within the product master can’t be changed. This reduces the amount of application changes which will be required, but the applications will minimally need to disable functions that add or update master data. Users will need to learn new applications to feature or modify master data and a few of the items they normally do won’t work anymore.

Continuous merge: during this approach, applications are allowed to vary their copy of the master data. Changes made to the source data are sent to the master, where they’re merged into the master list. The changes to the master are then sent to the source systems and applied to the local copies. This approach requires few changes to the source systems. If necessary, the change propagation are often handled within the database so no application code is modified .On the surface, this looks like the perfect solution because application changes are minimized and no retraining is required. Everybody keeps doing what they’re doing, but with higher quality, more complete data. However, this approach does have several issues:

Update conflicts are possible and difficult to reconcile: What happens if two of the source systems change a customer’s address to different values? There’s no way for the MDM software to make a decision which one to stay, so intervention by the info steward is required. Within the meantime, the customer has two different addresses. This must be addressed by creating data governance rules and standard operating procedures to make sure that update conflicts are reduced or eliminated.

Additions must be remerged: When a customer is added, there’s an opportunity that another system has already added the customer. To affect this example, all data additions must undergo the matching process again to stop new duplicates within the master.

Maintaining consistent values is more difficult: If the load of a product is converted from pounds to kilograms then back to pounds, rounding can change the first weight. This will be disconcerting to a user who enters a worth then sees it change a couple of seconds later.

In general, of these things are often planned for and addressed, making the user’s life a touch easier at the expense of a more complicated infrastructure to take care of and more work for the info stewards. This could be a suitable trade-off, but it’s one that ought to be made consciously.

Master Data Management committee

It’s recommended that management-level representation from the MDM stakeholders form a committee to facilitate cross-functional decision-making. Here are a couple of characteristics of an efficient Steering Committee:

Be sized appropriately – large enough to represent the priority stakeholders, but sufficiently small to quickly analyze key information and make decisions.

Focused on fast decision-making

Become a vehicle for removing organizational barriers and not simply a daily meeting for taking note of reporting from the Project Team members

Not be a substitute for hands-on Sponsorship

Once the stakeholders are identified, the MDM Project Charter should include formation of a committee. Supported running hundreds or MDM projects, Profisee recommends the subsequent roles participate within the committee. Note that there could also be quite one team member per role, or some roles might not be applicable or a company’s organizational structure.


Weekly newsletter

No spam. Just the latest releases and tips, interesting articles, and exclusive interviews in your inbox every week.