Published: February 27, 2006
Enterprise data integration is a hot topic and covers a wide range of technologies including enterprise application integration (EAI), enterprise information integration (EII), and extract, transformation and load (ETL). Master Data Management (MDI) is also a data integration technology and should be added to list. Just to make life interesting, analyst organizations like Gartner are talking about Customer Data Integration (CDI), which to me is a subset of MDM. Sorting all this out and coming up with a data integration strategy is not easy.
Currently I am conducting a research project on data
integration for the data warehousing institute (TDWI). The
results will be published in a TDWI report and Webcast in
October.
The premise of the research report is that organizations are
moving toward (or need to move toward) an enterprise-wide
data integration strategy (of which data warehousing is a
piece). It will look at different types of data integration
(consolidation, federation, propagation) and data
integration technology (EAI, EII, ETL, MDM). It will discuss
why these technologies complement, rather than compete, with
each other and suggest which technology should be used when.
It will also look at requirements for data integration
products and look at the concept of a data integration
center.
I am interested to hear if you agree with the premise of the
paper. The topic of the data integration center is also very
interesting. Several companies I have interviewed either
have, or are planning to have, a data integration center.
Both Ascential (IBM) and Informatica are pushing this
concept. Informatica recently published a book on the data
integration centers and it is worth reading, even if you
aren't an Informatica customer. You can find the book on
Amazon.
All comments welcome. Colin.
A Historical Perspective
Before adding my two cents on MDM and CDI, let�s put
these two terms into a historical perspective. Over the
years, most organizations have developed, or purchased, a
wide range of different front-office, middle-office and back-office
applications for running day-to-day business operations. As
the number of operational applications has increased, so too
have the number of data stores holding important reference
or master data about key business entities such as
customers, products, employees and finances. In most
organizations, this operational master data is dispersed
across many IT systems.
In the past, to keep operational master data consistent, IT organizations developed custom MDM solutions. As companies have moved toward the use of packaged applications for operational processing, they have looked to their applications provider to help solve the MDM problem. Vendors such as i2, Oracle, SAP and Siebel have responded accordingly, but typically only for master data managed by their products. The big thrust this year by both applications and third-party software vendors is to increase the scope of MDM products.
One important business area addressed by applications vendors is customer relationship management (CRM). When CRM was first introduced, there was considerable confusion about what it meant and how to deploy it. Much of the misunderstanding was caused by the fact there were three different types of CRM processing (operational CRM, analytical CRM and collaborative CRM), and customers were confused about how vendor products supported each of these types of processing. The same confusion exists about MDM today because it can be used for both operational and analytical processing.
One of the main types of master data managed by CRM applications is customer data. To solve customer master data integration issues and to help create �a single view of the customer,� applications vendors, third-party software vendors and IT data warehousing groups have been developing and deploying so-called customer data integration applications. I personally think CDI is a bad term because people get confused about the differences between CDI and MDM. Some analysts even suggest you should do CDI before MDM. To me this is backwards thinking, but more on that later. At this point, suffice it to say that CDI, for all intents and purposes, is the same as customer MDM, or C-MDM. It is important to note, however, that a full enterprise-wide C-MDM solution is somewhat broader in scope than a CDI data hub.
There are several issues associated with building a C-MDM application. One of the main ones is creating a common customer identifier that can be used to connect the disparate accounts a customer may have with an organization. This issue alone has spawned vendor software solutions that support customer identity management (CIM).
Once a common ID has been developed for a customer, the challenge then is to integrate the customer data in the organization. This can be done using data consolidation, data federation, data propagation or a combination of all three. The method chosen will depend on the type of application required (operational or analytic) and whether the various customer data stores have to be synchronized or not.
The key things to note at this point in the discussion are:
The third item above is worthy of further discussion. C-MDM applications are often built by using data warehousing tools to extract and capture customer data into a single operational data store (ODS). Many of these applications are used for both operational processing and operational business intelligence. The result is that many data warehousing experts consider C-MDM a business intelligence (BI) application. This is not true. C-MDM can be used for both operational transaction processing and BI processing. Given, however, that the dividing line between the two types of processing is blurring, it is easy to see why the confusion arises. Although the demarcation between the two types of processing is academic (business users don�t care about it), it does have organizational and political implications. This is why some companies are merging their operational and business intelligence IT groups, especially in the area of data integration.
Another major issue for C-MDM, and other types of MDM, is metadata and metamodel management. I have already discussed how customer identity management is a problem, but another difficulty is that customer reference data and relationships vary over time. This issue has important implications for business intelligence applications that may analyze customer data across various time periods, comparing revenue this month to this time last year, for example. If, during the last 12 months, the customer hierarchies have changed or the sales organization has been restructured, then this will affect the validity of the comparison. This means that metadata and metamodel changes may have to be tracked and recorded in MDM applications. This means that the development of MDM applications requires significant business domain expertise.
Developing an MDM Solution
Developing an enterprise-wide MDM solution is a massive
multiyear project. Also, MDM vendor solutions that address
the issues I have outlined in this article are still
immature. This is why many IT organizations are deploying
tactical MDM projects such as customer and product data hubs
to solve near-term problems. It is also why some analysts
think CDI comes before MDM. It is important, however, to
think top down and build bottom up. It may be necessary to
develop tactical MDM applications, but it is equally
important that MDM and C-MDM solutions employ an
enterprise-wide approach to data integration, data quality
and metadata management.
Given the importance of MDM, Claudia Imhoff and I have teamed up with the Business Intelligence Network to write a research report on MDM. The focus of the report will be customer usage and cases studies. The report and associated Web event will be available by the middle of this year. Watch this space for further details.