Long Live MDM
Editor’s Note: Today’s post was written by Jeff Schaffzin. Jeff is an independent consultant with over 15 years of experience in high tech. He’s worked with a number of leading software vendors in roles such as product marketing, professional services and information technology. Specializing in data management, Jeff has spent the last three years focusing on Customer Data Integration and Master Data Management and has worked with a number of high profile companies in the United States and abroad.
DISCLAIMER: While the facts that I’ve included here are true, I’m speculating on the reasons why they’re taking place. I have no affiliation with any company mentioned here, nor should my opinions be construed as knowledge of their actions.
If you, like me, have followed MDM for the past year or two, you knew that what has been happening recently was going to happen sooner or later. Whether it was due to choice or necessity, MDM has been in the IT press a lot lately. Oracle acquired Silver Creek to enrich its product information management offering. Talend has announced and started to promote its open source MDM application. Data integration provider Informatica acquired Siperian recently in order to enter the MDM space and IBM recently acquired Initiate Systems as well.
Each of these events leads to one key question – how will this impact MDM in the short term and in the future? Given my understanding of the space, I think three scenarios are likely:
Scenario 1
It is hard to ignore the movements that IBM and Oracle have been making in the past year or so. In a market economy, the goal is to have as much market share as possible. In order to do this, you either build new products or acquire existing companies that have the technologies that you want.
While each company has done a combination of both building and buying solutions, their strategic plans are hardly a secret. IBM has proposed a vision of an end-to-end data management platform, which includes their MDM offering as well as reporting tools like Cognos and analytics/statistics from SPSS. Now that IBM has acquired Initiate Systems to complement their MDM stack, the question is why. Do they want to be known as a serious player in the health care industry? There could be other reasons too – they may consider MDM just a small piece of their data management toolkit and this could solidify that position, moving MDM from one of the hottest ‘technologies’ out there to just a “means to an end” to increase market share for their software business unit. Regardless of the reason, it means one less major MDM player in the market.
Then we have Oracle. For as long as I can remember, Oracle has been promoting its Fusion strategy. For those of you who are not familiar with it, Fusion is Oracle’s attempt to provide one code base that would pull together the applications it has built and purchased. This momentous undertaking was finally demonstrated at last year’s Oracle Open World (while Oracle continued to acquire other companies such as Silver Creek Systems).
However, like IBM, one can speculate on where MDM fits in this Fusion strategy. Oracle has always promoted its database first and sold its applications second. Even with the numerous special purpose hubs they’ve been developing lately, could this finally be the technology that enables Oracle to transcend from being a database vendor to a true platform player. Only time will tell with this one.
Scenario 2
There’s always the possibility that MDM has been considered the “secret sauce” – the so-called missing link – that rounds out the product lines for data integration/migration vendors.
Talend’s acquisition of French software company Amalto provided them a way to enter the MDM space. The open source vendor has been a darling of the analysts for a number of years and even won an award by Gartner, one of the first (if not the first) they offered such a company. However, since I don’t have contacts within Talend, it’s not clear what their next step will be, since they seem to be focusing their energies mostly in MDM after hiring two people to drive that effort within the past 6 months or so.
As the de facto leader in data integration, Informatica needed to extend its reach beyond that space. If you look at their job listings, they are looking for someone to market their CEP (Complex Event Processing) efforts. Relatively recently, they were looking to hire someone who had experience with ERP or MDM, but it is unclear which path they decided to take with that. Regardless, there were always looming rumors of them wanting to add MDM to their portfolio with the press suggesting that they would acquire Initiate Systems. However, instead of buying them, they purchased Siperian – a company half its size in terms of customer base and revenue.
In either of these cases, MDM may not be their flagship product, but at least it shows that it is a viable technology and shows that it is something that won’t be going away any time soon.
Scenario 3
People like me who have been in the data management space are always interested in improving something. I believe in the statement, “even if something isn’t broken, there’s always a reason to make it better.” This was clear when Customer Data Integration (CDI) first came about and many companies hopped on that bandwagon, knowing that they wanted a way to track their customers more efficiently.
At the same time, other companies explored Product Information Management (PIM), a way to have a single source of product information which was sourced from PLM, inventory and supply chain systems. Following that was the concept of MDM – a beautiful vision – having a single source of truth that can be used by an entire company.
Now we have a new concept that has been promoted – Multi-domain MDM. Siperian and other companies have began to promote this to show the world that they are truly the most advanced players out there. While this has been going on, there have been rumblings about Enterprise Information Management (EIM). What I’m still not clear on is – what’s the difference between multi-domain MDM and EIM? Are they the same? If not, what are the differences between the two concepts?
In any case, there’s a lot to think about. I don’t know where you stand, but one thing is certain – MDM is not going away, at least for the foreseeable future.
Data Profiling For All The Right Reasons, Part 5
The Hub Designs Blog welcomes the final installment of this great series by Rob DuMoulin, an information architect with more than 26 years of IT experience, specializing in master data management, database administration and design, and business intelligence.
Part 5: The Profiling Payoff
This is the final part of a five-part series, describing how data profiling benefits both IT projects and business operations. In Part One, we discussed profiling perspectives. In Parts Two, Three and Four, we introduced the value of system, entity, and attribute-level metrics. This part discusses the archival and beneficial uses of profile results.
If you have defined your corporate data profiling strategy similar to the methods discussed in the preceding parts of this series, you’ll have amassed a robust collection of metadata spanning relevant systems across your business. Although systems may be of different types and locations, the structured approach and common metrics you collected create a centralized repository of information that can be examined holistically. Ideally, this information will exist in an open-source database repository with reports made available across the enterprise. System and Entity information help planners and developers organize information strategies. Attribute-level domains, constraints, and business rules help data architects understand existing systems. Relationships and value patterns are readily available to support validation of information-related hypotheses as needed.
If you plan to design your own repository, consider adding timestamps and indicators to help you manage and present the information. To keep your repository relevant to business needs, design collection rules to be configurable. This allows you to easily ignore superfluous information or enable tests only at certain critical times. Allow initial system profiling efforts to gather a large set of metrics and store them as your baseline. As you learn about the information, you will see which tests or which data objects add no value. Us geeky DBA-types who understand system-level catalogs have our own scripts to do much of what was described inParts Two,Three and Four. Those less-inclined may prefer to use a third-party tool for profiling. Either way works as long as the business needs are satisfied and the entire enterprise standardizes on one approach (and thus one integrated repository).
You will find that collecting and maintaining this level of detail has a definite cost. Even if the collection is automated, interrogations of large data sets places an overhead on production systems that may not be practical. Record and monitor profile execution metrics to identify bottlenecks or tuning opportunities. Realize that the extent of data profiling is contingent on the project phase, specific data elements, and most of all, business value. Review profiling goals on a regular basis and eliminate unnecessary and redundant checks.
How much profile history to maintain is another consideration. Even though disk is “relatively” cheap, maintaining all historical entries in a live repository may not be necessary. Consider business needs and value for historical profile information. Even consider archiving at a summarized (or less frequent) level and keep only a limited time window of statistics online.
This discussion on data profiling was intended to broaden perceptions of what it means to a business and the value it can bring if done in a sustainable way. The blog format is not conducive to in-depth discussions, but hopefully the topics covered here spur some thoughts into how you can add value to your business by implementing some of these concepts. Use your imagination, but remember that no matter how cool it might be to collect and store some profile output, if it does not add business value to somebody, it might not be worth the overhead to continue recording it.
Go back to Part 4.
Initiate Systems Acquired By IBM
Today, IBM announced that it is acquiring Initiate Systems.
This was widely rumored last week, but the announcement of Informatica’s acquisition of Siperian took my mind off this temporarily.
On the face of it, it makes all the sense in the world. IBM knows a good product when it sees it, and Initiate has been doing well in the MDM world lately, particularly in the healthcare vertical, where it grew up, and in the public sector vertical. IBM’s press release explicitly mentions Initiate as a leader in “data integrity software for information sharing” among healthcare and government organizations. I thought it was interesting that the IBM release didn’t mention the terms “master data management” or “MDM” even once.
I was a little surprised that IBM’s release made no mention of the financial terms, since IBM is a public company, but I’m sure it will only be a matter of time before those details become available to those who know where to look or whom to ask.
It wasn’t a surprise to see the IBM release mention the stimulus funding being invested around the globe. When I first saw the rumors last week, I immediately thought – IBM is buying Initiate to be better prepared for the various e-Healthcare initiatives that are coming down the pike.
Where things may get a bit tricky is explaining the multiple MDM platforms from IBM to potential customers, and managing several different development roadmaps and product portfolios. There’s the IBM InfoSphere MDM Server (the former DWL product) and there’s also IBM InfoSphere MDM Server for Product Information Management (the former Trigo product). And now there’s the Initiate product too.
While the acquisition does make sense, there is an “embarrassment of riches” factor. IBM will, of course, develop a sales playbook explaining what situations at what type of customer are a good fit for each product.
I think the lingering feeling I have with Initiate Systems is that it may be headed for a “golden ghetto” at IBM – never to reach its full potential as a solution across many different industries, and eventually to handle many different domains of master data. IBM may (and rightly so, in its mind) pigeonhole it into the healthcare and government verticals.
But Initiate’s had some good success outside those two industries. In the Financial Services vertical, they’ve got customers like Capital One Financial, Countrywide Financial (now Bank of America), eSure Insurance, and Wells Fargo. In the Hospitality industry, they’ve got Choice Hotels. In manufacturing, they’ve got Mitsubishi Motors Australia. In the Logistics vertical, they’ve got Federal Express. In the retail sector, Barnes & Noble, CVS, Longs Drug Stores and SuperValu are all customers. And in the high tech space, they’ve got Dell, Ingenix, Intuit, LocatePLUS, Microsoft and National Instruments.
Unfortunately, they didn’t achieve enough critical mass in any of these other verticals to compete with the strong momentum they’d developed in healthcare and government.
As I said last week, these are interesting times in the MDM world. The recent M&A activity, the healthy demand from large and medium sized corporations, the large number of consultants from other areas claiming to now have experience in MDM – these are all signals to me of a large and fast-growing market. So the New Year, for those of us in the MDM space, is off to a good start.
Siperian Acquired By Informatica
Siperian, one of the last best-of-breed providers of master data management (MDM) technology, is being acquired by Informatica.
The two firms were already working together closely, having an alliance and OEM relationship through Informatica’s acquisitions in 2008 of Identity Systems (for entity resolution and matching) and in 2009 of Address Doctor (for customer address cleansing).
This will strengthen the Siperian product further by bringing Informatica’s technology even more tightly into the Siperian MDM Hub.
At the same time, it eliminates the “company viability” question mark that sometimes gets raised in large IT shops’ minds when evaluating enterprise software vendors. When a Fortune 500 company is evaluating a smaller company, they sometimes wonder how long a company like Siperian can last against companies like IBM, Oracle and SAP. I’ve never been a big fan of that argument, since some of the best software gets created at small and medium-sized companies, but there’s no doubt it’s a obstacle to be overcome with the larger enterprises. Now, it shouldn’t be an issue.
As a Siperian partner, Hub Designs is excited about this acquisition. Based on the information we’ve got at this point, it seems like a good thing for Siperian’s customers, products, shareholders, partners and people. In today’s economic climate, dreams of a big IPO (for any venture-backed technology company) are unlikely, so an acquisition by a well-run larger company is a good outcome.
I know a lot of the people at Siperian personally, and have worked closely with them over the last few years. I hope the people at Informatica realize what a strong team they are getting in this acquisition, and do everything they can to hang onto them all.
I do suggest they stop using the term “MDM Infrastructure” though (which appeared 5 times in Informatica’s press release announcing the acquisition). First, it’s not accurate – MDM projects typically need to be drive by the business to be successful, so they can’t and shouldn’t be thought of as “IT Infrastructure” projects. Secondly, from a marketing perspective, “infrastructure” is about as exciting as mud – it’s hard to get senior management excited about spending money on something with the word “infrastructure” in the name.
As for the acquisition’s impact on the rest of the MDM market, it’s still growing pretty quickly, but the number of players is shrinking. So I think we’ll see it become even more competitive, and with Informatica now becoming a strong player in the MDM hub market, that’s got to cool its relationship with Oracle, who selected Informatica as an OEM component of its Oracle Fusion MDM hub.
IBM is rumored to be acquiring Initiate Systems, which is an interesting play in its own right, especially given the expected growth in spending in the e-healthcare space over the next few years.
And SAP continues to improve its products slowly but steadily, while D&B/Purisma is doing some interesting things with web services access to the D&B central database of information on businesses.
As for the remaining independent MDM vendors, like Orchestra Networks and Kalido, or Product Information Management (PIM) solutions like Stibo and Riversand, they should see this as further validation of the strength of the MDM market. Kalido feels that it’s the only independent MDM provider who can manage every master data domain. That may be true. I plan on learning more about Kalido over the next few months.
So like the Chinese curse, “may you live in interesting times”, the beginning of 2010 promises to be interesting for all of us in the MDM business!
If you’d like to continue the discussion on the “Impact of Informatica’s Acquisition of Siperian”, click http://ning.it/aJ1Xj5.
Data Profiling For All The Right Reasons, Part 4
The Hub Designs Blog welcomes Part 4 of this series by Rob DuMoulin, an information architect with more than 26 years of IT experience, specializing in master data management, database administration and design, and business intelligence.
Part 4: Profiling Relationships and Patterns
This is part four of a five-part series describing how data profiling assists in all aspects of system development, from design through deployment.
Part One introduced different perspectives on data profiling. Part Two identified valuable system and entity metrics to track. Part Three discussed attributes. In this segment, we dive deeper into attribute relationships and pattern recognition. Also, we expand on primary key identification discussion and discuss hidden relationships.
Pattern grouping provides a mask of distinct format patterns within an attribute data set and a count of the number of occurrences. Patterns give insight into the type of values found in an attribute. For example, a numeric pattern analysis may show values such as 999.99999, 99, or -.9999.
Observing distinct patterns gives insight into the maximum digits and precision, and also domains such as integer or real. Pattern of a database date or date-time type provides unremarkably similar patterns for all dates. Because the database management system typically enforces the domain, date analysis provides no value and can be ignored. If dates are stored in character format, however, patterns quickly show variations in date formatting. Character patterns only have significance to a limited number of positions. It makes no sense to pattern a description field of 200 or 2000 characters. Smaller code attributes of less than 10 characters though do provide value. Ignore pattern profiling for character strings over 20 characters at first, then refine to shorter character strings if the results do not add value.
In pure database theory, referential integrity (RI) is your friend. In practice, designers and software vendors often forgo RI to improve system performance on data inserts. These designers place the data quality burden on the application and do not endorse external data manipulation outside the application interfaces. In the real world, though, data corruption occurs and without RI or routine data quality checks, corruptions may not be found for a long time or not at all. Personally, I have identified over $50,000 of recent orphaned sales in a retail client resulting from deliberately disabled RI. These unreported sales were not added to the ledger and were allowed to occur for performance reasons until I found them through simple profiling. Enforcement of RI is a topic for another discussion but is mentioned here because it does identify a valid reason for data profiling.
In even presumably good relational designs, some parent-child relationships are not enforced for different reasons. First, interrogate the RI listed in the system catalogs to identify all enforced relationships. Reverse-engineering a system with a good modeling tool is probably the best way to do this. A harder and more valuable analysis is to identify unenforced relationships and determining the probability of the relationship if not all values are an exact match. Do this by counting all the candidate child attribute values that exist within a known parent attribute table. If all match and there are a non-trivial number of matches, there is a good probability of a non-identified relationship. A small number of mismatches could identify data quality issues.
In Part 5, we tie all the techniques discussed in the first four parts together to show the value of a repeatable data profiling process.
Data Profiling For All The Right Reasons, Part 3
The Hub Designs Blog welcomes Part 3 of this series by Rob DuMoulin, an information architect with more than 26 years of IT experience, specializing in master data management, database administration and design, and business intelligence.
Part 3: Attribute-Level Analyses
This is part three of a five-part series on data profiling.
In Part One, we took a light-hearted view of where profiling benefits an organization and in Part Two, we discussed the fundamentals of a profiling strategy. The remaining three parts discuss attributes, relationships, patterns, and how to use the combined data profiling information you collect. In this section, we introduce attributes, the lowest-level components of a profiling effort.
An attribute is simply a individual data element. Alone, an attribute has no context. Given the simple descriptor of “Cost” for an attribute tells us very little about the attribute’s true purpose and immediately drives a need for additional information, such as units (hours, Dollars, Euros…), type (weighted, unit, gross…), and use (invoice, sum, average…). Attributes therefore must be analyzed within the context of their business purpose to have meaning.
Some characteristics require business knowledge to define and others can be determined through interrogation of existing values and underlying rules of the storage medium. It takes both analyses to get a complete picture of information within a system. While assembling this puzzle, though, keep in mind that until you validate the enforcement of business rules, only assumptions can result from physical profiling or business context.
Analyses of values, domains, and constraints allows insight into use (or abuse) of an attribute. The larger the sample size, the better confidence you gain in the results. Without explicit proof of business rule enforcement, though, you must assume that just because a value does not presently exist does not mean it cannot exist. Business rules are defined by business experts and enforced through database constraints, data type/precision, and application code. Knowing the methods of enforcement allow you to narrow a domain but not totally understand it. Profiling of actual values provides additional refinement in terms of percentage of NULL values, percentage of distinct values, minimum, maximum, and average values, top x and bottom x recurring values along with their counts, and minimum, maximum, and average data lengths.
Some attributes within a data set serve valuable purposes that are important to identify. Attributes that individually or in conjunction with others define uniqueness of the data set also may support relationships between entities. Uniqueness can be further classified as being either members of a system-enforced primary key or of a business key (outside of the defined primary key). System-enforced primary keys are relatively easy to define within a database system through interrogation of the system catalog. Business keys that exist in tables in addition to a primary key may be more difficult to identify, especially if more than one attribute is needed to define uniqueness.
Attribute-level information of interest includes: data type (size and precision), the number and percent of NULL values, column descriptions, number and percent of distinct values, and the minimum-maximum-average values and lengths. Uses of the system catalog provides some of this information, but others must be collected through sampling the data.
Other types of attributes that may help in identifying relevancy are those that provide system-level auditing or change control. Knowing which attributes fill these roles may either allow you to (a) ignore them for profiling purposes or (b) use them to help explain versions or data anomalies.
Part 4 expands on attribute profiling with the introduction of relationships and patterns.
Data Profiling For All The Right Reasons, Part 2
The Hub Designs Blog welcomes Part 2 of this series by Rob DuMoulin, an information architect with more than 26 years of IT experience, specializing in master data management, database administration and design, and business intelligence.
This discussion is the second of a five-part series on data profiling. In Part 1, we discussed the project roles that benefit from data profiling and how better understanding information results in more reliable information systems. Important goals of any profiling strategy include automation of metric collection and socializing results to support the differing objectives of a data-centric project.
Early in a system development life cycle, profiling helps define sources, data storage requirements, and data transformations. As a system goes into production (or if profiling is added to an existing system for quality control purposes), routine profiling is useful to audit system quality and business rule enforcement. The frequency of collection and amount of effort you expend to automate your profiling methods should be based on the ability of the organization to benefit from the profile results.
This section discusses the beginnings of a profiling effort. Information assembled here forms the foundation of other profiling activities. For this discussion, consider a Profile Group as a set of information sharing a common purpose and data management methods. Examples of profile groups include tables within a single database schema or a group of spreadsheets with the same format but each spreadsheet representing a different time slice of data.
The underlying System managing a set of information within the profile group may be a named relational database, a file system directory, or even a web site being accessed through web services. The reason we abstract information into Systems is to group the information into distinct governance methods common to the underlying information. Relevant metadata and governance methods we track at the system-level include: technical contacts, backup schedules, system descriptors, connection strings, business unit owners, and host operating systems. System-level metadata common to a profile group helps us understand and troubleshoot future analyses. This level of information also provides developers with an understanding of inherent restrictions (or freedoms) they may encounter when trying to use or integrate the information.
Entities within a profile group belong to the same system, may have a common unique identifier, and, for database entities, have the same schema owner. Typically, entities are database tables, but may also be similar files or spreadsheet tabs containing like attribute lists. For entities, we track characteristics common to all the attributes they contain. These include: row counts, entity-level descriptors, growth characteristics (size and frequency), last analyzed date, and various customized indicators such as active/inactive, existence of change data management attributes such as insert/update timestamps, and existence of audit traceability indicators such as insert/update username.
The combination of system and entity level profiling supply the foundation for the attribute-level profiling, which is where physical information in a system resides. It also provides valuable metadata to classify information and allows for future correlation of like information across systems. Assembly and publication of entity and system level information benefits the various consumers of the information by providing a centralized “master” source of contact and context information.
In Part 3, we will dive into the attribute level analyses around data profiling.
Data Profiling For All The Right Reasons, Part 1
The Hub Designs Blog welcomes another guest post by Rob DuMoulin, an information architect with more than 26 years of IT experience, specializing in master data management, database administration and design, and business intelligence.
Part 1: The Psychology of Data Profiling
Swiss psychologist Carl Gustav Jung founded the Analytical School of Psychology. His word association theories form the basis of the Myers-Briggs Type Indicator Assessment test to identify career aptitude in today’s high school students. Dr. Jung’s approach assigned personality profiles based on how an individual’s thoughts associated to various phrases. By analyzing responses, he could understand how an individual viewed the world around them and perceived themselves. Typically, subjects are asked to speak the first thought entering their minds after hearing a trigger phrase. For the following example, remember, there are no wrong answers. If I say the words “Data Profiling”, what is the first thing you think of?
If you thought of food, cats, country music, CSI NY, or residential plumbing, you are either not in IT or are an IT Manager.
If your first thought was “Quality Assurance”, you align yourself with data quality professionals having anti-social thoughts of failing test cases and sadistically reporting lazy developers for buggy code. You gleefully scour test cases looking for any evidence of truncation, missing values, non-matching codes, numeric precision errors, and inconsistent abbreviation, text, and date formatting.
If “Integration” comes first in your mind, past legacy integration projects have scarred you with a disdain for source system data quality levels. You view production apps with contempt and loathe the time it takes to track down data issues caused by system integrations. You investigate upstream sources to create detailed mappings and transformation rules. Typical debugging sessions consist of validating relationships to identify orphaned data, identifying attributes that contain overloaded columns (attributes containing more than one distinct data element), or fixing format errors from implied decimals.
Some of you responded with “Value Domains” or “Data Types”, indicating you are obsessive compulsive data architects compelled to organize the world into strict and orderly fashion with some degree of normalization, though you are not considered “normal” by your peers. Your concerns lie in understanding and regulating naming conventions, relationships, existence of NULL or default values, and understanding the meaning of each data element to accurately identify business rules and when two or more objects are related or redundant.
Lastly, if “Debugging” is the first item in your thought queue, you are a coder justifying why presumably good code is not working. Extreme paranoia has taught you to assume nothing about data quality, so you add tests to identify duplicates, validate relationships, enforce business rules, track change data capture, provide substitute values. Your phobia of early morning phone calls cause you to add auditing to your code to inform a DBA of data issues rather than waking you up in the middle of the night.
It is truly amazing how much we can conclude from the response to one simple phrase.
As stated before, there are no wrong answers. Aside from the innocent jab at Managers and non-IT resources, we all realize the benefits of information quality and absolutely need business involvement to understand context and domains of business information. The meaning and actions of Data Profiling change both by role and by project phase. Through profiling, we are able to identify best sources of information, learn proper ways to categorize and store it, reactively identify quality issues, and proactively define business rules to prevent future issues.
Identifying what is important to profile, when and how profiling is done, and how to share our findings across business and project resources is key. Done properly, profile results integrate to a master metadata repository and are periodically refreshed through an automated process.
This five-part series provides a tool-agnostic approach to comprehensive data profiling, focusing on information meaning and use. The next part of the series discusses system and table-level profiling. In particular, what information is important to collect at the system and table level and how can that information be leveraged by the Enterprise to help assure quality. The third part dives into attribute-level profiling and the fourth discusses attribute patterns and relationships. The final part discusses the benefits and utility of gathering profiled information into a single repository.
Continue with Part 2.
2009 Year in Review
As we’re about to enter 2010, it’s a good time to reflect on what happened in 2009 and what it all means.
“It was the best of times; it was the worst of times…” So Dickens begins “A Tale of Two Cities”, but it’s also a good description of the past year.
The first half of the year was one of the most challenging I’ve faced in my twenty-three year career in business and technology. The second half of 2009 was better – not without its speed bumps but every month was a little better than the one before it.
The macro-economic climate has been tumultuous at best. But the second half of the year showed enough improvement that Hub Designs’ revenue for the year was up 33%. Not bad for a two and a half year old company during the worst economic conditions in 80 years …
Marketing and Thought Leadership
We launched a new web site in January, and it’s been well received. Total visits to www.hubdesigns.com were up 14% over 2008.
A little later in the year, we updated the “look and feel” of the Hub Designs Blog, branding it as the “world’s fastest growing blog covering master data management and data governance”. We’ve gotten more than 43,000 hits since we started writing in July 2007, and our readership more than doubled in 2009, to about 27,000 hits per year.
We published six issues of our “Best Practices in Master Data Management” newsletter this year. We publish the newsletter about six times a year to roughly 3,300 subscribers.
I wrote six articles for Information Management magazine, including some popular ones on “Product Information Management Challenges”, how to build a business case for master data management, and how to select the right MDM vendor for your organization. I also wrote for Identity Resolution Daily, on “The Growing Role of Identity Resolution in MDM” and “Matching – MDM’s Secret Sauce”.
With our partner Siperian, we wrote a white paper in August called “When Data Governance Turns Bureaucratic: How Data Governance Police Can Constrain the Value of Your MDM Initiative” that has generated quite a bit of discussion. You can download a copy of it here.
A second white paper, called “Best Practices for Leveraging D&B in Oracle E-Business Suite”, was written in partnership with Dun & Bradstreet. It describes using D&B information to drive better supply chain performance for companies using Oracle E-Business Suite. You can download it here.
I volunteer for the Education Committee of the Oracle Applications Users Group (OAUG). A big part of that effort lies in programming the MDM track for the annual conference. This year, it was in Orlando in May, and I really enjoyed speaking there and seeing people from the Oracle community that I don’t see very often. Here’s a link to my OAUG presentation.
We participated in conference calls with Oracle Development during the year, and ultimately attended the Oracle Fusion “Hands-On Validation & Testing” session for Customer MDM at Oracle headquarters in August. It was a great chance to get some early insights into Oracle’s next major product release and to see the progress Oracle has made in building out its Fusion MDM vision, which is striking in its powerful hub technology and its elegant & productive user interface.
In 2008, we attended the Gartner MDM Summit to decide whether to exhibit there in 2009. We were impressed enough with the conference that we did exhibit in 2009, in October in Los Angeles. We had a positive experience, so we’ll be a Silver level sponsor in April 2010 in Las Vegas. Since we specialize in MDM and data governance, we find the association with Gartner’s MDM event a powerful one.
I didn’t attend Oracle OpenWorld for the past couple of years, but this year I was glad I did. It was like “old home week”, seeing people from Oracle itself and from the broader Oracle community that I’ve met over the past 15 years. David Butler, Senior Director of MDM Marketing at Oracle, posted my presentation on Oracle’s web site, and said “you were our cleanup hitter and you hit a home run with the bases loaded”.
We also did webinars with our partners Siperian and Initiate Systems. The Siperian webinar covered the differences between MDM platforms like Siperian and ERP platforms like SAP from a master data perspective. The Initiate webinar, with Initiate’s CTO Marty Moseley, discussed developing strong MDM business case, deploying core MDM technologies and lessons learned on the “build vs. buy” question.
Social Networking
After experimenting with social networking in 2008, this year we had a coordinated strategy to use the Hub Designs Blog, Facebook, LinkedIn and Twitter to communicate & collaborate with our clients, potential clients, team members, partners, suppliers, etc.
It’s a pretty simple strategy. Short updates (140 characters or less) go out on Twitter, and are re-published on both LinkedIn and Facebook. Longer updates (i.e. blog articles) are published on the Hub Designs Blog. We encourage responses and feedback using @replies on Twitter and comments on LinkedIn and Facebook, as well as longer-form comments on the blog. And we get them – almost every blog article gets at least one comment, sometimes as many as a dozen.
When a new blog article comes out, we notify everyone via a single update on Twitter. What’s amazing is that during 2009, social networking now drives about 15% of the Hub Designs Blog’s total traffic. And one of our clients gave us some good feedback that our social networking activities help her stay current on what we’re up to, and help her feel connected to us as a company.
Another social networking experiment that developed further in 2009 was the MDM Community. We started this using Ning (a “social network in a box”) in November 2008, out of frustration with LinkedIn’s “Group” functionality. It now has more than 210 members, from 23 different countries. It’s still a work in progress, but if you’re interested in master data management or data governance, you should check it out at http://mdmcommunity.ning.com. It’s becoming an international “who’s who” of the MDM world.
Summary of Client Projects
In case you think the Hub Designs team has been doing nothing but marketing, writing white papers and magazine articles, speaking at conferences, and volunteering for user groups, here’s a summary of our 2009 client projects:
- Technology provider for vehicle dealers: integration of Oracle E-Business Suite with D&B data
- Payroll services company: integration of Oracle E-Business Suite with external credit information
- Information services company: technical support for customers using Oracle E-Business Suite
- Legal information services company: readiness assessment and product MDM strategy & design
- Simulation and engineering software company: advisor to data governance board
- Manufacturer of oil and gas equipment: integration of Oracle E-Business Suite R12 with D&B
- Software company: built connector between Oracle AR and D&B’s DNBi risk management solution
- Technology company: customer MDM strategy workshop
Out With The Old, In With The New
This past year has been a lot of fun, but it has also been somewhat exhausting. So I’m looking forward to a bit more deliberate pace in 2010.
We’re very excited about the coming year at Hub Designs. We’ve got some great projects underway and in the pipeline, and we’ll be continuing to grow and expand to meet our clients’ needs and market demands.
In closing, I’d like to say how grateful I am to my family, for their patience with my traveling so much and for their unconditional love.
Hidden Costs of Duplicate Customer Data
A client asked me last week about what rate of duplicate data was “normal” in customer master data.
My initial answer was that, among companies that don’t have any formal master data management, data governance or data quality initiatives in place, duplication rates of 10%-30% (or more) are not uncommon.
When I was at D&B, we used to routinely see that level of duplication in client’s customer files.
In a study in the healthcare field, Children’s Medical Center Dallas engaged an outside firm to help clean up their duplicate data:
“Solving both the current and future problems around duplicate records helped Children’s improve the quality of patient care and increase physician acceptance of the new EHR. The duplicate record rate was initially reduced from 22.0% to 0.2% and five years later it remains an exceptionally low 0.14%. The 5 FTEs initially tasked with resolving duplicate records have been reduced to less than 1 FTE.”
“For the Children’s Medical Center, the results were heartening, not only from a care delivery standpoint but also because of the significant cost-savings that can be realized. A study conducted on Children’s data showed that on average, a duplicate medical record costs the organization more than $96.”
So it is possible to get the duplication rate down to really low levels through careful analysis and the application of the right tools, as part of an ongoing data governance program. Even the hospital above (and hospitals are usually not mentioned as practitioners of best practices) was able to maintain a duplication rate of only 0.14% after 5 years.
And there are very real costs to not de-duplicating your customer data. Depending on the functional area (marketing, sales, finance, customer service, etc.) and the business activities you undertake, high levels of duplicate customer data can:
- annoy customers or undermine their confidence in your company,
- increase mailing costs,
- cause hundreds of hours of manual reconciliation of data,
- increase resistance to implementation of new systems,
- result in multiple sales people, sales teams or collectors calling on the same customer,
- etc.
The best studies I’ve seen of the cost of duplicate data have been in the healthcare industry. One study I saw said:
“According to Just Associates, the direct cost of leaving duplicates in an Master Patient Index database is anywhere from $20 per duplicate to several hundred dollars. The lower cost reflects the organization’s labor and supply costs to identify and fix the record while the higher expense reflects the costs of repeated diagnostic tests done on a patient whose previous medical records could not be located.
The American Health Information Management Association (AHIMA) estimates that it costs between $10 and $20 per pair of duplicates to reconcile the records. If the records aren’t reconciled, however, the costs are even higher.”
Here are three more case studies backing up the range I quoted of 10%-30%:
- Once the analysis was complete, Sentara discovered they had a significant duplication rate, over 18%. They had attempted to address the duplication rate in the past through a remediation process, but due to either technology issues or because the cost of merging and cleaning up the duplicates across their many different systems was too high, they had not yet successfully reduced their duplication rate. Source: Initiate Systems success story
- Emerson Process Management faced a tremendous challenge four years ago in getting its CRM data in order: There were potentially 400 different master records for each customer, based on different locations or different functions associated with the client. “You have to begin to think about a customer as an organization you do business with that has a set of addresses tied to it,” says Nancy Rybeck, the data warehouse architect at Emerson who took charge of the cleanup. Working with Group 1, Rybeck analyzed the customer records for similarities and connections using everything from postal standards to D&B data, and managed to eliminate the 75 percent site-duplication rate the company suffered in its data. “That’s going to ripple through everything,” she says. Source: DestinationCRM.com
- Problem: Number of duplicate records: 20.9% of Utah Statewide Immunization Information System records. Impact of Problem: Difficult to find patients in system—key barrier to provider participation, risk of over-immunization—unable to find reliable patient record, cost of unnecessary immunizations, risk of adverse effects on patients. Source: health.utah.gov.
And here’s a good quote from a white paper titled “Data Quality and the Bottom Line” by The Data Warehousing Institute:
“Peter Harvey, CEO of Intellidyn, a marketing analytics firm, says that when his firm audits recently ‘cleaned’ customer files from clients, it finds that 5 percent of the file contains duplicate records. The duplication rate for untouched customer files can be 20 percent or more.”
Every organization will need its own metrics, but left unchecked, the duplication problem is a hidden cost that drags at your company, slowing down your processes and making your analyses less reliable.
If your sales analysis reports can’t be sure that there’s one and only one record for each of your largest customers, then the sales figures for those customers are probably not right. So the entire report becomes suspect at that point.
I’d like to end with a great quote on data quality by Ken Orr from the Cutter Consortium in “The Good, The Bad, and The Data Quality”:
“Ultimately, poor data quality is like dirt on the windshield. You may be able to drive for a long time with slowly degrading vision, but at some point, you either have to stop and clear the windshield or risk everything.”
Please let us know what you think by commenting here. We’re interested in hearing your thoughts on data quality and the issue of customer data duplication.
First Look at Oracle Fusion MDM Hub
“All NDAs are lifted” were the magic words uttered by Steve Miranda from Oracle at the Fusion Inner Circle Event at Oracle OpenWorld on October 15th.
Just to make sure, I asked Steve explicitly during the Q&A section of the program if it was okay under the non-disclosure agreement we had all signed to write about Fusion on my blog, and he said “Yes.”
Hub Designs was invited back in February to help Oracle’s Fusion MDM team with some design review, validation, and testing activities. In return for our assistance, we’ve gotten to see Fusion MDM inside and out, and we can proudly say that we are one of the very few trusted partners who helped Oracle to design and develop the application.
We participated in a lot of conference calls with Haidong Song, Oracle’s Product Strategy Director for Customer MDM, and other members of his team. And we attended a week-long “hands-on validation” event at Oracle headquarters in August, looking specifically at the customer data management aspects of the Fusion MDM hub.
My first impressions of Fusion MDM during that hands-on session were very favorable. I remember thinking to myself, “Oracle could almost start selling this into the MDM hub market right now!”
Of course, Fusion isn’t scheduled to ship until sometime in 2010, and there’s still plenty of work to be done between now and then. But the core functionality needed for master data management was there, and the Oracle Fusion MDM team had a room full of customers and partners banging on it for a week without any significant crashes or issues.
There was plenty to like in Fusion that didn’t relate specifically to master data management – the new and improved user interface, the embedded analytics, the modern, standards-based architecture, the usability research that Oracle has done, the improved business processes, the built-in collaboration capabilities …
But the fundamentals of MDM were strong as well. Haidong and his team demonstrated how to import and consolidate customer data from outside sources, and we did our first hands-on lab session bringing in a small customer data load from a desktop file, such as a list of trade show leads.
We also tested a larger volume of customer data being brought into Fusion MDM through the Bulk Import process.
We did another exercise simulating how a typical customer data steward would identify potential duplicate customers, and then resolve those duplicates by merging the duplicate parties.
We also got a good look at the Informatica components that Oracle is bundling into Fusion on an OEM basis: the former Identity Systems matching engine and the former Address Doctor address cleansing tool. Previous Oracle MDM products like Customer Data Hub have had loose integration with Trillium and Firstlogic for address cleansing, but it’s refreshing to see Oracle investing in deep integration with industry leading solutions.
I think there are going to be a lot of Oracle customers who will move to Fusion MDM as the first wave of their overall migration to Fusion, who will see Fusion MDM as a good way to get some early experience with the Fusion applications family, before committing their mission critical Enterprise Resource Planning (ERP) applications to the Fusion platform.
And in 2010 and beyond, I think will be a lot of potential customers who evaluate Fusion MDM positively on its own merits against competitive MDM hubs. Oracle brings a robust data model, open architecture, and a next-generation approach to master data management, with state-of-the-art matching, data quality, middleware, and business process management.
Please let me know by commenting here what your thoughts and expectations are for Oracle’s Fusion MDM hub.
Oracle OpenWorld Presentation
I had a great time at the Oracle OpenWorld conference this year.
Oracle did a great job organizing the MDM track. There were a lot of great presentations, and a good balance of speakers between Oracle people, outside consultants and experts, and end users with success stories to share.
David Butler, Senior Director of MDM Marketing at Oracle, was kind enough to convert my presentation titled “Best Practices in Master Data Management and Data Governance” to PDF format and to post it on the Oracle.com MDM web page.
You can find it in the ‘Partners’ portlet on the right hand side of the page, or just click here.
Siperian Momentum
At the Gartner MDM Summit conference three weeks ago in Los Angeles, I sat down with Anurag Wadehra and Ravi Shankar from Siperian. I usually go to Siperian’s user conference, which was held last week in Princeton, NJ. I couldn’t make it this time but had a great time at their Spring 2009 event.
So instead, I thought I’d do a blog article on Siperian’s momentum in the last year or so, based on the briefing that Anurag and Ravi were kind enough to give me in Los Angeles.
Siperian’s ambition is to be a leader in multi-domain master data management and since their product is not tied to a specific data model, that’s a realistic goal. Many of their customers find the business problem they’re initially trying to solve does in fact involve multiple domains (or areas) of master data.
Siperian’s most recent fiscal year ended May 31st, and they wrapped up the new year’s first quarter on August 31st. Impressively, their license sales more than doubled over the last 4 quarters, and overall revenue almost doubled.
The reduction in dependence on services revenue and the corresponding increase in license revenue, indicates a positive trend that Siperian continues to shift its implementations to its alliance partners.
One of the reasons Siperian wanted to sit down with myself and others in the MDM space was to dispel some rumors that have been floating around about the company. The economic downturn that began in the fall of 2008 has been widely felt, to be sure, and Siperian had significant exposure at that time to the financial services industry, which was one of the hardest hit industry sectors.
But Siperian has done a good job diversifying its customer base into other verticals, more than a dozen total to date, and is continuing to close deals with new customers, extending its footprint at existing customers, and building significant relationships with global systems integrators.
With customers like Johnson & Johnson, Merrill Lynch, and Cephalon speaking on behalf of Siperian at events like the Gartner MDM Summit and Siperian’s own user conference, there definitely seems to be a pattern emerging of large organizations with challenging MDM requirements turning to Siperian.
Another trend worth mentioning is that a large portion of Siperian’s revenue is repeat business – customers who have done a successful project with the company and are expanding their MDM footprint into another domain, geography, etc. This speaks volumes about the success of Siperian customers’ current implementations.
Siperian’s “Business Data Director” (BDD) product, launched at the spring user conference, has already signed up more than a dozen customers, with 2-3 already “live” and more going live in the next few months. I was there for the launch of BDD and remain impressed with it.
To a large degree, Siperian’s strategy of scaling through alliances is paying off. Ninety percent of its revenue in the last 4 quarters was partner influenced, with its top four partners accounting for 60% of that business.
I’ve followed the company closely for the past couple of years, and I think their company strategy and product roadmap is solid. Siperian helps keep the “Big Three” of MDM (Oracle, IBM and SAP) on their toes, and has generated a lot of innovation in this space.
I’m sorry to have missed their user conference last week, and I continue to expect great things from Siperian. Please share your thoughts on the company and their products here using the Comments feature.
MDM Track at the OAUG Conference
The Oracle Applications Users Group conference, COLLABORATE 10, is being held April 18-22, 2010 in Las Vegas, Nevada.
But the Master Data Management (MDM) track of COLLABORATE 10 needs YOUR help!
This is your final invitation to share your MDM and Data Governance success story, knowledge and expertise by presenting at the conference.
The MDM Track’s call for papers has been extended to 11:59 pm EDT on Monday, October 26; this deadline will not be extended further.
More than 5,000 users, technology leaders, Oracle executives and solution innovators will gather for the event April 18-22, 2010, at the Mandalay Bay Convention Center.
We hope we’ll see you there — as a speaker!
If you’re interested in presenting, all you need at this point is a title, a short abstract of 520 characters summarizing your idea, and up to five “bullet point” objectives.
If you’d like to submit a paper, just send an e-mail to info (at) hubdesigns (dot) com, giving me a brief sketch of your idea. I’ll respond with the URL you’ll need to submit it.
Aaron Zornes Data Governance Session at Oracle OpenWorld
I’ve always enjoyed the depth and quality of Aaron Zornes’ analysis on master data management. I’ve been attending the MDM Summit conferences that he organizes in the U.S. with SourceMedia since 2006, and I’ve spoken at quite a few of his events.
Today I had the pleasure of hearing him speak on enterprise data governance. Here are some of his major points:
- Don’t settle for “passive” / downstream data governance; instead demand “active” / upstream data governance (please see my white paper with Siperian on this at http://forms.siperian.com/content/PowerGovernancePR).
- Don’t expect data governance maturity assessments to solve all your problems and provide a roadmap out of data governance anarchy.
- Today’s “data stewardship consoles” are substantially less than true enterprise data governance.
- Vendor viability does matter.
- Be prepared to spend $250k-$500k for an initial data governance solution.
Aaron styles himself as the “godfather of MDM” and today was a good reminder of why he deserves that title.
First Day at Oracle OpenWorld
Having a dedicated MDM track at Oracle OpenWorld this year makes a big difference, in terms of being able to find the sessions more easily and in the focus and energy in the sessions.
First up today was a panel discussion on Hyperion Data Relationship Management (DRM). It was moderated by my friend Rahul Kamath from Oracle, and included Dongyan Wang from NetApp, Anand Raaj from Halliburton, and Nimish Mehta from Lumendata. It was very well done, and gave some good insights into the role that DRM can play as a hierarchy management tool in an MDM environment.
Next was Pascal Laik, VP of MDM Product Strategy at Oracle, who co-presented with Cisco’s Kin-Ching Wu. Pascal talked about the reality of complex, heterogeneous environments, and the difference between “push mode” and “pull mode”. He discussed the business drivers of growth, efficiency, IT agility and compliance, and the hard work Oracle has been doing over the past couple of years to help its customers to create their business cases and document the ROI that MDM has been realizing for them. Pascal laid out Oracle’s end-to-end data quality, pre-built integration and data governance strategies, and announced the new Data Governance Manager as a way to Define, Operate, Monitor and Fix data in the hub. Interestingly, 95% of the applications that Oracle customers integrate with are non-Oracle applications.
KC Wu from Cisco discussed their Customer Registry program, which draws data from 40 source systems and publishes it to about 80 downstream systems. She described a fascinating 10-year journey up the MDM maturity model.
The highlight of the next session for me was Bill Miller, a senior IT person at Oracle whom I’ve known for several years, who recently successfully implemented Oracle Customer Hub 8.0 at Oracle. It was very interesting to hear him describe how Oracle has put in place a lot of customer MDM and data governance best practices.
The last session of the day was Vanessa Hsu from Oracle, along with Kelle O’Neal from First San Francisco Partners and Angie Couron from Symantec. They did a great session on enterprise data governance, and gave a “first look” at Data Governance Manager.
Oracle OpenWorld 2009
I just arrived in lovely San Francisco for the latest edition of Oracle OpenWorld.
I’m particularly interested in the Master Data Management (MDM) track this year, as it looks as if the Oracle team has done a great job putting together a roster of Oracle employees, customers and partners to speak on its MDM products for managing master data on customers, products, financials, sites and suppliers.
I ran into several Oracle people like Pascal Laik, David Butler, and Rahul Kamath at last week’s Gartner MDM Summit conference in Los Angeles (more on that later), and as always, it was great to see them.
I’m really looking forward to this week’s sessions on the state of the art in MDM and data governance, and will be speaking myself on Thursday, Oct. 15th at 3:00 pm PDT. So if you’re interested in MDM and you’re attending OpenWorld this week, please stop by and say hello.
For that matter, if you’re in San Francisco and want to get together, send me an e-mail at powerd (at) hubdesigns (dot) com, or call my office number (781-749-8910) – it’s forwarded to my cell phone.
Hope to run into you in the City by the Bay!
Two Passes to Gartner MDM Summit
Hub Designs has two additional passes to the Gartner MDM Summit conference next week in Los Angeles. You will be responsible for all of your own travel, lodging and meals, but your conference registration would be covered.
Please contact me via the “Contact Us” page on our web site. To be fair, the first two people (using the time stamp on your request) will get the passes.
Please provide the following:
- Your first and last name (as you’d like it to appear on your badge)
- Company Name
- Title
- Phone Number
- E-Mail Address
- Mailing Address (please put in Message field on web site)
Only complete entries will be considered.
For more information on the conference, please see http://www.gartner.com/us/mdm.
Webinar with Initiate Systems
“Master Data Management: The Sliding Scale Between Build and Buy”
Replay of the webinar with Dan Power and Marty Moseley
Please join industry experts Dan Power, Founder and President, Hub Solutions, and Marty Moseley, CTO, Initiate Systems, for this webinar where we’ll outline the best practices that have evolved to support organizations in making the critical “build vs. buy” decision.
Master data management (MDM) transforms data integration and business processes. Many organizations are exploring an MDM solution and will eventually have to answer the build vs. buy question. The combination of build and buy for MDM depends on the individual organization’s circumstances, goals and objectives. As MDM has evolved, so have the best practices for considering how much should be built and how much should be bought.
Some key considerations include:
- What are your current data volumes? How will they change in the near and distant future?
- Are customer relationships one-dimensional? Are you concerned with multiple domains of data and managing the corresponding hierarchies?
- Will you implement Web services? How will they be used?
- Do you augment your internal data with information from external vendors?
- What are the time, budget and resource limitations?
- Is MDM intended to eventually provide an enterprise data platform?
Please click here for the on-demand replay.
Looking for MDM Papers for OAUG COLLABORATE 2010
As I’ve written in the past, Hub Designs is a corporate member of the Oracle Applications Users Group (OAUG), and your trusty author, Dan Power, is the OAUG Education Committee’s track manager for Master Data Management.
Believe it or not, we’ve already started planning the May 2010 conference. So we’re looking for good papers on Oracle’s current MDM products: Oracle Customer Hub, Oracle Product Hub, Oracle Site Hub, and Hyperion Data Relationship Management.
This will be the second year where we will be combining the Customer MDM and Product MDM threads in a single Master Data Management track. Feedback on this was very good at last year’s conference.
Here’s the scoop from the OAUG on the Call for Papers:
Share Your Knowledge at COLLABORATE 10!
Proposals are due by Tuesday, October 20.
You are invited to submit a presentation proposal and share your approach to Oracle Applications in an education session at the premier annual conference for Oracle customers — COLLABORATE 10: Technology and Applications Forum for the Oracle Community, presented by IOUG, OAUG and Quest. More than 5,000 users, technology leaders, Oracle executives and solution innovators will gather for the premiere user-driven education and networking event April 18-22, 2010 at the Mandalay Bay Convention Center in Las Vegas, Nevada.
If you are an Oracle Applications professional with an interest in Oracle Fusion, Oracle E-Business Suite, PeopleSoft, Agile, Hyperion, Oracle Communications and Siebel product families, as well as applications technology, please submit through the Oracle Applications Users Group (OAUG). Proposals are now being accepted. The deadline is Tuesday, October 20, 2009 at 11:59 p.m. EDT.
As a selected presenter, you’ll have the chance to:
- Share your best practices and tested solutions for Oracle technologies and applications
- Enhance your own knowledge through new conversations with your peers
- Attend a full week of education sessions to learn from other Oracle users, experts and leaders
Get more information about presenting at COLLABORATE 10, including tracks, specific industry- or product-related areas of emphasis, presenter requirements and the presentation submission and selection processes.
Submit your proposal through the OAUG
Note These Important Presentation Submission Dates and Deadlines
- October 20, 2009, 11:59 p.m. EDT: Presentation abstracts due.
- December 2009: Accepted presenters notified by the OAUG.
- January 21, 2010: Acceptance of the compliance agreement due.
- March 9, 2010: All presentation materials including white paper and presentation slides are due.
- April 18 – 22, 2010: We look forward to seeing you in Las Vegas!
IOUG, OAUG and Quest strive to provide top-quality content at COLLABORATE, emphasizing user-driven education sessions that truly benefit attendees and their organizations. We will monitor sessions and feedback from attendees to ensure education sessions are not focused on a sales-centered topic. Presenters in violation will be noted and may be prevented from speaking at future COLLABORATE conferences. This does not apply to any purchased vendor activities, which are clearly communicated to attendees as sponsored events.
Attention Team Oracle! All Oracle employees interested in speaking at COLLABORATE 10 are to contact Lisa Stuart at lisa.stuart@oracle.com prior to submitting papers through the official COLLABORATE 10 call for papers engine.
| Connect with COLLABORATE 10 — OAUG Forum on Twitter for conference news, reminders and networking. Use hashtag #C10. |
AMB Webinar on “MDM on a Shoestring”
Regular readers of this blog are, of course, aware of the benefits of Master Data Management (MDM) as a means to enable the organization’s customer and supplier facing employees with accurate, complete, timely and consistent information. Many companies find themselves in the predicament of having multiple versions of the truth. Employee productivity is often reduced by using inaccurate data while servicing customers, patients, vendors, investors, etc.
The cost and disruption introduced by adopting an MDM solution shouldn’t be minimized. While the benefits of these environments are well documented, the road to implementing MDM can be challenging. Hub Solution Designs’ partner, AMB, a leading vendor in the data governance industry, is presenting a simpler path, where the benefits can be reaped without the sometimes difficult implementation and the costs normally associated with MDM. AMB calls its approach “MDM on a Shoestring”.
AMB is presenting a short webinar on this topic. Using service-based profiling tools that can reach data wherever it resides, plus a virtual MDM registry constructed using an open repository and easy-to-use query tools, allows for building a much simpler roadmap to benefiting from MDM.
AMB has partnered with industry veteran Mark Albala to offer an informative 30 minute webinar on this topic. Attendees will learn about a new, affordable approach to MDM, including Mark’s seven key attributes to using the right tools and how information profiling forms the basis for MDM success.
This 30 minute program is designed to be brief and informative, and is available at 12:30 pm EDT on either August 27th or September 15th. To register, just visit to the AMB website at http://www.ambpdm.com/mdm_on_a_shoestring.html.
New Data Governance White Paper
A new white paper by Dan Power of Hub Designs is available on Siperian’s web site.
The white paper underscores the importance of a proactive data governance approach, and is designed to help organizations develop a sound and sustainable data governance initiative.
Data governance is a vital component of any master data management effort, since it defines who owns the data, who establishes policies, and who the decision-making authority is when it comes to an organization’s critical data assets. However, many companies tend to take a limited and reactive approach to data governance.
In this new report titled, “When Data Governance Turns Bureaucratic: How The Data Governance Police Can Constrain the Value of Your Master Data Management Initiative”, we outline the limitations of a reactive data governance strategy and urge organizations to adopt a proactive data governance approach, whereby master data is corrected and validated right at the source and often by the business user. This removes potential data stewardship “bottlenecks” and eliminates critical time lags that may occur between the initial entry of a new master record, its certification/ publishing, and its ultimate availability to the rest of the enterprise.
To access the full report, visit http://forms.siperian.com/content/PowerGovernancePR.
Before You Take the MDM Journey
Editor’s Note: Today’s post was written by Jeff Schaffzin. Jeff is an independent consultant with over 15 years of experience in high tech. He’s worked with a number of leading software vendors in roles such as product marketing, professional services and information technology. Specializing in data management, Jeff has spent the last three years focusing on Customer Data Integration and Master Data Management and has worked with a number of high profile companies in the United States and abroad.
Since I’m a consultant, I have the chance to meet with a wide variety of people at different companies in various industries. About a month ago, I talked with someone I worked with a number of years ago who wanted to know more about Master Data Management. Since he’s worked more as a “functional” person for most of his career (as opposed to a “technical” one), he asked me exactly what an MDM solution would provide his company.
MDM, I told him, is not simply a software application that you ‘buy’ from a software vendor like you might with a CRM or ERP solution. You can’t just decide one day that you want to buy a “customer hub” or a “product information manager” because you heard from your IT Director (or even CIO) that it will save your company millions and cure world hunger. It’s vital to understand why your company might need an MDM solution.
You need to look at your company and do some good old-fashioned detective work. Before you take that journey, take the time to understand how your company works and more importantly, why it isn’t as efficient as it could be. Perhaps management wants to know more about your customers, but can’t do it because customer data is stored in three different applications, and even then it takes two or three months to get an out-of-date report. Maybe your company is paying too much in commissions with multiple reps getting paid for the same deal. Has your company grown so fast that you have multiple purchasing and inventory management systems and hundreds of Excel spreadsheets that have all the answers – if only you could piece them together?
Perhaps you have a more urgent need to understand your customers. If you’re a pharmaceutical company, you need to follow strict spend management guidelines related to marketing to your customers. If you’re a financial services provider, you need to comply with capital management standards like Basel II and to understand your clients as mandated by federal Anti-Money Laundering legislation. Perhaps you’re a publicly held company and need to ensure that you comply with Sarbanes-Oxley. In any case, failure to comply with such legislation can lead to fines, damaged reputations or even imprisonment of top executives.
These all are commonly found reasons for pursuing an MDM solution. Take a moment – what reasons do you have for exploring MDM? If your company is like most that I talk to, you’ve got the problems that master data management can help solve.
New Guest Post for Identity Resolution Daily
I’m sure regular readers of this blog have noticed the reduced frequency of new articles in the past few weeks. It doesn’t mean that I don’t care about you, the reader – honestly!
But it does mean that it’s gotten much harder for me to write for this blog, because I’m typically at a client site Monday – Thursday, and correspondingly, life seems as it’s on “fast forward” lately, as I try to squeeze everything else into weekday evenings and Friday – Sunday.
I did find time to write a guest post for Identity Resolution Daily, a great blog maintained by Infoglide Software.
Here’s a brief excerpt:
There definitely seems to be a trend lately with small companies in the master data management (MDM) and data quality space being purchased (as in the asset acquisition of Exeros by IBM) or partnering with larger firms (such as Silver Creek Systems’ OEM relationship with Oracle).
I think this is a good thing. Using the classic “build, buy or ally” strategy, it isn’t surprising that sometimes companies will conclude that it’s faster and/or cheaper to buy a technology, or partner with another company that has that technology, rather than build it themselves internally.
For the complete article, please click “The Growing Role of Identity Resolution in MDM“.
Thanks for being patient with me as I re-adapt to life as a road warrior!
Gryphon Networks
digg this |
del.icio.us |
Reddit |
Stumble It!
Editor’s note: from time to time, the Hub Designs Blog profiles companies and solutions you may not have heard of yet that are relevant to master data management (MDM).
Company & location: Gryphon Networks, headquartered in Norwood, Massachusetts, provides “on demand contact governance” solutions.
Value proposition: Gryphon’s approach combines compliance and preference management, converting consumer contact preferences, compliance policies, and corporate governance into a consumer contact database, tracking the legal methods for contact. This gives you a “safe” list that expands your marketable base.
What point in MDM lifecycle: This is particularly useful when you’re using an MDM hub to support marketing activities and you’re concerned about maintaining a “single source of truth” on “Do Not Call” status, as well as opt-in / opt-out status for fax, email, and direct mail campaigns.
Relevance to MDM: Today’s hubs are evolving into “policy hubs”, where the enterprise can go beyond basic customer name & address data to tracking advanced attributes like contact preferences and managing compliance with a growing list of privacy regulations. But for a lot of industries, the current generation of MDM hubs doesn’t go far enough. That’s where Gryphon Networks comes in – it provides a real-time, on-demand contact governance capability that your MDM hub can interact with via Web Services.
If you’re in an industry like financial services, hotels, healthcare, telecommunications, insurance, etc. where there’s a need for a lot of outbound marketing activities and at the same time, strict privacy regulations around “Do Not Call” and opt-out status for e-mail, fax and direct mail marketing, your MDM strategy should probably include integration with Gryphon Networks’ platform.
For more information, contact Bob Hadden at rhadden@gryphonnetworks.com.
OAUG COLLABORATE 09
digg this |
del.icio.us |
Reddit |
Stumble It!
The Oracle Applications Users Group (OAUG) COLLABORATE 09 conference has wrapped up, and this year was a good one.
Attendance was down overall, from about 7,500 people last year to roughly 4,500 this year (caution, these are unofficial “word of mouth” numbers). But given the gloomy economic picture over the last 6-8 months, I was just happy the conference wasn’t canceled altogether. And I noticed that the people who were there were more engaged. These are the folks who had to fight to attend, so once they got there, they were more focused on getting the most out of it.
On the Master Data Management front, we had a great roster of presentations this year.
I particularly enjoyed Bob Barnett on “Design Guidelines for Oracle PIM MDM Processes”, Shyam Kadigari on “Oracle Customers Online Implementation”, Mani Kumar Manda on “Golden Rules to Tame the MDM Beast” and William McKnight on “Top 10 Mistakes Companies make in forming Enterprise Data Governance”.
I thought Pascal Laik, VP of MDM Product Strategy at Oracle, did a great job on “Rapid ROI with Oracle Master Data Management”. He did a demo of the ROI Analysis tool that Oracle has created, which looked very comprehensive and should save MDM teams a lot of time. Oracle customers can get access to this through their Oracle sales team.
There were a couple of presentations I was looking forward to but had to miss, including Bill Swanton from AMR Research on “Master Data Management for ERP Suites – It’s Different” and Brent Zionic from Sun Microsystems on “The Lunatic, the Lover & the Poet – Beyond Imagining Data Management”. Word of mouth feedback on these presentations was very good.
The OAUG is planning to offer a number of eLearning webinars over the rest of 2009, so we’re inviting all of the presenters (and anyone else interested in doing an eLearning session) to submit their ideas at http://secure.meetingexpectations.com/oaug/elearning/elSubmission.aspx.
I’ve been a member of the OAUG’s Education Committee for several years, and with the aid of the OAUG Special Interest Group (SIG) coordinators for Customer Data Management and Product Lifecycle Management / PIM, I’ve been planning the MDM track at each year’s conference. So if you’re interested in presenting at a future OAUG COLLABORATE conference, please sign up for Hub Designs’ newsletter, so I can keep you posted on the next Call for Papers.
Heading to OAUG
digg this |
del.icio.us |
Reddit |
Stumble It!
I’m heading to Orlando, FL this Sunday to attend and speak at the annual Oracle Applications Users Group (OAUG) conference.
I’m a volunteer member of the OAUG Education Committee, managing the Master Data Management track. As such, I get to work closely with the Special Interest Group coordinators, and have a lot of fun planning the the MDM part of the conference.
This year, I’m very interested in hearing what all of our great MDM track speakers will have to say, and catching some of the Oracle executive presentations on their progress towards the Fusion applications suite.
As you might expect, I’m particularly interested in the Fusion MDM Hub, and Pascal Laik from Oracle will be doing a session on that.
I’ll try to write a few “dispatches from the front lines” here during the conference to share my thoughts on the various sessions.
Hope to see you in Orlando!
New Columns in Information Management
digg this |
del.icio.us |
Reddit |
Stumble It!
Usually, when I’ve written a magazine article, I’ll post a brief excerpt here, with a link to the full article. When I moved from the online edition of DM Review (now known as Information Management) to writing a monthly column in the print edition, somehow I forgot to keep doing that.
So here are brief excepts and links to the full articles for the past few months, in case you haven’t already seen them.
Feb. 2009: For years I’ve been recommending that companies investigating or implementing MDM should include business process management in their plans. BPM allows an organization to model, deploy and manage mission-critical processes that span multiple applications, departments and business partners – behind the firewall and over the Internet.
Click on “Business Process Management and MDM” to continue reading.
Mar. 2009: I recently came across a great quote on data quality by Ken Orr in “The Good, the Bad and the Data Quality” from the Cutter Consortium: “Ultimately, poor data quality is like dirt on the windshield. You may be able to drive for a long time with slowly degrading vision, but at some point, you either have to stop and clear the windshield or risk everything.”
Click on “Data Quality and Master Data Management” to continue reading.
Apr. 2009: Third party content is an area that’s close to my heart. I started working intensively with customer and product information more than 20 years ago and was one of the first consultants to integrate Dun & Bradstreet data with Oracle’s applications suite (about seven years ago).
Click on “Filling in the Gaps” to continue reading.
As always, please let me know what you think by commenting here.
Interview in Data Quality Pro
digg this |
del.icio.us |
Reddit |
Stumble It!
Data Quality ProTM is a free, independent community resource dedicated to helping data quality professionals take their career or business to the next level. Founded and managed by data quality professionals, its mission is to create the most beneficial data quality resource that is freely available to members around the world.
Dylan Jones, founder & editor, interviewed me recently, and the interview appears on the Data Quality Pro site today.
Please click here to read the full interview.







In 
among other applications. In addition to externalizing business rules locked in proprietary applications (for example, ERP or CRM), we also use design patterns defined here to communicate between different data formats. Instead of writing translators between each and every format (with potential for a combinatorial explosion), use this in combination with the
In
in which the organization operates. Expressed in business terms, this model represents a “foundation principal” or theme we can pivot around to understand each facet in the proper context. This is not easy to pull off, but will provide a fighting chance to resolve semantic differences in a way that helps focus the business on the real matters at hand. This is especially important when developing the Canonical model introduced in the next step.






