Skip to content

Posts from the ‘Data Quality’ Category

20
Apr
Managing Complexity by Michael Heiss

Getting Started with Data Governance, Part 2

This is the third article in an ongoing series on Data Governance sponsored by SAP. Here are Part One and Part Two of the series. Read more »

19
Apr
Data Governance

Getting Started with Data Governance, Part 1

This is the second article in an ongoing series on Data Governance sponsored by SAP. You can find the first article in the series here. Read more »

18
Apr
Platinum and Gold

Golden Relations and Platinum Relations, by Henrik Liliendahl Sørensen

“Golden copy” is a term widely used in master data management (MDM), as we often see the master data hub as a golden copy of the data in the company’s operational databases. Read more »

16
Apr
Oracle MDM

Oracle 2011 MDM Strategy and Roadmap

This session at COLLABORATE 2011 was presented by Manoj Tahiliani, Senior Director of MDM Product Management & Strategy at Oracle. Read more »

13
Apr

The Strategic Nature of MDM According to Oracle

Oracle Logo

This week, I attended David Butler’s presentation at the Oracle Applications Users Group COLLABORATE 11 conference in Orlando, FL.  Read more »

27
Mar

It’s Good To Be On The First Page of Google

Google Search Results Heat Map

Google Search Results Heat Map

I was doing some research recently on the search terms that bring people to the Hub Designs Blog. So I took a few minutes and found that for the most frequently used 40 search terms over the past year that I looked at, the Hub Designs Blog was in the Top 10 search results on Google for every single one, with an average position of Google’s search results page of third. That, to me, is amazing. Read more »

20
Mar

Why Govern Master Data?

This is the first article in an ongoing series on Data Governance sponsored by SAP.

Data Governance

The most important thing about data governance is to “start from where you are”. Most companies are just getting started on their data governance journeys. It can be hard to admit that your company is at data governance maturity level 0 or 1. But the most critical step is the first one – getting started. Read more »

17
Mar

The MDM Track at OAUG COLLABORATE 11

Official COLLABORATE 11 Speaker Web Banner

I’ve been a volunteer member of the Education Committee of the Oracle Applications Users Group (OAUG) for several years now, serving as the “track manager” for Master Data Management (MDM). Read more »

15
Mar

Things That Make You Go Hmmm (Part 2)

The “Chicken Little Syndrome” is a concept in Cognitive Psychology that states how the human mind fills in gaps of understanding and jumps to conclusions.

For example, Deborah sits at home wondering why Dave hasn’t called yet. Dave must have a reason for not calling. The reason must be that he is angry. What did she say to anger him? It must have been their last phone conversation. She remembers making a comment about Steve, his good friend who’s also a friend of her brother. He must have told Steve, and now Steve is trying to drive a wedge between them. The nerve of Steve. The next time she sees Steve, she plans to… and then … Dave knocks at the door to surprise her with flowers and take her to dinner. Deborah jumped to conclusions with no factual base at all. And each jump brought her farther from the truth than before. After telling Dave of her rationalizations, Dave calls her neurotic, runs out the door, and calls Steve on his way to the sports bar to tell him never to setup him up on a blind date again. Hey, not all stories have a happy ending!

Sharing a secretphoto © 2007 Mack Male | more info (via: Wylio)

As kids, we all played the group game “I’ve got a secret”. It starts in a big circle where one person tells the one on his left a few unrelated gossip items. That person tells the person on their left what they heard, and so on. When it gets to the end, the last person proclaims the gossip they heard and it is compared to the original gossip. I used to think that childhood attention deficit was the explanation for how the truth gets distorted until I saw the same results in adults at a party. I then realized the obvious explanation; those kids must have been drinking too.

So, how does this all relate to Master Data Management (MDM)?

Businesses cannot afford to jump to conclusions and need to make their decisions based on reliable information. The better the quality, integration, and standardization of the information, the more precise analyses can be. Speculation should only exist in running “what if” scenarios which should model several possible outcomes, not just the single worst case like Deborah did.

And as information flows through an organization, it changes just as in the party game. Without strict governance rules and controls, information can easily change its meaning or become completely corrupt and unusable. Or worse, information could appear usable and be taken as fact when it truly is incorrect or not correct for the applied purpose.

Consider your MDM strategy and ensure the word “Master” really applies. Consumers of MDM information should rely on the Master information, not a version of it that has been passed through multiple “gossip-like” systems. Oh, and tell your kids not to drink.

11
Mar
Infoglide Matching

Matching (That Is, Entity Resolution) Revisited

I got a couple of e-mails from a friend over the past few days, and he asked some great questions about matching (or entity resolution, as Infoglide Software prefers to call it).

His first question was about AbiliTec IDs for individuals. He wondered if this was an Axciom product or an industry standard identifier.  He was looking for a way to uniquely identify customers from the web, retail point-of-sale (POS), and other marketing channels for a client who doesn’t have any useful way of identifying the same customer across channels.

I’m familiar with AbiliTec IDs (Acxiom calls them AbiliTec links). It’s a persistent identification scheme for consumers and addresses, issued and controlled by Acxiom, similar to how D&B issues and controls the D-U-N-S Number for businesses. There’s a good brochure available on Acxiom’s UK website.

To answer his specific questions, it’s definitely an Acxiom product, not an industry or open standard identifier. So one can only get AbiliTec links by working with Acxiom in some way, shape or form.

As far as uniquely identifying individuals across channels like the web, retail POS, and other marketing outlets, that’s pretty much exactly what AbiliTec is used for – think of it as a high speed matching engine that will return AbiliTec links for every consumer record you feed it.  The downside is you’ve got to pay Acxiom for the privilege. But like D&B, they’re pretty good at helping you create the business case to justify the expense.

Then my friend came back with another question: are there any competing products for the AbiliTec link?

My answer was: none that I know of that are as widely accepted. The three consumer credit bureaus (Equifax, Experian and TransUnion) all sell their own persistent consumer IDs too, but in my opinion, Acxiom’s is very good at that particular task.

Products like DataFlux, Trillium, Informatica, IBM’s QualityStage and SAP BusinessObjects can also generate a key with the same value given varying input, so that different representations of the same person, such as

  • John O’Connor, 30 Palomino Lane, Westwood MA 02090
  • J. J. Oconnor, 303 Palomino Lane, Westwood MI 02093
  • John Joseph Connor, 30 Polo Pony Court, Westawooda RI 02021

will all generate the same key value.  So you can use that generated key to identify the person across three different channels.

Of course, it all depends how well you write and tune your business rules, too. If you write them too tight, then the three records above won’t generate the same key after all. Write them too loose, and a bunch of other records will also generate that same key value and you’ll end up with a bunch of false positives (non-Johns).

The beauty of Acxiom’s approach (and D&B’s for that matter when you’re interested in businesses) is that they’ve both got anywhere from hundreds of millions to billions of records of reference data to work with – both to refine their match engine and business rules, and to match against.

So when you’re comparing records, you’re not just comparing these three representations against one another, you’re comparing them to all of the historical addresses this person has lived at over the last 25 years, and all of the other versions of their name of file from marriages, divorces and name changes. Don’t underestimate the power of the database!

You can retweet this by cutting and pasting < New on the Hub Designs Blog: “Matching (That Is, Entity Resolution) Revisited” at http://wp.me/p5Tdn-yS >

Image courtesy of Infoglide Corporation

14
Jan

Writing for the Hub Designs Blog

In writingphoto © 2008 Toshiyuki IMAI | more info (via: Wylio)

Now that 2010 is over, we can share some readership statistics with you.  Through Dec. 31, 2010, our total views since inception (in July 2007) were 101,078 (including RSS traffic).

We were excited to see that overall readership was up 14% in 2010 vs. 2009 (and 2009 was up 96% vs. 2008).

This offers some great opportunities for exposure for guest authors. Some of our most popular articles on the Hub Designs Blog have been written by guest authors such as James Parnitzke, Rob DuMoulin, Maureen Butler and Joan Lawson.

So, as we start off 2011, we’re looking for new writers – new perspectives, new talent, new ideas. We’re looking for people like you, our readers, who are (like we are) out on the front lines of master data management and data governance.

Whether you’re working through the education process with your team, or developing your strategic roadmap for MDM and governance for the next few years, or creating a business case to convince your senior management to fund your MDM initiative, or conducting a vendor evaluation, or in the middle of implementing an MDM hub – we want to hear about it!

By all means, sanitize your stories, and change the names of the guilty to protect the innocent, disguising the identities of the companies and individuals where needed. But share your lessons learned, your best practices, and your “things not to do” with the rest of the MDM Community.

If you’re interested in writing for the Hub Designs Blog, please check out the Author Guidelines first, and then get in touch using the Contact Us page on our web site.

We’re looking both for people willing to write on a regular basis, and for people looking to contribute a single or occasional guest article. Please include a brief abstract of your article in your message, and we’ll reply to everyone we hear from.

This is a “triple win” – you gain from the exposure your article will get on the Hub Designs Blog, we gain from working with new writers with interesting perspectives, and of course, our readers gain from the great content that we’ll be able to bring together for them.

So get your “thinking cap” on and start writing! Then reach out to us using our Contact Us page.

6
Dec
Kalido Data Governance Framework

First Look at Kalido Data Governance Director

I attended an analyst briefing today with Kalido on their new product, Kalido Data Governance Director.

The Kalido presenters included Bill Hewitt, President and CEO, Winston Chen, VP of Strategy and Business Development, Lovan Chetty, Senior Manager of Product Management, Mike Wheeler, Director of Data Governance Solutions and Lorita Vannah, Director of Marketing Communications. Lorita is the person who first turned me on to Kalido, about two years ago now. We first met at the 2008 Gartner MDM Summit in Chicago, and she impressed me then with her passion for MDM, data governance and her company.

Bill started off by talking about how the data governance market has been exploding as the volume of corporate data has been exploding, which is certainly true, and observed that Kalido noticed a disconnect between data and business processes. To address this issue, Kalido developed a new product from the ground up, because the company felt that data was better managed through policies. For example, it may be okay to store customer data in multiple places, as long as the relevant policy allows that.

As part of its research into data governance, Kalido developed its own data governance maturity assessment. Winston described the evolution of data governance, from application-centric to today’s “enterprise repository centric” approach. The next phase, according to Kalido, is policy centric, followed by fully governed. Winston also discussed the need to manage data policies in context: you’ve got data, but you’ve also got business processes, systems and organizational scope.

That allows you to fully describe the context in which a particular policy is being defined.

The way to operationalize governance processes is: to define the policy, to implement the policy, and then to enforce the policy, which Kalido modeled on how laws are created by the legislative branch, implemented by the executive branch, and then enforced by law enforcement and the judicial branch of government.

Kalido has been working with data quality vendors such as DataFlux and Trillium to build integration with their products into Kalido Data Governance Director, so metrics can be automatically gathered back into DGG from those data quality tools.

If a data quality problem goes beyond the single or small number “issue” state, then it could be remediated as an “initiative”, where it would be put into Data Governance Director and tracked as a separate initiative, with all of the visibility and accountability that goes with that, and the full life cycle of governance – definition, implementation, and metrics / enforcement – could be used to make sure the data quality issue was resolved.

Lovan Chetty did a brief demonstration of the product, showing a web-based user interface to author new initiatives and policies, manage scope and organizational parameters, and create a unified business model, including a data model, process model and systems model.

Mike Wheeler talked about Kalido’s lighthouse customer program for Data Governance Director, which consisted of cultivating about 16 companies and 3 consulting firms, including some large financial services providers and manufacturing companies, at different levels of data governance maturity, to provide input and feedback on their policies and data governance programs and practices.

A number of them will be speaking at tomorrow’s Kalido Connect virtual user conference.

One very large company had a “light going on” moment when using the product, when they realized that pulling the knowledge out of everyone’s head is the hardest part, and that lots of “tribal knowledge” is often never incorporated in the actual policies.

One of the largest banks in Mexico, Scotiabank, has already bought the product prior to its general availability, in order to streamline its data governance operations. And a Top-5 pharmaceutical company has also signed up as a customer.

After a short Q&A session, Kalido promised to let everyone get a closer look at the new product in their virtual user conference tomorrow. For more information, or to register, please go to http://bit.ly/kalido-register.

The screen shot below shows the product measuring and reporting data policy compliance status based on results captured from 3rd party monitoring tools.

Kalido Data Governance Director Screen Shot

23
Nov

New Article in Information Management Magazine

InfoMgtNovDec2010

The latest issue of Information Management magazine is out, and my column in this edition is titled “Data Governance: A Chicken and Egg Problem”.

Here’s a brief introduction to the article:

Data governance suffers from a bit from the “chicken or the egg” syndrome. People at your company aren’t going to understand what data governance is and what it can do for them until they actually see the results. However, getting the initiative funded and launched will only happen if you can convince your company of the tangible benefits of data governance. That can be difficult when there’s no actual program in place.

You can read the rest of the article at: http://digital.info-mgmt.com/info-mgmt/20101112#pg33.

Please let us know what you think of the article by using the “Leave a Comment” link here.

12
Nov
Kalido Logo

Kalido Data Governance Maturity Survey Results

This morning, Kalido, a Hub Designs partner, released an initial analysis based on the almost 100 responses it received to its Data Governance Maturity Assessment Survey.

The results were not surprising, but I found them very interesting nonetheless. Keep in mind that this was a self-selecting group; that is, people who were interested enough in data governance to have taken the survey. That suggests that the general population would be even less mature.

The biggest finding was that only 10% of organizations have been able to move their data governance programs beyond the first two levels of data governance maturity. That matches well with our experience at Hub Designs – most companies are just getting started with data governance.

Despite the commonly expressed belief that data should be owned by the business, traditional IT organizations are accountable for data in nearly two-thirds (63 percent) of organizations.  At Hub Designs, we believe that the business should be accountable for the data – but sometimes, that’s a “bridge too far”. You’ve got to start where you are, and evolve over time to higher levels of maturity. If the center of gravity right now is in the IT organization, that that’s where you start. But over time, have a strategy for moving data governance into the business.

Nearly half (45 percent) of organizations taking the survey said they have a formal data governance council in place, but only 27 percent have established a data governance council with business representation and formal data stewardship. That tells me that even in places where they’re doing some type of (immature) data governance, there are still lots of opportunities for improvement, by increasing the level of business involvement, stewardship and data quality.

This finding I found stunning: more than half (57 percent) of organizations do not measure the performance of data management activities at all. That leads me to believe that those organizations won’t be doing data management for much longer, because lack of measurement tends to lead to lack of funding, because of a perceived lack of documented results.

Clearly, we have a long way to go in the corporate world in becoming more mature from a data governance perspective. I really liked Kalido’s survey, and you can find Winston Chen, Kalido’s VP of strategy and business development, discussing it on his blog at http://bit.ly/cbckxD.

Speaking of Kalido, Hub Designs is sponsoring their upcoming Virtual User Conference on December 7, 2010. The Kalido Connect virtual conference provides attendees with a cutting-edge platform for networking, exhibition, collaboration and learning. Attendees can watch a presentation in a packed auditorium, network with peers in the Kalido Connect Lounge, or visit fully interactive sponsor booths on the exhibit floor. From group chats to one-on-one discussions, the virtual platform allows for a live conference and exhibition floor with real-time user interaction. To register, just click http://bit.ly/kalido-register.

Kalido Connect offers:

  • Real-world examples of Kalido’s business value as told by their customers
  • Keynote sessions on BI, MDM and data governance trends and how to keep ahead of the curve
  • Technical breakout sessions to maximize your investment in the Kalido Information Engine™ and expand your skill set
  • Exhibit hall showcasing complementary products and services from Kalido partners and sponsors
  • Opportunities to network with colleagues, industry leaders and executives

Last year, more than 300 people attended the Kalido Connect Virtual User Conference, and Kalido expects to double that this year.

11
Nov
Informatica Logo

Informatica MDM Tweet Jam

This is a transcript (lightly edited for brevity) of today’s Informatica MDM Tweet Jam. We hope you enjoyed the actual Tweet Jam and this transcript. If there were questions you didn’t get a chance to ask, please feel free to ask them via our web site’s Contact Us page.

Dan Power: Informatica MDM Tweet Jam like playing “stump Dan” – see if you can perplex, mystify and amaze me!

Dan Power: Actually, just kidding – want to have a good dialogue with everyone – would love to have a good MDM discussion.

Informatica Corp.: Right now! Join the #MDM TweetJam with @dan_power. 9am PT.

Dan Power: OK, the Tweet Jam is officially open!

Jakki Geiger: Dan, what are the most common concerns you hear about MDM?

Dan Power: IT people still seem concerned about how to involve the business and sell it to senior management.

Jakki Geiger: what advice do you give them?

Dan Power: IT seems to know that MDM is needed but sometimes can’t seem to get the business on board, and it can be hard to pitch to the C-Suite.

Dan Power: We advise building a compelling business case – getting outside help if needed – and recruiting internal business champions.

Jakki Geiger: What strategies to get the business on board have you seen work?

Dan Power: I wrote an article about that in a recent Information Management magazine and a blog article on Hub Designs Blog that accompanied it.

Jakki Geiger: We’ve seen IT successfully tie MDM to key strategic imperatives like improving cross-sell and up-sell=getting sales on board.

Ravi Shankar: One thing we have done to help IT is to quantify how much DQ issues can cut costs or increase revenue.

Dan Power: Getting the business on board means STARTING in the business – find out their pain points and recruit them to drive from Day 1.

Jakki Geiger: Others include onboarding channel partners onboard faster, which appeals to sales and channel operations.

Jakki Geiger: A huge driver has been regulatory compliance = appealing to those who gather data across the enterprise and create reports.

Ravi Shankar: I like what Charles Bloodworth of J&J said at Informatica World 2010 – “MDM is not just a project; it’s a discipline – a way of doing bus for us”.

Dan Power: Good points Jakki & Ravi – those are the pain points I’m talking about: increasing revenue / onboarding channel partners faster.

Jakki Geiger: One area I think is really going to take off is improving business processes = improve data to improve the process.

Jakki Geiger: One exec got buy in from exec team with “we need to manage our product supply chain and info supply chain equally efficiently”.

Ravi Shankar: Agreed – bus needs to be involved in MDM. Charles of J&J said bus involvement drove their MDM and data governance success.

Dan Power: That’s right – becomes a way of life – new discipline for the business – to have a golden copy of the data that they can trust.

Jakki Geiger: I agree with u. IT needs to understand what the business pains and strategic imperatives are, then evaluate “can MDM help?”

Dan Power: Product management and supply chain are just as fertile for most companies as customer data – so MDM is just getting started.

Dan Power: I’ve been talking to a lot of companies lately that have already done customer MDM and are now looking at doing product MDM.

Ravi Shankar: Product MDM: I see lot of demand for this from manufacturing companies. Just came from S. Korea – product MDM is hot.

Dan Power: Or even supplier MDM – in order to get global strategic sourcing initiatives off the ground, which can save millions of $.

Ravi Shankar: Customer MDM to product MDM – we’ve seen that with our own early customers – They leveraged the same Informatica platform.

Julie Hunt: How do you see MDM implementations evolving to take advantage of newer tech such as ‘cloud’?

Julie Hunt: And what advantages does the cloud offer to MDM solutions?

Dan Power: Good question, Julie – definitely see a movement towards the cloud – people don’t want to create tomorrow’s “legacy systems”.

Dan Power: So they increasingly are asking their vendors about cloud deployment options, even if they don’t rush to take advantage of them.

Dan Power: They want to know they’re available

Dan Power: To Julie’s Q about cloud, I think eventually we’ll see cloud deployments at lower cost than on-premise (particularly hardware).

Ravi Shankar: Let me outline 2 use cases we’ve seen @ InformaticaCorp.

Ravi Shankar: Use case 1: During peak times like holiday seasons, retailers can burst into cloud for additional capacity.

Ravi Shankar: Use case 2: Mktg mgrs can use self service tools to upload attendee list from event w/o having to bother IT.

Dan Power: The promise of cloud for me, is more flexibility as my business grows and if we have seasonal peaks and valleys of demand.

ocdqblog (Jim Harris): What do you say to companies that expected that from their data warehouse? How is MDM different from conformed dims?

Ravi Shankar: ocdqblog – welcome. Looking forward to a lively MDM discussion.

Dan Power: Good question, Jim. Most companies had unrealistic expectations from data warehouses, which ended up being expensive, read-only,

Dan Power: and updated infrequently. MDM gives them the capability to modify the data, publish to a DW, and manage complex hierarchies.

Dan Power: So to finish answering your question Jim, I think MDM offers more flexibility than the typical DW.

Dan Power: That’s why BI on top of MDM (or more likely, BI on top of a DW that draws data from an MDM) is so popular.

Ravi Shankar: MDM for DW – 90% of Informatica MDM customers use it for analytical use (in addition to operational).

ocdqblog (Jim Harris): Thanks Dan – Follow-up is do you see MDM as compliment or replacement for DW?

Dan Power: Definitely a compliment – fills void in the middle between trx systems and the DW – does things that neither can do to data.

Jakki Geiger: are you seeing this trend? Evolving beyond single customer view= visibility into 360 customer view w/products and channels, etc.

Dan Power: Yes, Jakki – people want more than a single view – they want multiple views on top of the single view.

Ravi Shankar: Siperian customers – We’re having a lively chat on MDM and data governance. Join in!

Ravi Shankar: Dan, what do you tell DW admins that DW provides their single view for enterprise?

Dan Power: I tell DW admins that most people in the enterprise aren’t completely happy with DW – that’s why there’s pain leading to MDM.

Jakki Geiger: Since the driver of MDM is the business, how are we getting master data into the hands of the business?

Dan Power: Good Q, Jakki – getting MDM data back into hands of the business should be built into the project – and the software platform.

Ravi Shankar: Compliance is driven out of DW – you need MDM for accurate compliance reports – Do you agree?

Dan Power: Yes, Ravi – Garbage in, Garbage out – you need quality data from the MDM system to feed into the data warehouse.

Julie Hunt: So we must advocate value of data governance as well as value of MDM with business, senior management?

Dan Power: I tell people to think of their initiative as a data governance project that happens to involve #MDM technology.

Dan Power: Not an #MDM technology project that requires data governance.

Dan Power: And to start the data governance piece about 6 months before the technology piece, if possible.

Julie Hunt: The importance of data quality = another layer to be advocated to the business and to management – show them the impact on outcomes.

Jakki Geiger: MDM is like a Ferrari. If you don’t use DQ with MDM, it’s like putting regular gas in Ferrari=sub optimal performance.

Dan Power: I’ve seen people try to do MDM without data quality – and it’s a disaster, like trying to run a submarine on dry land!

Dan Power: The fact is that #MDM and data quality are linked, just as #MDM and data governance are linked.

Ravi Shankar: Should data quality be integrated within #MDM?

Dan Power: Good question, Ravi – I’ve seen it both ways – a data quality engine integrated with the MDM platform or separate, both can work as long as the data quality tool is robust and the integration is solid, shouldn’t matter.

Dan Power: Most MDM platform vendors are not equally good at developing data quality tools – Informatica is one of the few that is.

Julie Hunt: How much does corporate culture impact success/failure of projects for #MDM, data governance etc.?

Dan Power: Great Q – corporate culture is a huge impact on success because data governance drives MDM and requires a lot of change mgt. So spend a lot of time on org. change in the data governance side of the #MDM initiative in order to be successful.

Ravi Shankar: Heard a customer say – “Don’t overdo data governance – do just what’s necessary” Do you agree?

Dan Power: I’d agree not to go overboard on data governance – balanced approach that’s right for your co. just enough to get the job done. Too much data governance can be worse than not enough – can be bureaucratic – the “data governance police”.

Ravi Shankar: Data governance applies to all data, but I hear that in MDM context a lot. Do you hear “master data governance” for MDM?

Jakki Geiger: Some argue shouldn’t call it data governance because the -ve connotation of “governance” thoughts?

Dan Power: I actually like that phrase – master data governance – makes it more clear and precise what we’re talking about

Dan Power: Because otherwise, data governance organization can get drawn into all kinds of weird things not related to master data

Dan Power: We need to recognized that data governance is (a) political, (b) controversial, (c) going to have an enforcement side.

Ravi Shankar: Now, do orgs do data governance first before implementing MDM or after they select an MDM product?

Dan Power: So in some ways, I actually like the term “data government” better – makes it more explicit what we’re talking about.

Dan Power: And it reminds people that we’re talking about governing the enterprise’s core master data – just like we govern other key assets.

Jakki Geiger: I think the challenge is that we’re still in the process of understanding that data is a strategic asset.

Dan Power: It’s ideal if they can start data governance before even selecting a product – so that the data governance org. can help w/ the selection process.

Ravi Shankar: Dan wrote an excellent whitepaper – “When Data Governance Turns Bureaucratic” – you can download it from http://bit.ly/ck2Gw8.

Dan Power: Truly competitive 21st century companies not only understand that data is a strategic asset, it’s how they run their business.

Dan Power: Forward looking businesses like Google, Amazon, Century 21, eBay, etc. realize that the data IS their business!

Jakki Geiger: “Data as strategic asset” is a fairly new concept. Visionaries recognize need 4 scale and intelligence=harnessing & analyzing data.

Dan Power: That was a fun white paper to write – looking forward to doing another one with the great folks at Informatica again soon!

Jakki Geiger: What I liked about Dan’s WP was the discussion around stopping the problem of data quality at the source.

Seth Grimes: Is data governance also (d) useful on balance and (e) capable of delivering ROI?

Dan Power: Yes, of course – or people wouldn’t be doing it. You can’t bring together massive amounts of data in an MDM hub and not have some type of governance framework in place. And if there was no ROI, it wouldn’t be happening.

Dan Power: I’m pretty familiar with Oracle’s data governance program, and for a huge company, it’s not real expensive.

Ravi Shankar: Welcome to #INFATJ – good data governance question.

Ravi Shankar: Successful Informatica MDM customers like J&J, Merrill, and numerous others have had strong global data governance orgs.

Ravi Shankar: Data is a key asset that many firms make a lot of money out of it – Bloomberg for e.g.

Ray Wang: RT @Ravi_Shankar_: Data is a key asset that many firms make a lot of money out of it – Bloomberg for e.g.

Dan Power: Good example with Bloomberg – welcome Ray!

Ravi Shankar: @rwang0 thx for the RT

Jakki Geiger: Can you create a career out of MDM? Many of our customers have extended MDM to address more and more issues in their orgs.

Dan Power: Good Q, Jakki – u can create a career out of it, I have for the last 6 years, but you’ve got to really have this in your blood

Ravi Shankar: Within Informatica customers, we’ve seen careers of several people take off b/c of successful #MDM data governance.

Julie Hunt: Thanks for great tweet jam!

Jakki Geiger: Thank you for participating! Looking forward to next time. Good luck to you all!

Dan Power: Thanks for joining us today – hope you enjoyed it! Check out the Hub Designs Blog at http://blog.hubdesigns.com.

Ravi Shankar: Thx for your insightful discussion and advice on #MDM data governance. Hope you all enjoyed it. Until next time!

Dan Power: This is Dan Power, signing off – have a great day everyone!

25
Oct
Guy Kawasaki as Evangelist

The Need for MDM Evangelism

For a long time now, I’ve admired Guy Kawasaki, one of the early Apple employees responsible for marketing the Macintosh computer in 1984. He’s credited with being one of the people to bring the concept of evangelism, in his case focused on creating passionate users and developers to become advocates for Apple, to the high tech business.

I’ve tried to emulate him by being an evangelist for customer and product MDM. From 2001 to 2004, I was a consultant working with the precursor to Oracle’s Customer Data Hub platform. At D&B from 2004 to 2007, I managed its strategic alliance with Oracle while Oracle launched and refined Customer Data Hub. I left D&B to start Hub Designs in 2007 because I wanted to work more directly in developing and executing MDM strategy at corporate clients. All this time, I’ve tried to get people excited about using the evolving technology to solve business problems.

In the past nine years, in all of the different industries and companies I’ve worked with, most have quickly “gotten” MDM:

  • They understand the value of the Single View of the Customer (or Product, as the case may be).
  • They see the revenue increases from being able to up-sell and cross-sell customers by knowing more about them, and by knowing their own products better.
  • They understand the dollar value of having a streamlined, coordinated New Product Introduction process.
  • They see the short payback period and millions in savings from a strategic sourcing program that consolidates vendors and products, and renegotiates agreements.
  • They understand the contribution MDM makes to credit risk management (know your customer, and whether they can pay their bills on time).
  • And they see how MDM (done properly, which includes data quality improvement and a data governance program) makes it much easier and more efficient to have accurate, complete, timely and consistent information available for compliance with governance regulations.

But all of those organizations, where I’ve been the “external champion” or evangelist, have needed a corresponding “internal champion” or evangelist.

Someone to lead the charge internally, to have the hallway conversations, to fight the good fight politically, to scrap for every budget dollar, to convince the powers that be, the type of person who digs in and doesn’t let go. Someone who’s convinced that master data management and data governance is important to his or her company. That it’s so important that it’s worth going out on a bit of a career limb. Or who perhaps was brought in specifically to head up an initiative like this.

My friend Tom Carlock wrote a great article called “So You Want to be a Data Champion?”, where he discusses how to be prepared to be your organization’s “data champion”. Tom knows whereof he speaks, because he’s been in roles like that at The CIT Group and AIG, and is now a leader of product strategy at D&B. He mentions attributes like being able to have a consistent vision that you can “sell” to others, the ability to develop and maintain relationships, being able to listen, ask for input and deal with objections, and being optimistic, hopeful and patient.

To that I would add, being persistent. My father introduced me to a quote by Calvin Coolidge, the 30th U.S. President:

“Nothing in this world can take the place of persistence. Talent will not; nothing is more common than unsuccessful people with talent. Genius will not; unrewarded genius is almost a proverb. Education will not; the world is full of educated derelicts. Persistence and determination alone are omnipotent.”

If you decide to become an MDM evangelist at your company, and you’re persistent in that role, you can help your company manage master data as an enterprise-wide asset – and transform itself in the process. I think our corporations today – ten years into the twenty-first century – desperately need that type of innovation and change.

9
Sep
SAP

Speaking at SAP Virtual Trade Show

Hub Designs is an associate member of SAP’s alliance program, and on September 23rd, Dan Power from Hub Designs will be speaking at an SAP virtual trade show being put on by SearchSAP.com and TechTarget.

This free virtual seminar is focused on best practices for maximizing SAP performance. The day long virtual event features expert presentations, live panels and expert networking opportunities to help you make the most of your SAP environment, and will cover the hottest topics in SAP right now – including business intelligence, virtualization, master data management and mobile technologies. You’ll learn tips that you can put into practice immediately and you’ll get unbiased advice for long-term strategy development. At this unique online event, go beyond the hype and get insight into the latest technologies and best practices you can use to improve operational efficiency in SAP environments.

Dan Power’s session will be at 1:30 pm EDT, and will cover topics such as:

  • Definitions of master data management, data governance and data quality
  • The five essential elements of MDM
  • Why companies are doing MDM and what this means to you
  • Getting started on an MDM roadmap
  • Is your organization ready?
  • Creating the MDM business case
  • MDM software selection
  • Some important best practices

For more information, please visit http://searchsap.techtarget.com/feature/Getting-the-most-out-of-your-SAP-environment,  and to register, please click here.

8
Sep

Call for Papers for MDM Track at OAUG COLLABORATE 2011

Oracle Applications Users Group

I’ve been involved in the Oracle Applications Users Group (OAUG) since 1995, and have been a member of the OAUG Education Committee for several years now. The Education Committee is starting to plan next April’s COLLABORATE 11 Conference, and I’m managing the “Master Data Management” track.

Together with the Special Interest Group (SIG) coordinators for the Customer Data Management SIG and the Oracle Enterprise Product Lifecycle Management SIG, we invite YOU to submit a paper for the 2011 conference’s MDM track.

Our vision for the MDM track at COLLABORATE 11 is to have:

Here are the important facts from the OAUG Call for Papers:

You’ll have the opportunity to connect with more than 5,000 users, technology leaders, Oracle executives and solution innovators gathering for the user-driven education and networking event April 10 – 14, 2011 at the Orange County Convention Center West in Orlando, Florida. Proposals are now being accepted. The deadline is Friday, October 1, 2010 at 11:59 p.m. EDT. To submit a paper, go to http://collaborate.oaug.org/submit/.  For more information, you can go to http://collaborate.oaug.org/presenterinfo/.

Note to Oracle Employees: All Oracle employees interested in speaking at COLLABORATE 11 are to submit your papers through the Call for Papers submission form. Please contact speakerprograms@oaug.com for assistance with technical difficulties. For all other inquiries, please contact Lisa Stuart at lisa.stuart@oracle.com.

30
Aug
photo by Wonderlane

Our MDM Strategy Offerings

Recently, I put together an overview of Hub Designs’ MDM strategy offerings for a potential client. Here’s a recap.

Education

  • Based on our popular “Best Practices in MDM and Data Governance” speaking engagements, presented at Oracle OpenWorld and the Oracle Applications Users Group COLLABORATE conference.
  • Our workshops get business & IT professionals up to speed quickly
  • You get access to the best MDM experts, and can bring your business people into the process early

Roadmap

  • Based on Hub Designs’ MDM framework
  • Defines where you are now, where you want to be, and over what time period
  • Looks at master data management, data integration, data quality, and data governance over time

Readiness Assessment

  • Looks at issues relating to politics & culture
  • Performs skills assessment on people who may need training
  • Examines process issues, outlining where business processes need improvement or redesign
  • Investigates technology issues, detailing where essential components are not present or not able to support your upcoming MDM initiative
  • Performs data profiling to discover data quality issues

Business Case

  • Captures business requirements
  • Identifies stakeholders and select metrics
  • Baselines current performance
  • Negotiates expected benefits
  • Converts to financial results
  • Develops total cost of ownership
  • Calculates hard-dollar ROI

Software Selection

  • Develops selection criteria
  • Creates a weighted vendor scoring model
  • Includes functionality, technology, viability, costs, services and vision
  • Develops demo scripts for vendors to follow and sample data sets to give them
  • Manages proof of concept (POC) process
  • Assists in evaluating POC performance and scoring vendors

These engagements range in length from one to twelve months, with teams varying from two to ten people, depending on the size of the company, the number of domains of master data  involved, and the complexity of the politics and legacy systems in the enterprise.

If you’re interested in discussing an MDM strategy engagement like this, please contact Hub Designs at http://www.hubdesigns.com/contact_us.html. Or if you have comments on the above approaches, please let us know by commenting here.

3
Aug

MDM Community on Ning

Today, Hub Designs committed to sponsoring the MDM Community on Ning.

Recently, Ning changed its business model from providing free social networks to charging between $19.95 per year (for educational and non-profit use) to $199.95 per year (for customized Ning Networks), all the way up to $499.95 per year (for high end social networks with integration options and more bandwidth and storage).

When I started the MDM Community back in November 2008, it was mostly a reaction to the awful state of LinkedIn Groups. Lots of spam, tons of irrelevant job postings, and very little community or sharing between MDM practitioners.

At the time, Ning was a free option, so starting the MDM Community on Ning was an easy choice. It grew gradually, and now has 295 members from 28 different countries. A lot of the different players in the MDM world are represented: Oracle (and Silver Creek Systems), IBM (and Initiate Systems), Informatica (and Siperian), D&B, Kalido, Orchestra Networks, Riversand Technologies, TIBCO (and Netrics). And a lot of large systems integration and consulting firms are represented.

Well, Ning is no longer providing its social network as a free service, but the $200 per year is a pretty reasonable investment to give MDM practitioners all over the world a vendor-neutral forum to hang out, ask questions of one another, help each other out, provide assistance, share opinions, write blog articles, update their profiles, do all of those things that people do on social networks.

At the time, there really wasn’t any other place to do all that which was ad-free, spanned all of the different flavors and vendors of MDM and data governance, and gave everyone an equal voice. I moderate the discussion forums but I try to do it with a very light hand. If anything, perhaps I should be more involved in the MDM Community and put more of my energy into growing it – and hopefully, I’ll do that now that Hub Designs has stepped up to keeping it alive on Ning.

If you haven’t already joined, please consider joining by clicking here.  If you’re already a member, log in at http://mdmcommunity.ning.com/ and let us know what’s on your mind.

30
Jul

Data Profiling For All The Right Reasons, Part 5

The Hub Designs Blog welcomes the final installment of this great series by Rob DuMoulin, an information architect with more than 26 years of IT experience, specializing in master data management, database administration and design, and business intelligence.

Part 5: The Profiling Payoff

This is the final part of a five-part series, describing how data profiling benefits both IT projects and business operations.  In Part One, we discussed profiling perspectives.  In Parts Two, Three and Four, we introduced the value of system, entity, and attribute-level metrics.  This part discusses the archival and beneficial uses of profile results.

If you have defined your corporate data profiling strategy similar to the methods discussed in the preceding parts of this series, you’ll have amassed a robust collection of metadata spanning relevant systems across your business.  Although systems may be of different types and locations, the structured approach and common metrics you collected create a centralized repository of information that can be examined holistically. Ideally, this information will exist in an open-source database repository with reports made available across the enterprise. System and Entity information help planners and developers organize information strategies. Attribute-level domains, constraints, and business rules help data architects understand existing systems. Relationships and value patterns are readily available to support validation of information-related hypotheses as needed.

If you plan to design your own repository, consider adding timestamps and indicators to help you manage and present the information.  To keep your repository relevant to business needs, design collection rules to be configurable. This allows you to easily ignore superfluous information or enable tests only at certain critical times. Allow initial system profiling efforts to gather a large set of metrics and store them as your baseline.  As you learn about the information, you will see which tests or which data objects add no value.  Us geeky DBA-types who understand system-level catalogs have our own scripts to do much of what was described inParts Two,Three and Four. Those less-inclined may prefer to use a third-party tool for profiling. Either way works as long as the business needs are satisfied and the entire enterprise standardizes on one approach (and thus one integrated repository).

You will find that collecting and maintaining this level of detail has a definite cost.  Even if the collection is automated, interrogations of large data sets places an overhead on production systems that may not be practical. Record and monitor profile execution metrics to identify bottlenecks or tuning opportunities. Realize that the extent of data profiling is contingent on the project phase, specific data elements, and most of all, business value. Review profiling goals on a regular basis and eliminate unnecessary and redundant checks.

How much profile history to maintain is another consideration.  Even though disk is “relatively” cheap, maintaining all historical entries in a live repository may not be necessary. Consider business needs and value for historical profile information. Even consider archiving at a summarized (or less frequent) level and keep only a limited time window of statistics online.

This discussion on data profiling was intended to broaden perceptions of what it means to a business and the value it can bring if done in a sustainable way. The blog format is not conducive to in-depth discussions, but hopefully the topics covered here spur some thoughts into how you can add value to your business by implementing some of these concepts.  Use your imagination, but remember that no matter how cool it might be to collect and store some profile output, if it does not add business value to somebody, it might not be worth the overhead to continue recording it.

29
Jul

Data Profiling For All The Right Reasons, Part 4

The Hub Designs Blog welcomes Part 4 of this series by Rob DuMoulin, an information architect with more than 26 years of IT experience, specializing in master data management, database administration and design, and business intelligence.

Part 4: Profiling Relationships and Patterns

This is part four of a five-part series describing how data profiling assists in all aspects of system development, from design through deployment.

Part One introduced different perspectives on data profiling. Part Two identified valuable system and entity metrics to track. Part Three discussed attributes. In this segment, we dive deeper into attribute relationships and pattern recognition. Also, we expand on primary key identification discussion and discuss hidden relationships.

Pattern grouping provides a mask of distinct format patterns within an attribute data set and a count of the number of occurrences. Patterns give insight into the type of values found in an attribute. For example, a numeric pattern analysis may show values such as 999.99999, 99, or -.9999.

Observing distinct patterns gives insight into the maximum digits and precision, and also domains such as integer or real. Pattern of a database date or date-time type provides unremarkably similar patterns for all dates. Because the database management system typically enforces the domain, date analysis provides no value and can be ignored. If dates are stored in character format, however, patterns quickly show variations in date formatting. Character patterns only have significance to a limited number of positions. It makes no sense to pattern a description field of 200 or 2000 characters. Smaller code attributes of less than 10 characters though do provide value. Ignore pattern profiling for character strings over 20 characters at first, then refine to shorter character strings if the results do not add value.

In pure database theory, referential integrity (RI) is your friend. In practice, designers and software vendors often forgo RI to improve system performance on data inserts. These designers place the data quality burden on the application and do not endorse external data manipulation outside the application interfaces. In the real world, though, data corruption occurs and without RI or routine data quality checks, corruptions may not be found for a long time or not at all. Personally, I have identified over $50,000 of recent orphaned sales in a retail client resulting from deliberately disabled RI. These unreported sales were not added to the ledger and were allowed to occur for performance reasons until I found them through simple profiling. Enforcement of RI is a topic for another discussion but is mentioned here because it does identify a valid reason for data profiling.

In even presumably good relational designs, some parent-child relationships are not enforced for different reasons. First, interrogate the RI listed in the system catalogs to identify all enforced relationships. Reverse-engineering a system with a good modeling tool is probably the best way to do this. A harder and more valuable analysis is to identify unenforced relationships and determining the probability of the relationship if not all values are an exact match. Do this by counting all the candidate child attribute values that exist within a known parent attribute table. If all match and there are a non-trivial number of matches, there is a good probability of a non-identified relationship. A small number of mismatches could identify data quality issues.

In Part 5, we tie all the techniques discussed in the first four parts together to show the value of a repeatable data profiling process.

28
Jul

Data Profiling For All The Right Reasons, Part 3

The Hub Designs Blog welcomes Part 3 of this series by Rob DuMoulin, an information architect with more than 26 years of IT experience, specializing in master data management, database administration and design, and business intelligence.

Part 3: Attribute-Level Analyses

This is part three of a five-part series on data profiling.

In Part One, we took a light-hearted view of where profiling benefits an organization and in Part Two, we discussed the fundamentals of a profiling strategy.  The remaining three parts discuss attributes, relationships, patterns, and how to use the combined data profiling information you collect.  In this section, we introduce attributes, the lowest-level components of a profiling effort.

An attribute is simply a individual data element.  Alone, an attribute has no context.  Given the simple descriptor of “Cost” for an attribute tells us very little about the attribute’s true purpose and immediately drives a need for additional information, such as units (hours, Dollars, Euros…), type (weighted, unit, gross…), and use (invoice, sum, average…).  Attributes therefore must be analyzed within the context of their business purpose to have meaning.

Some characteristics require business knowledge to define and others can be determined through interrogation of existing values and underlying rules of the storage medium. It takes both analyses to get a complete picture of information within a system. While assembling this puzzle, though, keep in mind that until you validate the enforcement of business rules, only assumptions can result from physical profiling or business context.

Analyses of values, domains, and constraints allows insight into use (or abuse) of an attribute. The larger the sample size, the better confidence you gain in the results. Without explicit proof of business rule enforcement, though, you must assume that just because a value does not presently exist does not mean it cannot exist. Business rules are defined by business experts and enforced through database constraints, data type/precision, and application code. Knowing the methods of enforcement allow you to narrow a domain but not totally understand it. Profiling of actual values provides additional refinement in terms of percentage of NULL values, percentage of distinct values, minimum, maximum, and average values, top x and bottom x recurring values along with their counts, and minimum, maximum, and average data lengths.

Some attributes within a data set serve valuable purposes that are important to identify. Attributes that individually or in conjunction with others define uniqueness of the data set also may support relationships between entities.  Uniqueness can be further classified as being either members of a system-enforced primary key or of a business key (outside of the defined primary key).  System-enforced primary keys are relatively easy to define within a database system through interrogation of the system catalog.  Business keys that exist in tables in addition to a primary key may be more difficult to identify, especially if more than one attribute is needed to define uniqueness.

Attribute-level information of interest includes: data type (size and precision), the number and percent of NULL values, column descriptions, number and percent of distinct values, and the minimum-maximum-average values and lengths.  Uses of the system catalog provides some of this information, but others must be collected through sampling the data.

Other types of attributes that may help in identifying relevancy are those that provide system-level auditing or change control. Knowing which attributes fill these roles may either allow you to (a) ignore them for profiling purposes or (b) use them to help explain versions or data anomalies.

Part 4 expands on attribute profiling with the introduction of relationships and patterns.

27
Jul

Data Profiling For All The Right Reasons, Part 2

The Hub Designs Blog welcomes Part 2 of this series by Rob DuMoulin, an information architect with more than 26 years of IT experience, specializing in master data management, database administration and design, and business intelligence.

Part 2: Profiling the Basics

This discussion is the second of a five-part series on data profiling. In Part 1, we discussed the project roles that benefit from data profiling and how better understanding information results in more reliable information systems. Important goals of any profiling strategy include automation of metric collection and socializing results to support the differing objectives of a data-centric project.

Early in a system development life cycle, profiling helps define sources, data storage requirements, and data transformations. As a system goes into production (or if profiling is added to an existing system for quality control purposes), routine profiling is useful to audit system quality and business rule enforcement. The frequency of collection and amount of effort you expend to automate your profiling methods should be based on the ability of the organization to benefit from the profile results.

This section discusses the beginnings of a profiling effort. Information assembled here forms the foundation of other profiling activities. For this discussion, consider a Profile Group as a set of information sharing a common purpose and data management methods. Examples of profile groups include tables within a single database schema or a group of spreadsheets with the same format but each spreadsheet representing a different time slice of data.

The underlying System managing a set of information within the profile group may be a named relational database, a file system directory, or even a web site being accessed through web services. The reason we abstract information into Systems is to group the information into distinct governance methods common to the underlying information. Relevant metadata and governance methods we track at the system-level include: technical contacts, backup schedules, system descriptors, connection strings, business unit owners, and host operating systems. System-level metadata common to a profile group helps us understand and troubleshoot future analyses. This level of information also provides developers with an understanding of inherent restrictions (or freedoms) they may encounter when trying to use or integrate the information.

Entities within a profile group belong to the same system, may have a common unique identifier, and, for database entities, have the same schema owner. Typically, entities are database tables, but may also be similar files or spreadsheet tabs containing like attribute lists. For entities, we track characteristics common to all the attributes they contain. These include: row counts, entity-level descriptors, growth characteristics (size and frequency), last analyzed date, and various customized indicators such as active/inactive, existence of change data management attributes such as insert/update timestamps, and existence of audit traceability indicators such as insert/update username.

The combination of system and entity level profiling supply the foundation for the attribute-level profiling, which is where physical information in a system resides. It also provides valuable metadata to classify information and allows for future correlation of like information across systems. Assembly and publication of entity and system level information benefits the various consumers of the information by providing a centralized “master” source of contact and context information.

In Part 3, we will dive into the attribute level analyses around data profiling.

26
Jul

Data Profiling For All The Right Reasons, Part 1

The Hub Designs Blog welcomes a guest post by Rob DuMoulin, an information architect with more than 26 years of IT experience, specializing in master data management, database administration and design, and business intelligence.

Part 1: The Psychology of Data Profiling

Swiss psychologist Carl Gustav Jung founded the Analytical School of Psychology. His word association theories form the basis of the Myers-Briggs Type Indicator Assessment test to identify career aptitude in today’s high school students. Dr. Jung’s approach assigned personality profiles based on how an individual’s thoughts associated to various phrases. By analyzing responses, he could understand how an individual viewed the world around them and perceived themselves. Typically, subjects are asked to speak the first thought entering their minds after hearing a trigger phrase. For the following example, remember, there are no wrong answers. If I say the words “Data Profiling”, what is the first thing you think of?

If you thought of food, cats, country music, CSI NY, or residential plumbing, you are either not in IT or are an IT Manager.

If your first thought was “Quality Assurance”, you align yourself with data quality professionals having anti-social thoughts of failing test cases and sadistically reporting lazy developers for buggy code. You gleefully scour test cases looking for any evidence of truncation, missing values, non-matching codes, numeric precision errors, and inconsistent abbreviation, text, and date formatting.

If “Integration” comes first in your mind, past legacy integration projects have scarred you with a disdain for source system data quality levels. You view production apps with contempt and loathe the time it takes to track down data issues caused by system integrations. You investigate upstream sources to create detailed mappings and transformation rules. Typical debugging sessions consist of validating relationships to identify orphaned data, identifying attributes that contain overloaded columns (attributes containing more than one distinct data element), or fixing format errors from implied decimals.

Some of you responded with “Value Domains” or “Data Types”, indicating you are obsessive compulsive data architects compelled to organize the world into strict and orderly fashion with some degree of normalization, though you are not considered “normal” by your peers. Your concerns lie in understanding and regulating naming conventions, relationships, existence of NULL or default values, and understanding the meaning of each data element to accurately identify business rules and when two or more objects are related or redundant.

Lastly, if “Debugging” is the first item in your thought queue, you are a coder justifying why presumably good code is not working. Extreme paranoia has taught you to assume nothing about data quality, so you add tests to identify duplicates, validate relationships, enforce business rules, track change data capture, provide substitute values. Your phobia of early morning phone calls cause you to add auditing to your code to inform a DBA of data issues rather than waking you up in the middle of the night.

It is truly amazing how much we can conclude from the response to one simple phrase.

As stated before, there are no wrong answers. Aside from the innocent jab at Managers and non-IT resources, we all realize the benefits of information quality and absolutely need business involvement to understand context and domains of business information. The meaning and actions of Data Profiling change both by role and by project phase. Through profiling, we are able to identify best sources of information, learn proper ways to categorize and store it, reactively identify quality issues, and proactively define business rules to prevent future issues.

Identifying what is important to profile, when and how profiling is done, and how to share our findings across business and project resources is key. Done properly, profile results integrate to a master metadata repository and are periodically refreshed through an automated process.

This five-part series provides a tool-agnostic approach to comprehensive data profiling, focusing on information meaning and use. The next part of the series discusses system and table-level profiling. In particular, what information is important to collect at the system and table level and how can that information be leveraged by the Enterprise to help assure quality. The third part dives into attribute-level profiling and the fourth discusses attribute patterns and relationships. The final part discusses the benefits and utility of gathering profiled information into a single repository.

15
Jul

Two of Our Most Popular Series

While I’m on vacation for the next two weeks, the Hub Designs Blog will be republishing two of our most popular series, “Modeling the Blueprint for MDM” by James Parnitzke, and “Data Profiling For All The Right Reasons” by Rob DuMoulin.

I met both of these tremendously experienced professionals last year on an MDM strategy engagement in the mid-West, and I enjoyed working with them a great deal. I learned a lot from these guys, and I’m happy to be able to pass on to the readers of this blog a small part of their experience here in these series.

The eleven articles in these two series received more than 6,300 page views in their first run; hopefully there are some readers who missed them the first time around who will enjoy reading them over the next two weeks.

I usually write a recap of my vacation when I get back, trying to tie my vacation experience to MDM somehow. Here’s the one I did in 2008 – it looks pretty funny two years later, with “things I learned on my summer vacation” like “Don’t Be Too Ambitious” and “Stay On Course but Be Flexible”.

I’m not sure if I’ll do that this year; perhaps I’ve already stretched the metaphor between summer vacation and master data management too far. But I hope you’re enjoying your summer, and thank you again for reading this blog and supporting Hub Designs!

28
Jun

Philosophy of MDM

My philosophy of MDM is simple: all things being equal, enter and manage master data in its own repository or hub, and pay the same attention to the organization and business processes for creating, distributing, updating and retiring master data that you do for other types of data within the enterprise.

You’d be amazed how often that simple statement confounds people though. They want to enter master data in their ERP or CRM system, and then synchronize it over to the MDM hub. Or they’d like to somehow do without an organization to manage their master data for the enterprise. Or they’re willing to concede the need for a data governance group, but don’t think that group will need any formal processes or technology to help orchestrate their work or facilitate it and improve their productivity.

Even though the link between data quality tools and master data management is well established, I sometimes still see people try to do MDM projects without using data quality technology. And even though synchronizing the high quality master data available in the hub should be a high priority, people (typically for cost reasons) still try to skimp on integration technology and try to get by with only the most basic ETL tools.

One of the most popular articles we’ve had here on the Hub Designs Blog was the Five Essential Elements of MDM, in which I laid out what I thought were the most important related areas of technology. In it, I included the MDM hub itself, of course, and also data quality, data integration, middleware, third party content and data governance (which of course, is not really technology, but needs to be included because it too is so often forgotten).

So getting back to the focus of this article, my philosophy of MDM is to have all of the essential elements, to have a sound vision and strategy for MDM, a strong business case based on metrics, to create a governance framework and organization to carry it out, to design governance processes, and then (last but not least) to implement technology to facilitate the governance needed to support the enterprise’s master data requirements.

So often today, we see organizations taking a technology-driven approach, or leaving out important parts of the above approach.  Have you thought your MDM initiative all the way through?

10
Jun

Intersection of MDM, CRM and ERP

My earlier article on Why Product Information Management in Information Management magazine prompted Andrew White of Gartner to write a short blog article.

Andrew picked up on my comment “If CRM and ERP platforms were better able to manage master data, perhaps we wouldn’t need MDM solutions.” He goes on to say that “these applications were designed in an era when there was no need to take account of information requirements ACROSS the enterprise.”

The operating assumption for most CRM and ERP platforms, unfortunately, was that you were going to run your ENTIRE business on them.  This rarely, if ever, turns out to be the case, particularly if the business does a lot of acquisitions. One business unit or geography certainly. And the count may grow over time. But there are always going to be areas of the business “outside the pale” – not included in that particular CRM or ERP solution’s purview. This leads to silos of data, which create many problems in the management and analysis of information in the enterprise.

That’s why an MDM hub makes so much sense. It provides a neutral place for customer, product and other master data from all over the enterprise to be created, read, updated and managed. Increasingly, today’s CRM and ERP applications are being used in concert with a robust MDM hub. Even now, CRM and ERP products just aren’t designed to manage master data effectively. They don’t have the built-in data quality and data governance processes that are needed to ensure a single view of accurate, complete, timely and consistent master data across the enterprise.

You can read the article by Andrew White of Gartner Research at http://blogs.gartner.com/andrew_white/2010/06/07/good-summary-of-mdm-of-product-data-and-its-value-to-the-business/.

27
May

Why I Enjoy MDM So Much

I was reading a very good article on a blog called Presentation Zen called The Importance of Starting from Why. The article describes a TED talk by a leadership expert and author named Simon Sinek.  In his talk, which I encourage you to watch yourself, he talks about the importance of understanding the “Why” of something vs. the “How” and the “What”. Since I read that article and watched that video, I’ve been thinking about why I enjoy MDM and data governance so much, and about the central premise of Simon’s talk, which is that there’s a simple pattern, that all great and inspiring leaders and organizations think, act and communicate in the same way, and it’s the exact opposite from everyone else.  He calls it “the golden circle”:

Why_chart

Why -> How -> What, and goes on to say that this little idea explains why some organizations and some leaders are able to inspire, where others aren’t. Every person and organization knows what they do, most know how they do it, but a lot don’t know why. The successful ones start with the why and work “inside out” (in the opposite direction from most people and companies). By nailing down the why first, everything else falls into place.

I don’t want to reproduce his whole talk here in this article, but it got me thinking about my interest in master data management and about Hub Designs and our approach to working with our clients.

I got interested in master data in one of my first consulting projects after graduating from college. I had a client that was a distributor of VHS videotapes. People would call up and order a show on tape, and the customer service people would enter them in as new customers rather than search to see if they might be an existing customer. Their order entry system was written in FoxPro on a PC network, and I had my own consulting business doing FoxPro programming. So I was engaged to help them deduplicate their customer master, based on similarity of customer name and address. I remember at the time thinking it was a great intellectual exercise.

That was my first exposure, but hardly my last. In 1995, I got recruited into a position as a project manager for an Oracle ERP implementation, and I did many Oracle projects over the years that followed. In ERP implementations, converting master data well is a big contributor to the success of the project, and I found that handling data quality issues properly became second nature to me.

In 2001-2002, I was a program manager on a large Oracle ERP project for a $1 billion software company, and one of the areas I oversaw was Customer Registration. My client and my team were one of the first to integrate Oracle’s Trading Community Architecture (TCA) with Dun & Bradstreet’s real-time database (D&B Data Integration Toolkit). That lead to my going to work for D&B in 2004, and being part of the Global Alliances team there until 2007. While at D&B, I managed their strategic alliance with Oracle, and worked closely with Oracle on Customer Data Hub and its integration with D&B content.

I mention all this not to bore you with my professional history for the past twenty-three years, but to illustrate how a passion for master data can get into your bones, and shape your career. It’s woven itself into my life, and become part of the “Why” for Hub Designs and how we work with our clients. Anyone who knows me or has worked with Hub Designs professionally knows that we care deeply about our clients and their success.  They become part of our family. We hug them when we see them. We put so much of ourselves into our clients’ projects that we form relationships that last for years.

In the video we produced for the recent Gartner MDM Summit, we used words like ‘passion’, ‘performance’, ‘teamwork’, and ‘integrity’ to describe our “why”. That’s what gets us out of bed in the morning – making a difference for our clients, helping them solve their business problems, moving the needle, making things better in their organizations, and improving things one company at a time.

In the end, why I started my own consulting firm again was so I could work with clients in my own unique way, so I could develop something of lasting value, and so I could turn my passion for MDM and data governance into a business that would make a difference to our clients.

What’s your why?

25
May

MIKE2.0

MIKE2.0 (Method for an Integrated Knowledge Environment) is an Open Source methodology for Enterprise Information Management.

I first became familiar with it in 2009. MIKE2.0 provides a lot of “thought capital” for practitioners in the areas of enterprise architecture, master data management, and data governance. While it was (at that time, at least) too incomplete to use “as is”, it was very helpful in being able to show to a client as an example of what a data governance program would look like, or what an outline of an enterprise master data management program would look like.

MIKE 2.0 evolved from work done by BearingPoint, which has emerged from its 2009 bankruptcy operating in 14 countries throughout Europe with about 3,250 employees. The MIKE2.0 intellectual property is now open source and is controlled by the MIKE2.0 Governance Association, which includes representation from BearingPoint and Deloitte.

Now, MIKE2.0 is firmly in the hands of a non-profit, independent governing body, which makes the entire body of work available as a tool for MDM and data governance practitioners.

There are a lot of great assets embedded in MIKE2.0 – in particular, there’s a customer data integration solution offering, a data integration solution offering, a data investigation and re-engineering solution offering, and an information governance solution offering.

These map fairly well to my “Five Essential Elements of MDM” article, where I said that, to succeed with MDM, you really needed:

  • a Hub of some type
  • some kind of data integration or middleware
  • data quality capabilities
  • external content
  • data governance (which of course, is the most important)

So while I wouldn’t recommend using MIKE2.0 “out of the box” (i.e. without the need for some fairly heavy adaptation), it may very well save you a lot of time in your MDM and data governance initiative. If you’re not already familiar with it, I highly recommend you check it out.

21
May

Recent eLearning Curve Webinar

Hub Designs recently hosted a 30 minute webinar on “Best Practices in MDM and Data Governance with Dan Power”, in concert with our friends at eLearning Curve and Information Management magazine.

To download the replay of the webinar (with audio), please go to http://bit.ly/hub-designs-webinar.  To download just the slides, please go to http://bit.ly/mdm-best-practices and click “Download”.

For the “When Data Governance Turns Bureaucratic” white paper mentioned in the presentation, go to http://bit.ly/data-governance.  Scroll to and click the link at the end of that article.

Thanks for attending the webinar (or the replay). We hope you found it valuable!

2
May

When Data Governance Turns Bureaucratic

How Data Governance Police Can Constrain the Value of Your Multidomain Master Data Management Initiative

(this appeared as a guest post on Informatica’s blog on Friday, April 30 2010)

I published a white paper last year, entitled “When Data Governance Turns Bureaucratic,” that looked at how reactive data governance was preventing organizations from realizing the full value of master data management (MDM). By “reactive”, I mean organizations using a “coexistence” architecture where front office applications (CRM) and back office applications (ERP) are still used to author master data (customer and product data, suppliers, employees, etc.). Because these applications remain the “Systems of Entry” while the MDM hub’s role is limited to being the “System of Record,” some of the biggest promises of MDM remain unfulfilled.

So, what exactly would proactive data governance look like? Essentially, the proactive model places more emphasis on business users being the owners of the master data. Rather than letting data stewards carry the burden of the central issues of accuracy and completeness, the accountability for these goals shifts towards the business users. Since end users are empowered to enter new master data directly into the hub, their trust in the accuracy and completeness of master data goes up, plus there’s less need for data stewards to act as the “data quality police.” Once users are no longer dependent on the CRM and ERP systems to perform initial entry and updating of master data, the data stewards can focus on managing exceptions and measuring data for quality, availability, security and usefulness. In this less-intrusive role, data stewards don’t present a bottleneck to critical business processes such as order management or invoicing.

By getting the master data right at the source, your initial level of quality for new records is much higher. The proactive style of data governance also effectively eliminates any time lags between the initial entry of a new master record, and its certification and publishing via middleware to the rest of the enterprise. As such, marketing campaigns can be done more quickly, with no upfront data remediation needed prior to launching a campaign. Finance benefits as well, since all of the data elements needed for a new customer are captured at once, and the hub-based process for adding a new customer can include pulling third-party content and calculating a credit limit, then passing that information back to the ERP system. Customer service benefits too, because all information is stored in one hub and made accessible through an efficient, user-friendly front end. Customer service reps are able to access all of the data needed for each customer interaction, as well as being able to author new data when necessary.

When is the right time to transition from reactive to proactive data governance? Some situations call for starting out immediately with the proactive approach, such as when you’ve got multiple CRM systems and ERP systems that would require integration with the hub in order to allow them to continue to operate as Systems of Entry, or when your current source systems are very brittle or hard to maintain or modify. In those cases, bite the bullet and plan from the beginning for proactive data governance.

Want to learn more about the reactive vs. proactive governance? You can download the complete whitepaper “When Data Governance Turns Bureaucratic” here.

29
Apr

Upcoming Hub Designs Webinar

Hub Designs is hosting a complimentary webinar on “Best Practices in MDM and Data Governance with Dan Power” in concert with our friends at eLearning Curve and Information Management magazine.

The webinar will be held on Friday, May 21, 2010, at 12:00 pm EDT (9:00 am PDT).

Topics will include:

  • helping you better understand master data management (MDM) and data governance,
  • presenting ten best practices and four advanced topics,
  • covering what works and what doesn’t,
  • the importance of a holistic approach,
  • how to get the political aspects right.

We’ll also discuss the difference between proactive and reactive data governance, and a potential MDM hub architecture.

To register, go to https://www1.gotomeeting.com/register/733157689. If you have any questions you’d like us to address, you can ask them here before the webinar using the Comment feature.

Follow

Get every new post delivered to your Inbox.

Join 2,554 other followers