Applied Data Governance by Collibra
Collibra provides advanced technology for information governance
The Hub Designs MDM Think Tank met recently with Collibra – Stan Christiaens, co-founder, and Maarten Masschelein, who handles technical pre-sales duties.
Collibra is an interesting company that Hub Designs has been tracking for the past several years.
On its web site, www.collibra.com, the company defines itself briefly as:
Collibra is a data governance software company, bringing business and IT together to govern data as an enterprise asset. As a best of breed vendor, Collibra provides the Data Steward organization with a technology platform that supports their activities. Collibra is a Gartner Cool vendor for Enterprise Information Management.
When Stan and Maarten briefed the Think Tank, they talked about “process-driven data governance”. Collibra has three separate products – Data Stewardship Manager, Business Semantics Glossary, and Reference Data Accelerator, which together make up its flagship offering, Data Governance Center.
The company launched Release 4 of Data Governance Center a few months ago (April 2013).
The first step is to load your operating model for data governance into the platform. This means setting up your data governance organization, including the steering committee, data governance office, stewardship layer and other working groups (using your own terminology, of course). The product supports centralized, decentralized and federated approaches to data governance, and allows you to create a multi-tiered organization that reflects how your company actually operates.
The next step is to set up your people and processes, including the relevant roles, responsibilities and controls, data ownership, workflows and common processes. The product supports subject matter experts (SMEs) and cross-functional collaboration, both of which are critical in a successful data governance effort.
The third step is to load your various asset types, including business and data definitions, taxonomies, reference data, rules, policies and information about business traceability. This aspect of the product seems to provide some very useful metadata management and policy management capabilities.
Collibra says that it can guide customers through this process and get them up and running in about ten days. The company provides an out-of-the-box operating model. If a customer doesn’t have everything thought out in advance (which is very common), the framework that Collibra offers can guide them.
After setting up the product, you’re ready to start executing, for use cases such as:
- Business Glossary
- Data Dictionary
- Reference Data Management
- Policy Management
- Issue Management
The product emphasizes using actionable metrics, making good conduct visible, measuring the progress of developing higher levels of data governance maturity, and data quality issue management.
All of these together is what Collibra calls “applied data governance” – turning it from a paper exercise, or an effort perceived as more bureaucracy, into a living, breathing, evolving part of your company.
You can import existing materials for your starting point. Then the team rounds out and finishes the critical business definitions, and has them reviewed and approved through workflows. The glossary can then be published to the whole company. And interestingly, you can highlight a term like “Customer” in any application or report and press a hotkey to show that term’s definition.
This part of the product really shows Collibra’s semantic roots. You use a three-step process: (1) Define, (2) Map, (3) Visualize. This allows you to leverage the terms from the Business Glossary, and map them to the actual tables and attributes in your transactional systems like ERP and CRM, and your analytical applications like your data warehouse. You can even describe the transformation logic applied between the transactional and analytic applications.
Reference Data Management
Collibra can govern reference data in one place and then share it across applications. This is usually in the form of lists, provided by agencies like the U.S. government, the Postal Service, the UN, and groups like ISO and the American National Standards Institute. You bring the data into Collibra, update it where needed and map it to your applications, and then you publish / provision the data into those systems.
For Collibra, reference data can come in many forms:
- “flat” reference data (the lists described above) which is very common,
- hierarchical or relational reference data, where a hierarchy or relationship is required between the “flat” reference data (think countries & currencies, or product hierarchies). This can include business rules represented as reference data (e.g., using this state and that product family means a particular discount)
- mappings (i.e. crosswalks): because reference data is widely referenced, and because of legacy systems and packaged applications, you can’t just change it as you see fit. That is why people require mappings. For example: transactions may “country coded” in the CRM system with “US”, but as “USA” in the ERP system.
For me, one of the most exciting use cases is Policy Management. First, you define the policy, and then apply it to your existing data. An external data quality tool such as Harte Hanks’ Trillium assists with reporting on the level of compliance your current data has with the policy. Then you can act on those notifications and remediate the data where needed, to make it comply with the policy.
Collibra is releasing version 4.1 this month and has just announced its alliance with Trillium (a robust data quality tool).
We often remind clients of the need to have a “help desk for data” – a centralized way for people to report data quality issues that can then be resolved by the data governance team. This component of Collibra provides that – with a way to report issues, resolve them through triage and review, escalation, and assignment to the right resources, and then notify the appropriate parties of the proposed solution.
With the ability to put your data governance organization into the platform, define your critical business terms, establish a data dictionary, centrally manage your reference data, create & enforce data governance policies, and manage & resolve issues as they are brought up, Collibra provides everything you will need to get started with data governance.
The product will deliver you from “Excel hell” – with multiple competing versions of spreadsheets defining business terms in a glossary, listing key metadata, and reconciling reference data between applications. Policies can be defined in a proactive way, rather than just in Word documents stored in SharePoint that you are hoping people are following. And issue management becomes an organized, disciplined process, rather than an endless spaghetti bowl of e-mails, meetings and crossed connections.
It’s been interesting to watch Collibra’s development over the past 2-3 years, and with Data Governance Center Release 4.1, the product deserves serious consideration by every company doing or thinking about doing a data governance initiative.
Dan Power is the publisher of Hub Designs Magazine, and the Founder & President of Hub Designs, a leading consulting firm specializing in master data management (MDM) and information governance. You can reach him at http://hubdesigns.com/contact_us.html.