What are Data Catalogs?

Apr 5, 2021

How Data Catalogs Help Tame Healthcare

We have so much data now, why is it still so hard to get good information?

There was a time not too long ago when most of the non-billing data in healthcare was gathered manually from paper charts and tick sheets on clipboards. Given the level of effort to digitize that data, organizations had to be very choosy about what was deemed valuable enough to devote resources to keying it into a database. Quality and regulatory reporting requirements generally, and appropriately, took precedence. Beyond those there had to be a strong case built to justify the investment, otherwise you had to get creative (did somebody say intern?). With the advent of electronic medical records (EMRs) and the progression of business systems in general, it seems that everything is electronic and everything is tracked now. And yet, the same complaints are heard constantly about the lack of information to make business and clinical decisions. The complaints are just louder and angrier because the massive investment in EMRs hasn’t produced the promised ROI yet.

Different problem, similar outcome.

Information is no longer held captive in stacks of paper; now it is locked away in massive, inaccessible data silos. By “inaccessible” we don’t mean it is actually impossible to access, although the more overzealous IT departments may prefer that scenario, what we mean is that it is effectively inaccessible due to the sheer volume of data and the difficulty of turning that data into information. For example, if I want to know how many patients received an antibiotic during their visit with a provider, a seemingly simple question, I have to know several things in order to unlock the information held within the data:

  • What does “received” mean? Is it actual administration of the antibiotic or do prescriptions count?
  • What is a “visit”? Are we interested in inpatient, outpatient, and ambulatory visits? What about tele-health visits that resulted in a prescription?
  • What types of “providers” should be included? MDs? NPs? Others on the care team?
  • What systems contain this data? Are there different systems for hospital vs. clinic records? What about pharmacy? Do I need to look at both medical record AND billing data to get the full picture?
  • How do I pull the different data elements from the different source systems? Do the answers to questions 1 through 3 differ depending on the system?

Something is still missing.

There has been great progress made in the areas of data integration, visualization, and analysis. State-of-the-art tools help bring data together into comprehensive data warehouses. Sophisticated reporting and dashboarding tools create beautiful visualizations of data and serve it up to decision makers at the click of a button. Highly advanced data analysis tools help answer key questions and make data science more accessible to analysts across the organization. But what these tools can’t do is instill enterprise-wide consistency and broad trust in the data across all the disciplines of data integration, visualization, and analysis. That’s where enterprise data governance comes in.

Enterprise data governance (EDG)

is the essential and overarching discipline that seeks to improve the efficiency of the use of data through organization, coordination, transparency, and education (a.k.a data literacy). And EDG’s tool of the trade is the data catalog. It’s the missing piece that complements the data management, integration, visualization, and analysis tools you already have:

So what is a data catalog?

A data catalog is a repository of an organization’s information assets. It’s a searchable directory of reports, dashboards, metrics, data warehouse tables and columns, data definitions, business glossary definitions, and so on, along with metadata about those assets. “Metadata” is information about the data asset, or “data about data”. The type of metadata tracked depends on the type of object. For example, a report might have a description, a list of available filters, the name of the report developer, the name of the database the report pulls its data from, etc. A metric might list inclusion and exclusion criteria, the name of the data steward responsible for metric definition, the date the metric was last refreshed…you get the idea. If it relates to data and analytics in any way, it can (and should) be tracked in a data catalog.

If you get a data catalog, all your problems will be over right?

If only it were that easy. And achieving successful EDG is not easy; in fact it’s quite difficult. But it’s essential, and well worth the effort in order to get the improved quality, consistency, and effectiveness of your data that leads to increased organizational trust in the data, which leads to better use of the data, which ultimately leads to more efficient and higher-quality patient care. Data catalogs, like all technology solutions, cannot bring about improvement alone, but you also cannot achieve all of the above benefits without the coordination, automation, and transparency data catalogs provide. It takes a two-pronged approach, which is why the Compendium Data Catalog comes standard with EDG templates and recommendations refined over years of healthcare data consulting experience. A world-class data catalog paired with a time-tested approach — all so you can achieve success! Want to learn more about how the Compendium Data Catalog can increase both the trust in and the efficient use of data in your organization? Contact us today!