There are two ways to build an impressive structure.
This is one way:
This is another way:
Which way do you prefer?
Data under construction
In the corporate world, we rarely find ourselves building things with stone, bricks, or even steel and concrete. Today, we build with data.
With data, we construct analyses of what happened and predictions of what will happen. With data, we create satisfying customer experiences. With data, we build the revenue and profit of our company.
But just like any other kind of construction, there are multiple ways to go about it. You could produce a stunning data analysis and spectacular report solely on the basis of manual labor: combing through column after column, table after table. It just might be outdated by the time you actually finished it. And you wouldn’t be able to do anything else in the interim: not exactly ideal optimization of human resources.
Modern construction calls for modern tools. And modern data construction calls for modern data tools, one of which is a data catalog platform.
What is a data catalog platform, again?
A data catalog platform organizes all the data assets in a company’s information landscape on the basis of the data’s metadata (i.e. the data about the data). It gives any data asset an intelligent context from which to understand and use it.
Each data asset’s entry in the metadata catalog includes definitions, descriptions, ratings, data owner and steward, and more, making it simple to search for, identify and evaluate the data you need for any given purpose.
A small business might not need tools for data catalog software; if you have very limited data assets, the cost of organizing and automating your metadata and data catalog management might not be worth it. Some things you can build by hand: contracting with a construction company to install a shelf is usually overkill.
But as soon as you are dealing with data assets in the tens or hundreds of thousands (as are most enterprises today), efficiency dictates either modern automated tools or a team of indentured servants – and the latter is not exactly realistic.
If you find yourself or your team members saying the following things, you’ll find that life and work with a data catalog platform is refreshingly simplified and astoundingly productive.
“I spent a long time creating a dataset (or a report)… and after I finished I found out that it existed already.”
Frustrating waste of time and energy #1 = duplicating what already exists. It’s almost inevitable when data assets aren’t organized.
A data catalog platform serves as a single, searchable repository for all data assets. Finding what you need is step one in any data project, so powerful, user-friendly search and discovery functionality is at the heart of any data catalog worth its salt. Ideally, data discovery on a data catalog should be just as intuitive as searching for a product on an online retail marketplace.
“I used an available dataset for an important project, only to discover that it had quality issues. I had to redo the whole project.”
Frustrating waste of time and energy #2 = unknowingly using bad quality materials, rendering the finished product useless. How were you supposed to check the quality? It all looks the same… until it falls apart.
Data catalog features should include usage data in entries, so you can see how often and for what this data asset has been used. They should also include user-generated information like ratings and reviews, enabling you to leverage the combined experience of your organization’s data consumers when choosing what data you’re going to rely on.
“Whenever I have a question about a data asset, it takes forever to get in touch with someone who can give me an answer.”
Who is the data owner? Who is the data steward? Is there a subject matter expert in the house – or did she jump ship a month ago without a forwarding address?
A good data catalog platform clearly identifies the important roles for every data asset. A really good data catalog provides communication channels from within the catalog itself for getting answers and clarifications from those responsible for the data asset. A really, really good data catalog records these questions and conversations within the data asset entry, making this valuable tribal knowledge available to every future user who looks at the entry.
“It takes me SO long to find the right data for my project.”
Without a data catalog, getting the data you need for a project is reminiscent of a poor blind dating experience. Based on whatever documentation exists, plus some talking to colleagues, you’re set up with a data asset you think has promise. You psyche yourself up, make yourself look all snazzy, go to the appointed place, spend a few hours with the date, and… nah. Not a chance. How did anyone even think this would go?
It would be SO much more effective – not to mention time- and energy-saving – for you to be able to learn more about your date before you meet them. What do they look like? What are their interests? What are their life goals? What do their friends – and former dates! – say about them?
A data catalog removes the blindness from your data matchmaking experience. Like the perfect online dating site (which doesn’t exist, but we can dream), you can search for what you think would make a good match, and then see your potential dates in the illuminating context of objective information about them (e.g. what accomplishments have they had, where do they usually go on dates, how many people have gone out with them) and subjective information (e.g. what prior relationships say about them). You can even ask questions to people who know them!
While this doesn’t guarantee success (you do have to meet them before you can commit a long-term relationship), your chances of finding Mr./Ms. Right Data go way up and the time required to do so goes way down.
Many hands (or the right tools) make light work
If you were Egyptian Vizier Rekhmire planning the impressive structure that would house your future mummified body, you might have had the luxury of “many hands.” Not so if you’re an enterprise business user, BI self-service user, or BI analyst. All you have are your own two hands and brain – and the brains of any relevant data users or experts you can get your hands on.
But if you also have a data catalog platform, it doesn’t matter. The right power tools trump manual labor any day. Spectacular data construction, here you come!