Webinar Transcript
Amnon Drori: Hi everyone. My name is Amnon Drori. Thank you for joining today. Today we’re going to share with you a little bit of what we had experienced in the past couple of weeks with our clients specifically around the BI. Today, I’m going to be joined with my colleague, Amichai, who’s running the Customer Success and off we go. Amichai, can I control the presentation?
Amichai Fenner: Yes, sure. You go.
Amnon: Great. Thank you. What we saw in the past couple of weeks after working very, very closely with our clients is that the need for data has not been changed. As a matter of fact, due to the COVID-19 situation, a lot of organizations have enabled their employees to work from home. Due to that fact, a lot of the communication that was very, very easy between employees, specifically around business intelligence and data scientists and data architects, enabling data for their business users had become more and more challenging.
Now not only that dealing with data management was a challenge by itself, now it’s becoming even more challenging by the fact that people are not in the same building, nor having some communication between themselves. What actually happens in life is that probably, this is a better description of how you guys work today. We find ourselves do this all the time because not only do you have to work from home, you have to take care of your family at the same time. Now it’s even becoming more and more challenging, not only to work and try to drive your business as usual, as was before, but in parallel, take care of your family as well.
What we had experienced in the past three weeks after working in the market for more than four years, is that the demand for data and the needs to get data on demand are becoming more and more critical. If business users could have waited few days or even a week when they wanted to get additional data in order to be able to take decisions, if they wanted to understand, if they can trust the data that they’re looking at. Now, the need for accurate data has become more and more, a critical necessity for any organization in order to be able to take this decision in a faster pace, due to the changes that are happening around us. What we’ve seen is the fact that while you’re trying to get more data in a faster way, due to business demands, you compromise on its quality.
You compromise on the amount of efforts that it takes you to be able to deliver faster data to the business, which increases the risk. When you have more time, you have the capability to look at the data that you actually ship to your business users and verify this through one or two, three even cycles. Today you don’t have this luxury. What we heard from our clients the best couple of weeks is how do you balance this? How do you balance this physics of moving fast with doing as less mistakes as possible? The business users who are consuming data in a matter of hours can trust the data that they see and be able to take decisions based on that data that you are actually responsible for.
What we’ve seen is that in the past couple of weeks, business in intelligence and analytical people had become more and more critical to the organization. As a matter of fact, when we talk to business users, their heroes are the BI guys, the business intelligence team that consists variety of different roles that collectively enable the data to be trusted, enable more data, to be shipped to the business users so they can expand their spectrum of knowledge based on additional data, to be able to take proper decisions, very much to align their financials, very much to be able to take decisions that impact their customers and people around them.
If I need to capture the five different bullets that we’ve heard again and again in the past three weeks was the need for the business to get more data. On the BI side, the language was to provide data in a very short, timely fashion, without compromising on its accuracy, which is in some cases, an anomaly. The faster you do, the more holes you have, so how do you balance this?
They also wanted to get an access to everything you need from anywhere at any time, given the fact that we’re not even at the same room or in the same in building or in the same state. How do you keep communicating, looking at the same source of truth and you can trust it. The fact that you’re not physically in the office or have an easier communication, given the fact that you’re working probably with one or two kid in your lap and the constant pressure of business users to want data in order to get better insights, where the environment keeps changing all the time.
We captured about 15 use cases of which we looked at the popularity of them and we picked three, which are probably going to be familiar to you but I think it shines even the past three weeks that we’ve seen our clients using. What we want to do today is to show you the top three use cases that we’ve seen our customers using hundreds of times on a weekly basis and how automation provides this capability, which is so critical these days.
If the picture in front of you looks familiar, it is basically illustration of a typical business intelligence landscape that we’ve seen customers use. Where on the left hand side, you can see hundreds or many dozens of applications being stored in different data sources. A lot of this data needs to be shipped to the data consumers, which could be by the hundreds, could be by the thousands of people consuming tens of thousands of reports at any given point of time.
The magic happens when you ship data from the data sources to the data consumers, through using all of these BI tools that are doing different things to the data. In most cases they are even from different vendors. Spending a lot of time in these tools to try to understand four questions. How do you answer four questions?
These are four questions that we’ve seen clients struggling with every day. Where the data is coming from? Where the data is going to? Where does it exists within those systems? What do the data element means? When you say territory, when you say a customer, when you say commission, when you say address and I can go all day, what exactly does it mean?
All these answers are encapsulated in understanding these business intelligence systems. This is where you spend, or the business intelligence team spent a lot of time. One of the things we understood being, or coming from the BI environment in the past 12 years is the fact that it’s probably not efficient to understand answering four questions by digging in each one of these systems, rather than finding a solution that will understand those systems for us.
If there’s a way we can capture the data and the information about the data through its different varieties, what you knonw as metadata. Like the business metadata and the technical metadata and the operational metadata. Through those systems in a single repository and enable analysis of that information, it can help help us a lot because we’re going to have a tool that analyze that for us in those systems rather than us continue to do this manually.
The popularity of use cases that we’ve seen within our customers in past three weeks have been able to answer it by the fact that analyzing business intelligence is now being done in an automated fashion. We’re going to go through three use cases and my colleague Amichai, who’s the Customer Success, who works very closely with our clients is going to show you how they’re using automation at Octopai, addressing these questions for those specific use cases. These are not real customer environments but our demo environments can mimic how you could imagine our customers using Octopai. Let’s dive into use case number one. This is an example of an insurance company but we’ve seen this use case happening a lot with almost all of our clients. The business request was, “I need to be able to trust the data that I see because I need to make quick decisions in a changing environment.”
This is a true quote of one of our clients. When we asked the BI, “What exactly do you do when you’re being asked by the business user to enable them to trust the data in the BI language?” They are saying, “How do I find the exact detailed processes and database tables that collectively are associated to lending the data on a particular report?” When a business user is saying, “I look at this certain data. Can I trust it? Is that a true number?” In the BI language, it means I need to discover how the data landed in that report by understanding the entire BI landscape.
Amichai, can you show an example of how you leverage automation to answer this question?
Amichai: Yes, sure. Let me go ahead and share our demo environment. All right. This here is our demo environment and this is our landing page. You can see here we’ve–
Amnon: Sorry for interrupting. We still see the presentation.
Amichai: Oh. Sorry about that.
Amnon: Maybe you need to– Oh, now we see the presentation. Thank you. Now, we see the demo. Thank you.
Amichai: Excellent. Great. This here is our demo environment. You can see we have analyzed ETLs, database objects, and reports. We have 22 reports analyzed in our demo environment. I would like to show you an example of how customers are using this to find out how data got into a report. One of the use cases that we’ve been getting is, “I have this report that all of a sudden maybe it is being used. It hasn’t been used but now due to this whole situation, users are now starting to use it. It’s giving me bad data or it’s coming up empty.” You want to quickly find out how data is getting into this report and ultimately start to fix this.
I’ll give you an example of a report. In this case, it’s going to be customer product. You can see it by filtering the word of ‘product’ we have filtered eight of the reports that have been analyzed that include that name. Now, I’ll go ahead and click on lineage to see exactly how data gets into this report. I can see it right away is this is the report here that I created lineage from. It is actually using this view which is based on another view and two more tables that are using three different ETLs to get data into those tables and then of course, into the report.
Now, each of these ETLs is actually from a different system. This is Informatica over here. This is Informatica again but this here is SSIS, for instance. You can see your multi-vendor landscape and how it all ends up into the report. Is this what you mean, Amnon?
Amnon: Yes, exactly. I think that what’s interesting is that from the BI language, if I want to understand the exact name ETLs and exact name database tables and views that I need to look at and to check, if everything is okay that collectively are lending certain data on the report, this is a great map to see. The alternative is to try to draw this manually but the question remains which out of the ETL processes and database tables out of the thousands possible options are related to that specific report. Right?
Amichai: Yes, exactly. Obviously, manually that would take a very long time. We’re talking about anywhere from a week or two depending on the complexity of the whole process.
Amnon: Right. Thank you.
Amichai: Sure.
Amnon: We can move on to maybe use case number two. Use case number two basically illustrates the opposite picture of reverse lineage is what we just did. This is an example that we’ve taken from one of our clients in the healthcare industry but again it might be relevant to other industries. What we’ve seen is that the growing demand to additional data in order to be able to take decisions is growing very, very rapidly. The business request from the BI was, “I need more data so I can make more accurate decisions during the situation that we’re currently having.”
The business request was, “Give me more data.” The BI language was, “If I need to update a certain ETL process in order to enrich its capability to ship more data, how do I understand the impact of all the changes of which I’m planning on doing on a certain ETL process?” One of the things you want to avoid is not to look at only that specific ETL for enriching a certain report. You want to understand all the implications of the remediation that you are planning on doing on a certain ETL because you don’t want to deal with issues that might happen or affect other reports or database tables upstream. The impact analysis part is critical even on the design phase of changes before you even go live.
Amichai, can you show an example of how do you deal with that?
Amichai: Of course, sure. Let me share our demo environment again. Here we go. This time as Amnon said, I will show you how we will easily search for an ETL that we need to make a change to and then understand the impact of making a change there down the line across your entire landscape. For this example, I want to make a change in load data warehouse.
I go ahead and search for it, click on lineage, and there you go a couple of clicks you can see exactly the ETL that we plan on making a change to. We can see the different tables that it is populating. We can then see different analyses, services, objects, dimensions, and measure groups that is affecting over here a view that’s consuming these tables or proceed to to another view, so how it relates basically to the database and then ultimately the reports that are relying on this data.
What’s cool here is that you can click on any of these reports and we’ll see they’re actually from different tools. This here is an SSRS report. This here is SSRS as well. This here is BO for instance, or this guy here is a Power BI report but still we’re able to show you this in one screen, mapping out exactly how this ETL or making a change to this ETL may impact across your entire environment.
Amnon: Thank you, Amichai. What we’re seeing here is basically a lineage between systems. What you’re demoing actually is that the blue circle is the ETL starting point and then the red is the database, data warehouse, and the green stuff or the green bubbles are the reports from the reporting systems, right?
Amichai: Yes. That’s correct.
Amnon: What if I want to see the lineage inside one of these options. Let’s say when I look at the load data warehouse I see three lines coming out of it. It probably represents three different maps. Can I drill into those lineages deeper all the way to the column level?
Amichai: Yes, sure. We can drill straight into this SSIS by double-clicking on it. We’re basically drilling into the package. We can see that it’s got a sequence container over here. We can keep drilling through this package. In the sequence container, we can see we have a bunch of tasks. One is an execute SQL task over here, a few data flow tasks further on. If I want to, I can drill further into any of these data flow tasks and what’s cool here it’ll actually visualize the column-to-column lineage, so how one field actually gets populated all the way from source to target so I click on a target field over here, order quantity, for instance, and I can see exactly all the components that were used in this data flow and ultimately all the way back to its source column. Is this what you mean?
Amnon: Yes. You can show different layers of lineages either between systems, table to table, or even column to column, and that’s with a click of a button.
Amichai: Yes, that’s right.
Amnon: Great. Thank you for showing this. Let’s move on to the third example or third use case which we’ve seen. Just to recap, we’ve seen organizations wanting to be able to trust the data that they’ve seen. For that sake, you would need to use a capability called reverse lineage between systems. We’ve seen another use case that has to do with the fact that business users want to have more data being sent over to them. You need to update or modify or create or recreate business processes, so you would want to know the impact analysis of any change that you are about to make so you will not be in a vicious cycle of inaccurate reports as a result of not knowing what could be the impact of changes.
The third popular use case that we’ve seen is that due to the growing demand of data, in some cases, we see that business users don’t really understand what does this data mean. We see more and more business users asking their analysts, “When you show me a certain data element, what does it mean? When you say an address, what did you mean? The physical address, the billing address, the headquarter address, the shipping address, what exactly do you mean? When you say state, do you mean a US State or maybe a state within Germany, for example, a company, because they’re also having states? Or when you say commission, how the calculation is being done.”
There’s so many requests from business users to get more data but also to understand the meaning of the data that they’re asking so they can actually look at the right data point and they can trust it. One of the things that we’ve seen is the growing requests were very fast creation of business glossary. The ability to understand the new data that is being created and enriched within the reporting system but with proper description to enable the business users to really accurately understand what does this data mean. The business request was, “I need to understand what this specific data element means.”
From the BI language, it was, “Where can I find the definition of that specific data element so I can populate this to the business user?” Probably, you are aware of what we call data catalog but most specifically, the business glossary. Now, as a result of analyzing the different facets of metadata from all of these BI tools, Octopai enables you to create on the fly, a business glossary that contains all the metadata elements, the data elements, as well as the description of those data elements in order to enable the business user to understand what certain data elements means.
Amichai, can you show how do we relate that within Octopai, how do you use it, and maybe you can show this with relevance to maybe combining this with data lineage?
Amichai: Yes, sure. Let me go ahead and show you our demo environment again. Okay, here we go. As you said, that’s a use case that we see a lot. Basically, you’ll have a business user who’s questioning why they see certain data in a certain field, for instance. Maybe they expected to see something else and you want to find out what is supposed to be there and what the meaning of that field is.
For this example, I would like to show you, again, a report. I will show you our sales report in our demo environment. [silence] There we go. This time instead of clicking on lineage, I want to go straight to our column-to-column lineage for this report. I’m going to click on that, and we can see a dataset for this report over here.
For example, our business user has complained about the field Full Name. They expected to see first, middle, and last name, and they’re only seeing first and last name in this field. By looking at the report, you can see that, actually, he’s right. He’s only seeing the first name and the last name. Now, we want to go and see whether this is the design or maybe, this is a mistake and this needs to be corrected. By using the automated business glossary over here which is created from the metadata that you upload to Octopai automatically, we can go ahead and search for that term.
Remember, we’re looking for a field called Full Name on the report, and if I click on it, I will see its description. I can see over here it is a combination, includes first, middle, and last name. It sounds like the business user is right. We also have the owner of this data item over here. It’s Jeff Smith, in this example. We may want to reach out to him and see whether this is correct, and we need to change the report or what the reason is for this discrepancy. That’s, again, a common use case that we have.
Amnon: Where did you get all of these definitions? It’s by the same extraction that you’ve done from the reporting system and just created this on the fly?
Amichai: Exactly. It’s a by-product from the analysis of the metadata that you are already uploading to Octopai and we are analyzing and it’s automatically created.
Amnon: Thank you. If we can go back to the presentation just to create a short summary and, again, maybe in the next session we can share additional use cases, I think that what we try to show you is the fact that leveraging automation is something that is really important for organizations, not only because of the corona and the three top use cases that we’ve seen shines over other use cases is because the need to automate the way you work today to cope with all of the growing business demands is one of the ways for you to really provide the ability to answer those four questions that are being asked regularly on a daily basis by business users.
In the BI language is where the data is, what does this data mean, where data is coming from, and where is it going. At Octopai, we believe that this goldmine called information about the data and different type of information or as you know as metadata, is really a world of insights that any organization can enjoy that today most of that discovery is being done manually.
The combination of extracting metadata in a very, very easy manner centralize this in a single repository, providing a very thorough analysis and project those analyzed metadata in a form of different type of visualization like data discovery, like data lineage, business glossary, version management, all in a single platform is the way to modernize the way business intelligence teams work as opposed to how they used to work in the past years.
One of the questions that I was asked before we go further is, how do you get all this metadata? How do you make this analysis in the first place? Our product is based on the fact that we analyze metadata of the client, and the client needs to extract the metadata from their business intelligence systems. It can be done in one or two ways, either the end customer extracts the metadata from a variety of different tools, or what we’ve seen happening on 100% of our clients is, that they use certain extractors to extract and to help them extract the metadata.
We also enable this capability, when we create an account for you at Octopai, you can download relevant extractors that extract metadata from a variety of different tools. You can see a long list of them on our website. Once you extract the metadata, the metadata is being shrink-wrapped in an XML file. It’s a readable file that consists only the metadata, not data. You upload this to Octopai account that we’ve created for you. That process takes about 30 minutes That’s it. 30 to 45 minutes.
Depends on how many business intelligence systems do you have in your landscape? No preparation, no documentation, no interviews, no custom development, no professional services is needed. Just to extract the raw metadata. Once you’ve uploaded the metadata, we need one to two days to analyze this once, and then you get an access and start working just like Amichai demoed to you guys.
We believe in couple things. First of all, is that we need to shift from manual to automation. Automation comes in place by the fact that you leverage available technologies like algorithms and machine learning and pattern analysis, and very thorough parsing of different facets of metadata from different tools to centralize a single repository and leverage the power of the cloud.
The fact that at the end of the day, you want to be able to use the insights around your metadata, not to be able to physically work to be able to use those insights. With that said we’re open for questions. I got some questions here. Some has to do with what is the process to load Octopai with all relevant metadata, to get data mapping. As we just described, to get Octopai up and running, it takes about 30 minutes while downloading the relevant extractor, extract metadata, upload this, and within a day or two, you get an access and start working with a product.
If there’s any additional questions, I’ll be happy to answer either if you put them on chat or maybe send them over by email. We’re open for questions. Here’s one question. Will the conference video be available in someplace? The answer is yes. This video is being recorded. Our marketing is going to capture this recording and probably enable this to be viewed or downloaded or accessed through the website.
Melissa from our Marketing is going to send all of you the email, where exactly can you either download or just access to listen to. Another question was what’s the backend database. Our product runs on Microsoft Azure. We’re using rational database and some other techniques to capture the metadata, to store the metadata, and to analyze the metadata. All of these analysis is being done seamlessly for the clients. From the customers’ perspective once you subscribe to start using Octapai, the infrastructure is already included in the price.
The only thing you’re buying is the access to use Octopai, so you can do lineage discovery, use the business glossary, and so on. All the infrastructure the analysis is being done on the cloud by Octopai is already included in the price. There’s no additional cost whatsoever, out of you guys starting to use Octopai. Also, it’s worthwhile to say that we believe in democratizing the ability by everybody in the business intelligence group to understand the full data journey.
This is why when we price Octopai, we price it by the number of metadata sources that we analyze. We do not price it by the number of users. Any number of users within your organization that you would allow to get an access to Octopai can enjoy the functionality and leverages by everybody which increase the ability to decently have a conversation about the data movement process.
Another question is, is Octopai like data catalog like relation? To a certain extent in the business glossary and data catalog domain, the answer is yes. Octopai is not limited just to become a data catalog. We’re looking at the data catalog as a result of very thoroughly understand metadata from a huge variety of sources, which is a little bit different from other companies’ approach that they are focused on a certain, either data governance or data catalog or data lineage.
We’re looking at these capabilities as it derived of a very thorough technology that enables capturing different facets of metadata, parsing this, and analyzing and modeling this very, very properly. If you do a good job in that through technology, then you can create tools like data catalog, business glossary, version management, data discovery, and data lineage. We’re looking at this as a result of a technology and it seems to be that our clients feel very, very good about what we have to offer.
More questions. Do you have an on-prem and cloud deployment models? We do. Nevertheless, I think that like 99% of our clients are leveraging the cloud version simply because they don’t want to deal with local on-prem infrastructure or have to bother about version of our product and remote installing and maintaining of the application. As a matter of fact, one of the things that we feel very proud of is the fact that we bet on the cloud four to five years ago, knowing that organizations will move more and more to the cloud.
They want to get a service. They don’t want to invest in local infrastructure to do that. If you look at Octopai’s website, you can see some of the clients that we work with in the banking, in the financials, in healthcare, pharmaceutical, government, manufacturing, automotive, all of them are cross-vertical, cross sizes, cross geographies, and they all use Octopai in the cloud.
If there’s one thing they wanted to do from economical point of view is to not invest, to be able to have data lineage rather than just use data lineage. They need just to extract, encrypt it by running this through a vault to their secured account that we created for them on Microsoft Azure or AWS, and just get an access and work. This is our belief. Any more questions? Not at this point. There are some others that are more technical.
I’ll be more than happy to answer later. If there are more questions that you want to share, you can either contact myself or contact Amichai, who is the more technical guy, I would say. We’ll be more than happy to show you a further personalized demo related to specific use cases, or even do a trial.
Again, the fact that the product runs on the cloud requires zero effort from your end besides of extracting metadata. We’ll be happy to do a trial, analyzing your metadata free of charge, so you can see how your metadata’s going to look like after analyzing at Octopai and compare that metadata that you’ve invested manually in the past couple of weeks to deal with use cases. How would it be looked like when you’re using Octopai?
We’ll be more than happy to answer any additional questions that you may have. Thank you so much for your time today. I hope that you’re all healthy and safe. We feel very, very privileged and proud that you’ve allocated the time to listen to us. If you give us a chance, we’ll be more than happy to show you how you can leverage our solution and provide you value as we do with so many others. Amichai, thank you very for demoing this. Thank you everyone for joining and have a great rest of the day.