Blog

Data Lakes Project: An In-Depth Q&A Compilation by Lingaro

Written by Maciej Trembicki | Mar 22, 2021 9:16:59 AM

Data Lake Projects: An In-Depth Q&A Compilation by Lingaro

In theory, the advantages of a modern Data Lake over a legacy data warehouse are quite obvious. In practice, they are not. Many businesses’ data warehouses have been proven to work with their existing analytics workflows. The promise of improving these workflows with a new strategic approach is met with strong skepticism and detailed questions.

Fortunately, we have answers. We have years of cutting-edge Data Lake experience that includes global scale data solutions based on Microsoft’s Azure cloud platform.

In this blog series, we have compiled our answers to some of the most common questions we have encountered along the way. We hope you will find them useful, as you explore your own Data Lake opportunities.

 

Question

“Our ultimate goals are to eliminate SAP BW BEx and build a semantic layer. Will a Data Lake help us?”

 

Answer

Yes, with a data catalog and data lineage. While building the Data Lake solution we should add two additional services:

  • Data catalogs

  • Data lineage

A data catalog captures both technical and business metadata which can be used — with proper visualization and security — as a semantic layer across the enterprise. This data catalog will allow your organization’s users to discover, manage, and understand data from various sources. And because it is centralized it can be retrieved from one place. The next step is to integrate the data catalog and data linage with reporting tools to create a semantic layer.

When building your end state BI architecture, you should include a reporting tool which provides the semantic layer functionality. This allows to standardize the business meaning of your data and provides a uniform experience for the end users, especially in corporate standard reporting and self-service scenarios. Eventually, the semantic layer should be integrated with the data catalog metadata, to allow for the functioning of the central knowledge repository and proper management of your data assets.