In theory, the advantages of a modern Data Lake over a legacy data warehouse are quite obvious. In practice, they are not. Many businesses’ data warehouses have been proven to work with their existing analytics workflows. The promise of improving these workflows with a new strategic approach is met with strong skepticism and detailed questions.
Fortunately, we have answers. We have years of cutting-edge Data Lake experience that includes global scale data solutions based on Microsoft’s Azure cloud platform.
In this blog series, we have compiled our answers to some of the most common questions we have encountered along the way. We hope you will find them useful, as you explore your own Data Lake opportunities.
“Do you need to have an end-state in mind when building a Data Lake?”
For the architecture, yes. For the data transformation and integration details, no. You should have an end-to-end vision of your target architecture that covers:
The technologies used,
Connections between the technologies,
High level data flows, split between common/framework parts and business specific items;
What should be supported on a global level and what should be specified
per application/business/region etc.
Also split rules between Data Lake and Data Hubs should be clarified there.