We are aware that diving into new concepts isn’t an easy task. That is why we will try to make it easy and clear to understand all the details. Below, we will go deeper into more specifics and focus on the new concepts by comparing the features of traditional data platforms with Data Mesh.
In the Data Mesh context, you can think of it like a Data Catalog. It allows everybody to create Data Products, integrate metadata from your ETL tool (when and how it was refreshed), so the dataset related information in the catalog stays up to date (how many rows, what columns are available etc.). It also contains information on how to access this data and so on.
But who provides OLAP databases and the reporting layer (a vital part of a Traditional Data Platform)? In the context of Data Mesh, it’s not even mentioned.
Data Mesh doesn’t really cover DWH, OLAP and report parts, as they are all just another downstream application that consumes Data Products.
In the Traditional Data Platform context, domains are mixed, possibly at the end separated in different Data Marts. In the Data Mesh context, domains are limited to data products and combining data into OLAP engines is beyond their scope. If one finds a need to create DWH, then it should be just another downstream application that uses the domain’s data and serves its customers (as explained in the previous example).
In the Data Mesh context, a Data Product appears much earlier than in the Traditional Data Platform. They build a block for downstream applications, that acts independently. For example, when building a DWH, you often have to create a snapshot of the fact tables; while the Data Product in the Data Mesh context will offer you these snapshots (versions) right away.
To sum up, when porting from a Traditional Data Platform to Data Mesh you have to scatter multiple parts into smaller, independent ones. By having Data Products at the “earlier” stage, you are free to design downstream applications in the way that suits you best:
It can still be DWH or OLAP, but clearly without any complex ETL logic, joining the star schema at most. In this case you will have to physically move data to target locations.
You can use virtualization tools; in case you don’t want to move data around.
If you would like to learn more about data warehouse architectures, big data platforms, and data management platforms, we invite you to read this article.