For more than a decade, organisations have embraced data lakes to overcome the technical limitations of data warehouses and evolve into more data-centric entities. While many organisations have used data lakes to explore new data use cases and improve their data-driven approaches, others have found the promised benefits hard to achieve. As a result, the effectiveness and ROI of many data lake initiatives are now under scrutiny.
The tech community’s view of data lakes has evolved as some organisations face challenges around managing vast data stores and avoiding “data swamps,” where data is stored but not used. These data swamps are massive repositories where data is dumped indiscriminately, leading to problems with discoverability and usability. Centralisation can create bottlenecks that slow access and analysis, and without rigorous governance, data quality can quickly deteriorate. In addition, the one-size-fits-all approach of data lakes fails to address the specific needs of different business domains. The potential of data lakes often remains untapped because users struggle to extract value due to a lack of appropriate tools or the complexity of the data itself.
The discussion in the tech community has shifted to a more nuanced and adaptable data strategy called data mesh. It aims to overcome some limitations of centralised data lakes by promoting a more distributed, human-centric, and context-specific approach to data management. Data mesh assigns responsibility for analytical data to the domain-specific teams that build and run applications and produce transactional data, such as e-commerce teams, and those that consume data and use it to gain insights.
To realise the full potential of data mesh, it’s essential to capitalise on the deep knowledge of the business context that the producing and consuming teams possess. This includes providing extensive training to existing members and creating additional specialised roles, such as a data product owner and a data engineer.
A data mesh platform supports producers and consumers, making their work easier and more efficient. The data mesh platform teams do not create data products or store or process data. Instead, they provide tools and infrastructure, train and advise producers and consumers, and moderate common standards and procedures in a federated approach.
When done correctly, the data mesh model promotes a proactive approach to maintaining data quality, relevance, and accessibility, as well as tailoring data products to meet the unique needs of different business units. By closely aligning analytical data with its operational context, a data mesh facilitates more effective use and sharing of data across the organisation.