It is interesting how the pace of technological advances opens gaps in better practice frameworks and architectures. The concept of data mesh made us look at the benefits of a decentralized sociotechnical data architecture, which may present a challenge when we envision developing enterprise deep learning models.
Let us assume you plan to build an LLM that helps logistics managers navigate exceptions, like when the shipment is running late or stuck (say in Panama Canal). The data set needed to train the model needs to span various sub-functions/teams. And this is where the challenges will come into play.
While the data mesh concept may work very well for BI, analytics, and advanced analytics, each of the advantages of the data mesh architecture may create some challenges when training deep learning models. Let us explore some of them as examples:
Domain ownership– One of the four basic principles of data mesh is that domain teams take ownership of operational and analytical data. Compared to a data lake, where operational data files from across the organization can be replicated, dumped, and then processed through a pipeline to build training data, this principle will create additional effort.
While other principles, like the self-serve-data infrastructure (discussed in a subsequent section), address this challenge to build capabilities that help build a centralized exchange tech infrastructure, this approach falls short when you are not just looking for analytical or plain operational data. When training an LLM model, you need training data in a format that “speaks the story.” Not just numbers. A story. Don’t forget that you are training language models.
With multiple teams owning their respective data, there must be more coordination and collaboration. While collaboration is powerful, we know how multiple stakeholders work in the real world. Hence, this principle adds additional leg work when training large models needing comprehensive enterprise-wide data.
Self-serve-data-infrastructure– This is where decentralization in data mesh becomes a bit confusing when it proposes domain-agnostic functionality, tools, and systems to enable domain teams to consume data when needed. This essentially gravitates towards building and managing a centralized technical infrastructure that can allow data consumption across domains since domains own data.
As indicated in a previous section, this approach falls short when you are not just looking for analytical or plain operational data. When training an LLM model, you need training data in a format that “speaks the story.” That is why developing these models will be challenging, but when you have such a model, you will have more power in your hands than you deserve (lousy humor). When a ship is stuck in the Panama Canal, the model should respond to a prompt looking for a solution, with what worked last time or what did not.
To do that, the training data needs to be transformed so that the existing self-serve capabilities may not be able to cater to a data mesh scenario. Because of these same reasons, the data-as-a-product principle will also fall short when building custom LLM models.
How I think about data architectures is that we should rise above these buzzword frameworks to build customized architectures based on requirements. A couple of weeks ago, I suggested modifying the data mesh to develop a better framework and was going to write about it. However, as I was thinking about implementation bottlenecks of a possible use case of the enterprise LLM model, I realized that what I envisioned would also fall short.
We tend to gravitate towards frameworks since sometimes they save us the hassle of reinventing the wheel. However, the world of technology is increasingly becoming a place where frameworks will not be optimal and should not be used. More to come on that in a separate article.

