Data Mesh Radio Patreon – get access to interviews well before they are released
Episode list and link to all available episode transcripts (most interviews from #32 on) here
Provided as a free resource by DataStax AstraDB
Ole’s Book (O’Reilly Early Release): https://www.oreilly.com/library/view/the-enterprise-data/9781492098706/
In this episode, Scott interviewed Ole Olesen-Bagneux, an Enterprise Architect who focuses on data at GN and the author of an upcoming book on data catalogs with O’Reilly. To be clear, Ole was only representing himself and not GN.
The two main topics, which are somewhat intertwined, were: 1) how can we better understand and handle the concept of a domain when discussing data; and 2) how can we build systems that better enable us to search “for” data, not just search “in” data that we know exists?
Some practical advice and general conclusions from Ole:
- Leverage what the Library and Information Sciences discipline – which is centuries (millennia?) old – has formed around the domain concept. It will help you better dig into the actual business dealings of the domain first before trying to focus too much on the technical/software aspects. The software aspects hinder your initial domain mapping – especially in depth – and business context understanding when you start from a DDD perspective in data.
- Spend a lot more time on enabling people to understand what data is available. We focus a lot on optimizing for searching “in” data, but we don’t spend near enough time setting up our systems to allow people to search “for” data. To do that, work seriously on your metadata tools system and look for ways to harmonize data across those tools.
Ole started the conversation sharing his view that Domain Driven Design (DDD) has some shortcomings when used especially for data domain mapping and in general in data. In his view, DDD is overly tied to software engineering so there is too much of a technical bent to understanding and even mapping out domains. He recommends taking domain analysis and domain theory learnings from the Library and Information Sciences discipline and using that to start your domain mapping and then look to bring in DDD after you get a good initial understanding of your domains. DDD and domain analysis can work together harmoniously, they don’t really contradict, but domain analysis focuses on the knowledge first instead of the technical first.
While Ole was inspired by Zhamak’s book as well as the book by Piethein Strengholt, his believes domain analysis lowers the significant friction and often frustration organizations feel when trying to start doing DDD for data. Domain analysis digs much more into what the domain does and why instead of how the domain communicates via software. He believes that data mesh should focus more on the information sharing and less on the software and that DDD will overcomplicate your domain mapping.
For Ole, DDD is overly concerned with modeling domains into software but you need to get to a deeper understanding of your domains and organization first before focusing in on your model. It may be that you truly can’t fully communicate your domain’s context in a data model either and it’s good to know that upfront and take steps to communicate in other ways, such as enhanced documentation.
Ole believes we focus too much on the data model and that often sends people down the path of overly technically-focused solutions. Other guests have mentioned that documentation around data sets and data products is often much more focused on the technical aspects, not actually describing the information represented by the data. How do we store our data so we can make it usable for humans, not just software? Make it searchable, findable, understandable, etc. At both the micro level – a dataset or data product – and the overall macro data mesh level.
Building a semantic model is at least as important as a data model for Ole. We need to again focus on that searchability but what different search capabilities do we need? A simple search experience on a keyword or two, browsing what data is available, complicated queries with filters, etc. Can we enable querying by data lineage, by relationships with a knowledge graph, across a domain, etc.? It’s a different way of approaching data that is not similar to a data model.
For a functional data mesh, Ole believes there needs to be a big focus on the metadata layer. You need to enable data consumers to find what data exists in their “knowledge universe”; focus on also serving the use case of searching “for” data, not just the typical searching “in” data for a specific answer.
Possible ways forward on semantic knowledge sharing should come from the Library and Information Sciences space in Ole’s view. They’ve been doing this for centuries in one way or another. We need to start thinking in a metadata way to move forward. And we need the industry to help develop better metadata tooling and for data practitioners to focus on what metadata tooling they have, working to harmonize data across those tools. It doesn’t have to all happen at once either, we can work from domain to domain to focus on that metadata harmonization.
Ole finished the conversation talking about the fine balance between leveraging tooling and trying to do everything with tooling. There will be important roles for humans in the middle of knowledge sharing – whether they will be more consumer facing like a data concierge or more behind the scenes, we shall see, but Ole bets it will be the latter.
Ole’s LinkedIn: https://www.linkedin.com/in/ole-olesen-bagneux-2b73449a/
Early Release for Ole’s book:
Ole’s Other Recommended Reading:
Zhamak Dehghani’s Data Mesh book: https://www.oreilly.com/library/view/data-mesh/9781492092384/
Piethein Strengholt’s Data Management at Scale book: https://www.oreilly.com/library/view/data-management-at/9781492054771/
The Elements of Knowledge Organization: https://link.springer.com/book/10.1007/978-3-319-09357-4
Data Mesh Radio is hosted by Scott Hirleman. If you want to connect with Scott, reach out to him at community at datameshlearning.com or on LinkedIn: https://www.linkedin.com/in/scotthirleman/
If you want to learn more and/or join the Data Mesh Learning Community, see here: https://datameshlearning.com/community/
If you want to be a guest or give feedback (suggestions for topics, comments, etc.), please see here
Data Mesh Radio is brought to you as a community resource by DataStax. Check out their high-scale, multi-region database offering (w/ lots of great APIs) and use code DAAP500 for a free $500 credit (apply under “add payment”): AstraDB