Data Mesh Radio Patreon – get access to interviews well before they are released
Episode list and links to all available episode transcripts (most interviews from #32 on) here
Provided as a free resource by DataStax AstraDB
In this episode, Scott interviewed Andrew Padilla, who runs a data and software consulting company – Datacequia – and serves as editor of the Data Mesh Learning community newsletter.
This one is a bit more philosophical about sharing information/knowledge so it’s one to sit and think over. Things in quotes are direct from Andrew.
Some key takeaways/thoughts that come from Andrew’s view of data mesh and the data space in general:
- To move from sharing the 1s and 0s of data to actually sharing knowledge, we need to harmonize data, metadata, and code – “the digital embodiment of knowledge”. That’s where Andrew hopes the mesh data products can head.
- Software development isn’t cutting it for sharing knowledge. Will data product development? Do we need to move to knowledge-centered development instead? Remains to be seen.
- We still don’t know how to model well – in data – what is going on in the real world. What are the experiences of the organization? Can we really define an “organizational experience”? Event storming tries but seems to fall short often.
- We must learn to treat organizations like living entities. Organizational experiences cross multiple domains and the types of experiences will change, will evolve – possibly quite quickly. We again have to get better at modeling those and evolving how we share knowledge about the experiences.
- Knowledge graphs are the best way we have currently for combining information across domains. We still haven’t fully figured out how to leverage our cross domain knowledge though.
- Historically, we’ve bent our ways of working to the limitations of the machines. We need to spend more time on bending the machines to better match the way humans store, process, and share knowledge.
- Data centricity is an interesting concept but might take our current imbalance of data versus operational focus too far towards data. But that might be what is necessary to really get to balance. It remains to be seen. But it’s crucial to understand a data-first focus isn’t necessarily a knowledge-first or knowledge as a first class citizen approach.
- It’s important to understand that mesh data products are a means to an end in data mesh. Yes, they are crucial to sharing information but they are there to serve a purpose, not that they are the purpose.
- In data mesh, it can be easy to focus too much on creating data products of immediate utility or that are high value in and of themselves. But it’s important to think about how data products together create value – and maybe not immediate value – to really drive forward our understanding of the organization’s knowledge and experiences.
Andrew started the conversation with his hope and vision for data products – or the data quantum – in data mesh. Historically, data, metadata, and code are not often grouped together and even less frequently are they in harmony. They belong together as that harmony creates a higher level abstraction to share knowledge, not just the 1s and 0s of data. To get data mesh right, Andrew believes you have to really figure out how to build mesh data products with that harmonization in mind. And you probably won’t get it right at the start of your journey and that’s okay – we are all still figuring it out.
We possibly – or even probably – need to move from software development and even data product development to knowledge development in Andrew’s view. Knowledge development meaning centering the development process on sharing knowledge. So much of what data we share lacks the actual context of what happened in the real world, the experiences of the organization. But we still don’t have a great way of sharing those organizational experiences – event storming in DDD (Domain Driven Design) tries to address this but it often falls short. How can we progress towards modeling organizational experiences?
For Andrew, as many have said, if we can figure out how to do data mesh well and create some good standards, it is very well-suited for cross organizational knowledge sharing and collaboration. Not data selling but collaboration – similar to what Jarkko Moilanen mentioned in his episode’s discussion about the data economy. But it’s still early and data mesh will not be the silver bullet to figuring out how to do that cross organizational collaboration well – and crucially safely and compliantly.
We are just starting to understand, to develop a point of view on what Andrew called the “knowledge of experience”. Per Andrew, “knowledge, by definition, is just the acquisition and use of experience and/or education”. We must learn to treat organizations like living entities -> organizations have new experiences and are changed by them. And the types of experiences are also changing as the real world changes. In the 80s and 90s, much of communication between entities was done by fax. Faxes aren’t nearly as common for most organizations nowadays. But the evolution of experiences seems to be accelerating and we are still struggling to capture those in data.
For Andrew, software and data must reflect or “be the embodiment of” those organizational experiences. Do we even know what it really means for an organization to have an experience? Much less model it? And those experiences don’t take place in isolation to a specific domain. So once we figure out the experience modeling for the domain, we then get to try to figure out how to scale that for experiences across domains. Yes, we have some work ahead of us!
Knowledge Graphs probably hold the key to the cross domain information sharing, per Andrew. He called them the “glue”. They are as good as we have technologically currently for developing and leveraging the knowledge tie-ins across domains. They allow people to bring more of a “history of experiences” to the conversation.
In Andrew’s view, we’ve historically had to deal with the limitations of what technology could do when thinking about knowledge sharing. Our ways of working have bent to those limitations. But we should start to try to bend the machines more to the way humans store, process, and share knowledge. Almost a higher-order Law similar to Conway’s Law that we must push the machines to communicate and work in the way humans think.
Scott asked about Dave McComb’s data centricity concept. Andrew thinks that data centricity might swing too far towards data only instead of maybe knowledge first or knowledge as an equal footing to the 1s and 0s of software. It might shake things up to move towards data centricity but does it get us closer to knowledge as a first-class citizen? Per a follow-up, Andrew sees the data, metadata, and logic as components that must work together in their specified functions in harmony and balance much like your vital organs. So just swinging to data doesn’t necessarily swing us towards knowledge.
Andrew brought up the concept of the 2D person in physics. What might be just a simple line in three dimensions – something you can easily avoid – is a full stop blocker to the 2D person. 2D is choosing between is this data or is this code. We are living in a 3D world – even 4D when you think of time – so we need to move past our current ways of working and think on a higher plane about how to approach our work.
Data products in data mesh are simply a means to an end for Andrew. They are the building blocks to build out your knowledge repository and sharing. But it’s crucial to understand that they are for a purpose in the greater organizational sense, not the end accomplishment.
Andrew finished up by sharing his ideas around data monetization. If you are specifically thinking of data, even for internal sharing, as a monetary asset, does that put experimentation on the back burner? Do you only go for the sure bets? Or only focus on things you know have a specific use instead of simply potentially valuable? Andrew thinks that there is significant value to R&D incubation in general and specifically to data. We aren’t in a world yet where producing those speculative or interesting with no specific utility data products is cheap so maybe those come later. But it’s important to not become overly focused on the immediate utility of each data product itself instead of how it fits into the greater picture of your organization’s knowledge.
Andrew’s LinkedIn: https://www.linkedin.com/in/andrew-padilla-8988094a/
Datacequia website: https://www.datacequia.com/
Andrew’s personal Substack: https://datacequia.substack.com/
Data Mesh Community newsletter Substack: https://datameshlearning.substack.com/
Data Mesh Radio is hosted by Scott Hirleman. If you want to connect with Scott, reach out to him at community at datameshlearning.com or on LinkedIn: https://www.linkedin.com/in/scotthirleman/
If you want to learn more and/or join the Data Mesh Learning Community, see here: https://datameshlearning.com/community/
If you want to be a guest or give feedback (suggestions for topics, comments, etc.), please see here
Data Mesh Radio is brought to you as a community resource by DataStax. Check out their high-scale, multi-region database offering (w/ lots of great APIs) and use code DAAP500 for a free $500 credit (apply under “add payment”): AstraDB