Sign up for Data Mesh Understanding’s free roundtable and introduction programs here: https://landing.datameshunderstanding.com/
Please Rate and Review us on your podcast app of choice!
If you want to be a guest or give feedback (suggestions for topics, comments, etc.), please see here
Episode list and links to all available episode transcripts here.
Provided as a free resource by Data Mesh Understanding / Scott Hirleman. Get in touch with Scott on LinkedIn if you want to chat data mesh.
Transcript for this episode (link) provided by Starburst. See their Data Mesh Summit recordings here and their great data mesh resource center here.
João’s Twitter: @joaoasrosa / https://twitter.com/joaoasrosa
João’s personal space: https://www.joarosa.io
Kent Beck talk at DDD Europe 2020: https://www.youtube.com/watch?v=3gib0hKYjB0
Timothy Burton book on Hyper Objects: https://www.upress.umn.edu/book-division/books/hyperobjects
In this episode, Scott interviewed João Rosa, Principal Consultant at Xebia. They discussed domain driven design for data, the importance of intentionality in preventing chaos, being effective instead of efficient, and the concept of a hyper object.
To start at the end, João talked about the need to embrace complexity when dealing with software – and we need to treat data and analytics as a software process. If we try to abstract away the complexity, we lose the nuance and that nuance is what can make all the difference in terms of the value of your data. Software is not like manufacturing where complexity is very costly.
This was a pretty broad-ranging conversation starting with Domain Driven Design – or DDD – for data. João believes we should apply the principles of DDD to everything controlled by software – and when thinking of data as a product, data is definitely controlled by software.
One of the big challenges with bringing something like DDD to data is that there aren’t tools – and most challenges in the data space have historically been addressed with a tool-first approach. There is a desire to move quickly and just solve challenges but it’s not possible to do that with DDD in João’s view. A very interesting point of view João has is developing software is a learning process and working software is a consequence of that learning.
With the move to cloud and the easy consumption of new tools, creating data is very easy. But João believes that in an enterprise, there needs to be very clear boundaries and contracts between domains to prevent overlap and confusion. The conversations between teams are hard because all of them are context-dependent. Even at the software level, your interface to your data products is a form of communication.
João brought up the manufacturing-oriented philosophy of software development and why it causes so many challenges. It is very much about efficiency and lean development. That works well when you are producing physical goods but he doesn’t think it does for software. Small incremental changes to software are not costly in a CI/CD world but the creation of software is expensive. So we need to move away from the manufacturing approach. But that would mean management releasing more control, which many are not willing to do.
For João, there is also a major value to discovery about what you’ve already deployed. How are people using it, what is the market / consumer-base telling us? But in general, we spend far too much time focused on new features and not discovering new things about what is already in production. And those small incremental improvements are often the things that generate real value – and if the investment is small to generate good returns, those small changes are a significant point of potential value leverage.
João brought up Kent Beck who said “once software arrives to production, it changes itself”. Measuring that feedback is crucial. Data mesh, if done well, can really set up organizations to succeed because it can make people effective rather than efficient – we create data products that are easy to use but have unexpected consumption. People can discover new things. We lower the friction to new, useful insights. Efficiency is doing the task at hand with little waste. But is that effective in creating business value?
Intentionality is a key theme for João – if you have autonomy without direction, it can create chaos. In her episode, Jessitron (Jessica Kerr) mentioned the need for agency instead of autonomy. Autonomy is “you figure it out” – João quoted Jessitron as saying “you provide me the direction but not the path”. We should also be constantly assessing what are we trying to accomplish and are we actually headed in that direction. What is the business problem you are trying to solve? Apply intentionality to your work to stay focused on the real goals.
When ~80% of our time is spent trying to code and only ~20% is spent on setting our intentions, what is the outcome? João believes if we flipped that and focused much more on what we are trying to achieve, solidifying the communication before going and coding it, that would have a far better outcome.
Right now, João believes that data is where DevOps was about five years ago – we still, as an industry, need to build the body of knowledge on how to do this right. The DevOps engineer title is starting to fade and we are calling them what they are – platform engineers. But as with DevOps, we need to look at the long-term payoff of building a platform – not all organizations should build a platform!
João brought up the need to think about the long-term viability of all data initiatives, not just the platform. Data products must be sustainable – which is why so many guests have recommended starting with source-aligned data products. Lorenzo Nicora does a great job explaining why in his episode. One of João’s clients is leasing large industrial equipment and has switched to proactive maintenance instead of waiting for things to break and fixing them then. This has created a more reliable service for customers and lowered maintenance-related downtime costs. How can we apply that to data?
A hyper object is an object that spans time and space. João sees data as a hyper object but we typically think of data as a snapshot in time. How do we store data today to answer the questions of tomorrow? And how do we apply intentionality to data so we stop storing data for the sake of storing it. This philosophy better enables us to think of data as a product and reason about the evolution of a data product.
To attempt to sum up João’s thoughts: focus more on intentionality – why are we doing something and is it working -, embrace complexity, and look to solve more through conversation instead of tooling.
Data Mesh Radio is hosted by Scott Hirleman. If you want to connect with Scott, reach out to him on LinkedIn: https://www.linkedin.com/in/scotthirleman/
If you want to learn more and/or join the Data Mesh Learning Community, see here: https://datameshlearning.com/community/
If you want to be a guest or give feedback (suggestions for topics, comments, etc.), please see here
All music used this episode was found on PixaBay and was created by (including slight edits by Scott Hirleman): Lesfm, MondayHopes, SergeQuadrado, ItsWatR, Lexin_Music, and/or nevesf