Sign up for Data Mesh Understanding’s free roundtable and introduction programs here: https://landing.datameshunderstanding.com/
Please Rate and Review us on your podcast app of choice!
If you want to be a guest or give feedback (suggestions for topics, comments, etc.), please see here
Episode list and links to all available episode transcripts here.
Provided as a free resource by Data Mesh Understanding / Scott Hirleman. Get in touch with Scott on LinkedIn if you want to chat data mesh.
Transcript for this episode (link) provided by Starburst. See their Data Mesh Summit recordings here and their great data mesh resource center here. You can download their Data Mesh for Dummies e-book (info gated) here.
Vanya’s LinkedIn: https://www.linkedin.com/in/vanyaseth1809/
In this episode, Scott interviewed Vanya Seth, Head of Technology for Thoughtworks India and Global ‘Data Mesh Guild’ Lead for Thoughtworks. To be clear, Vanya was only representing her own views on the episode.
Some key takeaways/thoughts from Vanya’s point of view:
- Data mesh is at a similar inflection point to where microservices was a decade ago. Let’s not relearn all the hard lessons they already learned. We should adapt/contextualize to data of course but we can skip a lot of the anti-patterns.
- Similarly, many people are stuck thinking “there’s no way that could work” regarding data mesh like they were when people suggested development and operations be combined in DevOps. It’s understandable – it’s hard to imagine a post monolithic world when all you’ve known is monoliths.
- ?Controversial?: We should try hard to prevent creating the fear of missing out (FOMO) for those not doing data mesh. If data mesh isn’t right for your org, especially if it isn’t right at this time, that’s perfectly okay. Don’t take on the overhead cost of data mesh if it won’t bring more value than cost. Scott note: PREACH!
- ?Controversial?: Some CDOs or CAOs, their organizations don’t really get the value of data so they are implementing data mesh to try to prove out value and make their mark. That can obviously create issues if their organizations aren’t ready.
- A few indicators an org is ready for data mesh (see below for expanded context): A) data/AI investments are not delivering the promised/expected returns and/or it’s hard to point to the value delivered in general from data/AI investments; B) the organization is attempting to throw more people at centralized data management and it’s not working (platform included); and C) there’s extremely unclear ownership around many aspects of data, especially who owns aspects of hand-offs or who owns the end data asset – how can a consumer actually ask a question about the data with no clear owner?
- “Innovation in queue syndrome” = your innovation agenda is “in queue” and keeps getting deprioritized because you are dealing with everything else first just to keep your data practice flowing.
- Use value stream mapping to understand how your organization drives value from business processes and where there is value leakage. Especially useful if data work isn’t driving value.
- We should take a lot of learnings in how microservices service discovery evolved, especially the tooling, for data mesh. There is no need to reinvent the wheel on this.
- Some existing tooling from the microservices space is just fine for data mesh too. We don’t need to invent new tools when existing ones – which are already robust and mature – can be extended or even used as is.
- Platforms aren’t about the tooling, they are about the holistic user experience – how do you stitch things together to automate the toil and let users focus on what matters? The tooling is under the hood, not the main interface.
- Users of your various data platforms should not be directly interacting with tools for the most part. It should be about abstracting away the tools and making it easy for them to interact with data, not the tools of the platform.
- !Crucial!: “Choose your blast radius.” Far too many are looking to change the entire organization at the start of a data mesh journey instead of limiting scope to a reasonable level. Find one “courageous” domain to move forward.
- “Nothing succeeds like success itself.” Get to a data mesh win that you can tout quickly so others will get bought in and see the value and want to participate. Incremental value delivery builds interest and momentum.
- Build your platform at the same time as you’re building your initial data products. Far too many platforms are built with tools as the focus instead of automating away toil and focusing on necessary capabilities.
- !Crucial!: Evolvability should be a first class concern when building your platform, just like with any product. You must be able to continue to improve and change to meet needs.
- Focus on the abstractions and the ubiquitous language – e.g. business people don’t care what the technical underpinnings of a data product are, they care about what it means for them and how they can access/leverage it.
- When starting your data mesh journey, look at the use cases to decide how much of each pillar you really need. Don’t overbuild early. If you only need a minuscule amount of governance, great. If you don’t actually need the producing team to be overly involved in ownership, awesome. Don’t go for full data mesh at the start.
- What you should focus on relative to your early journey is unique to your own situation and use case. Don’t worry about competitors or how others are starting – their circumstances are their own.
Vanya started with a bit about her background and how deeply entrenched she’s been in the microservices space – that played into the overall conversation a lot. Both Vanya and Scott agree if we want to do data mesh right, we really should take learnings from microservices and DevOps so we don’t have to relearn what they already did the hard way.
For Vanya, data mesh is at a similar inflection point to where microservices was a decade ago – people were extremely skeptical that developers and operations could even work together, much less around combining them in a singular approach with DevOps. It’s hard to imagine a post monolith world when all your career and experience are with monoliths. We have to be somewhat kind to those people in understanding that change is hard and scary 🙂
But, as a counter, for data mesh Vanya believes (and Scott agrees) we must try to prevent creating the same fear of missing out (FOMO) that microservices had. For many, if your organization wasn’t doing microservices, it wasn’t seen as a cool place to work and that all the best developers were at companies doing microservices. We don’t want that in data mesh because it will lead to lots of wasted effort for companies that shouldn’t be doing data mesh now or potentially ever.
According to Vanya, there are a few really good indicators an organization might be ready for data mesh. Before we get into the 3 she listed, a few things that might be indicative of indicators (Scott note: I know, I know, silly Scott phrasing) are constant displeasure of the kinds of initiatives they’ve been doing in the data and AI space – there is a constant pressure to prove the value of data and AI investments but really, an inability to do so. Long and lengthening cycles to return on data work/projects. A biggie is an ever-growing platform that is trying to do too much and hasn’t been delivered – trying to boil the ocean.
So the 3 indicators data mesh could be a good fit that Vanya listed were:
1) Investments in data and AI aren’t delivering expected value and it’s hard to actually point to the value that is being delivered. Users aren’t getting “the right data at the right time with the right quality”.
2) Large and growing central data teams where trying to scale is done by throwing more people at the problem and it just isn’t working. When automation would be better, they add people.
3) Confusion around who owns data when and why. Who owns the handoff between systems? Who owns the documentation and metadata around data? When someone has a question, how hard is it to find who owns the data?
Vanya highly recommends using value stream mapping to understand how you drive value with business processes and especially where are value leakages; this can be data or not, and should be applied to both analytical and operational data processes. You can understand better your business processes and expected outcomes – if something didn’t meet expectations was that because expectations were wrong or did something happen along the way to lose value? Value stream mapping gives you an objective and neutral starting point and helps identify problem areas – value leakage – where you can prioritize what to tackle first.
In microservices, Vanya pointed to how challenging service discovery started to become until tooling came along – specifically mentioned Consul – so we really don’t have to reinvent everything in data mesh. The tools out there, especially those in the open source space, are really making nice progress – specifically mentioned DataHub – compared to where they were 2 years ago at the infancy of bleeding edge data mesh adoption. Overall, we should 1) look to existing tools to see if we can use them as is; 2) look to extend existing tools where possible to cover incremental needs specific to data mesh; and then 3) look to create new tooling that is required for data specific challenges. Again, don’t reinvent the wheel.
For Vanya, one thing many organizations struggle with in data mesh is the self-serve platform – what is the goal? Circling back to an earlier point, it’s not about building the most amazing, ocean-boiling platform. It’s about stitching tools together to automate the toil away – how can you create a holistic user experience to focus on doing the value-add? The value of the platform to the users is the abstractions away from the tools that make it easy to focus on what needs to be done to drive value from data, not play with the shiny tools. Focus on enabling interacting with the data, not the tools of the platform.
“Choose your blast radius” is a key phrase for Vanya. Think about scope appropriately and don’t try to bite off more than you can chew. You don’t have to reorganize your entire organization on day one to do data mesh, that is far too much of an upfront cost and makes failure a massive cost. Look at how it was done well in microservices: thin slices, not taking a sledgehammer to the monolith. Gradual evolution is sustainable, a revolution either succeeds or it doesn’t – don’t take on risk that isn’t actually beneficial!
“Nothing succeeds like success itself,” was another line from Vanya. It’s crucial to get to an early win or two to show off to the rest of the organization proving data mesh delivers value and getting them interested in participating. ‘Hey, we did this and it was a big win, who’s next?!’ It’s not just about showing value, it’s about showing there was a reasonable encapsulated timeline, not just promises. That incremental value delivery creates momentum and the more momentum you have, the more you can get people on board.
As many past guests have noted, it is a pretty bad (Scott note: fully terrible) idea to build the platform and then bring it to the users when it’s done in Vanya’s view. There are far too many unexpected friction points and finding those and tackling/automating away the actual friction is where the platform adds value, not bells and whistles. You want to find those as they emerge and work on tight feedback loops – that’s product thinking! And if you don’t make evolvability a first class concern, you are not building your platform as a product either.
For Vanya, it’s pretty easy for tech people to focus on the tech, whether that is in data or not. But the overall organization doesn’t care about the tech, they care about what they can do. So it’s crucial to find the ubiquitous language and make your implementation and platform about what are people trying to do. The user isn’t accessing S3, they are accessing the Inbound Marketing Conversion data product. S3 is simply a mechanism to accessing the data and insights.
When considering your thin slice early in your data mesh journey, it’s okay to have a very unbalanced slice in Vanya’s view. This has been mentioned before but it’s important to reiterate. If you only need a bit of one of the pillars but you do need more capability in another of the pillars, that’s absolutely okay. Don’t build today for all the problems of 6mo from now. You want to focus on tackling the toil of today.
Vanya’s phrase “innovation in queue” is when an organization keeps putting off their innovation agenda for more immediate concerns – everything innovative ends up getting deprioritized in the queue.
Most data mesh journeys are taking six to seven months to really prove out doing data mesh and value. Scott note: this seems to be standard for larger organizations but a complete POC means faster follow-on for additional use cases. It’s a balance!
In some organizations, if data is not really valued the CDOs or CAOs are looking to implement data mesh to show the value of data but their organizations often aren’t ready and trying to do data mesh just creates more challenges than benefits.
Data Mesh Radio is hosted by Scott Hirleman. If you want to connect with Scott, reach out to him on LinkedIn: https://www.linkedin.com/in/scotthirleman/
If you want to learn more and/or join the Data Mesh Learning Community, see here: https://datameshlearning.com/community/
If you want to be a guest or give feedback (suggestions for topics, comments, etc.), please see here