Sign up for Data Mesh Understanding’s free roundtable and introduction programs here: https://landing.datameshunderstanding.com/
Please Rate and Review us on your podcast app of choice!
If you want to be a guest or give feedback (suggestions for topics, comments, etc.), please see here
Episode list and links to all available episode transcripts here.
Provided as a free resource by Data Mesh Understanding / Scott Hirleman. Get in touch with Scott on LinkedIn if you want to chat data mesh.
Transcript for this episode (link) provided by Starburst. See their Data Mesh Summit recordings here and their great data mesh resource center here. You can download their Data Mesh for Dummies e-book (info gated) here.
Kineret’s LinkedIn: https://www.linkedin.com/in/kineret-kimhi/
Kineret’s Blog Post ‘Do’s and Don’ts of Data Mesh’: https://medium.com/blablacar/dos-and-don-ts-of-data-mesh-e093f1662c2d
In this episode, Scott interviewed Kineret Kimhi, Analytics Lead at BlaBlaCar.
Some key takeaways/thoughts from Kineret’s point of view:
- !Interesting Decision!: BlaBlaCar reorganized their data organization but did not fully decentralize by embedding people into domains. Instead, they kept a central team but combined multiple functions into a squad around domains – a key domain might have a data engineer, data analyst, data scientist, and a software engineer.
- !Scott Mantra Too!: Sharing your experience – data mesh or otherwise – early and often with the broader data community means better and quicker feedback, not just internal experience. It’s okay to be vulnerable about what didn’t go well, you can get better info and help save others the same pain.
- ?Crucial?: It’s very important that when you split up your teams from functional data role teams, people keep in contact with functional role peers. If not, it can be very lonely as the only data engineer inside a domain. There is a significant turnover risk and a risk to not having scalable learning and knowledge transfer of data work if not handled well.
- Data mesh will lead to a lot of potential changes to people’s ways of working, especially with each other. Don’t shy away from that, people need to know you aren’t forgetting they need career development and that you’ll support them as they learn and get used to new ways of working.
- Communication really is the most important aspect of getting data mesh right. You need to get feedback and keep people informed, aligned, and in sync. Change is less painful if people are told why it is happening and are informed instead of discovering the results of change.
- Look to align as much of your data team at the start of a data mesh journey – get everyone involved in the plan as best as possible. But don’t completely change your data team structure at the beginning, start small.
- On that note, starting small means there isn’t a huge disruption to your ways of working and data org so a team/domain can get comfortable with new ways of working. You can get great feedback to make it easier/better for the domains that follow if you keep in close contact.
- When doing your data mesh PoC, look for simple use cases. Set your PoC up for success and focus on learning how to do data mesh rather than tackling the hardest challenges first.
- Reorganizing your data teams can be frustrating to non-data team stakeholders. You can’t have everything drop to the wayside while you learn to do data mesh. There needs to be a balance between learning a new way of working and continuing at least some semblance of business as usual around data work so you don’t cause major disruption to exec’s ways of working.
- It’s really easy to go wrong with governance early in a data mesh journey. Getting people on the same page and on the same tooling is crucial so there is a better shared understanding, e.g. lineage, observability, catalogue, etc.
- If you do not have key core aspects of data governance in place, get those in place before starting a data mesh journey or you will make it much harder on yourself than it should be.
- ?Controversial?: Data documentation needs to get disrupted because it is still far too manual and too difficult for consumers to really understand without just experimenting with the data themselves.
- ?Often Overlooked?: It’s important to recognize data mesh isn’t the right fit for every organization. Why are you looking to do data mesh? Does data mesh even address your data challenges?
- !Important!: Make certain to give your PoC the resources – including people – needed to succeed but also set it up to succeed. Give the space and limit the pressure on your PoC to really learn if data mesh is for you and if your organization is ready to do data mesh.
- As you get domain teams to start owning their data, it’s not a switch to flip. It’s a process, work with them to get them capable and don’t ask them to do overly complex things as they learn. Crawl, walk, jog, run.
- Make the team participating in your PoC feel like the pioneers and the potential heroes. Make it as easy on them as is practically possible and try to keep as much load off them as possible. Look to make participating in the PoC as beneficial to them as possible and not a burden.
- Set up a strong governance process to prevent schema changes from causing unnecessary downstream pain. But that can’t be on the centralized team to make the changes, that’s a bottleneck and doesn’t scale.
- Don’t expect data mesh to suddenly solve all your data challenges 🙂
Kineret started off the conversation saying she was previously running data engineering at BlaBlaCar but with their move to data mesh, that isn’t really a necessary role anymore so now she is the Analytics Lead. This was part of their greater reorganization of their data org where they are organized around domains into squads instead of by functions like data engineering. So crucial domains might have a data engineer, software engineer, data analyst, and data scientist in one squad focused on their data. So BlaBlaCar has a central data team but each squad is essentially attached to a domain. They kept their chapters around functional roles to keep knowledge sharing high and promote more camaraderie between similar roles attached in a squad to the different domains. BlaBlaCar sees the value in sharing their experiences early and often with the general data community so they can take in external feedback and also help out others looking to do similar things with data. Scott note: If I had a nickel for every time I tried to preach this… 🙂
On advice as to how to start a data mesh journey, Kineret was relatively insistent that you need to form a group of people to partner with on the transformation. No matter your title, you need to have people to lean on and get feedback from. You can’t drive it simply by force of will. Some things BlaBlaCar did that she believes helped make their journey (more) successful were: getting broad alignment around their data mesh journey, including around planning. It wasn’t just a small team of people, all the data team worked together. Second aligns well with Zhamak’s advice too: start small. It allowed them to get people used to a new way of working instead of trying to shift the entire data approach of the organization at once. They focused on collecting feedback from everyone involved in the PoC so they could see how well it worked and so that future domains could replicate the successful parts and avoid the ways of working that didn’t go as well.
As part of their 3mo PoC, Kineret and team took in a LOT of feedback from that single team. The reorg made a number of the data squad attached to the domain feel lonely and disconnected from others in a similar functional role. Again, back to the chapters approach to keep people connected around their role functions. Kineret said she believes if they tried to move all the domains at once to data mesh, their journey would have failed and they likely wouldn’t have kept their data people nearly as happy because they might not have implemented the chapters approach early enough.
To keep your overall organization on board with your data mesh journey, it’s important to think about how stakeholders interact with the data team and keep that stable while you are in transition according to Kineret. If key stakeholders across the organization have to go through an entirely new and different process with each domain, those stakeholders are not going to be happy. So plan ahead and communicate what changes and have things well documented, not just at the data product level but at the business process level. If only you could do data mesh as if it were a separate thing unto itself but it’s part of the business strategy, you can’t keep it in a bubble.
For Kineret, there have been some data mesh transition pain points especially around people moving into different roles or day-to-day responsibilities but communication is the key to keeping everyone aligned and limiting the unnecessary pain that often comes with change. If someone is used to talking with Alice in the central team about challenges with data and then Alice is suddenly in another domain, there is some frustration and concern by that data consumer as to who to talk to and how can they get to a good relationship with their new contact. But a key goal of data mesh is to make data consumers’ overall experience better. So while it might be a bit challenging as things change, keeping data consumers informed of change and making sure there are low friction processes to get what they need are crucial.
Kineret believes data documentation as a whole needs to be disrupted. We generally have some necessary pieces but it’s still overly manual to properly document and consumers still generally can’t really understand the data without digging into it themselves. The documentation generally isn’t capable of doing what’s necessary to get someone up to speed on a data product. Even though her team is doing great, it’s still a challenge to find the right mix of important to in-depth. And it’s still very manual work to create the documentation. Scott note: very much true. Data documentation is still an incredibly difficult task to get right and is probably far more tedious than it should be.
When asked for some general data mesh getting started advice, Kineret had some beginning questions instead. What does your general data governance look like? If it’s not robust, you should look to set that up before decentralizing. You’ll save yourself a lot of unnecessary pain. Second, what is your buy-in for data mesh like and what is the reason for thinking data mesh is the right choice to solve your challenges?* Data mesh isn’t right for every organization. Third, are you really ready to do your PoC and give it the resources necessary to succeed? Are you setting your PoC up to succeed by not putting too much pressure or trying to tackle too hard of a problem? Lastly, can you find a PoC use case that is relatively contained so this doesn’t have too much outside influence and too many stakeholders? Can you clear the space to make it possible to succeed?
*Scott note: so much this; if centralization isn’t you bottleneck, decentralizing is far more likely to cause more issues. Be realistic about what data mesh can change and what it can’t. Don’t use an excavator to dig a 3 inch deep trench in your garden…
A non-standard approach Kineret and team took was by separating ingestion and putting that on the data platform team. So the data domain teams could focus on the cleaning and transforming data instead of setting up the extracting from databases or other data stores. This was part of BlaBlaCar’s capability building and data ownership transition strategy. They didn’t ask or expect the world from the data domains as they were learning. Find relatively simple things for them to do instead of the most complex data engineering tasks. Crawl, walk, jog, run.
Heading into the end of the conversation, Kineret really emphasized how important it is to get the people aspects of something like data mesh right. Make sure people can feel seen and heard, keep people informed, keep people in touch with those doing the same functional roles so your technical folks don’t get too lonely, etc. Really make sure you focus much more than most technical people probably want to – the tech is cool! – on making this a positive transformative experience for the people in your organization, not just the organization itself.
Governance can be really difficult but very crucial early in a journey. You want everyone on the same page relative to ways of working but also on the shared governance tooling, e.g. observability, cataloging, and lineage.
Make sure to treat your initial PoC domain teams like pioneers and give them the support and guidance necessary. Keep undue burden off them as best as possible, keep in constant contact for feedback, and look to make it as beneficial as possible to them. Celebrate the PoC domain team because they went and did the big, kinda scary thing that could transform your organization.
Data Mesh Radio is hosted by Scott Hirleman. If you want to connect with Scott, reach out to him on LinkedIn: https://www.linkedin.com/in/scotthirleman/
If you want to learn more and/or join the Data Mesh Learning Community, see here: https://datameshlearning.com/community/
If you want to be a guest or give feedback (suggestions for topics, comments, etc.), please see here