Data Mesh Radio Patreon – get access to interviews well before they are released
Episode list and links to all available episode transcripts (most interviews from #32 on) here
Data Governance In Action: What Does Good Governance Look Like in Data Mesh – Interview w/ Shawn Kyzer and Gustavo Drachenberg
In this episode, Scott interviewed Shawn Kyzer, Principal Data Engineer, and Gustavo Drachenberg, Delivery Lead at Thoughtworks. Both have worked on multiple data mesh engagements including with Glovo starting 2+ years ago.
From here forward in this write-up, S&G will refer to Shawn and Gustavo rather than trying to specifically call out who said which part.
Some key takeaways/thoughts from Shawn and Gustavo’s point of view:
- It’s very easy for centralized governance to become a bottleneck. Make sure any central governance team/board that is making decisions has a way to quickly work through backlog through good delegation. Not every decision needs deep scrutiny from top management.
- To do federated governance right, you need to enable the enforcement – or often more appropriately the application – of policies through the platform wherever possible. Take the burden off the engineers to comply with your governance standards/requirements.
- Domains should have the freedom to apply policies to their data products in a way that best benefits the data product consumers. So if there are data quality standard policies, the data product should adhere to the standard for measuring completeness as an aspect of data quality but might be optimized for something other than completeness.
- The cost of getting anything “wrong” in data previously has been quite high because of how rigid things have been – the cost of change was high. But with data mesh, we are finding new ways to lower the cost of change. So it is okay to start with policies that aren’t complete and will evolve as you move along.
- If you have an existing centralized governance board, that will sometimes make moving to federated governance … challenging at best … so you will need a top-down mandate to reshape the board. Look to meet the necessary representation across your capabilities (e.g. product, security, platform, engineering, etc.) but not create a political issue if possible.
- Look to add incremental value through each governance policy. And look to iterate quickly on policy decisions where you can. Create a feedback loop on your policies to iterate and adjust. It’s okay to not get your policies perfect the first time, you can adjust them.
- Really figure out what you are trying to prove out in your initial proof of value/concept. If it’s full data mesh capabilities, that can easily take 4-6 months.
- An interesting incremental insight: Zhamak has warned about organizations trying to scale too fast as an anti-pattern that may result in lots of tech debt or a failure of your implementation.
- An interesting incremental insight: in all of the data mesh implementations S&G have worked on thus far, the initial data product has not had any PII as that adds significant complications probably beyond what the value add of including PII would be in most cases.
- Your data mesh implementation team should be 1-2 people from every necessary capability.
- Data mesh is a large commitment – resources, time, focus, etc. – so you need to be prepared to fund it for the long-haul. This isn’t an initial big-bang approach. But this is also why you should focus on continuous incremental value delivery once you get to delivering data products to keep up momentum.
- You will get things wrong as you move forward with your data mesh implementation. Look to limit the blast radius but it’s absolutely fine and expected that you will learn and improve. Data mesh gives people flexibility and flexibility allows for making changes. Set up fast feedback loops and look to iterate rather than trying to get it perfect the first time. Perfect is the enemy of done.
S&G started off giving the four general states of data governance in most organizations: none, centralized, decentralized, and federated. Many organizations, even quite large ones, have little to no major data governance oversight. As previous guests have mentioned, many get fed up with data governance only being a cost center – especially if it doesn’t even offer much risk mitigation or regulator compliance – and essentially do away with their data governance. Decentralized data governance is an anti-pattern in general with each domain or line of business coming up with their own approaches, making cross domain boundary collaboration difficult at best – it’s like each domain is speaking a different language entirely. Many companies move to a centralized approach but that often quickly ends up becoming a blocker without pretty specific controls in place. Rigid plus low throughput isn’t great. Hence, why data mesh pushes for federated governance – governance with a central group to make necessary decisions and policies but where the people who understand best actually apply the policies to their work – namely the data products.
So, per S&G, the federated governance structure in data mesh in general should be a centralized board or team representing many different constituencies throughout the organization necessary to make smart and informed decisions about policies. Then the policies are codified – and coded – into the platform for domains to easily apply the policies to their data products. The centralized team should focus on making quick decisions by delegating developing policy researching and development to people within each of their own constituent groups – e.g. software engineering, platform, product, security, legal/compliance, etc. That way, the leaders on the centralized board don’t need to have all the context themselves to make smart decisions as the people they delegated to can ensure their constituent group’s needs are met. And the application of policies to data products at the domain level is made easy – or at least far easier – through automation. This setup gives the domains more freedom in how they apply the policies to the data products.
On speaking about greenfield versus brownfield for data governance, of course greenfield – meaning little to no data governance in place – is typically far easier according to S&G. It can be disconcerting to see large organizations with very large data practices and little governance but it’s at least easier to only have to focus on creating and training instead of evolving and unlearning too. Either way, to move forward, look to build out the CYA – cover your…butt… – aspects of data governance first and work to build a minimum viable data governance board. Then you can start to ask about needs and create a backlog to start working through. But again, make sure to focus the board on making decisions and impact, not as a political entity. Easier said than done but showing them how to make decisions quickly and efficiently is great. And, with data mesh, policies can be changed or enhanced later – you don’t have to get it perfect at the start.
If you are in a brownfield deploy of your governance board, it can be a political minefield, per S&G, as there may be overrepresentation of certain teams. But you need to work to have the right representation of needs, the right diversity of capability. There needs to be a top-down mandate to really reshape the way your board is composed so you can get to that fast decision making capability. As a reminder, this is somewhat counter to what Laura Madsen recommended in her episode but aims for the same outcome. Possibly look to disrupt your governance if it ever becomes too slow and the bottleneck.
So, you’ve got your governance board together – how do you get going for something like data mesh? According to S&G, you should first focus on policies that positively impact the technical people, e.g. that all output ports on your data products should be registered in the data catalog. And it’s okay to not get your policies 100% correct upfront, you can adjust. Use a feedback loop to take in information about missing policies or currently deployed policies – are they meeting people’s needs? Every policy should have incremental value. Security is obviously a policy area that could be considered as cost-only but it’s still quite important to address and risk mitigation is a value-add.
For S&G as previously mentioned, in order to keep things moving, delegation is crucial. If there is a truly important decision with major implications, possibly the leaders of the different capabilities represented on the board need to get more deeply involved. But for most policies, those heads should delegate as much as possible to people they trust to represent their interests and move forward. We don’t need the end approvers to be overly involved in routine decisions. Much like when purchasing a solution, the CFO typically doesn’t need to be involved in specifically deciding which data catalog to use if it is a small portion of the budget – the experts did the work and made a selection, you delegated to them for a reason presumably.
S&G gave some advice around getting started in your data mesh journey. When they were working with Glovo, company management gave the team the time and budget to really build out the platform in tandem with the governance and the first data product. That took 6 months. And coordinating across all four pillars moving forward simultaneously was certainly not easy. If you don’t have that amount of time and budget, you can do a relatively smaller proof of value/concept in probably 3-4 months; but Zhamak has warned of premature scaling causing a fair bit of issues for a number of companies trying to implement data mesh so trying to rush your proof of value/concept might not be the best idea. They also mentioned a pattern of the first data product at their clients not having PII as that complicates your initial platform needs for governance. And to pick a relatively simple source-aligned data product use case as your first data product.
As to who you should have on your data mesh “tiger team” if you are lucky enough to have some full-time heads to staff it, S&G recommend having 1-2 from each necessary capability so 1-2 data engineers to help build out the platform and upskill your domain team, 1-2 folks on the governance team, 1-2 from product or elsewhere to do the data product management, etc. and then obviously the domain you are working with needs to be heavily involved. And whoever is on the team, prepare to do a lot of data product/data mesh evangelism.
It’s important to understand that committing to data mesh is a big long-term commitment, financial and otherwise, per S&G. Your implementation can’t be a skunkworks approach, you have to be committed to moving forward together so you can drive the necessary buy-in. And it isn’t just the initial implementation, you have ongoing growth of your implementation and maintenance. This is partially why so many guests have mentioned delivering continuous incremental value to make it easy to secure additional necessary funding.
When asked about what parts of your federated computational governance should be in the platform versus at the data product level, S&G believe you should always look to create the affordances and the easy path in the platform. The application of policy via the platform is the best way to ensure compliance and also standardization, which makes it easier on data consumers. But any decision relative to the specific context or needs of the explicit product should be made at the product level. So, the decisions about how to measure data quality characteristics would be at the platform level but the SLAs to meet for a data product would be set at the data product level itself by the domain team.
In wrapping up, S&G wanted to reiterate that data mesh isn’t easy if you want to set yourself up for long-term success. It is going to take a lot of effort to get it going and deliver your initial data product and platform and governance policies. But by spending time to do it right, you set yourself up for gaining a lot of momentum. Don’t get discouraged. And be prepared to get things wrong and then fix them, that’s totally okay. Play, learn, iterate, improve.
Gustavo Drachenberg’s LinkedIn: https://www.linkedin.com/in/gusdrach/
Shawn Kyzer’s LinkedIn: https://www.linkedin.com/in/shawn-kyzer-msit-mba-b5b8a4b/
Data Mesh Radio is hosted by Scott Hirleman. If you want to connect with Scott, reach out to him at community at datameshlearning.com or on LinkedIn: https://www.linkedin.com/in/scotthirleman/
If you want to learn more and/or join the Data Mesh Learning Community, see here: https://datameshlearning.com/community/
If you want to be a guest or give feedback (suggestions for topics, comments, etc.), please see here
Data Mesh Radio is brought to you as a community resource by DataStax. Check out their high-scale, multi-region database offering (w/ lots of great APIs) and use code DAAP500 for a free $500 credit (apply under “add payment”): AstraDB