Data Mesh Radio Patreon – get access to interviews well before they are released
Episode list and links to all available episode transcripts (most interviews from #32 on) here
Transcript for this episode (link) provided by Starburst. See their Data Mesh Summit recordings here and their great data mesh resource center here. You can download their Data Mesh for Dummies e-book (info gated) here.
Andrew’s LinkedIn: https://www.linkedin.com/in/andrewsharp27/
Blog post “Data Mesh – Is this the evolutionary trigger to reinvigorate Data Governance?”:
In this episode, Scott interviewed Andrew Sharp, Data Governance Lead at the consulting company The Oakland Group based in Leeds in the United Kingdom.
Some key takeaways/thoughts from Andrew’s point of view:
- Data Governance changes that are necessary for data mesh are both a threat and an opportunity. Are you throwing away the good with the bad? Or can you reinvent your practices broadly to do governance better in general and clean out bad habits/approaches when moving to a federated model?
- !Controversial! Of the four pillars of data mesh, governance is the most challenging and least mature.
- There are no roadmaps to doing federated governance for data mesh well – and likely there can’t really be a specific roadmap for all as every organization is different. People are just starting to find their way on how to do governance in data mesh well and people will need to explore for their organization.
- ?Controversial? Data mesh will likely require a seismic – or maybe tectonic – shift in the way we approach data governance. That doesn’t mean organizations have to completely change it all at once – that is overly high risk – but it probably won’t work if we just try small shifts to our governance approach instead of small steps leading to a large movement.
- Many organizations are already not doing data governance very well. So adding in the complexity of data mesh and finding entirely new ways of working when the current state isn’t great? That is going to add additional challenges for many organizations on their data mesh journey.
- Try to build momentum around positive organizational changes for governance. And make sure people are bought in to the test, learn, iterate model or change is going to be much more difficult than it needs to be, data mesh or not.
- Data ownership: people generally don’t really understand why data ownership is important. But often, it’s hard to get domains to truly take ownership. And many don’t understand what good data ownership means/entails. It’s necessary to explain why they are the best owner and what owning data means.
- There’s already a shortage of people who are actually data governance capable and that’s likely to get worse as data mesh creates more demand. It remains to be seen if we need embedded governance professionals in each domain or if making existing people in the domain governance capable will work – basically, full time roles or capabilities/responsibilities as part of other roles, we don’t know which will win.
- Existing data governance professionals will need to learn and adapt to keep pace in a data mesh world. Governance won’t be exactly how it has been for years but there are lots of opportunities for governance capable people who also understand more technical aspects. You can teach the required tech aspects to willing governance people but they have to be willing.
- ?Controversial? There has been a shift – sometimes a large shift – towards more technical aspects in many people’s governance roles. But that is likely more of a pendulum swing instead of permanent shift to more tech than non-technical. Scott note: Focusing on the technical and not all the other aspects will result in VERY subpar value realization of your data. It’s where data engineering often goes wrong.
- ?Controversial? There is a misconception that the central data governance team goes away completely when everything is federated – data mesh or not. But that’s more purely decentralized. The reality will be somewhere in between full central control and fully decentralized – and that balance, we still have to discover what works best.
- In physics, there is distribution of load. Instead of one central team doing all the heavy lifting, you distribute it out to far more people and they each have a smaller load to lift. The central team shifts to focusing on the coordination and bigger picture. Just because you’ve been lifting heavy as a central team and can do it doesn’t mean you should!
- It’s a valid and common strategy to not rush into data mesh. Many are testing and learning at a smaller scale rather than jumping in with both feet. They are testing, learning, iterating, and then reflecting. There isn’t some arbitrary time table for doing the large scale changes like data mesh requires.
Andrew started the conversation off with a potentially controversial – but probably often agreed with – statement: of the four pillars of data mesh, federated computational governance is the most challenging and is the least mature in ways of working/patterns. Organizations are starting to learn and make their way forward but it’s still a major challenge. Be prepared to explore and find the right path for you and your organization.
According to Andrew, most organizations are already not doing that well with data governance in the traditional sense so trying to figure out how to do it in a federated approach will be tough. And in data mesh, the computational aspect of federated computational governance means things are automatically applied where appropriate. That’s very hard to do when you know exactly what needs to be done so it will be doubly hard in data mesh where we are still figuring it out. But changing your data governance approach can be an opportunity, not just a threat to existing status quo. How can we leverage the change we are doing to governance to be better than we ever were before? Far easier said than done but it’s not only challenges.
To do data governance right in data mesh, Andrew believes it is more likely to require a major shift to generally how the industry approaches data governance; organizations will need to make big changes – over time – rather than just a few tweaks to better align with data mesh. But, it is very early days and that all remains to be seen, just a prediction. Scott note: I strongly agree with this belief. I think people are looking for ways to not invest effort in aspects of data mesh but I think many have noted the automated/scalable governance work pays significant dividends as your implementation goes wider.
But, Andrew wanted to stress that while we need major shifts, it’s almost more like tectonic shifts than seismic shifts which often result the volcanic eruptions and earthquakes. Large but not moving quite as quickly – the big bang change approach to governance is overly risky. Why put all your eggs in one basket rather than try incremental improvements? Data mesh is all about trying, getting feedback, and iterating to improvement and governance shouldn’t be any different. Build up the momentum around your changes and work with people to communicate where you are headed and why.
When discussing evolution of data governance and sort of traditional data governance roles and people that have been working in governance for a long time versus new people moving into the space, Andrew believes it is crucial for those doing the traditional type of data governance to grow and adapt their skills, especially technically. Will roles require additional responsibilities? Will domains have embedded data governance-focused people as their main role? Or will most of data governance at the domain level be split to responsibilities handled by roles not exclusively focused on governance? He doesn’t expect widespread redundancies but do prepare for some changes. That said, it can be very much of a pendulum action instead of a shift that stays – so potentially look for an overly technical focus for a year or two before it settles into a better equilibrium.
“Turkeys voting for Christmas” is a phrase Andrew used relative to perception of the work many governance teams are doing in data mesh. Essentially, if turkey is a traditional Christmas dinner, are these governance teams that are helping lead the work to federate governance eliminating their own roles? He doesn’t believe so and Scott STRONGLY does not believe so. Look at federated government – it isn’t fully decentralized, that is just silos. Data silos are bad. So you need central coordination points and planners. Where the balance falls for governance responsibilities remains to be seen.
Historically, the central governance team has been doing all the heavy lifting because they are the ones trained to do so according to Andrew. But if we use a fishing analogy, we can see why central teams are happy to participate – if we have to fish and provide food for everyone in the organization, that’s a LOT of fish you need to bring in. Instead, give them rods, teach them to fish. You can still focus on the big value fish – e.g. going and catching a swordfish or a tuna – but by breaking the work load down into manageable chunks, everyone can move faster and focus on creating more value where they have the best context. The less coordination we need across teams, the less unnecessary friction there is.
Other quick tidbits:
Most understand why data ownership is crucial. But many domains are not willing – or not capable – to take real ownership of data immediately. So gradual capability building and ownership handover is probably necessary.
The role of data governance professionals in data mesh is still in flux. Will there be embedded roles in domains or will it merely be skillsets as part of broader roles? Either way, there is likely to be a significant shortage of highly capable data governance people while the need for those people is greater in data mesh than traditional approaches.
Data Mesh Radio is hosted by Scott Hirleman. If you want to connect with Scott, reach out to him at community at datameshlearning.com or on LinkedIn: https://www.linkedin.com/in/scotthirleman/
If you want to learn more and/or join the Data Mesh Learning Community, see here: https://datameshlearning.com/community/
If you want to be a guest or give feedback (suggestions for topics, comments, etc.), please see here
Data Mesh Radio is brought to you as a community resource by DataStax. Check out their high-scale, multi-region database offering (w/ lots of great APIs) and use code DAAP500 for a free $500 credit (apply under “add payment”): AstraDB