Data Mesh Radio Patreon – get access to interviews well before they are released
Episode list and links to all available episode transcripts (most interviews from #32 on) here
Provided as a free resource by DataStax AstraDB
In this episode, Scott interviewed Luca Paganelli, Data Architect at the Italian utility Gruppo HERA.
To start, some interesting points and/or key takeaways and questions:
- Introducing new concepts and ways of working around data slowly – not looking to make a hard shift – has worked well. When Gruppo HERA debuted their new data strategy manifesto, none of it was a surprise and it was already relatively in-line with the way many were talking about data and moving forward on their data journeys.
- HERA’s Data, Analytics, and Intelligence Automation (DAIA) team is not forcing domains to comply with HERA’s data mesh-inspired guidelines but instead working with them closely to help the domains achieve their data related goals – delivering the “right thing”. That gives the DAIA team strong influence to direct the domains’ approach to data work without pushback and gives domains better confidence in the guidelines and can mitigate analysis-paralysis risk. This lack of rigidity and strong rules created a better sociotechnical environment to innovate but it can mean nothing really feels standardized because the domains can still choose to go a different direction.
- The paradigm-shift was initially “steep” for both IT and domain owners. But domain owners realized how much better they could serve themselves and external data consumers if they took over more data ownership. IT was afraid to give up control but started to buy in to the leverage and expertise they can provide by empowering the business domains to do great things with their data.
- A concern with not having broad standardization is bespoke solutions so it is hard to create broad reuse. There is also a challenge of people not being sure how much they can trust the data products. The DAIA team believes the tradeoff is worth it to drive initial buy-in with domain owners.
- Defining data products has been a struggle. There is a chicken and egg issue of 1) needing to understand who from the business should be involved in designing a data product but 2) data domains must be discovered to know who are the subject matter experts from the business to involve.
- For HERA, they are looking for data products to first serve their domain owners. This can be a slippery slope as domains may have valuable information but that isn’t useful for them to analyze for their own purposes. So then other domains can’t get to that data. But getting domains to freely share their data is a common incentivization problem.
- Purely technical focused data products will probably not serve demand. We need to focus on sharing information – what is the data saying, what is it about? Information is more than just the 1s and 0s of data.
Gruppo HERA had to develop a sophisticated and reliable way to do their reporting to regulators but had not focused nearly as much on their internal data and analytics. But a progressively larger number of experimentations (spanning BI to AI) emerged where data started being used to drive the company. About 2.5 years ago, they developed a new team around data, analytics, and intelligence automation (DAIA) to start to rectify that and bring their data and analytics up to the same level as their regulatory reporting. As Luca said, they “were ready to scale data governance”.
One key change, per Luca, was the business had embraced a digital workplace program so the teams were able to create small-scale applications to fill gaps where business processes were not yet digitized. While small scale apps could be not scalable in the long-run, it still gave people a good idea of the benefits of moving to a more digital or data native approach.
Thus far, the most important aspect of change management around data for HERA – at least from Luca’s view – has been not making the DAIA guidelines mandatory for domains while helping domains understand what good looks like. This has meant better conversations where the DAIA team can focus on listening and responding to issues domains have instead of enforcing a rigid set of rules.
Luca’s team has seen a lot of success working with teams to deliver the “right thing” but with a lot of flexibility in how that is achieved. This has meant the DAIA team can make people feel seen and heard, which gives the team a good way to influence direction. The current risk or challenge from that is the large variety of quality of data products; but overall, the DAIA team views it as a success, mostly owed to not being overly rigid. They have created a better sociotechnical environment to innovate.
Patience is a crucial part of the DAIA team’s long term strategy, per Luca. They are helping domains address their current needs all while constantly rearticulating the overall vision. They are also helping teams by continuing to support active projects so there can be a good transition to new ways of working. The new data strategy manifesto wasn’t like a bolt of lightning, the DAIA team had been working with domains to move them more towards the manifesto’s approach so it already aligned with the domains’ ways of thinking.
Speaking of technical and organizational challenges, Luca mentioned that historically, data ownership was mostly a technical thing and was owned by a centralized IT team. Domains were at most owning very high level concepts and IT owned the rest. So, when they moved to their new approach, domain owners often reacted with fear at first. But once they got over the initial fear, they saw the power that owning their data has to make their data a great resource to the company. IT also initially reacted with fear but started to buy in when they saw how they could empower the business users. This process is also progressively moving things out of the siloed ways of working to cross-functional, Agile teams.
Scott expressed his concerns about what a too flexible approach means for reuse, first on the data side if there are a wide variety of data products, and second about how to find reusable patterns which are necessary to scale a data mesh implementation.
Luca reiterated that the DAIA team is neither IT nor business so they have to partner with both sides to get things done. Their strategy is to have the guidelines to make the standardized way the easy pathway but to not force the domain owners to comply. They partner with them and give them guidance but only try to influence them to do things the right way. Per Luca, by laying out the tradeoffs in an honest way, you can help the domain owners understand why you recommend a certain way. By letting people decide, they will commit more to making it work.
As for defining data products, Luca mentioned how it has been a struggle internally. Many organizations implementing data mesh are struggling with this, including the very high level questions of how big should data products be and how many should we have. Should they mirror the source system? – In general, this is an anti-pattern – How do they encapsulate the subject matter expertise in the domain into their data products? There is a chicken and egg issue of needing to understand who from the business should be involved in designing a data product but they need to design the data product first before knowing who are the subject matter experts in that domain.
HERA is using two different types of data products, source aligned and consumer aligned. Consumer aligned – which they call Consumption Data Products – are designed to serve a specific use case. These are fit-for-purpose data products and there is often a working backwards process to figure out which domains need to deliver what once the use case is established. There is also a focus on making sure to limit the scope of a data product so it doesn’t get too complex or complicated to create or maintain. The source aligned data products – which they call Domain Data Products – are then built initially to “power” CDPs (use-cases) but are also designed to be more general purpose.
For Luca, data products must first serve the domain owner as it is hard to find domains that are willing to be so altruistic, they will create data products simply to share with other teams. This can be a slippery slope as there are likely many use cases where a data product or even a small part of a source-aligned data product is not useful to the data-owning domain but is extremely useful to other domains. Incentivization can be very difficult though.
When it comes to Domain Driven Design (DDD), when you first start to share the definitions of domains, many people create an extremely complex pictures of what a domain is in Luca’s experience. He recommends trying your best guess at domains and moving forward, not getting overly exact. It’s okay to make some initial guesses and work with the domain to define the boundaries. He also mentioned that data products that are purely technical solutions won’t satisfy the demand for information by the data product owner so focus on delivering a complete product, not just data, but the information of what it is you’re sharing.
Luca wrapped up with some thoughts about how crucial it is work on the organizational operating model, to try to embrace domain driven design for data, and to be as rigid as the organization can handle in your guidelines – too rigid of guidelines can be seen as regulatory without any value so start less rigid than you’d like. Domains can see the rigid way as creating no value so they will pushback and often deliver nothing. “Perfect nothing is still nothing” as Luca said.
Luca’s LinkedIn: https://www.linkedin.com/in/paganelliluca/
Data Mesh Radio is hosted by Scott Hirleman. If you want to connect with Scott, reach out to him at community at datameshlearning.com or on LinkedIn: https://www.linkedin.com/in/scotthirleman/
If you want to learn more and/or join the Data Mesh Learning Community, see here: https://datameshlearning.com/community/
If you want to be a guest or give feedback (suggestions for topics, comments, etc.), please see here
Data Mesh Radio is brought to you as a community resource by DataStax. Check out their high-scale, multi-region database offering (w/ lots of great APIs) and use code DAAP500 for a free $500 credit (apply under “add payment”): AstraDB