#56 Insights from Deploying Data Mesh and Knowledge Graphs at Scale – KGC Takeover Interview w/ Veronika Haderlein-Høgberg and Guest Host Ellie Young

Provided as a free resource by DataStax AstraDB

Data Mesh Radio Patreon – get access to interviews well before they are released

Transcript for this episode (link) provided by Starburst. See their Data Mesh Summit recordings here and their great data mesh resource center here

Knowledge Graph Conference website: https://www.knowledgegraph.tech/

Free Ticket Raffle for Knowledge Graph Conference (submissions must be by April 18 at 11:59pm PST): Google Form

In this episode of the Knowledge Graph Conference takeover week, special guest host Ellie Young (Link) interviewed Veronika Haderlein-Høgberg, PhD. Veronika was employed at Fraunhofer-Gesellschaft at the time of recording but was representing only her own view and experiences. She was invited for her special mix of both data mesh and knowledge graph know-how.

At Fraunhofer-Gesellschaft, Veronika’s employer up until recently, she and team were currently implementing a knowledge graph to help with decision support for the organization. And previously, Veronika worked on a data mesh-like implementation as part of the Norwegian public sector at the Norwegian tax authority before the data mesh concept was really congealed into a singular form by Zhamak.

Veronika and Ellie wrapped the conversation with a few key insights: to share data, groups need to agree on common standards to represent it, and they also need to be able to share information with each other about that data into the future. To develop these initial data standards, and to build the relationships to coordinate around that data long term, different departments in the enterprise have to converse with each other. Building conversations across departments requires also building trust, and for this curiosity is a crucial ingredient, both on the individual level, but also at the domain and organizational levels. If people don’t feel comfortable asking questions, they can’t understand each other’s perspectives well enough to contribute to that shared context.

What does this look like in practice? Different departments coming to discuss the difference definitions they have of different terms, and finding out what data they need from each other, therefore, what data they must collect and protocols they must develop. And, computer scientists discussing data with business people—understanding both what business requirements are, and conveying the needs of data systems in order to provide organized, quality data.

Veronika’s recent organization, Fraunhofer, is using a knowledge graph as they need to make their investment decisions much more data driven. They need to do analysis across many different sources – they have some slight control over internal data sources but essentially none over external sources. They are repeatedly doing harmonization across these sources, often the same harmonizations. Veronika believes they shouldn’t have to do the harmonization manually, so they needed a translation layer – the knowledge graph.

To build out their knowledge graph, they need business experts to work with the ontology experts – however, it is a struggle for time and attention from the business experts and they need to learn the importance and how to do ontologies. This is when Ellie mentioned that by centralizing the integration, it might cost a lot of effort up front, but it’s necessary if you only want to do the harmonization work once.

For Veronika, thinking in the data as a product mindset and having data owners is crucial to getting a knowledge graph implementation right. She said a knowledge graph is a different way of expressing and sharing your knowledge, it’s just in a way that computers can also understand it – not just humans. That framing helps people to understand why knowledge graphs are useful and important.

Data mesh implementation background and insights:

At the tax authority, Veronika’s team of information architects were working to translate tax law into data models. They discovered the need of a common methodology to create the models – they chose UML – and then other authorities needed to use the data so the team began working to create data standards for efficient data sharing. The Norwegian data mesh implementation has even extended into a public/private partnership. Veronika also mentioned that Denmark now has standards for how bills are written so the legal aspects can be translated into data.

Per Veronika, their data mesh journey started from pains – they really struggled to consume data from other entities as well as internally. They started by making it easy to consume data from those other entities without creating a large burden on those producers. They learned about good – and bad – practices on sharing data and which tools were best to use for data modeling. All the decisions were in small partnerships.

For Veronika, a big key to success was taking on small partnerships and making an environment where asking questions was highly encouraged and even making mistakes was okay – this has been a recurring theme in many DMR episodes. Ellie made points that the community and communication are key to being able to make something like data mesh work. Veronika followed up that culture and “fuzzy factors” are more important than tooling and even methodologies.

Veronika discussed the silent fear that change brings and how that is such an impediment to getting things done. There is also a fear of looking silly when asking questions. So we need to work with people to get to a place far more comfortable with change and where curiosity is rewarded instead of shamed or looked down upon. Asking questions gives people the ability to grow as they learn new things and gives a conduit for far more information sharing – documentation can’t be the end-all, be-all.

Ellie mentioned the need to stop focusing so much on the specific data in data processes. There needs to be a much bigger focus on people and the data creation process. Think about creating data for others to ingest with intentionality.

For Veronika, she saw the organizational impact of implementing a data mesh very strongly – it led to a greater sense of doing meaningful work, which led to far lower churn across the organization. Those cross functional implementations had a strong impact and the people working on those implementations were far more engaged. Veronika believes the fear of data mesh silos in data mesh makes sense but people naturally want to prevent those silos so allow people to focus on consumer needs.

Veronika believes knowledge graphs are key to preventing silos in data mesh and somewhat vice versa – it is very difficult to maintain a knowledge graph across an organization that doesn’t think of data as a product.

The two discussed the importance of data reuse, even in the same use case, to prevent manual work. It’s like a golden source-like concept – you don’t have to figure out which data you can trust. That manual repetition of harmonization is what kills productivity and people’s desire to work with outside data.

Veronika made an interesting point that any time you model data, it is almost like a mini knowledge graph. Each data model has a tiny ontology of the domain.

At the Norwegian Tax Authority, the working groups around the initial data mesh implementation started very informally – people who knew each other beforehand. In a larger org, Veronika believes the key to success is talking, sharing what you are working on and talking about how it might work. Specific goals are very important, you need concrete deliverables which make it easier to get funding.

In the Norwegian government, they worked closely with lawmakers to define business concepts and initially thought that everything had to harmonize but found that it was more important for each domain to define their business concepts the way they understand it and make it transparent – it isn’t possible to make all data harmonizable, the world is full of variation. You need to think about your business concepts and then not force them on others – focus on the translation. Otherwise, you will never get to publishing anything. They didn’t go quite as far as the Danish government though, which has created a role in the law making process, where there is someone who understands data enough that they can word the law to be able to translate it into data or a data model.

Veronika believes you shouldn’t use terms to identify concepts, use URIs. Terms are just a label, not the concept. She also believes it is often not necessary to make things computer readable and that you need to focus on creating a living organization.

To finish up, Veronika reiterated: don’t be afraid, be curious – curiosity is a prerequisite for success with data mesh or knowledge graphs.

Ellie’s LinkedIn: https://www.linkedin.com/in/sellieyoung/

Veronika’s LinkedIn: https://www.linkedin.com/in/veronikahaderlein/

Data Mesh Radio is hosted by Scott Hirleman. If you want to connect with Scott, reach out to him at community at datameshlearning.com or on LinkedIn: https://www.linkedin.com/in/scotthirleman/

If you want to learn more and/or join the Data Mesh Learning Community, see here: https://datameshlearning.com/community/

If you want to be a guest or give feedback (suggestions for topics, comments, etc.), please see here

All music used this episode created by Lesfm (intro includes slight edits by Scott Hirleman): https://pixabay.com/users/lesfm-22579021/

Data Mesh Radio is brought to you as a community resource by DataStax. Check out their high-scale, multi-region database offering (w/ lots of great APIs) and use code DAAP500 for a free $500 credit (apply under “add payment”): AstraDB

Leave a Reply

Your email address will not be published.