#67 All About Interoperability and Standards in Data Mesh – Interview w/ Samia Rahman

{{top-bit}}

Transcript for this episode (link) provided by Starburst. See their Data Mesh Summit recordings here and their great data mesh resource center here

In this episode, Scott interviewed Samia Rahman, Director of Data and AI Strategy and Architecture at life sciences company Seagen. Samia is helping to lead Seagen’s early data mesh implementation after helping with two implementations at Thoughtworks since the start of 2019.

For Samia, interoperability is about taking information from two systems and combining them to get a higher value. A simple definition but a good one.

Two potential key takeaways:

1) don’t try to plan too much ahead for developing interoperability standards but definitely keep an eye out for places where you could start to develop those standards. And your standards really, really should evolve – you don’t have to nail them right out of the gate.

2) your interoperability will also evolve – you don’t need to make every data product interoperable with every other data product and you can start with basic interoperability first. The more you can standardize around unique identifiers, the better, but it’s okay to not get it right first thing out of the gate.

Samia started her career – and even before in school – focusing on software, especially end-to-end development. A repeating pattern for her has been how crucial contract testing is to getting things into a trustable and scalable state. We’ve had them in hardware and software for a long time and if you don’t have easy testing, those systems often get replaced pretty quickly. Those tests are the safety net to allow for fast and reliable evolution. And that evolution is a key theme for this conversation – set yourself up to iterate and evolve as you learn. Work to not paint yourself in a corner

Data standards, including specifically for interoperability, are everywhere in the life sciences space – FHIR, FDA has lots, etc. but it’s still not great for truly sharing the meaning of the data. FAIR is trying to get there but the interoperability and domain knowledge isn’t really standardized yet.

Samia strongly recommends not getting ahead of yourself on interoperability and standards. It’s perfectly okay to start small – iterate and build on your standards for interoperability, To start have some key identifying “linkers” done. Get things out in front of consumers so they can explore and give feedback and use that to power your iterations. Incrementally building towards a standard is crucial.

If you are going to build a standard, reusability should be your first goal. If it is only for a single use case, that isn’t a standard, it’s just an implementation detail. Samia again recommends contract testing / a schema checker. And definitely leverage existing standards. It’s also not a huge deal if you have more than one standard internally. You don’t need one standard to rule them all.

Per Samia, if you implement versioning, data consumers are usually very willing to work with data producers as they evolve data products. But without versioning, you are just pulling the rug out from underneath them. And right now, there isn’t a lot of good info on versioning data out there, nor tooling. The need to evolve data products is why absolute self-service is probably never possible. The human-in-the-middle is important to help consumers evolve their thinking as the business model evolves.

Samia mentioned the data consumer responsibility to inform data producers – inform them about need changes, issues with their data products, etc. We can’t have data consumers going off and all creating their own fixes to data quality issues, the data producers need to know so they can fix them at the source.

You need to be on the lookout for interoperability opportunities and then validate that there is a need for for interoperability. An important point is that not all data needs to be interoperable.

Samia finished with her interoperability vendor wish list – some kind of tooling that can more easily detect when someone should use an existing standard and that can put those standards in front of data product producers much more easily. How can we make it very easy for data product producers to build in interoperability and leverage existing standards from the start?

Samia’s LinkedIn: https://www.linkedin.com/in/samia-rahman-b7b65216/

FHIR standard cheat sheet: https://www.healthit.gov/topic/standards-technology/standards/fhir-fact-sheets

{{bottom-bit}}

1 thought on “#67 All About Interoperability and Standards in Data Mesh – Interview w/ Samia Rahman”

  1. Several really good points on interoperability. However, for Data Mesh to become true hit, more proactive push will be needed from the Data Mesh community a.k.a interest group. So, instead of “aggressively waiting” for vendors to come up with proprietary data connectivity solutions, there needs to be “req spec” for open data sharing standards, covering protocols, syntax and semantics – everything needed for smooth data connectivity and cross-domain and cross-organization data product use. When widely recognized spec is in place, it becomes fairly straightforward exercise to monitor the progress within industry – and to raise red flags when interoperability is not enabled well/fast enough. Without open (de-facto) standards, Data Mesh full potential will not be realised. Hence, true openness needs to be the end-goal.

Leave a Reply

Your email address will not be published.