#23 Where and How Can Data Virtualization Work in Data Mesh – Interview w/ Dr. Daniel Abadi

Provided as a free resource by DataStax AstraDB

Don’t forget to catch Dr. Abadi at Datanova – the Data Mesh Summit on Feb 9-10th. Thanks to Starburst for sponsoring the transcripts for Data Mesh Radio, check out the transcript here. And check out Starburst’s other free data mesh resources here.

In this episode, Scott interviewed Dr. Daniel Abadi, the Darnell-Kanal Computer Science Professor at the University of Maryland with a focus on scalable data management research. Dr. Abadi will be presenting next week at the Data Mesh Summit on Data Fabric and Data Mesh alongside Zhamak and Sanjeev Mohan.

This was a pretty wide ranging and free wheeling conversation about data virtualization in general and how it can be used in data mesh. Both agreed that there are many places where data virtualization can play in data mesh, whether in extracting information from operational systems, stitching together a data product once data processing has been done, or at the mesh experience plane re combining data across multiple data products. Dr. Abadi specifically mentions something like a query fabric that makes use of a data virtualization approach, not just tools that only do data virtualization.

There is a natural side effect of having multiple different technologies in use – when you give the domains the ability to use what they choose, the difficulty of combining data from multiple sources needs to be solved. There is always a balance between how much you just copy data and how much you can access in the source system and data virtualization can give a few more options rather than all or nothing.

As data virtualization has been around as a concept for 30+ years, there is a lot of baggage with the term but Dr. Abadi sees there being recent advancements that mean more people should take a second look at where they can be useful. But warns to do your homework and really think through whether they fit your use case. A query fabric can make your user experience much more pleasant. Trying to create data products entirely within a data virtualization platform probably won’t be, at least according to Scott.

Additional topics included retransmitting or reprocessing data, versioning, the importance of denormalizing data for analytics and how that plays with data virtualization, and much more. It is a really fascinating deep dive into the history of computing and how it impacts what we are trying to do today.

Dr. Abadi’s blog post on data federalization and data virtualization: https://blog.starburst.io/data-federation-and-data-virtualization-never-worked-in-the-past-but-now-its-different

Dr. Abadi’s contact info:


Twitter: @daniel_abadi / https://twitter.com/daniel_abadi

Starburst blog posts: https://blog.starburst.io/author/daniel-abadi

Data Mesh Radio is hosted by Scott Hirleman. If you want to connect with Scott, reach out to him at community at datameshlearning.com or on LinkedIn: https://www.linkedin.com/in/scotthirleman/

If you want to learn more and/or join the Data Mesh Learning Community, see here: https://datameshlearning.com/community/

If you want to be a guest or give feedback (suggestions for topics, comments, etc.), please see here

Music used this episode created by Lesfm (intro includes slight edits by Scott Hirleman): https://pixabay.com/users/lesfm-22579021/

Data Mesh Radio is brought to you as a community resource by DataStax. Check out their high-scale, multi-region database offering (w/ lots of great APIs) and use code DAAP500 for a free $500 credit (apply under “add payment”): AstraDB

Leave a Reply

Your email address will not be published.