#40 Getting Data-as-a-Product Right and Other Learnings From Adevinta’s Data Mesh Journey – Interview w/ Xavier Gumara Rigol

Provided as a free resource by DataStax AstraDB

Transcript for this episode (link) provided by Starburst. See their Data Mesh Summit recordings here (info gated)

Scott interviewed Xavier Gumara Rigol who has been helping lead Adevinta’s data mesh implementation as Area Manager for Experimentation and Analytics Enablement. The discussed the data as a product concept and learnings from Adevinta’s journey thus far. Xavi has put out some great articles and did a Data Mesh Learning meetup that are linked below.

One key aspect to data as a product is to understand the need for data product evolution, both relative to maturity and to what is consumed. This is a common theme in many data mesh conversations as historically, data consumption has resisted evolution and change. Consumers need to really understand that the business is evolving so what they consume will too. If you manage data products well, it won’t be a sudden change but if we are trying to share insights into a domain, those insights will change. When thinking about data product maturity, it’s totally okay to start by thinking of a data product as a single table or view.

Xavi also mentioned some pitfalls to forced data product evolution – e.g. getting it wrong as changes can be quite costly to backfill. Adding new attributes is easy but computing something for 3 to 6 months in hindsight can cost a lot of compute charges. To do do evolution right, versioning and deprecation plans are key.

To get data as a product right, Xavi recommends start by prioritizing which data you want to make available; this is a process, not a switch to flip. You should figure out which data is important for each domain and at the broader organization level.

Applying data as a product thinking to your data sets is easier said than done. While data mesh is a leading proponent, companies not doing data mesh can also use data as a product thinking – Adevinta started down this path before embarking on their data mesh journey. Of course, data as a product is far easier said than done.

For Adevinta’s data mesh journey, they started with every data product being a single table. Data was originally centrally managed so interoperability was already established. However, the documentation was lacking and the general usability wasn’t great. They spent their first few quarters just focusing on splitting their monolithic data production into separate pipelines for each domain instead of one giant cluster.

The giant cluster was becoming a major bottleneck as changes were hard and maintainability was getting harder every day. Now, each domain essentially has one data product but with multiple dimensions/tables. Each product is layered and each layer has different granularity and SLAs.

A few other notable points:

  • Xavi believes all data products should be accessible via SQL but definitely not only SQL.
  • Template/blueprints for data products are incredibly useful and important.
  • The tooling/practices to prevent application changes from breaking the data are just very lacking – Adevinta uses data model reviews but it’s still not perfect.

Xavier’s Twitter: @xgumara / https://twitter.com/xgumara

Xavier’s LinkedIn: https://www.linkedin.com/in/xgumara/

Adevinta meetup presentation: https://www.youtube.com/watch?v=av6cT_r4orQ

Xavier’s Medium Articles:

https://medium.com/adevinta-tech-blog/building-a-data-mesh-to-support-an-ecosystem-of-data-products-at-adevinta-4c057d06824d

https://medium.com/adevinta-tech-blog/treating-data-as-a-product-at-adevinta-c1dce5d394c5

https://towardsdatascience.com/data-as-a-product-vs-data-products-what-are-the-differences-b43ddbb0f123

Transcript for this episode (link) provided by Starburst. See their Data Mesh Summit recordings here (info gated)

Data Mesh Radio is hosted by Scott Hirleman. If you want to connect with Scott, reach out to him at community at datameshlearning.com or on LinkedIn: https://www.linkedin.com/in/scotthirleman/

If you want to learn more and/or join the Data Mesh Learning Community, see here: https://datameshlearning.com/community/

If you want to be a guest or give feedback (suggestions for topics, comments, etc.), please see here

All music used this episode created by Lesfm (intro includes slight edits by Scott Hirleman): https://pixabay.com/users/lesfm-22579021/

Data Mesh Radio is brought to you as a community resource by DataStax. Check out their high-scale, multi-region database offering (w/ lots of great APIs) and use code DAAP500 for a free $500 credit (apply under “add payment”): AstraDB

Leave a Reply

Your email address will not be published.