#10 Ensuring Data Quality via Data Testing and Versioning – Interview w/ Jesse Paquette

{{top-bit}}

In this episode, Jesse Paquette, Chief Science Officer and Co-founder at Tag.bio – a data platform vendor in the life sciences space, and Scott dive a bit deeper into data quality in general, especially data testing and versioning.

You can see the LinkedIn post that sparked this discussion here

Jesse recommends a number of things to ensure data quality, especially data testing and versioning. This includes versioning of 1) the code used to create the data (generally the ETL code), 2 the schema, 3) the business logic layer, and 4) timestamping / temporality based versioning.

Jesse’s general calls to action are 1) make data testing frameworks so testing is much less tedious and time consuming; 2) work with stakeholders to gain trust in the data and then continue the dialogue to keep said trust; and 3) create schema/domain model blueprints so that domains have a starting point – whether they use it is irrelevant but shortening the path to a working domain model is crucial.

Jesse’s contact info:

Email: jesse at tag.bio

LinkedIn: https://www.linkedin.com/in/jessepaquette/

Twitter: @bzdyelnik / https://twitter.com/bzdyelnik

Website: https://tag.bio/

Tag.bio vendor interview for Data Mesh Learning: https://www.youtube.com/watch?v=acQADu7ttqQ

{{bottom-bit}}

Leave a Reply

Your email address will not be published.