Provided as a free resource by DataStax AstraDB
Data Mesh Radio Patreon – get access to interviews well before they are released
This episode is part of the Data Innovation Summit Takeover week of Data Mesh Radio.
Data Innovation Summit website: https://datainnovationsummit.com/; use code DATAMESHR20G for 20% off tickets
Scott interviewed Jarkko Moilanen, a Data Economist and Country CDO ambassador for Finland. Jarkko will be presenting on “Data Monetization and Related Data Value Chain Requires Both Data Products and Services” on May 6th in track M6.
Jarkko is approaching measuring the value of data and then trying to extract that value from data from many different angles. He is thinking about data products, data as a service, data as a product, etc.
Per Jarkko, treating data like a product can apply a LOT of the learnings from the API revolution – this time around we can skip a lot of the sharp edges. APIs are about an interface to value creation – how can we treat data products the same way?
We discussed the difference between return and return on investment. A data initiative may have a very high return but if the investment to get that return is too high, it’s a bad initiative. How do we get to figuring out what quality level we need to solve our challenges – there is no reason to go for 5 9s quality if that doesn’t move the needle.
Jarkko coined a new concept on the call – the half life of data value. For a large percent of data, Jarkko believes the value of the data starts to fall considerably over a relatively short period of time. How can we extract the value when it is most valuable? If the half-life is weeks, days, hours, or even less? And how do we set ourselves up to get the most “bang for the buck”?
Jarkko is firmly in the camp of intentionality when it comes to data. We can’t keep betting on “this data might have value” or collecting data for the sake of collecting it. The data cleansing after the fact is difficult – what was the context at the time? Can you enrich the data further? Etc. – and the cost to do so is typically quite high compared to the value. And you keep incurring costs just to keep data around. If you ascribe to his data half-life theory, the value diminishes quickly so stop keeping around so much data you aren’t using!
Jarkko’s data economy model has three layers that he adapted from the API economy:
- The bottom layer is private/internal to the organization only use. In Jarkko’s view this is typically for organizations that don’t have the capabilities to productize their data, that have low data maturity. If they can move toward productizing, it will enable reuse – not just for themselves but potentially third parties.
- The middle layer is to have closed sharing agreements with other organizations, creating data sharing ecosystems. These ecosystems are typically very limited in the number of other organizations involved. These are often also about creating value for a joint purpose, e.g. two suppliers sharing data specifically to meet a customer need. To do this, you need to productize your data enough to make it generally understandable and relatively easy to use.
- The top layer is the completely public data marketplace layer. Organizations participating at this layer are packaging data or even algorithms for sale.
Jarkko also has four key elements that define a data product – whether a data mesh type data product or not – in his mind:
- The technical data flow layer – how the data is processed/created/handled by the underlying infrastructure
- The business plan layer – descriptions, plans for the data; really, what is the business objective of the data product
- The legal layer – what are the conditions for using the data
- The ethical layer – while this is becoming more important in the AI space, we should think about ethical use in all data products
Jarkko talked about trust as a measurement for value – if you can’t trust data, it’s value is significantly less. Lack of trust can significantly raise the total cost as consumers have to work to enrich the data and thus lowers the value of data. Jarkko has been working on a trust index for data products, which is especially applicable in a data exchange scenario.
For Jarkko, there are 3 key things to managing data: First, treat every bit or set of data as if you’d share it externally. That means enrich it, make it trustable, usable, secure, etc. Data has a habit of going external in some way. Second, make your data actually usable in your scenario – what level of data literacy do you have so you know what bar you have to meet? How can you find that core 80% in a 10/80/10 split that will drive insights with data? Third, have a ready-made toolkit to mock data products at the business layer with consumers. This is more about the process than tooling, but have a set of canvases so you can share ideas about new data products and get good feedback. That good feedback from users before creating a data product is very useful.
Jarkko summarized his thoughts with let the business people lead the way; if they aren’t enabled to lead, we need to educate them so they can leverage the data.
Jarkko’s LinkedIn: https://www.linkedin.com/in/jarkkomoilanen/
Open Data Product Spec: http://opendataproducts.org/#open-data-product-specification
Jarkko’s Data Product Business website: https://www.dataproductbusiness.com/
Data Mesh Radio is hosted by Scott Hirleman. If you want to connect with Scott, reach out to him at community at datameshlearning.com or on LinkedIn: https://www.linkedin.com/in/scotthirleman/
If you want to learn more and/or join the Data Mesh Learning Community, see here: https://datameshlearning.com/community/
If you want to be a guest or give feedback (suggestions for topics, comments, etc.), please see here
All music used this episode created by Lesfm (intro includes slight edits by Scott Hirleman): https://pixabay.com/users/lesfm-22579021/
Data Mesh Radio is brought to you as a community resource by DataStax. Check out their high-scale, multi-region database offering (w/ lots of great APIs) and use code DAAP500 for a free $500 credit (apply under “add payment”): AstraDB