Data Mesh Radio Patreon – get access to interviews well before they are released
Episode list and links to all available episode transcripts (most interviews from #32 on) here
Provided as a free resource by DataStax AstraDB; George Trujillo’s contact info: email (firstname.lastname@example.org) and LinkedIn
Transcript for this episode (link) provided by Starburst. See their Data Mesh Summit recordings here and their great data mesh resource center here. You can download their Data Mesh for Dummies e-book (info gated) here.
Andrew’s LinkedIn: https://www.linkedin.com/in/andrewpease123/
In this episode, Scott interviewed Andrew Pease, Field CTO of North Europe at Salesforce. To be clear, he was only representing his own views on the episode.
Some key takeaways/thoughts from Andrew’s point of view (mostly written by him):
- Sensitizing people to data and improving their data fluency can be a challenge. Lots of people have had some less than perfect past experiences – perhaps a dry, abstract class has given them “statistics trauma”. It’s important to make it digestible for them to get started.
- Organizations typically evolve into silos so IT systems/approaches often evolve into silos too – Conway’s Law. The bigger those organizations and silos are, the harder they are to bridge / the deeper the divides.
- Much as we’d like one, there is not a single silver bullet architecture for all organizations to overcome these silos.
- Without relevant IT architectures and processes, it can be challenging to put relevant and timely data and actionable insights into the business people’s workflows. You won’t get it “perfect” the first time, but get started and learn to improve through experience.
- You should reiterate to people that data is there to augment their role, not to replace it. It’s there to help them be more efficient and successful in their work. That’s a key part of data fluency, not just understanding how to use data but where data can help.
- Feedback loops are very important to increase data quality levels and data value. It’s important to build in these loops to make end-users feel like they are a part of a constant and never-ending improvement exercise. It shouldn’t be a big burden but data quality is a team sport.
- It’s important for data consumers to understand not only the potential of data and analysis, but also the limitations. E.g. you can’t reliably score lead quality from simply a person’s name and their email address. The data needs to be representative enough to find useful patterns.
- AI should be perceived in the enterprise as augmented intelligence – it is there to make the human in the loop better, not replace them.
- It is crucial to inform operational teams, the data producers, about what data might be needed in the future, not just now. And then incentivize those data producers to actually create and maintain quality data products. If all we do is ask, it likely gets lost in the operational “priorities”.
- Anecdotal feedback on what data is being used and is useful is great. But it’s not going to tell the actual full story. Make sure to create ways to track usage and measure impact of data work.
- Data hackathons can be great ways to set up some cross domain collaboration and improve data fluency but also knowledge of other domains and the organization as a whole.
- It’s vital to figure out how to get people excited about data, in combination with incentivizing them to do so in appropriate contexts.
- IT and the business side need to meet and collaborate in order to make data a crucial and embedded aspect of everyone’s roles.
- As always, communication is crucial, especially around reorganization of data teams and competencies. A clumsy reorg will certainly alienate – and possibly infuriate – people.
- “The most complex system that we have in our organizations isn’t a computer, it’s the people who are operating the computers.” When we think about the composable enterprise, we need to think about the humans in the loops and how and where they interface.
- Look to have a standardized way to bring people to better data fluency. Many different roles have budgets for ongoing training in their field, everyone should have that for data and it should be part of any organization’s new employee ramp period too.
Andrew started off by discussing the general way that organizations evolve. It’s pretty natural for most to evolve into silos and the larger the organization, the deeper the divides between the silos and the harder it is to bridge those divides. With Conway’s Law, IT systems/approaches also then often develop into silos. There is a lot of required intentionality to prevent evolving into silos or lessen divides that have already formed. And there is no “silver bullet architecture” to overcome the challenges silos create or undo the silos.
One of the big dreams of being data driven is putting timely and actionable data – “what do you want to tell them?” – in the workflows of business people. But, according to Andrew, many organizations attempting to do that look at it as an all-or-nothing kind of goal and that’s just not reasonable. You won’t get it “perfect” at the start. And that’s okay, it doesn’t make it not worth doing. As part of that process, it can be very important to reiterate that data is there to help not replace people – AI should mean augmented intelligence, it’s there to help the human in the loop be better.
There are two major opposing forces re data quality in Andrew’s view. First, you never get a second chance to make a first impression so your data quality has to be up to a certain level before showing to potential consumers. But conversely, the only way to get to actual quality data – essentially what matters, why it matters, and what quality levels are acceptable – is to get data in front of consumers and then iterate towards the required quality. Feedback loops are crucial to actual data quality so you can optimize for what matters. Your data consumers must understand that data quality is a team sport so they need to participate too.
Andrew brought up his concept of “statistics trauma” when discussing improving people’s data fluency – essentially, many have a bitter taste from past statistics/math and/or data related work/school. So to get execs more data driven, you need to sensitize them to data but in a careful approach. That falls to the CDO and it can be challenging but is quite rewarding when it works. It’s as much about communication as anything else in data.
In data, Andrew believes there needs to be far more bi-directional conversations. Data consumers need to tell data producers what they need and that can include data that doesn’t exist yet so the producers need to start capturing it. So the earlier a data consumer can tell a data producer about their needs, the more likely they will get what they want down the line. Data mesh helps there because it’s not the central team trying to understand and take requests to the producers. By cutting out the data team in the middle, you have a better chance to get to what data consumers want more quickly. But we can’t lose sight of something that many seem to overlook – we can’t just inform data producers of what we want them to produce and maintain, we need to properly incentivize and enable them to do so.
In Andrew’s view, there is obviously value in collecting feedback on what data is viewed as valuable. But it’s going to have bias – essentially it’s valued but might not be valuable – so you should develop more concrete ways to measure what data work is useful and valuable. We should track what is being used but also how well what we thought would be valuable actually performed – that way we might better know what additional data might drive incremental value. Your feedback loops should include both quantitative and qualitative measurement where possible.
If you make ultimatums around data usage – you’re either with us as a data user or you’re against us – you won’t get buy-in per Andrew. Mandates just don’t get the buy-in some people believe. So you need to work to figure out why someone is not leveraging data. Again, make it less intimidating and make it rewarding. If you threaten people – do this or we’ll fire you – you will simply get people adhering to the letter instead of the spirit – we want to make using data useful AND fun. Gamifying learning about data and data hackathons are two great ways to accomplish that.
Around data, if we want a “yin yang synergy” between business and IT, both parties have to meet the other _more_ than halfway in Andrew’s experience. Both sides have to be willing to partner to improve. There isn’t a silver bullet way to accomplish it but embedded IT in the business and vice versa can certainly help. You could rotate people across different business units. Etc. However, it’s very important if you are in a decentralized organization to make sure you share best practices.
Andrew said, “the most complex system that we have in our organizations isn’t a computer, it’s the people who are operating the computers.” There is a major change in the way our brains work between learning something and trying to get a point across. Some people are good at switching between those quickly – e.g. in a meeting – but many aren’t and it’s important to not leave them behind. So communication is crucial to get right and think about the broad group you are trying to work with. Sometimes data should be brought in to the discussion to make a point but sometimes it should purely be about increasing data fluency.
It’s easy to try to focus on hiring for data skills in many roles but really, every organization should at least consider data training as part of a new employee training according to Andrew. Obviously don’t forget existing employees but immersing people in data, especially the data of the organization, from the start pays off in the long run.
Data consumers need to understand what is actually possible with data. E.g. lead scoring based on a person’s name and email address is not a reasonable request.
Data Mesh Radio is hosted by Scott Hirleman. If you want to connect with Scott, reach out to him at community at datameshlearning.com or on LinkedIn: https://www.linkedin.com/in/scotthirleman/
If you want to learn more and/or join the Data Mesh Learning Community, see here: https://datameshlearning.com/community/
If you want to be a guest or give feedback (suggestions for topics, comments, etc.), please see here
All music used this episode was found on PixaBay and was created by (including slight edits by Scott Hirleman): Lesfm, MondayHopes, SergeQuadrado, ItsWatR, Lexin_Music, and/or nevesf
Data Mesh Radio is brought to you as a community resource by DataStax. Check out their high-scale, multi-region database offering (w/ lots of great APIs) and use code DAAP500 for a free $500 credit (apply under “add payment”): AstraDB