I was delighted to welcome Covalent Bonds’ data scientist Marcela Quiros Lepiz, and former marketing data analyst Michelle onto the Talk Life Sciences podcast to get a clearer picture of the intersection between data accumulation and analysis and my own practices as a part of the marketing team. We had a frank and open discussion about the what every data scientist wishes a marketer knew about how to conduct data driven marketing, and the mistakes and misunderstandings that can occur.
As you’ll hear, there are places where my own assumptions about data scientists’ work and process needed to be corrected. This kind of open and frank dialogue is necessary in order to achieve a better, fuller understanding and appreciation of one another and of one another’s work.
Analyst vs. Scientist
To begin our efforts at clarification and greater understanding, I put a very basic question to my guests: What is the actual difference between a data scientist and a data analyst?
“It means Marce is much smarter than me,” Michelle says with a laugh.
Self-deprecation aside, Marce and Michelle explain that a data analyst creates hypotheses and looks to visualize and dissect the data they receive, while the scientist tests those hypotheses and creates models by which the data is measured and parsed.
Both halves are necessary to get the fullest possible picture of what gathered data is saying, and then building predictions and actions from the portrait that the data conveys.
Rethinking the Process
But how early should marketing bring scientists and analysts into the conversation as part of a project’s development? If Marce and Michelle had their way, it would be right from the very start.
A key element to both gathering data and then utilizing that data to the fullest extent is setting clear, comprehensive goals. You have to know what you are looking for before you begin looking for it. This might sound self-explanatory, but I promise you that many a marketer enters into testing and polling without those clear parameters built, and so the results are next to impossible to decipher.
This was a major point in my recent conversation with Michael Allen, VP of Marketing at Metrohm. Michael also stressed the importance of having well-established goals, along with well-reasoned expectations, from the very beginning of a sales and/or marketing campaign.
As Marce and Michelle elaborate on a bit more in this most recent conversation, these systems of gathering and analysis become locked in once you begin using them. It is difficult, if not downright impossible, to course correct once the process is in full swing. You have to essentially throw everything out and start from scratch, a prospect that nobody enjoys.
Bad Types of Data
Along those lines, I asked Marce and Michelle to clarify the difference between data that is “dirty” versus data that is “bad”.
“Dirty” data is data which cannot be read or understood by a computer because of how it has been stored within a given system. An example given is a list of phone numbers being listed as numeric amounts, throwing off a computer’s entire ability to make correct calculations. Dirty data encompasses everything from grammatical mistakes to data being attributed to the wrong field, to values that have been duplicated.
But “dirty” data might still be of use. It has to be identified and cleansed of whatever mistakes are prohibiting it from being read properly, but it does have that potential.
“Bad data” is data that is so incomplete or negligible that it is functionally useless. This is the kind of data that is rife when testing and campaigning is undertaken without the aforementioned clear and comprehensive goals as a backing structure. You can easily end up gathering plenty of information, but it will be information that does not actually say anything.
We may want to isolate individual pieces of data and try to develop deeper meaning around them. But that goes against all best practices of data gathering and reading, a principle that I can articulate with five words.
The Golden Rule
As we discuss the different perspectives that various departments might bring to interpretations of numbers and results gleaned from testing, I have to confess that, like many a marketer, I might have a tendency to leap to conclusions based on incomplete pieces and elements of data, making the assumption that the data that seems most important to a given result is the most important.
Michelle and Marce understand this impulse, but they strongly caution against letting that impulse drive decision making.
Here are those five words I mentioned in the last section: Correlation does not equal causation. There is your golden rule. Correlation does not equal causation. You cannot just cherry pick data points and insist that these alone are key to understanding a given question.
Michelle directs us to a website which uses humorous graphs to illustrate these fallacies, like the divorce rate in Maine being related to the consumption of margarine, or the number of Nicolas Cage movies per year relative to the number of drownings per year.
Wrap Up
We finish our conversation with an affirmation that while our responsibilities and practices within the organization may be different, Marce, Michelle, myself, and everyone from marketing through to the various data sciences, we are all on one team. And as one team, we are always going to put together our best work when we work together.
If marketing brings the data team in right from the start, if they go about their project with clear goals and reasonable objectives, there is no limit to how much we can learn, and how drastically we can improve not only our sales process but the experiences of all our partners and clients.
It was such a wonderful, meaningful experience to sit down with these two terrific ladies and get a true chance to understand their side of the business in a way I never have before.
Join us for a conversation that’s as fun and spirited as it is informative.
Find the related blog; Three Things Every Data Scientist Wishes a Marketer Knew berofe Starting Data-Driven Marketing