Does it really matter if you round your model results?

I worked with a fantastic marketing EVP, that adored analytics and would use it to drive her decisionmaking despite not being so math-savvy.  She quickly discovered that not all insights were equally strong.  Due to differences from assumptions, measurements of error, or dirtiness of data, some insights were much better than others.  She asked me to color-code the insights coming from my team with my assessment of how strong the insights were.   I was honored by her trust in me to net out the confidence of each insight, but also felt the responsibility of taking all the uncertainty of an analysis and communicating it with a color.  It was challenging.

In recent years, I’ve recognized that I continue to do this communication – maybe not as directly with color coding, but with the rounding and format of my charts and insights. I try not to present data as more exact or precise than it actually is, and this helps set expectations for the viewers.

IMG_20171026_152951Take for example, this picture of a speed limit sign from a downtown street.  It’s a bit silly — we all know that vehicle speeds aren’t measured that precisely.  But yet, when you do an analysis and output a value, if you show a number, like 5.2435, the recipient thinks that you have a very exact answer — and it doesn’t matter if elsewhere on the screen you say ‘results are accurate to +/- 1%’. Rounding is a conscious choice, and the rounding that you choose communicates your confidence in the preciseness of the answer.  To an analyst, sometimes, model outputs are just a number… but for a business person, there is value in whether you round to the nearest penny, dollar or thousands of dollars.  It communicates your certainty in your analysis.

Since realizing this, I have tried to always be intentional in my rounding and consider in graphs what I show and what it means. The better your communication, the better your chance to make an impact!

Chief Data/Analytics Officer

I’ve always believed that more companies should have a Chief Data/Analytics Officer.  Having separate analytics departments or a lack of C level title can makes their ability to do their job as truth tellers more difficult as they are forced to not only navigate, but yield to the politics and interests of the C-levels in the company in order to have a seat at the table.  I’m excited to hear that more companies are moving in that direction.  Also, 70% of these new Chief Data/Analytics Officers have backgrounds in math.  I really want to go show this to that academic adviser who told me that a career in math was only useful for those who wanted to teach.  :)  Analytics is definitely integrating well into the business world these days.


Big Data Bigger Flops – Lessons Learned from Big Data Projects

Natalie has been an analyst and designer on multiple big data projects — some successful and some not. In this talk, she’ll take a trip down memory lane on some of the NOT-so-successful projects, exploring her past learning experiences and discussing what she’s found is needed to make big data projects successful — and things to avoid. We’ll discuss visionary goals, prioritization, architecture, build versus buy, and how to educate the rest of your team on big data concepts.


Is third party data really that bad?

The research below raises the question of the value of third party data, and it’s worth for marketing campaigns. I’ve always been a fan of the data – even when it’s not perfect. This just means one should 1) really evaluate if third party data is the right approach for your business case and need and recognize the inaccuracies before jumping in. 2) test it on a small scale with certain data providers to see if it’s money worth spending.

Third party data is so often questioned for it’s value, and often time the statistic of 35% inaccuracy between determining if the subject is male or female is thrown around. But where does that number come from and how much bearing does that have on the accuracy of third party data? In 2012 one anonymous ad tech exec told Digiday “the gender is wrong 30-35 percent of the time,” and that statistic has been plaguing the analytics market like wildfire.

Marketers are then commonly forced to ask: But how much does that number really stand up; how inaccurate is third party data really; is the price I am paying for third party data worth it?

During the Digiday Programmatic Summit in November of 2016, Matt Rosenberg, ChoiceStream, went over the importance of scale that third-party data sellers are pressured under, and often time in order to meet that need for scale, accuracy is thrown to the wind. “Advertisers need scale, and as a data vendor, if you can’t provide that, no one will buy your segment,” he said.

Rosenberg put it like this: “If you can get 300,000 people in a group with 95 percent confidence that they belong there, or 30 million people in a group with 60 percent confidence, well, it might not be such a hard decision to relax your model a bit, especially when no one is set up to audit you.”

In a study done by ChoiceStream, the company Rosenberg was once CMO of, it was found that a particular data vendor had identified 84 percent of users as both male and female, much higher than the traditional “35%” that is usually thrown around. While this could easily be seen as an outlier, ChoiceStream took the time to examine two vendors that were least likely to identify people as both male and female. By getting the third-party data internally from the vendors and syncing across data-sets, it was still found that about a third of the time the two vendors disagreed on what gender an individual was.

Imperfect data leads to imperfect analysis

I’m a big believer in looking at data even when it’s imperfect to see if you can gain insights as some data is better than nothing, but it’s important to be realistic and think of it as an indicator to test rather than a TRUTH to build on. I thought this article did a good job of pointing out several potential flaws that I’ve seen occur in my career.

The three legged stool of data science

“Data Science is a three-legged stool that combines business acumen, data wrangling and analytics to create extreme value. Focusing on the hard science skills such as statistical methods is a common mistake when actually, developing the knowledge about a particular business and wrangling the relevant data are often the most important skills to bring to the table.”

In my experience, failure of data science and advanced analytics projects most often is the result of a lack of business understanding or lack of clean data.  These skills are often under valued when searching for data scientists and can result in a diminished or absent ROI on big data implementations and projects.

Christmas Carol as created by an AI

I’m not sure that AIs are really ready to create art that I can appreciate, but there must be a start to it.  Here’s a first attempt of an AI (recurrent neural network) creating a Christmas Carol from an inspirational picture and over 100 hours of music.  Give it a listen and see what you think of it.