Martin Frigaard: Martin is a science and data evangelist providing analytic tools and skills to various audiences including medical professionals, engineers, product managers, and journalists. He received his leadership and team-building experience in the US Army and a non-commissioned officer and has a masters degree in clinical research from the University of California, San Francisco. He’s worked with various universities, non-profits, and private companies.
Introductions: Who are we?
Peter Spangler: Entrepreneurial minded data science leader with experience building analytic solutions, insights and teams at Lyft, Citrix and Alibaba Group. Led experimentation design and data science projects focused on retention, user acquisition and channel optimization in the SaaS and rideshare spaces. I have produced solutions for incrementality testing, segmentation, ML models and fraud. I am a passionate advocate for building analytics teams in cross-functional environments and believe communication is core to any analytics program.
Agenda
“Computation Is Not Decision Making”
We need to ask the right questions of our data for our models to add the most business value
“Visualization is our most powerful tool”
We can surface actionable insights and areas of greatest impact by exploring relationships in our data
“Sizing supports our stakeholders”
Uncovering the drivers of our business problem will inform necessary partnerships for action
Computation Is Not Decision Making (1)
Improving business decisions
Using data effectively requires both data and domain knowledge (one will not suffice).
– click “S” to view speaker notes
Computation Is Not Decision Making (2)
Set the stage
Be able to articulate, “what problem are we facing?”
– click “S” to view speaker notes
Computation Is Not Decision Making (3)
Know the characters in the story
Understand what’s been measured, i.e. “what are the data?”
– click “S” to view speaker notes
Computation Is Not Decision Making (4)
Connect the business problem to a measurable objective
The hardest part of data science is translating a problem into a question that data can answer (and then finding those data).
Example measurable objective: “Identify predictors for customer churn.”
– click “S” to view speaker notes
Visualizing the process (1)
Step 1: Look at your data
Visualizing the process (2)
Step 1: Look at your data
Step 2: Get some numbers
Visualizing the process (3)
Step 1: Look at your data
Step 2: Get some numbers
Step 3: Make a graph
Visualize Your Hypotheses
Exploratory data analysis quickly empowers discovery and insights
We should be able to connect an insight with a mechanism for validation
Customer segmentation can help us identify who is impacted most
Visualize Your Hypotheses
Customer churn is greatest among customers within the first 20 months of purchasing
Visualize Your Hypotheses
Customers paying by electronic check are more than 2X as likely to churn as other payment types.
Visualize Your Hypotheses
Payment type impacts customer segments differently
Visualize Your Hypotheses
Paperless Billing appears to be associated with higher churn agnostic of product type.
Visualizations engage stakeholders
Learnings:
Early life customers are at the greatest risk of churn
Payment method and billing are associated with increased churn risk
Customer segmentation reveals specific opportunities in the data