Demystifying data science

Paradigm Data Group

2019-11-05

Data Visualization for Insight

Introductions: Who are we?

Martin Frigaard: Martin is a science and data evangelist providing analytic tools and skills to various audiences including medical professionals, engineers, product managers, and journalists. He received his leadership and team-building experience in the US Army and a non-commissioned officer and has a masters degree in clinical research from the University of California, San Francisco. He’s worked with various universities, non-profits, and private companies.

Introductions: Who are we?

Peter Spangler: Entrepreneurial minded data science leader with experience building analytic solutions, insights and teams at Lyft, Citrix and Alibaba Group. Led experimentation design and data science projects focused on retention, user acquisition and channel optimization in the SaaS and rideshare spaces. I have produced solutions for incrementality testing, segmentation, ML models and fraud. I am a passionate advocate for building analytics teams in cross-functional environments and believe communication is core to any analytics program.

Agenda

  • Computation Is Not Decision Making
  • We need to ask the right questions of our data for our models to add the most business value
  • Visualization is our most powerful tool
  • We can surface actionable insights and areas of greatest impact by exploring relationships in our data
  • Sizing supports our stakeholders
  • Uncovering the drivers of our business problem will inform necessary partnerships for action

Computation Is Not Decision Making (1)

Improving business decisions

Using data effectively requires both data and domain knowledge (one will not suffice).

– click “S” to view speaker notes

Computation Is Not Decision Making (2)

Set the stage

Be able to articulate, “what problem are we facing?”

– click “S” to view speaker notes

Computation Is Not Decision Making (3)

Know the characters in the story

Understand what’s been measured, i.e. “what are the data?”

– click “S” to view speaker notes

Computation Is Not Decision Making (4)

Connect the business problem to a measurable objective

The hardest part of data science is translating a problem into a question that data can answer (and then finding those data).

Example measurable objective: “Identify predictors for customer churn.”

– click “S” to view speaker notes

Visualizing the process (1)

Step 1: Look at your data

Visualizing the process (2)

Step 1: Look at your data

Step 2: Get some numbers

Visualizing the process (3)

Step 1: Look at your data

Step 2: Get some numbers

Step 3: Make a graph

Visualize Your Hypotheses

  1. Exploratory data analysis quickly empowers discovery and insights

  2. We should be able to connect an insight with a mechanism for validation

  3. Customer segmentation can help us identify who is impacted most

Visualize Your Hypotheses

Customer churn is greatest among customers within the first 20 months of purchasing

Visualize Your Hypotheses

Customers paying by electronic check are more than 2X as likely to churn as other payment types.

Visualize Your Hypotheses

Payment type impacts customer segments differently

Visualize Your Hypotheses

Paperless Billing appears to be associated with higher churn agnostic of product type.

Visualizations engage stakeholders

Learnings:

  1. Early life customers are at the greatest risk of churn

  2. Payment method and billing are associated with increased churn risk

  3. Customer segmentation reveals specific opportunities in the data