Data Visualization

Goal: Tell Stories with Data Tools: Matplotlib, Seaborn, Tableau Public

Data Visualization

Data Visualization

Phase 3: Data Visualization (Month 4)

Goal: Tell Stories with Data

Why?
- 80% of DS interviews ask: "Walk me through your plot"
- 1 chart > 1000 rows
- Land $10K+ in salary for storytelling


Week Focus Hours
1 Python Plotting (Matplotlib/Seaborn) 35
2 EDA + Storytelling 35
3 Tableau Public Mastery 35
4 Capstone: Executive Dashboard 30

Week 1: Python Plotting – Matplotlib & Seaborn

Core Libraries

pip install matplotlib seaborn plotly

Essential Plot Types

Plot Use Code
Line Trends sns.lineplot(x, y)
Bar Compare categories sns.barplot(x, y)
Histogram Distribution sns.histplot(data)
Box Outliers, quartiles sns.boxplot(x, y)
Scatter Correlation sns.scatterplot(x, y)
Heatmap Correlation matrix sns.heatmap(corr)

Pro Code Template

import seaborn as sns
import matplotlib.pyplot as plt
import pandas as pd

# Load data
df = pd.read_csv("titanic.csv")

# Style
plt.style.use('seaborn-v0_8')
sns.set_palette("husl")

# Plot
fig, ax = plt.subplots(figsize=(10, 6))
sns.barplot(data=df, x="Pclass", y="Survived", hue="Sex", ax=ax, errorbar=None)

# Labels
ax.set_title("Survival Rate by Class & Gender", fontsize=16, fontweight='bold')
ax.set_xlabel("Passenger Class", fontsize=12)
ax.set_ylabel("Survival Rate", fontsize=12)
ax.legend(title="Gender")

# Annotate
for p in ax.patches:
    ax.annotate(f'{p.get_height():.1%}', 
                (p.get_x() + p.get_width()/2, p.get_height()), 
                ha='center', va='bottom', fontsize=10)

plt.tight_layout()
plt.savefig("survival_by_class_gender.png", dpi=300)
plt.show()

Resources:
- Python Graph Gallerypython-graph-gallery.com
- Seaborn Docsseaborn.pydata.org


Week 2: EDA + Storytelling Framework

5-Second Rule: Can a busy exec understand in 5 sec?

Storytelling Framework (McKinsey Style)

graph TD
    A[Context] --> B[Insight]
    B --> C[Action]
Step Example
Context "Titanic had 2224 passengers"
Insight "Women in 1st class: 97% survived"
Action "Prioritize women & children in evacuation"

EDA Checklist

df.describe()
df.isnull().sum()
sns.heatmap(df.corr(), annot=True, cmap="coolwarm")
sns.pairplot(df, hue="Survived")

Project: Titanic Survival Story

3 plots + 1 insight per plot → eda_titanic.ipynb


Week 3: Tableau Public – Drag, Drop, Wow

Install: Tableau Public (Free)

Core Skills

Skill How
Connect CSV, Google Sheets
Calculated Field IF [Pclass] = 1 THEN "Rich" ELSE "Poor" END
Parameters Dynamic filters
Dashboard 3+ sheets + actions
Story Sequence of insights

Build 3 Dashboards

# Dashboard Dataset
1 Sales Performance Sample Superstore
2 Customer Segmentation RFM Analysis
3 Funnel Analysis E-commerce funnel

Publish: public.tableau.com → Share link


Week 4: Capstone – Executive Dashboard

Project: "Global Happiness Report 2023"

Dataset: World Happiness Report

Deliverables (GitHub: yourname/data-viz-capstone)

data-viz-capstone/
├── python/
│   ├── eda_happiness.ipynb
│   └── plots/
│       ├── happiness_vs_gdp.png
│       └── top10_happiest.png
├── tableau/
│   ├── Happiness_Dashboard.twb
│   └── Happiness_Dashboard.png
├── streamlit/
│   └── app.py
└── README.md

1. Python: Key Insights

# Top 10 happiest countries
top10 = df.nlargest(10, 'Happiness Score')
sns.barplot(data=top10, x='Happiness Score', y='Country', palette='viridis')
plt.title("Top 10 Happiest Countries (2023)")
plt.xlabel("Happiness Score")
plt.savefig("plots/top10_happiest.png", dpi=300, bbox_inches='tight')

2. Tableau: Interactive Dashboard

Sheets:
1. Map (Happiness by Country)
2. Scatter (GDP vs Happiness)
3. Bar (Top/Bottom 10)
4. Trend (Happiness over years)

Actions:
- Filter: Region
- Highlight: Click country

Publish: tableau.com/your-viz


3. Streamlit: Live App (Bonus)

# streamlit/app.py
import streamlit as st
import plotly.express as px

st.title("World Happiness Dashboard")
df = pd.read_csv("../data/happiness.csv")

region = st.selectbox("Select Region", df['Region'].unique())
filtered = df[df['Region'] == region]

fig = px.scatter(filtered, x="GDP per capita", y="Happiness Score",
                 size="Population", color="Country", hover_name="Country",
                 title=f"Happiness vs GDP in {region}")
st.plotly_chart(fig)
streamlit run streamlit/app.py

README.md (Portfolio Gold)

# World Happiness Dashboard

**Live**: [streamlit.app/happiness](https://yourname-happiness.streamlit.app)  
**Tableau**: [public.tableau.com](https://public.tableau.com/views/WorldHappiness2023/Dashboard)  
**Python EDA**: [notebook](python/eda_happiness.ipynb)

## Key Insights
| Insight | Action |
|-------|--------|
| GDP explains 75% of happiness | Invest in economy |
| Social support > Freedom | Build community programs |
| Nordic countries dominate top 10 | Study their policies |

## Tech
- Python: Matplotlib, Seaborn, Plotly
- Tableau Public: Interactive dashboard
- Streamlit: Live web app

Interview-Ready Plots

Question Your Plot
"Show correlation" sns.heatmap(corr, annot=True)
"Outliers?" sns.boxplot()
"Trend over time?" sns.lineplot()
"Compare groups?" sns.catplot()

Assessment: Can You Build This?

Task Yes/No
Python: 5-plot EDA
Tableau: Interactive dashboard
Streamlit: Live filter
3 insights with actions
Published + shared

All Yes → You’re visualization-ready!


Free Resources Summary

Tool Link
Python Graph Gallery python-graph-gallery.com
Seaborn Examples seaborn.pydata.org/examples
Tableau Public public.tableau.com
Sample Superstore tableau.com/sample-data
Streamlit Docs docs.streamlit.io

Pro Tips

  1. Never use default colorssns.set_palette("colorblind")
  2. Annotate everything%, n=, p<0.01
  3. Export high-resdpi=300
  4. Tell a story → Context → Insight → Action
  5. Add to resume:

    "Built interactive Tableau dashboard with 10K+ views"


Next: Phase 4 – Machine Learning Core

You can show data → now predict it.


Start Now:
1. Download World Happiness Report
2. Open Jupyter:

import seaborn as sns
df = pd.read_csv("happiness.csv")
sns.scatterplot(data=df, x="GDP per capita", y="Happiness Score", hue="Region")
  1. Save plot → Push to GitHub

Tag me when you publish your Tableau viz!
You now communicate like a senior analyst.

Last updated: Nov 19, 2025

Data Visualization

Goal: Tell Stories with Data Tools: Matplotlib, Seaborn, Tableau Public

Data Visualization

Data Visualization

Phase 3: Data Visualization (Month 4)

Goal: Tell Stories with Data

Why?
- 80% of DS interviews ask: "Walk me through your plot"
- 1 chart > 1000 rows
- Land $10K+ in salary for storytelling


Week Focus Hours
1 Python Plotting (Matplotlib/Seaborn) 35
2 EDA + Storytelling 35
3 Tableau Public Mastery 35
4 Capstone: Executive Dashboard 30

Week 1: Python Plotting – Matplotlib & Seaborn

Core Libraries

pip install matplotlib seaborn plotly

Essential Plot Types

Plot Use Code
Line Trends sns.lineplot(x, y)
Bar Compare categories sns.barplot(x, y)
Histogram Distribution sns.histplot(data)
Box Outliers, quartiles sns.boxplot(x, y)
Scatter Correlation sns.scatterplot(x, y)
Heatmap Correlation matrix sns.heatmap(corr)

Pro Code Template

import seaborn as sns
import matplotlib.pyplot as plt
import pandas as pd

# Load data
df = pd.read_csv("titanic.csv")

# Style
plt.style.use('seaborn-v0_8')
sns.set_palette("husl")

# Plot
fig, ax = plt.subplots(figsize=(10, 6))
sns.barplot(data=df, x="Pclass", y="Survived", hue="Sex", ax=ax, errorbar=None)

# Labels
ax.set_title("Survival Rate by Class & Gender", fontsize=16, fontweight='bold')
ax.set_xlabel("Passenger Class", fontsize=12)
ax.set_ylabel("Survival Rate", fontsize=12)
ax.legend(title="Gender")

# Annotate
for p in ax.patches:
    ax.annotate(f'{p.get_height():.1%}', 
                (p.get_x() + p.get_width()/2, p.get_height()), 
                ha='center', va='bottom', fontsize=10)

plt.tight_layout()
plt.savefig("survival_by_class_gender.png", dpi=300)
plt.show()

Resources:
- Python Graph Gallerypython-graph-gallery.com
- Seaborn Docsseaborn.pydata.org


Week 2: EDA + Storytelling Framework

5-Second Rule: Can a busy exec understand in 5 sec?

Storytelling Framework (McKinsey Style)

graph TD
    A[Context] --> B[Insight]
    B --> C[Action]
Step Example
Context "Titanic had 2224 passengers"
Insight "Women in 1st class: 97% survived"
Action "Prioritize women & children in evacuation"

EDA Checklist

df.describe()
df.isnull().sum()
sns.heatmap(df.corr(), annot=True, cmap="coolwarm")
sns.pairplot(df, hue="Survived")

Project: Titanic Survival Story

3 plots + 1 insight per plot → eda_titanic.ipynb


Week 3: Tableau Public – Drag, Drop, Wow

Install: Tableau Public (Free)

Core Skills

Skill How
Connect CSV, Google Sheets
Calculated Field IF [Pclass] = 1 THEN "Rich" ELSE "Poor" END
Parameters Dynamic filters
Dashboard 3+ sheets + actions
Story Sequence of insights

Build 3 Dashboards

# Dashboard Dataset
1 Sales Performance Sample Superstore
2 Customer Segmentation RFM Analysis
3 Funnel Analysis E-commerce funnel

Publish: public.tableau.com → Share link


Week 4: Capstone – Executive Dashboard

Project: "Global Happiness Report 2023"

Dataset: World Happiness Report

Deliverables (GitHub: yourname/data-viz-capstone)

data-viz-capstone/
├── python/
│   ├── eda_happiness.ipynb
│   └── plots/
│       ├── happiness_vs_gdp.png
│       └── top10_happiest.png
├── tableau/
│   ├── Happiness_Dashboard.twb
│   └── Happiness_Dashboard.png
├── streamlit/
│   └── app.py
└── README.md

1. Python: Key Insights

# Top 10 happiest countries
top10 = df.nlargest(10, 'Happiness Score')
sns.barplot(data=top10, x='Happiness Score', y='Country', palette='viridis')
plt.title("Top 10 Happiest Countries (2023)")
plt.xlabel("Happiness Score")
plt.savefig("plots/top10_happiest.png", dpi=300, bbox_inches='tight')

2. Tableau: Interactive Dashboard

Sheets:
1. Map (Happiness by Country)
2. Scatter (GDP vs Happiness)
3. Bar (Top/Bottom 10)
4. Trend (Happiness over years)

Actions:
- Filter: Region
- Highlight: Click country

Publish: tableau.com/your-viz


3. Streamlit: Live App (Bonus)

# streamlit/app.py
import streamlit as st
import plotly.express as px

st.title("World Happiness Dashboard")
df = pd.read_csv("../data/happiness.csv")

region = st.selectbox("Select Region", df['Region'].unique())
filtered = df[df['Region'] == region]

fig = px.scatter(filtered, x="GDP per capita", y="Happiness Score",
                 size="Population", color="Country", hover_name="Country",
                 title=f"Happiness vs GDP in {region}")
st.plotly_chart(fig)
streamlit run streamlit/app.py

README.md (Portfolio Gold)

# World Happiness Dashboard

**Live**: [streamlit.app/happiness](https://yourname-happiness.streamlit.app)  
**Tableau**: [public.tableau.com](https://public.tableau.com/views/WorldHappiness2023/Dashboard)  
**Python EDA**: [notebook](python/eda_happiness.ipynb)

## Key Insights
| Insight | Action |
|-------|--------|
| GDP explains 75% of happiness | Invest in economy |
| Social support > Freedom | Build community programs |
| Nordic countries dominate top 10 | Study their policies |

## Tech
- Python: Matplotlib, Seaborn, Plotly
- Tableau Public: Interactive dashboard
- Streamlit: Live web app

Interview-Ready Plots

Question Your Plot
"Show correlation" sns.heatmap(corr, annot=True)
"Outliers?" sns.boxplot()
"Trend over time?" sns.lineplot()
"Compare groups?" sns.catplot()

Assessment: Can You Build This?

Task Yes/No
Python: 5-plot EDA
Tableau: Interactive dashboard
Streamlit: Live filter
3 insights with actions
Published + shared

All Yes → You’re visualization-ready!


Free Resources Summary

Tool Link
Python Graph Gallery python-graph-gallery.com
Seaborn Examples seaborn.pydata.org/examples
Tableau Public public.tableau.com
Sample Superstore tableau.com/sample-data
Streamlit Docs docs.streamlit.io

Pro Tips

  1. Never use default colorssns.set_palette("colorblind")
  2. Annotate everything%, n=, p<0.01
  3. Export high-resdpi=300
  4. Tell a story → Context → Insight → Action
  5. Add to resume:

    "Built interactive Tableau dashboard with 10K+ views"


Next: Phase 4 – Machine Learning Core

You can show data → now predict it.


Start Now:
1. Download World Happiness Report
2. Open Jupyter:

import seaborn as sns
df = pd.read_csv("happiness.csv")
sns.scatterplot(data=df, x="GDP per capita", y="Happiness Score", hue="Region")
  1. Save plot → Push to GitHub

Tag me when you publish your Tableau viz!
You now communicate like a senior analyst.

Last updated: Nov 19, 2025