Detailed SQL for Data Analysis

(Phase 1.5 | 4 Weeks | 4–6 hrs/day) Goal: Master SQL for data extraction, aggregation, and insight generation — the #1 skill for Data Analyst & Data Scientist roles.

Detailed SQL for Data Analysis

Detailed SQL for Data Analysis

Detailed SQL for Data Analysis

(Phase 1.5 | 4 Weeks | 4–6 hrs/day)

Goal: Master SQL for data extraction, aggregation, and insight generation — the #1 skill for Data Analyst & Data Scientist roles.

Used by: 95% of companies | Appears in: 80% of DS interviews | Salary boost: +$10–20K


Why SQL?

Business Need SQL Does
"Show me sales last quarter" WHERE date BETWEEN ...
"Top 10 customers by revenue" ORDER BY revenue DESC LIMIT 10
"YoY growth %" Window functions
"Funnel drop-off" CTEs + COUNT(DISTINCT)

Week-by-Week Roadmap

Week Focus Hours
1 SQL Basics + SELECT 25
2 Filtering, Aggregation, Joins 30
3 Subqueries, CTEs, Window Functions 30
4 Real-World Project + Interview Prep 25

Tools Setup (Day 1)

Tool Install
DB Browser for SQLite sqlitebrowser.org
PostgreSQL (Optional) postgresql.org
Online Sandbox SQLFiddle, DB Fiddle

Practice DB: Download Chinook Database (Music store with 11 tables)


Week 1: SQL Basics & SELECT

Core Syntax

SELECT column1, column2
FROM table_name
WHERE condition
ORDER BY column
LIMIT 10;

Practice Queries (Chinook DB)

# Question Query
1 List all albums SELECT * FROM Album LIMIT 5;
2 Find tracks longer than 5 min SELECT Name, Milliseconds FROM Track WHERE Milliseconds > 300000;
3 Artists from Brazil SELECT * FROM Artist WHERE Country = 'Brazil';

Key Concepts

Concept Example
DISTINCT SELECT DISTINCT Country FROM Customer;
LIKE SELECT Name FROM Track WHERE Name LIKE '%love%';
IN WHERE GenreId IN (1, 3, 5)
BETWEEN WHERE UnitPrice BETWEEN 0.99 AND 1.99

Daily Task: Write 10 queries/day → save in week1.sql


Week 2: Aggregation & Joins

Aggregation

SELECT 
    GenreId,
    COUNT(*) AS track_count,
    AVG(Milliseconds)/60000 AS avg_min
FROM Track
GROUP BY GenreId
HAVING COUNT(*) > 100
ORDER BY track_count DESC;

JOIN Types

Type Use
INNER JOIN Matching rows
LEFT JOIN All from left
FULL OUTER All from both

Example: Customer + Invoice

SELECT 
    c.FirstName || ' ' || c.LastName AS customer,
    SUM(i.Total) AS total_spent
FROM Customer c
LEFT JOIN Invoice i ON c.CustomerId = i.CustomerId
GROUP BY c.CustomerId
ORDER BY total_spent DESC
LIMIT 5;

Project: Top 5 Customers by Country

Output: Country, Customer Name, Total Spent


Week 3: Subqueries, CTEs, Window Functions

1. Subquery

SELECT Name, Composer
FROM Track
WHERE GenreId = (
    SELECT GenreId FROM Genre WHERE Name = 'Rock'
);

2. CTE (Common Table Expression)

WITH rock_tracks AS (
    SELECT * FROM Track 
    WHERE GenreId = (SELECT GenreId FROM Genre WHERE Name = 'Rock')
)
SELECT Composer, COUNT(*) 
FROM rock_tracks 
GROUP BY Composer 
ORDER BY COUNT(*) DESC;

3. Window Functions (GAME CHANGER)

-- Rank tracks by length per genre
SELECT 
    Name,
    GenreId,
    Milliseconds,
    RANK() OVER (PARTITION BY GenreId ORDER BY Milliseconds DESC) AS rank_longest
FROM Track;

Advanced Query:

"For each customer, show their total spend and % of country total"

WITH customer_spend AS (
    SELECT 
        c.CustomerId,
        c.Country,
        SUM(i.Total) AS spend
    FROM Customer c
    JOIN Invoice i ON c.CustomerId = i.CustomerId
    GROUP BY c.CustomerId
),
country_total AS (
    SELECT Country, SUM(spend) AS country_spend
    FROM customer_spend
    GROUP BY Country
)
SELECT 
    cs.CustomerId,
    cs.Country,
    cs.spend,
    ROUND(100.0 * cs.spend / ct.country_spend, 2) AS pct_of_country
FROM customer_spend cs
JOIN country_total ct ON cs.Country = ct.Country
ORDER BY pct_of_country DESC;

Week 4: Real-World Project + Interview Prep

Final Project: Music Store Business Report

Deliverables (GitHub Repo: yourname/sql-music-analysis)

sql-music-analysis/
├── queries/
│   ├── 01_top_customers.sql
│   ├── 02_genre_performance.sql
│   ├── 03_employee_sales.sql
│   └── report.md
├── data/
│   └── chinook.db
└── README.md

Key Queries to Write

# Business Question SQL File
1 Top 10 customers by total spend 01_top_customers.sql
2 Best-selling genre per country 02_genre_performance.sql
3 Employee sales performance (with manager) 03_employee_sales.sql
4 Tracks never sold 04_unsold_tracks.sql
5 Monthly revenue trend 05_revenue_trend.sql

report.md Example

# Music Store SQL Analysis

## Key Insights
- **Rock** is the top genre (54% of sales)
- Top customer: **Helena Holý** ($49.62 from Czech Republic)
- **Jane Peacock** is the top-performing employee
- 2 tracks have **never been sold**

## How to Run
```sql
-- Open in DB Browser for SQLite
-- Run queries/01_top_customers.sql
---

## Interview-Ready SQL (LeetCode Style)

| Difficulty | Problem | Link |
|----------|-------|------|
| Easy | [175. Combine Two Tables](https://leetcode.com/problems/combine-two-tables/) | `LEFT JOIN` |
| Easy | [181. Employees Earning More Than Managers](https://leetcode.com/problems/employees-earning-more-than-their-managers/) | Self-join |
| Medium | [176. Second Highest Salary](https://leetcode.com/problems/second-highest-salary/) | `LIMIT OFFSET` |
| Medium | [177. Nth Highest Salary](https://leetcode.com/problems/nth-highest-salary/) | CTE + `DENSE_RANK()` |
| Hard | [185. Department Top Three Salaries](https://leetcode.com/problems/department-top-three-salaries/) | Window + Filter |

**Goal**: Solve **50 LeetCode SQL problems** (focus on **Easy 20 + Medium 30**)

---

## SQL Cheat Sheet (Copy-Paste)

```sql
-- SELECT
SELECT *, ROUND(col, 2)
FROM table
WHERE col IS NOT NULL
  AND col LIKE '%pattern%'
  AND col IN (1,2,3)
ORDER BY col DESC
LIMIT 10;

-- AGGREGATION
SELECT category, 
       COUNT(*), 
       SUM(sales), 
       AVG(price),
       MAX(date)
FROM table
GROUP BY category
HAVING COUNT(*) > 5;

-- JOINS
SELECT *
FROM A
INNER JOIN B ON A.id = B.id
LEFT JOIN C ON B.id = C.id;

-- WINDOW
SELECT 
    col,
    ROW_NUMBER() OVER (PARTITION BY group ORDER BY val) AS rn,
    RANK() OVER (PARTITION BY group ORDER BY val DESC) AS rank
FROM table;

-- CTE
WITH cte AS (SELECT ...) 
SELECT * FROM cte WHERE ...;

Practice Platforms

Platform Link
Mode Analytics mode.com/sql-tutorial
StrataScratch stratascratch.com
HackerRank SQL hackerrank.com/domains/sql
LeetCode leetcode.com/problemset/database/
SQLZoo sqlzoo.net

Weekly Schedule

Day Task
Mon–Wed Learn + write 15 queries
Thu Solve 5 LeetCode (Medium)
Fri Build project query
Sat Review + explain aloud
Sun Rest / blog post

Assessment: Can You Write This?

Query Yes/No
Top 3 tracks per genre by length
Customers who spent > average
Monthly revenue with YoY %
Rank employees by sales

All Yes → You’re SQL-ready for interviews!


Pro Tips

  1. Always EXPLAIN your query in interviews

    “I used a CTE to avoid duplicating the subquery…”

  2. Write comments in production SQL
  3. Use aliases (c.FirstNamec."FirstName")
  4. Index matters → know when to suggest INDEX ON Customer(Country)

Next: Phase 2 – Statistics & Math

Now that you can get the data… learn to understand it.


Free Resources Summary

Resource Link
Chinook DB github.com/lerocha/chinook-database
Mode SQL Tutorial mode.com/sql-tutorial
LeetCode SQL 50 leetcode.com/study-plan/sql
SQL Cheat Sheet Save this gist!

Start Now:
1. Download Chinook.db
2. Open in DB Browser
3. Run:

SELECT Name, Milliseconds/60000.0 AS minutes 
FROM Track 
ORDER BY minutes DESC 
LIMIT 1;

You just found the longest song!

Tag me when you push your SQL repo!
Let’s make you SQL-fluent in 4 weeks.

Last updated: Nov 09, 2025

Detailed SQL for Data Analysis

(Phase 1.5 | 4 Weeks | 4–6 hrs/day) Goal: Master SQL for data extraction, aggregation, and insight generation — the #1 skill for Data Analyst & Data Scientist roles.

Detailed SQL for Data Analysis

Detailed SQL for Data Analysis

Detailed SQL for Data Analysis

(Phase 1.5 | 4 Weeks | 4–6 hrs/day)

Goal: Master SQL for data extraction, aggregation, and insight generation — the #1 skill for Data Analyst & Data Scientist roles.

Used by: 95% of companies | Appears in: 80% of DS interviews | Salary boost: +$10–20K


Why SQL?

Business Need SQL Does
"Show me sales last quarter" WHERE date BETWEEN ...
"Top 10 customers by revenue" ORDER BY revenue DESC LIMIT 10
"YoY growth %" Window functions
"Funnel drop-off" CTEs + COUNT(DISTINCT)

Week-by-Week Roadmap

Week Focus Hours
1 SQL Basics + SELECT 25
2 Filtering, Aggregation, Joins 30
3 Subqueries, CTEs, Window Functions 30
4 Real-World Project + Interview Prep 25

Tools Setup (Day 1)

Tool Install
DB Browser for SQLite sqlitebrowser.org
PostgreSQL (Optional) postgresql.org
Online Sandbox SQLFiddle, DB Fiddle

Practice DB: Download Chinook Database (Music store with 11 tables)


Week 1: SQL Basics & SELECT

Core Syntax

SELECT column1, column2
FROM table_name
WHERE condition
ORDER BY column
LIMIT 10;

Practice Queries (Chinook DB)

# Question Query
1 List all albums SELECT * FROM Album LIMIT 5;
2 Find tracks longer than 5 min SELECT Name, Milliseconds FROM Track WHERE Milliseconds > 300000;
3 Artists from Brazil SELECT * FROM Artist WHERE Country = 'Brazil';

Key Concepts

Concept Example
DISTINCT SELECT DISTINCT Country FROM Customer;
LIKE SELECT Name FROM Track WHERE Name LIKE '%love%';
IN WHERE GenreId IN (1, 3, 5)
BETWEEN WHERE UnitPrice BETWEEN 0.99 AND 1.99

Daily Task: Write 10 queries/day → save in week1.sql


Week 2: Aggregation & Joins

Aggregation

SELECT 
    GenreId,
    COUNT(*) AS track_count,
    AVG(Milliseconds)/60000 AS avg_min
FROM Track
GROUP BY GenreId
HAVING COUNT(*) > 100
ORDER BY track_count DESC;

JOIN Types

Type Use
INNER JOIN Matching rows
LEFT JOIN All from left
FULL OUTER All from both

Example: Customer + Invoice

SELECT 
    c.FirstName || ' ' || c.LastName AS customer,
    SUM(i.Total) AS total_spent
FROM Customer c
LEFT JOIN Invoice i ON c.CustomerId = i.CustomerId
GROUP BY c.CustomerId
ORDER BY total_spent DESC
LIMIT 5;

Project: Top 5 Customers by Country

Output: Country, Customer Name, Total Spent


Week 3: Subqueries, CTEs, Window Functions

1. Subquery

SELECT Name, Composer
FROM Track
WHERE GenreId = (
    SELECT GenreId FROM Genre WHERE Name = 'Rock'
);

2. CTE (Common Table Expression)

WITH rock_tracks AS (
    SELECT * FROM Track 
    WHERE GenreId = (SELECT GenreId FROM Genre WHERE Name = 'Rock')
)
SELECT Composer, COUNT(*) 
FROM rock_tracks 
GROUP BY Composer 
ORDER BY COUNT(*) DESC;

3. Window Functions (GAME CHANGER)

-- Rank tracks by length per genre
SELECT 
    Name,
    GenreId,
    Milliseconds,
    RANK() OVER (PARTITION BY GenreId ORDER BY Milliseconds DESC) AS rank_longest
FROM Track;

Advanced Query:

"For each customer, show their total spend and % of country total"

WITH customer_spend AS (
    SELECT 
        c.CustomerId,
        c.Country,
        SUM(i.Total) AS spend
    FROM Customer c
    JOIN Invoice i ON c.CustomerId = i.CustomerId
    GROUP BY c.CustomerId
),
country_total AS (
    SELECT Country, SUM(spend) AS country_spend
    FROM customer_spend
    GROUP BY Country
)
SELECT 
    cs.CustomerId,
    cs.Country,
    cs.spend,
    ROUND(100.0 * cs.spend / ct.country_spend, 2) AS pct_of_country
FROM customer_spend cs
JOIN country_total ct ON cs.Country = ct.Country
ORDER BY pct_of_country DESC;

Week 4: Real-World Project + Interview Prep

Final Project: Music Store Business Report

Deliverables (GitHub Repo: yourname/sql-music-analysis)

sql-music-analysis/
├── queries/
│   ├── 01_top_customers.sql
│   ├── 02_genre_performance.sql
│   ├── 03_employee_sales.sql
│   └── report.md
├── data/
│   └── chinook.db
└── README.md

Key Queries to Write

# Business Question SQL File
1 Top 10 customers by total spend 01_top_customers.sql
2 Best-selling genre per country 02_genre_performance.sql
3 Employee sales performance (with manager) 03_employee_sales.sql
4 Tracks never sold 04_unsold_tracks.sql
5 Monthly revenue trend 05_revenue_trend.sql

report.md Example

# Music Store SQL Analysis

## Key Insights
- **Rock** is the top genre (54% of sales)
- Top customer: **Helena Holý** ($49.62 from Czech Republic)
- **Jane Peacock** is the top-performing employee
- 2 tracks have **never been sold**

## How to Run
```sql
-- Open in DB Browser for SQLite
-- Run queries/01_top_customers.sql
---

## Interview-Ready SQL (LeetCode Style)

| Difficulty | Problem | Link |
|----------|-------|------|
| Easy | [175. Combine Two Tables](https://leetcode.com/problems/combine-two-tables/) | `LEFT JOIN` |
| Easy | [181. Employees Earning More Than Managers](https://leetcode.com/problems/employees-earning-more-than-their-managers/) | Self-join |
| Medium | [176. Second Highest Salary](https://leetcode.com/problems/second-highest-salary/) | `LIMIT OFFSET` |
| Medium | [177. Nth Highest Salary](https://leetcode.com/problems/nth-highest-salary/) | CTE + `DENSE_RANK()` |
| Hard | [185. Department Top Three Salaries](https://leetcode.com/problems/department-top-three-salaries/) | Window + Filter |

**Goal**: Solve **50 LeetCode SQL problems** (focus on **Easy 20 + Medium 30**)

---

## SQL Cheat Sheet (Copy-Paste)

```sql
-- SELECT
SELECT *, ROUND(col, 2)
FROM table
WHERE col IS NOT NULL
  AND col LIKE '%pattern%'
  AND col IN (1,2,3)
ORDER BY col DESC
LIMIT 10;

-- AGGREGATION
SELECT category, 
       COUNT(*), 
       SUM(sales), 
       AVG(price),
       MAX(date)
FROM table
GROUP BY category
HAVING COUNT(*) > 5;

-- JOINS
SELECT *
FROM A
INNER JOIN B ON A.id = B.id
LEFT JOIN C ON B.id = C.id;

-- WINDOW
SELECT 
    col,
    ROW_NUMBER() OVER (PARTITION BY group ORDER BY val) AS rn,
    RANK() OVER (PARTITION BY group ORDER BY val DESC) AS rank
FROM table;

-- CTE
WITH cte AS (SELECT ...) 
SELECT * FROM cte WHERE ...;

Practice Platforms

Platform Link
Mode Analytics mode.com/sql-tutorial
StrataScratch stratascratch.com
HackerRank SQL hackerrank.com/domains/sql
LeetCode leetcode.com/problemset/database/
SQLZoo sqlzoo.net

Weekly Schedule

Day Task
Mon–Wed Learn + write 15 queries
Thu Solve 5 LeetCode (Medium)
Fri Build project query
Sat Review + explain aloud
Sun Rest / blog post

Assessment: Can You Write This?

Query Yes/No
Top 3 tracks per genre by length
Customers who spent > average
Monthly revenue with YoY %
Rank employees by sales

All Yes → You’re SQL-ready for interviews!


Pro Tips

  1. Always EXPLAIN your query in interviews

    “I used a CTE to avoid duplicating the subquery…”

  2. Write comments in production SQL
  3. Use aliases (c.FirstNamec."FirstName")
  4. Index matters → know when to suggest INDEX ON Customer(Country)

Next: Phase 2 – Statistics & Math

Now that you can get the data… learn to understand it.


Free Resources Summary

Resource Link
Chinook DB github.com/lerocha/chinook-database
Mode SQL Tutorial mode.com/sql-tutorial
LeetCode SQL 50 leetcode.com/study-plan/sql
SQL Cheat Sheet Save this gist!

Start Now:
1. Download Chinook.db
2. Open in DB Browser
3. Run:

SELECT Name, Milliseconds/60000.0 AS minutes 
FROM Track 
ORDER BY minutes DESC 
LIMIT 1;

You just found the longest song!

Tag me when you push your SQL repo!
Let’s make you SQL-fluent in 4 weeks.

Last updated: Nov 09, 2025