Data Science Roadmap

This is a practical, step-by-step roadmap to go from zero to employable Data Scientist in 12–18 months (full-time) or 18–24 months (part-time). Focus on skills that pay, portfolio projects, and real-world impact.

Data Science Roadmap

Data Science Roadmap

Data Science Roadmap

This is a practical, step-by-step roadmap to go from zero to employable Data Scientist in 12–18 months (full-time) or 18–24 months (part-time). Focus on skills that pay, portfolio projects, and real-world impact.


Phase 0: Mindset

Task Resources
Install Python, VS Code, Git Anaconda
Create GitHub + LinkedIn Clean profile photo + headline
Join communities Reddit r/datascience, Discord (DataTalks.Club), LinkedIn groups

Phase 1: Foundations

Goal: Speak the language of data

Topic Resources
Python Basics Automate the Boring Stuff (Ch 1–6)
Pandas & NumPy 10 Minutes to Pandas (official)
Data Cleaning Kaggle "Pandas" course (free)
SQL Mode Analytics SQL Tutorial OR LeetCode SQL 50

Mini-Project:

Clean + analyze a Kaggle dataset (e.g., Titanic) → GitHub repo with README.md


Phase 2: Statistics & Math

Goal: Don’t just run models — understand them

Topic Resources
Descriptive & Inferential Stats StatQuest (YouTube)
Probability (Bayes, distributions) Khan Academy
Hypothesis Testing (p-values, A/B) Practical Statistics for Data Scientists (book)
Linear Algebra (vectors, matrices) 3Blue1Brown Essence of Linear Algebra

Practice:

Solve 20 problems on DataCamp or StrataScratch


Phase 3: Data Visualization

Goal: Tell stories with data

Tool Learn
Matplotlib/Seaborn Python Plotting for Exploratory Analysis
Tableau Public Build 3 dashboards
Power BI (Optional for BI roles)

Project:

World Happiness Report → Interactive dashboard (Tableau Public)


Phase 4: Machine Learning Core

Goal: Build & evaluate models

Topic Resources
Scikit-learn pipeline Kaggle "Intermediate ML" course
Regression (Linear, Logistic) Andrew Ng’s ML Course (free audit)
Classification (Trees, SVM, KNN) Hands-On ML (Aurélien Géron) Ch 2–6
Model Evaluation (AUC, F1, confusion matrix) StatQuest
Cross-validation & Hyperparameter tuning GridSearchCV / Optuna

Projects (Pick 2):
1. House Prices → Feature eng + XGBoost
2. Customer Churn → Logistic + SHAP explanations


Phase 5: Advanced ML & MLOps

Goal: Production-ready models

Topic Tools/Resources
XGBoost / LightGBM Kaggle competitions
Feature Engineering Feature-engine library
NLP Basics HuggingFace "NLP Course" (free)
Time Series Store Item Demand Forecasting (Kaggle)
Docker "Docker for Data Science" (YouTube)
MLflow / DVC Track experiments
FastAPI Deploy model as API

Capstone Project:

End-to-end ML system:
data → clean → model → API → Streamlit dashboard
Example: Credit Card Fraud Detection with imbalance handling (SMOTE) + API


Phase 6: Big Data & Cloud

Optional but high-paying

Skill Platform
PySpark Databricks Community Edition
AWS/GCP Free tier (S3, EC2, SageMaker)
dbt (data build tool) For analytics engineering

Project:

Process 1M+ rows with PySpark → store in S3 → query with Athena


Phase 7: Job Prep & Portfolio

Goal: Get hired

Portfolio (3 Projects)

Type Example
Predictive House Price Prediction (Kaggle top 20%)
NLP Sentiment Analysis on Twitter (HuggingFace)
End-to-End Fraud Detection API + Dashboard

Host: GitHub + Streamlit/Gradio + LinkedIn posts

Resume

  • Quantify: “Improved AUC from 0.72 → 0.89”
  • Keywords: Pandas, Scikit-learn, SQL, AWS, A/B testing

Interview Prep

Type Resource
SQL LeetCode (Top 50)
Python HackerRank Data Science
Case Studies "Cracking the Data Science Interview"
Behavioral STAR method

Weekly Schedule (Full-Time)

Day Focus
Mon–Wed Learn + code (4h)
Thu Project work
Fri LeetCode / SQL (50 problems)
Sat Portfolio + write blog
Sun Rest / review

Salary Expectations (2025)

Role USA India Remote
Junior DS $95K–$130K ₹12–20 LPA $70K–$100K
Mid-Level $130K–$180K ₹20–35 LPA $100K–$140K

Pro Tips

  1. Contribute to open source (e.g., scikit-learn bugs)
  2. Write 1 LinkedIn post/week about your project
  3. Apply to 10 jobs/week after Phase 5
  4. Get 1 mentor (via ADPList.org)

Free Resources Summary

Topic Link
Python Python.org
Kaggle Courses kaggle.com/learn
StatQuest YouTube
HuggingFace huggingface.co/course
Streamlit streamlit.io

Start today: Open Kaggle Titanic, download data, and run pd.read_csv().

“The best time to start was yesterday. The next best time is now.”

Save this roadmap. Share with a friend. Tag me when you land your first DS job!

Last updated: Nov 09, 2025

Data Science Roadmap

This is a practical, step-by-step roadmap to go from zero to employable Data Scientist in 12–18 months (full-time) or 18–24 months (part-time). Focus on skills that pay, portfolio projects, and real-world impact.

Data Science Roadmap

Data Science Roadmap

Data Science Roadmap

This is a practical, step-by-step roadmap to go from zero to employable Data Scientist in 12–18 months (full-time) or 18–24 months (part-time). Focus on skills that pay, portfolio projects, and real-world impact.


Phase 0: Mindset

Task Resources
Install Python, VS Code, Git Anaconda
Create GitHub + LinkedIn Clean profile photo + headline
Join communities Reddit r/datascience, Discord (DataTalks.Club), LinkedIn groups

Phase 1: Foundations

Goal: Speak the language of data

Topic Resources
Python Basics Automate the Boring Stuff (Ch 1–6)
Pandas & NumPy 10 Minutes to Pandas (official)
Data Cleaning Kaggle "Pandas" course (free)
SQL Mode Analytics SQL Tutorial OR LeetCode SQL 50

Mini-Project:

Clean + analyze a Kaggle dataset (e.g., Titanic) → GitHub repo with README.md


Phase 2: Statistics & Math

Goal: Don’t just run models — understand them

Topic Resources
Descriptive & Inferential Stats StatQuest (YouTube)
Probability (Bayes, distributions) Khan Academy
Hypothesis Testing (p-values, A/B) Practical Statistics for Data Scientists (book)
Linear Algebra (vectors, matrices) 3Blue1Brown Essence of Linear Algebra

Practice:

Solve 20 problems on DataCamp or StrataScratch


Phase 3: Data Visualization

Goal: Tell stories with data

Tool Learn
Matplotlib/Seaborn Python Plotting for Exploratory Analysis
Tableau Public Build 3 dashboards
Power BI (Optional for BI roles)

Project:

World Happiness Report → Interactive dashboard (Tableau Public)


Phase 4: Machine Learning Core

Goal: Build & evaluate models

Topic Resources
Scikit-learn pipeline Kaggle "Intermediate ML" course
Regression (Linear, Logistic) Andrew Ng’s ML Course (free audit)
Classification (Trees, SVM, KNN) Hands-On ML (Aurélien Géron) Ch 2–6
Model Evaluation (AUC, F1, confusion matrix) StatQuest
Cross-validation & Hyperparameter tuning GridSearchCV / Optuna

Projects (Pick 2):
1. House Prices → Feature eng + XGBoost
2. Customer Churn → Logistic + SHAP explanations


Phase 5: Advanced ML & MLOps

Goal: Production-ready models

Topic Tools/Resources
XGBoost / LightGBM Kaggle competitions
Feature Engineering Feature-engine library
NLP Basics HuggingFace "NLP Course" (free)
Time Series Store Item Demand Forecasting (Kaggle)
Docker "Docker for Data Science" (YouTube)
MLflow / DVC Track experiments
FastAPI Deploy model as API

Capstone Project:

End-to-end ML system:
data → clean → model → API → Streamlit dashboard
Example: Credit Card Fraud Detection with imbalance handling (SMOTE) + API


Phase 6: Big Data & Cloud

Optional but high-paying

Skill Platform
PySpark Databricks Community Edition
AWS/GCP Free tier (S3, EC2, SageMaker)
dbt (data build tool) For analytics engineering

Project:

Process 1M+ rows with PySpark → store in S3 → query with Athena


Phase 7: Job Prep & Portfolio

Goal: Get hired

Portfolio (3 Projects)

Type Example
Predictive House Price Prediction (Kaggle top 20%)
NLP Sentiment Analysis on Twitter (HuggingFace)
End-to-End Fraud Detection API + Dashboard

Host: GitHub + Streamlit/Gradio + LinkedIn posts

Resume

  • Quantify: “Improved AUC from 0.72 → 0.89”
  • Keywords: Pandas, Scikit-learn, SQL, AWS, A/B testing

Interview Prep

Type Resource
SQL LeetCode (Top 50)
Python HackerRank Data Science
Case Studies "Cracking the Data Science Interview"
Behavioral STAR method

Weekly Schedule (Full-Time)

Day Focus
Mon–Wed Learn + code (4h)
Thu Project work
Fri LeetCode / SQL (50 problems)
Sat Portfolio + write blog
Sun Rest / review

Salary Expectations (2025)

Role USA India Remote
Junior DS $95K–$130K ₹12–20 LPA $70K–$100K
Mid-Level $130K–$180K ₹20–35 LPA $100K–$140K

Pro Tips

  1. Contribute to open source (e.g., scikit-learn bugs)
  2. Write 1 LinkedIn post/week about your project
  3. Apply to 10 jobs/week after Phase 5
  4. Get 1 mentor (via ADPList.org)

Free Resources Summary

Topic Link
Python Python.org
Kaggle Courses kaggle.com/learn
StatQuest YouTube
HuggingFace huggingface.co/course
Streamlit streamlit.io

Start today: Open Kaggle Titanic, download data, and run pd.read_csv().

“The best time to start was yesterday. The next best time is now.”

Save this roadmap. Share with a friend. Tag me when you land your first DS job!

Last updated: Nov 09, 2025