UNIT V — Applications of AI

Complete Notes with Clear Explanations, Real-Life Examples & Key Concepts (2025 Perspective)

UNIT V — Applications of AI

Complete Notes with Clear Explanations, Real-Life Examples & Key Concepts (2025 Perspective)

UNIT V — Applications of AI

Complete Notes with Clear Explanations, Real-Life Examples & Key Concepts (2025 Perspective)

1. AI Applications – Overview (2025 Landscape)

Domain Key Applications (2025) Leading Examples / Companies
Healthcare Diagnosis, Drug Discovery, Personalized Medicine Google DeepMind (AlphaFold 3), IBM Watson Health, Tempus
Finance Fraud Detection, Algorithmic Trading, Credit Scoring JPMorgan LOXM, PayPal fraud system, Upstart
Transportation Autonomous Vehicles, Traffic Optimization Tesla FSD v13, Waymo, Uber ATG
Education Personalized Tutoring, Automated Grading Duolingo, Khan Academy AI, Gradescope
Entertainment Content Generation, Game AI, Recommendation Netflix, Midjourney, OpenAI Sora
Manufacturing Predictive Maintenance, Quality Control Siemens MindSphere, GE Predix
Agriculture Precision Farming, Crop Monitoring John Deere See & Spray, Blue River Technology
Defense & Security Surveillance, Cyber Defense, Drone Swarms Palantir, Anduril, Israel’s Lavender system

2. Language Models (LLMs) – The Core of Modern AI

Evolution of Language Models
| Year | Model Family | Size | Breakthrough |
|------|-----------------------|--------------|-------------------------------------------|
| 2017 | Transformer | — | Attention is All You Need paper |
| 2018 | GPT-1 | 117M | Generative Pre-training |
| 2019 | GPT-2 | 1.5B | Zero-shot capabilities |
| 2020 | GPT-3 | 175B | Few-shot learning |
| 2023 | GPT-4 / Claude 2 | ~1.7T | Multimodal (text + image) |
| 2024 | Llama 3 / Grok-2 | 405B–1T+ | Open-source catching up |
| 2025 | GPT-5 class models | >10T | Reasoning, planning, long context (1M+) |

Key Concepts (2025)
- Pre-training → Instruction Tuning → Alignment (RLHF/RLAIF/DPO)
- Retrieval-Augmented Generation (RAG) – LLMs + external knowledge
- Mixture of Experts (MoE) – Only activate needed parameters (e.g., Mixtral, Grok-1)
- Multimodal Models – Text + Image + Audio + Video (GPT-4o, Gemini 1.5, Claude 3.5)

Real-Life Impact (2025)
- 70%+ of code on GitHub is now AI-generated (GitHub Copilot, Cursor)
- Customer support: 90% of queries handled by AI agents (Ada, Intercom AI)
- Education: Personalized tutors for millions (Khanmigo, Duolingo Max)

3. Information Retrieval (IR)

Finding relevant documents from large collections.

Classic IR → Modern Neural IR (2025)

Approach Method Example Tools (2025)
Boolean Retrieval AND, OR, NOT Old search engines
Vector Space Model TF-IDF + Cosine similarity Elasticsearch (classic)
BM25 Probabilistic ranking Still used in many systems
Dense Retrieval Embeddings (BERT, ColBERT) Cohere, Jina AI, Voyage AI
Hybrid Retrieval BM25 + Dense + Re-ranking Most production systems
Learned Sparse (SPLADE) Combines best of both Top performer in BEIR benchmark

Real-Life: Google Search (2025) = MUM + Dense passages + Re-ranking with Gemini

4. Information Extraction (IE)

Extracting structured data from unstructured text.

Sub-tasks
- Named Entity Recognition (NER) → Person, Org, Location
- Relation Extraction → (Elon Musk, CEO_of, Tesla)
- Event Extraction → (Company X, Acquired, Company Y, $10B, 2025)
- Template Filling

2025 State-of-the-Art
- Fine-tuned LLMs (GPT-4, Llama-3-70B-Instruct) outperform traditional models
- Prompt engineering + JSON output mode = best IE system

Example Prompt for IE (2025 style)

Extract all company acquisitions from the text. Return as JSON:
{
  "acquisitions": [
    {"buyer": "...", "target": "...", "amount_usd": ..., "date": "..."}
  ]
}
Text: "Microsoft acquired Activision Blizzard for $69 billion in October 2023..."

5. Natural Language Processing (NLP) Pipeline (2025)

Task Traditional Method 2025 Method
Tokenization Rule-based Byte-Pair Encoding (BPE), Tiktoken
POS Tagging HMM, CRF Built-in to LLMs
Parsing PCFG Rarely needed (LLMs understand syntax)
Sentiment Analysis VADER, TextBlob Prompt GPT-4o or Claude 3.5
Text Classification BERT fine-tuning Few-shot with Llama-3 405B
Summarization Extractive (TextRank) Abstractive with Gemini 1.5 Flash
Question Answering BiDAF RAG with long-context models

6. Machine Translation (MT)

Evolution
- Rule-based (1950s–1990s)
- Statistical MT (1990s–2010s) → Google Translate (old)
- Neural MT (2016+) → Transformer-based
- 2025: SeamlessM4T v2, NLLB-200, Google Translate (Universal)

Zero-Shot & Multilingual Models (2025)
- One model translates 200+ languages
- Real-time voice-to-voice (e.g., Google Meet live translation)

7. Speech Processing

Speech Recognition (ASR) – 2025
- Whisper (OpenAI) – Best open model
- Google USM, Deepgram, AssemblyAI – Real-time, high accuracy
- Word Error Rate (WER) < 3% on clean English

Text-to-Speech (TTS)
- ElevenLabs, PlayHT, Respeecher – Voice cloning in seconds
- Emotion & style control

End-to-End Voice AI (2025)
- GPT-4o voice mode: Real-time conversation with emotion detection

8. Robotics – The Physical Embodiment of AI

A. Robot Hardware (2025)

Component 2025 Technology Example Robots
Actuators High-torque brushless motors, series elastic Boston Dynamics Atlas, Tesla Bot
Sensors LiDAR, RGB-D cameras, tactile skins, IMUs Figure 01, Agility Robotics Digit
Compute NVIDIA Jetson Orin NX (275 TOPS), custom AI chips All modern humanoid robots
Batteries Solid-state batteries (higher density) Longer operation time

B. Perception

  • Computer Vision: YOLOv10, Segment Anything Model 2 (SAM-2)
  • SLAM (Simultaneous Localization & Mapping): ORB-SLAM3, Kimera
  • Tactile Sensing: GelSight, DIGIT sensors

C. Planning & Decision Making

  • Task & Motion Planning (TAMP)
  • Large Language Models for high-level planning (2025 breakthrough)
  • SayCan, Code as Policies, RT-2

Example: LLM + Robotics (2025)

# Pseudo-code: Robot uses LLM for planning
user_command = "Make me a cup of tea"
high_level_plan = llm.generate_plan(user_command)
# Output: 1. Go to kitchen 2. Find kettle 3. Fill with water...

for step in high_level_plan:
    low_level_actions = vision_language_model(step + current_camera_image)
    execute(low_level_actions)

D. Movement & Control

  • Reinforcement Learning (RL) for locomotion
  • Model Predictive Control (MPC)
  • Whole-body control (Boston Dynamics)

Leading Humanoid Robots (November 2025)
| Robot | Company | Status (2025) |
|----------------|--------------------|--------------------------------|
| Atlas | Boston Dynamics | Electric version, super agile |
| Optimus Gen 2 | Tesla | Walking in factories |
| Figure 01 | Figure AI | Working in BMW plant (pilot) |
| Apollo | Apptronik | Warehouse tasks |
| Ameca | Engineered Arts | Best face/expressions |

Summary Table – Unit V (2025 Perspective)

Area Dominant Technology (2025) Killer Application
Language Models Multimodal Transformers (10T+) AI assistants, code generation
Information Retrieval Dense + Hybrid Retrieval Semantic search engines
NLP Prompting + Fine-tuning LLMs Chatbots, content creation
Machine Translation Multilingual seamless models Real-time global communication
Speech End-to-end neural (Whisper, USM) Voice AI agents
Robotics LLM-guided + Vision + RL control Humanoid robots in homes/factories

Key Takeaway for 2025–2030
We are moving from “AI that talks” → “AI that sees, hears, and acts in the physical world.”
The next revolution = Embodied AI (Robots + LLMs) and AI Agents that can autonomously achieve complex goals.

You now have the complete big picture of AI applications in 2025! 🚀

Last updated: Nov 19, 2025

UNIT V — Applications of AI

Complete Notes with Clear Explanations, Real-Life Examples & Key Concepts (2025 Perspective)

UNIT V — Applications of AI

Complete Notes with Clear Explanations, Real-Life Examples & Key Concepts (2025 Perspective)

UNIT V — Applications of AI

Complete Notes with Clear Explanations, Real-Life Examples & Key Concepts (2025 Perspective)

1. AI Applications – Overview (2025 Landscape)

Domain Key Applications (2025) Leading Examples / Companies
Healthcare Diagnosis, Drug Discovery, Personalized Medicine Google DeepMind (AlphaFold 3), IBM Watson Health, Tempus
Finance Fraud Detection, Algorithmic Trading, Credit Scoring JPMorgan LOXM, PayPal fraud system, Upstart
Transportation Autonomous Vehicles, Traffic Optimization Tesla FSD v13, Waymo, Uber ATG
Education Personalized Tutoring, Automated Grading Duolingo, Khan Academy AI, Gradescope
Entertainment Content Generation, Game AI, Recommendation Netflix, Midjourney, OpenAI Sora
Manufacturing Predictive Maintenance, Quality Control Siemens MindSphere, GE Predix
Agriculture Precision Farming, Crop Monitoring John Deere See & Spray, Blue River Technology
Defense & Security Surveillance, Cyber Defense, Drone Swarms Palantir, Anduril, Israel’s Lavender system

2. Language Models (LLMs) – The Core of Modern AI

Evolution of Language Models
| Year | Model Family | Size | Breakthrough |
|------|-----------------------|--------------|-------------------------------------------|
| 2017 | Transformer | — | Attention is All You Need paper |
| 2018 | GPT-1 | 117M | Generative Pre-training |
| 2019 | GPT-2 | 1.5B | Zero-shot capabilities |
| 2020 | GPT-3 | 175B | Few-shot learning |
| 2023 | GPT-4 / Claude 2 | ~1.7T | Multimodal (text + image) |
| 2024 | Llama 3 / Grok-2 | 405B–1T+ | Open-source catching up |
| 2025 | GPT-5 class models | >10T | Reasoning, planning, long context (1M+) |

Key Concepts (2025)
- Pre-training → Instruction Tuning → Alignment (RLHF/RLAIF/DPO)
- Retrieval-Augmented Generation (RAG) – LLMs + external knowledge
- Mixture of Experts (MoE) – Only activate needed parameters (e.g., Mixtral, Grok-1)
- Multimodal Models – Text + Image + Audio + Video (GPT-4o, Gemini 1.5, Claude 3.5)

Real-Life Impact (2025)
- 70%+ of code on GitHub is now AI-generated (GitHub Copilot, Cursor)
- Customer support: 90% of queries handled by AI agents (Ada, Intercom AI)
- Education: Personalized tutors for millions (Khanmigo, Duolingo Max)

3. Information Retrieval (IR)

Finding relevant documents from large collections.

Classic IR → Modern Neural IR (2025)

Approach Method Example Tools (2025)
Boolean Retrieval AND, OR, NOT Old search engines
Vector Space Model TF-IDF + Cosine similarity Elasticsearch (classic)
BM25 Probabilistic ranking Still used in many systems
Dense Retrieval Embeddings (BERT, ColBERT) Cohere, Jina AI, Voyage AI
Hybrid Retrieval BM25 + Dense + Re-ranking Most production systems
Learned Sparse (SPLADE) Combines best of both Top performer in BEIR benchmark

Real-Life: Google Search (2025) = MUM + Dense passages + Re-ranking with Gemini

4. Information Extraction (IE)

Extracting structured data from unstructured text.

Sub-tasks
- Named Entity Recognition (NER) → Person, Org, Location
- Relation Extraction → (Elon Musk, CEO_of, Tesla)
- Event Extraction → (Company X, Acquired, Company Y, $10B, 2025)
- Template Filling

2025 State-of-the-Art
- Fine-tuned LLMs (GPT-4, Llama-3-70B-Instruct) outperform traditional models
- Prompt engineering + JSON output mode = best IE system

Example Prompt for IE (2025 style)

Extract all company acquisitions from the text. Return as JSON:
{
  "acquisitions": [
    {"buyer": "...", "target": "...", "amount_usd": ..., "date": "..."}
  ]
}
Text: "Microsoft acquired Activision Blizzard for $69 billion in October 2023..."

5. Natural Language Processing (NLP) Pipeline (2025)

Task Traditional Method 2025 Method
Tokenization Rule-based Byte-Pair Encoding (BPE), Tiktoken
POS Tagging HMM, CRF Built-in to LLMs
Parsing PCFG Rarely needed (LLMs understand syntax)
Sentiment Analysis VADER, TextBlob Prompt GPT-4o or Claude 3.5
Text Classification BERT fine-tuning Few-shot with Llama-3 405B
Summarization Extractive (TextRank) Abstractive with Gemini 1.5 Flash
Question Answering BiDAF RAG with long-context models

6. Machine Translation (MT)

Evolution
- Rule-based (1950s–1990s)
- Statistical MT (1990s–2010s) → Google Translate (old)
- Neural MT (2016+) → Transformer-based
- 2025: SeamlessM4T v2, NLLB-200, Google Translate (Universal)

Zero-Shot & Multilingual Models (2025)
- One model translates 200+ languages
- Real-time voice-to-voice (e.g., Google Meet live translation)

7. Speech Processing

Speech Recognition (ASR) – 2025
- Whisper (OpenAI) – Best open model
- Google USM, Deepgram, AssemblyAI – Real-time, high accuracy
- Word Error Rate (WER) < 3% on clean English

Text-to-Speech (TTS)
- ElevenLabs, PlayHT, Respeecher – Voice cloning in seconds
- Emotion & style control

End-to-End Voice AI (2025)
- GPT-4o voice mode: Real-time conversation with emotion detection

8. Robotics – The Physical Embodiment of AI

A. Robot Hardware (2025)

Component 2025 Technology Example Robots
Actuators High-torque brushless motors, series elastic Boston Dynamics Atlas, Tesla Bot
Sensors LiDAR, RGB-D cameras, tactile skins, IMUs Figure 01, Agility Robotics Digit
Compute NVIDIA Jetson Orin NX (275 TOPS), custom AI chips All modern humanoid robots
Batteries Solid-state batteries (higher density) Longer operation time

B. Perception

  • Computer Vision: YOLOv10, Segment Anything Model 2 (SAM-2)
  • SLAM (Simultaneous Localization & Mapping): ORB-SLAM3, Kimera
  • Tactile Sensing: GelSight, DIGIT sensors

C. Planning & Decision Making

  • Task & Motion Planning (TAMP)
  • Large Language Models for high-level planning (2025 breakthrough)
  • SayCan, Code as Policies, RT-2

Example: LLM + Robotics (2025)

# Pseudo-code: Robot uses LLM for planning
user_command = "Make me a cup of tea"
high_level_plan = llm.generate_plan(user_command)
# Output: 1. Go to kitchen 2. Find kettle 3. Fill with water...

for step in high_level_plan:
    low_level_actions = vision_language_model(step + current_camera_image)
    execute(low_level_actions)

D. Movement & Control

  • Reinforcement Learning (RL) for locomotion
  • Model Predictive Control (MPC)
  • Whole-body control (Boston Dynamics)

Leading Humanoid Robots (November 2025)
| Robot | Company | Status (2025) |
|----------------|--------------------|--------------------------------|
| Atlas | Boston Dynamics | Electric version, super agile |
| Optimus Gen 2 | Tesla | Walking in factories |
| Figure 01 | Figure AI | Working in BMW plant (pilot) |
| Apollo | Apptronik | Warehouse tasks |
| Ameca | Engineered Arts | Best face/expressions |

Summary Table – Unit V (2025 Perspective)

Area Dominant Technology (2025) Killer Application
Language Models Multimodal Transformers (10T+) AI assistants, code generation
Information Retrieval Dense + Hybrid Retrieval Semantic search engines
NLP Prompting + Fine-tuning LLMs Chatbots, content creation
Machine Translation Multilingual seamless models Real-time global communication
Speech End-to-end neural (Whisper, USM) Voice AI agents
Robotics LLM-guided + Vision + RL control Humanoid robots in homes/factories

Key Takeaway for 2025–2030
We are moving from “AI that talks” → “AI that sees, hears, and acts in the physical world.”
The next revolution = Embodied AI (Robots + LLMs) and AI Agents that can autonomously achieve complex goals.

You now have the complete big picture of AI applications in 2025! 🚀

Last updated: Nov 19, 2025