# Lesson 2: The Intelligence Spectrum
## Why Multiple Tiers Exist
You might wonder: why don't AI companies just offer their best model and call it a day?
The answer comes down to a fundamental tradeoff that applies to all AI models:
**The AI Tradeoff Triangle:**
1. **Intelligence** - How smart/capable the model is
2. **Speed** - How fast it responds
3. **Cost** - How much it costs per query
Here's the hard truth: **You can optimize for two, but not all three.**
Want maximum intelligence AND speed? It'll be expensive.
Want cheap AND intelligent? It'll be slow.
Want fast AND cheap? It won't be as smart.
This is why every major AI provider offers multiple tiers.
---
## The Major Provider Lineups
### Anthropic (Claude)
| Model | Intelligence | Speed | Cost | Best For |
|-------|--------------|-------|------|----------|
| **Opus** | ⭐⭐⭐⭐⭐ | ⭐⭐ | $$$$ | Complex analysis, research, nuanced reasoning |
| **Sonnet** | ⭐⭐⭐⭐ | ⭐⭐⭐⭐ | $$ | General purpose, everyday tasks |
| **Haiku** | ⭐⭐⭐ | ⭐⭐⭐⭐⭐ | $ | High-volume, simple tasks |
### OpenAI (GPT)
| Model | Intelligence | Speed | Cost | Best For |
|-------|--------------|-------|------|----------|
| **GPT-4o** | ⭐⭐⭐⭐⭐ | ⭐⭐⭐ | $$$ | Complex tasks, multimodal |
| **GPT-4o-mini** | ⭐⭐⭐⭐ | ⭐⭐⭐⭐ | $ | Most everyday tasks |
### Google (Gemini)
| Model | Intelligence | Speed | Cost | Best For |
|-------|--------------|-------|------|----------|
| **Pro/Ultra** | ⭐⭐⭐⭐⭐ | ⭐⭐⭐ | $$$ | Professional, complex use |
| **Flash** | ⭐⭐⭐⭐ | ⭐⭐⭐⭐⭐ | $ | Speed-critical applications |
### Meta (Llama - Open Source)
| Model | Intelligence | Speed | Cost | Best For |
|-------|--------------|-------|------|----------|
| **405B** | ⭐⭐⭐⭐⭐ | ⭐⭐ | Self-host | Maximum open-source capability |
| **70B** | ⭐⭐⭐⭐ | ⭐⭐⭐ | Self-host | Balanced open-source |
| **8B** | ⭐⭐⭐ | ⭐⭐⭐⭐⭐ | Self-host | Runs on consumer hardware |
---
## What Do the Numbers Mean? (Parameter Counts)
When you see "70B" or "405B," that's the parameter count in billions.
**Parameters** are essentially the "knowledge weights" inside the model. More parameters generally means:
- ✅ More knowledge capacity
- ✅ Better reasoning ability
- ❌ Slower inference
- ❌ More expensive to run
But it's not a perfect correlation! Training quality, architecture, and data matter enormously. A well-trained 70B model can outperform a poorly-trained 405B model.
---
## Practical Application: Matching Model to Task
Here's a decision framework:
**Use the most powerful tier (Opus, GPT-4o, Pro) when:**
- Task requires complex reasoning
- Nuance and accuracy are critical
- You're doing research or analysis
- Creative work requiring sophistication
- Stakes are high (legal, medical, financial)
**Use the balanced tier (Sonnet, GPT-4o-mini, Flash) when:**
- General conversation and assistance
- Summarization
- Most coding tasks
- Content drafting
- You need good results without premium cost
**Use the lightweight tier (Haiku) when:**
- High-volume, repetitive tasks
- Classification or simple extraction
- Customer service chatbots
- Quick, simple queries
- Cost is a primary concern
---
## Advanced: Model Routing
Production AI applications often use **model routing** - automatically choosing which model to use based on the query.
Example:
1. User asks: "What's the weather?"
→ Route to Haiku (simple, fast)
2. User asks: "Analyze the geopolitical implications of this trade policy"
→ Route to Opus (complex, needs horsepower)
This can reduce costs by 60-80% while maintaining quality for complex queries.
---
## 🎯 Key Takeaways
1. Model tiers exist because of the intelligence/speed/cost tradeoff
2. Bigger isn't always better - match the model to the task
3. Parameter count (70B, 405B) indicates model size but not always quality
4. Smart systems route between models based on query complexity
---
## 💬 Discussion Questions
1. What tasks do you use AI for? Which tier makes sense for them?
2. Have you noticed quality differences between model tiers?
3. If you're building an application, how do you decide which model to use?
---
## 📚 Up Next
**Lesson 3: Reasoning Models** - What are OpenAI's o1/o3 models? What is "extended thinking"? When should you use a reasoning model vs. a standard model?