Learning Plan: AI/LLM Practical Applications for Solution Architecture
Goal: Develop detailed conceptual understanding of AI solution architectures to evaluate technical and economic viability, maintain technical credibility as a sanity-check resource, and position strategically in an AI-transformed landscape.
Target Depth: "AI-literate decision-maker and architect" - sufficient understanding to evaluate whether proposed solutions make sense before they're built, identify architectural limitations vs. implementation problems, and translate between technical possibilities and business requirements.
Time Commitment: 1 hour/day, sustained learning
Background: 15 years in education data/tech consulting, familiar with Karpathy's LLM content, regular Claude/ChatGPT user, data engineering background
Note on Structure: Phase 1 is designed to be completable in ~15 days before your strategy meeting. It front-loads actionable architectural knowledge. Phases 2-5 build deeper foundations and expand into specialized topics.
Phase 2: Neural Network & Statistics Foundations (Weeks 3-4)
Purpose: Fill in the "why" behind what you learned in Phase 1. Understanding how neural networks learn and basic probability helps you reason about model limitations, training trade-offs, and when fine-tuning makes sense.
- Learning Plan: AI/LLM Practical Applications for Solution Architecture
Week 3: Neural Networks Fundamentals
Primary Resource:
3Blue1Brown: "Neural Networks" Playlist (YouTube, 4 videos, ~1 hour total)
- https://www.youtube.com/playlist?list=PLZHQObOWTQDNU6R1_67000Dx_ZCJB-3pi
- Watch at 1.25x speed
- Best visual explanations of backprop, gradient descent, activation functions
- Videos: "What is a Neural Network?", "Gradient Descent", "Backpropagation", "Backpropagation Calculus"
Supplementary:
Fast.ai: Practical Deep Learning Lesson 1 (video + notebooks)
- https://course.fast.ai/Lessons/lesson1.html
- Watch video (~2 hours at 1.5x)
- Skim notebooks to see code structure
- Focus on: what models actually do, transfer learning intuition, training loop
Why this matters: You learned what fine-tuning and training are from Karpathy. Now you need to understand how learning happens (gradient descent, backpropagation) to reason about: (1) why fine-tuning differs from RAG, (2) why more data/compute improves performance, (3) what can go wrong in training (overfitting, underfitting, vanishing gradients), and (4) realistic resource requirements for training.
Key concepts:
- Forward pass: inputs → layers → outputs
- Loss functions: measuring how wrong the model is
- Backpropagation: computing gradients (how to adjust weights)
- Gradient descent: actually adjusting weights
- Learning rate: trade-off between speed and stability
- Overfitting vs. underfitting
- Why deep networks work: hierarchical feature learning
- Transfer learning: starting from pretrained model
Conceptual exercise (1-2 hours):
Explain in plain language to a non-technical colleague:
- Why does a model need millions of examples to learn?
- What's actually happening when we "train" a model?
- Why does fine-tuning cost less than training from scratch?
- What's the difference between parameters and hyperparameters?
Write this out as if preparing a briefing document.
Daily breakdown:
- Days 1-2: 3Blue1Brown videos 1-2, begin fast.ai Lesson 1
- Days 3-4: 3Blue1Brown videos 3-4, finish fast.ai Lesson 1
- Days 5-6: Review, conceptual exercise
- Day 7: Buffer/catch-up day
Week 4: Statistics & Probability for LLMs
Primary Resource:
StatQuest: "Machine Learning Fundamentals" Playlist (YouTube, selected videos, ~2 hours total)
- https://www.youtube.com/playlist?list=PLblh5JKOoLUICTaGLRoHQDuF_7q2GfuJF
- Watch videos on: probability distributions, cross-entropy, logistic regression, overfitting/regularization
- StatQuest's style is accessible and visual
- Watch at 1.25x
Supplementary:
"Understanding LLM Sampling" Article (search for recent explainer)
- Look for articles explaining temperature, top-p, top-k
- ~20 minutes reading
- Anthropic's docs on sampling might also help
Why this matters: LLM outputs are probabilistic. Understanding sampling (temperature, top-k, top-p) helps you: (1) configure models for different tasks (creative vs. factual), (2) understand why models sometimes produce different outputs for the same input, (3) reason about confidence and uncertainty, and (4) evaluate claims about model behavior.
Key concepts:
- Probability distributions: what it means when a model outputs "probabilities"
- Sampling strategies: greedy, temperature, top-k, top-p (nucleus)
- Cross-entropy: how model "confidence" is measured
- Temperature: controlling randomness (high = creative, low = deterministic)
- Why models are stochastic: sampling from distribution vs. always choosing most likely
- Confidence vs. correctness: models can be confidently wrong
- Regularization intuition: preventing overfitting
Practical exercise (1-2 hours):
Test sampling parameters yourself:
- Use Claude/ChatGPT API playground (or similar)
- Same prompt, vary temperature from 0 to 1
- Observe output differences
- Document: when would you use temp=0? temp=0.7? temp=1?
- Create guideline: "For task X, use temperature Y because..."
Daily breakdown:
- Days 1-2: StatQuest videos on probability, distributions
- Days 3-4: StatQuest videos on cross-entropy, regularization
- Days 5: Sampling explainer articles, sampling exercise
- Days 6-7: Review, consolidate notes on when to use different sampling strategies
Phase 3: MCP, Skills, and Claude-Specific Tooling (Week 5)
Purpose: Understand Anthropic's contributions to the ecosystem (MCP, Skills) as case studies for general patterns. These are increasingly industry-standard, not just Claude-specific.
Days 1-3: Model Context Protocol (MCP)
Primary Resource:
Official MCP Documentation
- https://modelcontextprotocol.io/
- Read: Introduction, Quickstart, Core Concepts
- ~1 hour total reading
"Model Context Protocol Explained" by Nir Diamant (Substack)
- https://diamantai.substack.com/p/model-context-protocol-mcp-explained
- ~30 minutes
- Excellent practical examples
Supplementary Video:
"Building Agents with Model Context Protocol" Workshop (AI Engineer Summit)
- Search for Mahesh Murag's workshop on YouTube
- ~1 hour at 1.5x speed
- Demos of building simple MCP servers
Why this matters: MCP is becoming an industry standard for connecting LLMs to data sources and tools. You need to understand: (1) client-server architecture, (2) what MCP servers provide (prompts, resources, tools), (3) implementation realities (how big is the task, what languages, frameworks available), (4) security considerations, and (5) when MCP makes sense vs. custom API integration.
Key concepts:
- Architecture: client (AI app) → host → server (data/tools)
- MCP primitives: Prompts (templates), Resources (data), Tools (functions)
- JSON-RPC protocol: how messages are structured
- Server implementation: Python or TypeScript SDK, ~100-500 lines for basic server
- Security: authentication, authorization, sandboxing
- Testing: unit tests for tools, integration tests for full flow
- When to use MCP: standard integrations, multiple clients, model-agnostic
- When NOT to use MCP: simple one-off integration, extreme latency requirements
Implementation realities assessment (2-3 hours):
For a hypothetical MCP server project, estimate:
Scenario: Build MCP server that connects to company's student information system API
Your assessment:
- Developer skill level needed: junior/mid/senior? Specialization?
- Implementation time: hours/days/weeks?
- What's handled by SDK vs. custom code?
- Testing strategy: what needs to be tested?
- Ongoing maintenance: what breaks when APIs change?
- Cost to run: server hosting, API calls, monitoring
- Alternative approaches: when would you not use MCP here?
Daily breakdown:
- Day 1: Official MCP docs (Introduction, Quickstart)
- Day 2: Official MCP docs (Core Concepts), Nir Diamant article
- Day 3: Workshop video, implementation realities assessment
Days 4-5: Claude Skills & System Prompts
Primary Resource:
Anthropic's Skills Documentation
- Available in Claude desktop app under Skills
- Examine existing skills: docx, pptx, xlsx, pdf skills
- Read SKILL.md files to understand structure
- ~1 hour exploration
"Building Effective System Prompts" Guide
- Search Anthropic docs for prompt engineering guidance
- Focus on: instruction hierarchy, example patterns, structured outputs
- ~30 minutes
Why this matters: Skills are structured knowledge injection. Understanding them helps you: (1) recognize when to provide context via skills vs. RAG, (2) design effective system prompts for custom applications, (3) understand how LLMs follow layered instructions, and (4) evaluate "custom GPT" proposals.
Key concepts:
- Skill architecture: markdown files with instructions, examples, best practices
- Why skills work: injected into system prompt, high-priority instructions
- SKILL.md design patterns: clear objectives, step-by-step guidance, examples, troubleshooting
- Interaction with computer use: skills guide file creation, tool usage
- Portability: skills are just markdown, transferable to other LLMs
- When to use skills: reusable workflows, best practices for specific tasks
- When to use RAG instead: dynamic data, large knowledge bases
Design exercise (2-3 hours):
Create a SKILL.md file for a specific education use case:
Scenario: "Analyzing standardized test data to identify learning gaps"
Your skill should include:
- Clear objective statement
- Step-by-step analysis workflow
- Examples of good analysis patterns
- Common pitfalls to avoid
- Output format requirements
- When to escalate to human review
This exercise forces you to think through structured instruction design.
Daily breakdown:
- Day 4: Explore existing Claude skills, read SKILL.md files
- Day 5: System prompt guidance, skill design exercise
Days 6-7: Claude Computer Use Environment
Primary Resource:
Anthropic's Computer Use Documentation
- https://platform.claude.com/docs/en/agents-and-tools/tool-use/computer-use-tool
- Read all sections
- ~45 minutes
Computer Use Demo Videos
- Search for "Claude computer use demo" on YouTube
- Watch 2-3 recent demos
- ~30 minutes total
Why this matters: Understanding the computer use environment helps you: (1) evaluate proposals involving file creation/manipulation, (2) understand what's possible vs. what requires special setup, (3) reason about security and sandboxing, and (4) assess developer requirements for building computer-use applications.
Key concepts:
- Container environment: Ubuntu, isolated, ephemeral
- File system: /home/claude (work), /mnt/user-data/uploads (inputs), /mnt/user-data/outputs (deliverables)
- Available tools: bash, file operations, package installation
- Limitations: network restrictions, no persistent state between sessions
- Caching behavior: what persists, what resets
- Security model: sandboxing, what Claude can/cannot access
- Use cases: document creation, data analysis, code generation
Architectural pattern exercise (1-2 hours):
Design a workflow using computer use:
Scenario: "Automated report generation from uploaded CSV data"
Your design:
- Input: what does user provide?
- Processing steps: bash commands, file operations, tools used
- Output: what gets delivered to /outputs?
- Error handling: what could go wrong?
- Skill integration: what SKILL.md instructions are needed?
- Cost/time estimate: per report generated
Daily breakdown:
- Day 6: Computer use docs, demo videos
- Day 7: Architectural pattern exercise, Phase 3 review
Reference Materials (Keep Accessible)
Essential Documentation
| Resource | Purpose | URL |
|---|---|---|
| Anthropic API Docs | Tool use, caching, models | https://docs.anthropic.com |
| OpenAI Platform Docs | Embeddings, fine-tuning | https://platform.openai.com/docs |
| MCP Specification | Protocol details | https://modelcontextprotocol.io |
| Pinecone RAG Guide | RAG best practices | https://www.pinecone.io/learn/ |
Video Resources
- Karpathy's "Deep Dive into LLMs" (3.5 hours): https://youtube.com/watch?v=7xTGNNLPyMI
- Karpathy's "Let's Build GPT Tokenizer" (2 hours): https://youtube.com/watch?v=zduSFxRajkE
- 3Blue1Brown Neural Networks Playlist: https://www.youtube.com/playlist?list=PLZHQObOWTQDNU6R1_67000Dx_ZCJB-3pi
- StatQuest Machine Learning Playlist: https://www.youtube.com/playlist?list=PLblh5JKOoLUICTaGLRoHQDuF_7q2GfuJF
Cost Calculators & Tools
- OpenAI Tokenizer: https://platform.openai.com/tokenizer
- Anthropic Pricing: https://www.anthropic.com/pricing
- Model comparison (LMSYS): https://chat.lmsys.org/?leaderboard
Your Created Materials
Keep these in an accessible reference folder:
- RAG Cost Analysis Exercise (Phase 1, Days 4-7)
- Token Economics Spreadsheet (Phase 1, Days 8-10)
- BS Detection Checklist (Phase 1, Days 14-15)
- MCP Implementation Assessment (Phase 3, Days 1-3)
- Production RAG Checklist (Phase 4, Week 6)
- All Phase 5 Decision Frameworks
Pacing Notes & Adjustments
If you're moving faster:
- Deep dive into Karpathy's full "Neural Networks: Zero to Hero" course
- Implement actual RAG system (LangChain + Chroma + OpenAI embeddings)
- Take fast.ai full Practical Deep Learning course
- Build actual MCP server for a real use case
If you're moving slower:
- Phase 1 is the priority—extend it to 3 weeks if needed
- Phase 2 (foundations) can be compressed or skipped if time-pressured
- Phases 4-5 can be done "on-demand" when you encounter those specific needs
- Focus on exercises over reading—hands-on builds intuition faster
The key metric: Can you evaluate an AI solution proposal and write a 1-page technical assessment covering: viability, cost structure, failure modes, alternative approaches, and team requirements? That's the goal.
Cost Summary
| Resource | Cost |
|---|---|
| All video courses (YouTube, fast.ai, Coursera auditing) | Free |
| Documentation (Anthropic, OpenAI, Microsoft, etc.) | Free |
| API experimentation (OpenAI, Anthropic playgrounds) | ~$5-10 (optional) |
| Optional: Coursera verified certificates | ~$49 each |
| Optional: Hands-on RAG implementation | ~$20 (API credits) |
Minimum cost: $0 (all core resources are free; API experimentation is optional)
Success Indicators by Phase
After Phase 1 (Pre-meeting):
- You can explain RAG to a non-technical executive and identify when it's appropriate
- You can estimate token costs for a proposed AI solution and spot economic red flags
- You can distinguish between genuine architectural complexity and unnecessary "agentic" framing
- You have a checklist of questions to ask about any AI proposal
After Phase 2 (Foundations):
- You understand why fine-tuning differs from RAG at a mechanical level
- You can explain when more training data helps vs. when it doesn't
- You understand model behavior (sampling, temperature) well enough to configure systems appropriately
After Phase 3 (MCP & Claude Tooling):
- You can evaluate MCP server proposals and estimate implementation effort
- You understand when to use system prompts/skills vs. RAG for knowledge injection
- You know what's possible with computer use and what requires custom infrastructure
After Phase 4 (Production RAG):
- You can design evaluation frameworks for RAG systems
- You understand production considerations beyond MVP (monitoring, iteration, cost optimization)
- You can recommend specific architectural patterns for RAG use cases
After Phase 5 (Decision Frameworks):
- You have reusable frameworks for rapid evaluation of AI proposals
- You can generate technical assessments of proposals in <30 minutes
- You can confidently recommend offshore-suitable vs. senior-required work
- You maintain technical credibility while translating between technical and business stakeholders
Meta Notes on Learning Approach
Why this structure:
- Front-loaded actionability: Phase 1 gets you to "credible evaluator" in 15 days, even though it's pedagogically backwards
- Foundations when they're most useful: After seeing practical applications, foundations make more sense
- Exercise-heavy: Each phase includes hands-on work because concepts without application don't stick
- Reference-optimized: Materials chosen for ongoing utility, not just one-time reading
- Economic focus: Unusual for learning plans, but critical for your role as solution architect
Learning philosophy: You're not trying to become an ML engineer—you're building "informed buyer" expertise. The goal is knowing enough to ask the right questions, spot impossible claims, and translate between technical possibilities and business requirements. This requires deeper understanding than typical "intro to AI" content, but different depth than an implementer needs.
