Introduction to Machine Learning
What ML practitioners actually do
Machine learning isn't one job. It's a set of overlapping roles that share a common workflow. Here are the main ones you'll see in job listings:
ML Engineer. Builds and deploys models. Takes a business problem, prepares data, trains a model, evaluates it, deploys it as an API, and monitors it in production. This is the broadest role and the one this track most closely follows.
AI Engineer. Builds applications around pre-trained models: RAG systems, evaluation pipelines, agent workflows. Less model training, more system design. Growing fast as LLM applications become standard infrastructure.
Data Scientist (ML-focused). Heavier on analysis and experimentation, lighter on deployment. Explores data, builds models, communicates findings. Often works in notebooks rather than production pipelines.
MLOps Engineer. Focuses on the infrastructure: CI/CD pipelines, model registries, monitoring, drift detection, containerization. Makes sure models work reliably in production, not just in notebooks.
These roles overlap significantly. An ML engineer at a small company does all of it. At a larger company, the work is more specialized. Either way, the underlying workflow is the same.
The professional loop
Every ML project, whether it's a simple prediction model or a complex LLM application, moves through the same cycle:
1. Problem framing. Is ML the right approach? What kind of system is needed? What does success look like? A surprising number of ML projects fail because nobody asked these questions first.
2. Data strategy. For classical ML: what features matter? How do you split the data without leaking information from the future into the past? For LLM applications: what documents go into the retrieval system? How do you chunk them so the model gets useful context?
3. Model strategy. Train a custom model, use a pre-trained one, or fine-tune something in between? This decision depends on your data, your budget, your latency requirements, and what the problem actually needs.
4. Evaluation design. Define what "good" means before you build anything. Choose metrics. Build test cases. Set thresholds. This step exists separately because skipping it is one of the most common mistakes in ML: building first and asking "is it good?" later.
5. Build and integrate. Write the training pipeline or application pipeline. This is where most of the code lives.
6. Systematic evaluation. Run your evaluation suite against what you built. Not just "does it work?" but "where does it fail, and does that matter?"
7. Deploy and serve. Package the system, put it behind an API, containerize it, deploy it to infrastructure. The gap between "works in a notebook" and "works in production" is where most of the engineering lives.
8. Monitor and observe. Watch the system in production. Data changes over time. Models degrade. Costs accumulate. Without monitoring, you won't know until someone complains.
9. Iterate. Use what you learned from monitoring and evaluation to improve the system. Retrain, re-prompt, re-architect, whatever the signals tell you.
You'll run this loop in every project in this track. What changes is the complexity: early projects give you a clean dataset and a specified algorithm. Later projects give you a vague client need and expect you to figure out the rest.
The two branches
The track starts with classical ML (projects 1-13) and then moves into LLM applications (projects 14 onward). This order matters.
Classical ML teaches you the fundamentals that don't change: evaluation discipline, data leakage prevention, experiment tracking, the gap between development and production. These transfer directly to LLM work.
LLM applications introduce a different workflow (retrieval instead of feature engineering, prompt design instead of hyperparameter tuning, embedding similarity instead of learned weights), but the same loop applies. Problem framing, evaluation design, deployment, monitoring. The discipline is the same; the terrain is different.
By the end of the track, you'll work on projects that combine both: a production system where classical ML handles structured prediction and an LLM handles natural language, with routing logic that decides which component handles which request.
What you'll work on
Each project is built for a client with a specific problem. You'll direct AI to build the system, interact with the client to clarify requirements, verify the output, and deliver something that works. Here's a sample of what that looks like across the track:
- A churn prediction model deployed as an API for a subscription business
- An experiment tracking pipeline that compares multiple approaches systematically
- A containerized ML service with CI/CD and evaluation gates
- A RAG system that retrieves and synthesises information from a document corpus
- A fine-tuned model for a domain-specific task
- A production system that combines classical ML and LLM components with routing logic
The projects get harder in specific ways. The data gets messier. The requirements get vaguer. The client stops telling you exactly what they want. The tools multiply. You go from running everything locally to deploying on cloud infrastructure with cost constraints. And throughout, AI is your primary tool, capable and fast, but prone to specific mistakes that you'll learn to catch.
Core tools
These are the tools ML practitioners use daily. You'll set up the core ones in the track setup; the rest are introduced as projects need them.
Terminal. Your command line. Everything runs through it: code, tools, deployments, Claude Code itself.
Claude Code. Your AI coding agent. You'll direct it to write training pipelines, build APIs, analyze data, and debug problems. It's strong at generating ML code, and it makes specific, predictable mistakes with evaluation, data leakage, and production readiness that you'll learn to catch.
Git and GitHub. Version control. Every project lives in a repository. Every change is tracked.
Python. The language of ML. Nearly every ML library, framework, and tool is Python-first. You don't need to be an expert programmer (you're directing AI to write the code), but you need to be comfortable reading Python and understanding what it does.
Jupyter notebooks. Interactive documents where you run code in cells and see results immediately. The standard environment for data exploration, prototyping, and analysis. Most ML work starts in a notebook before moving to production scripts.
scikit-learn. The foundational library for classical ML. Preprocessing, model training, evaluation, pipelines. Clean API, excellent documentation, battle-tested. You'll use it from project 1.
MLflow. Experiment tracking and model registry. Records what you tried, what worked, and what the results were. Essential for reproducibility. Without it, you're running experiments and losing the results.
FastAPI. Turns your model into an API endpoint. The standard way to serve ML models: send a request with input data, get back a prediction. You'll build your first API in project 1.
Docker. Packages your application and all its dependencies into a container that runs the same way everywhere. The bridge between "works on my machine" and "works in production."
You'll install additional tools as the track progresses: PyTorch for deep learning, Hugging Face for pre-trained models, and others. Each project tells you what's needed.