Tunde Mobile Churn Prediction

Build a churn prediction model and serve it as an API for Emeka Okafor's retention team at Tunde Mobile, Lagos.

Client

Emeka Okafor — Head of Customer Retention, Tunde Mobile (MVNO, Lagos)
Wants: churn predictions ranked by risk, feature importance, API endpoint his team queries weekly
200,000 subscribers, losing 2-3% per month
Has 12 months of billing/subscriber data

Stack

Python 3.11+ (conda environment: ml)
pandas — data loading, profiling, preprocessing
scikit-learn — RandomForestClassifier, preprocessing, evaluation
Jupyter — notebook workflow
FastAPI + uvicorn — model serving
joblib — model serialization
curl — API testing
Git/GitHub — version control

File structure

materials/
  emeka-brief.md        — Emeka's email brief (the project's starting point)
  data-dictionary.md    — Column-level documentation for subscribers.csv
  subscribers.csv       — 7,043-row subscriber dataset (12 months ending March 2025)
  tickets.md            — Work breakdown: T1-T12
  CLAUDE.md             — This file
  images/               — Project images (populated during authoring)
  scripts/              — Generation scripts

Working files (notebooks, scripts, model artifacts, logs) go in the project root as you create them.

Tickets

T1: Read and summarize the brief
T2: Profile the dataset
T3: Review data dictionary against profile
T4: Impute missing values
T5: Encode categorical features
T6: Scale features and stratified train/test split
T7: Train RandomForestClassifier
T8: Evaluate model (confusion matrix, classification report)
T9: Extract feature importances
T10: Build FastAPI endpoint
T11: Test endpoint (valid + invalid input)
T12: Add request logging

Evaluation targets

Churn class recall >= 0.55 on the test set
Stratified split preserving ~8% churn rate in both train and test sets
API returns HTTP 200 with a probability between 0 and 1 for valid requests
Confusion matrix and classification report generated on test set
Feature importance ranking produced
Endpoint handles invalid input gracefully

Commit convention

Meaningful commit messages in imperative mood
One commit per logical unit of work
Examples:
- "Add data profiling notebook"
- "Train RandomForest with balanced class weights"
- "Build FastAPI churn prediction endpoint"
- "Add request logging to prediction API"