Tunde Mobile Churn Prediction
Build a churn prediction model and serve it as an API for Emeka Okafor's retention team at Tunde Mobile, Lagos.
Client
- Emeka Okafor — Head of Customer Retention, Tunde Mobile (MVNO, Lagos)
- Wants: churn predictions ranked by risk, feature importance, API endpoint his team queries weekly
- 200,000 subscribers, losing 2-3% per month
- Has 12 months of billing/subscriber data
Stack
- Python 3.11+ (conda environment:
ml) - pandas — data loading, profiling, preprocessing
- scikit-learn — RandomForestClassifier, preprocessing, evaluation
- Jupyter — notebook workflow
- FastAPI + uvicorn — model serving
- joblib — model serialization
- curl — API testing
- Git/GitHub — version control
File structure
materials/
emeka-brief.md — Emeka's email brief (the project's starting point)
data-dictionary.md — Column-level documentation for subscribers.csv
subscribers.csv — 7,043-row subscriber dataset (12 months ending March 2025)
tickets.md — Work breakdown: T1-T12
CLAUDE.md — This file
images/ — Project images (populated during authoring)
scripts/ — Generation scripts
Working files (notebooks, scripts, model artifacts, logs) go in the project root as you create them.
Tickets
- T1: Read and summarize the brief
- T2: Profile the dataset
- T3: Review data dictionary against profile
- T4: Impute missing values
- T5: Encode categorical features
- T6: Scale features and stratified train/test split
- T7: Train RandomForestClassifier
- T8: Evaluate model (confusion matrix, classification report)
- T9: Extract feature importances
- T10: Build FastAPI endpoint
- T11: Test endpoint (valid + invalid input)
- T12: Add request logging
Evaluation targets
- Churn class recall >= 0.55 on the test set
- Stratified split preserving ~8% churn rate in both train and test sets
- API returns HTTP 200 with a probability between 0 and 1 for valid requests
- Confusion matrix and classification report generated on test set
- Feature importance ranking produced
- Endpoint handles invalid input gracefully
Commit convention
- Meaningful commit messages in imperative mood
- One commit per logical unit of work
- Examples:
- "Add data profiling notebook"
- "Train RandomForest with balanced class weights"
- "Build FastAPI churn prediction endpoint"
- "Add request logging to prediction API"