Learn by Directing AI
All materials

tickets.md

Tickets — Tunde Mobile Churn Prediction

Unit 1: The Brief and the Data

T1: Read and summarize the brief

Load emeka-brief.md and confirm understanding of requirements. AC: Can state what Emeka needs in one sentence.

T2: Profile the dataset

Load subscribers.csv and print shape, column types, summary statistics, missing values, and class distribution. AC: Profile output shows row count, column count, types, and churn class split.

T3: Review data dictionary

Compare profile output against data-dictionary.md. AC: Confirm columns match, types are correct, no unexpected values.

Unit 2: Preprocessing and Splitting

T4: Impute missing values

Handle missing values with appropriate strategy per column. AC: No nulls remaining. Imputation strategy documented.

T5: Encode categorical features

Apply one-hot or ordinal encoding as appropriate. AC: All features numeric. Encoding choices documented.

T6: Scale and split

Scale numerical features and perform stratified train/test split (80/20, random_state=42). AC: Train and test sets exist. Churn proportion within 1 percentage point of original in both sets.

Unit 3: Training and Evaluation

T7: Train the model

Train RandomForestClassifier with class_weight='balanced' and random_state=42. AC: Model object exists, training completes without error.

T8: Evaluate the model

Generate confusion matrix and classification report on the test set. AC: Churn class recall >= 0.55.

T9: Extract feature importances

Print ranked feature importances from the trained model. AC: Feature importance list produced, top features identified.

Unit 4: Serving the Model

T10: Build the API endpoint

Create a FastAPI app that loads the trained model, accepts subscriber features as JSON, and returns churn probability and binary prediction. AC: Server starts, responds to valid requests with probability between 0 and 1.

T11: Test the endpoint

Test with valid input (curl), test with missing features, test with wrong types. AC: Valid input returns 200 with probability. Invalid input returns appropriate error.

T12: Add request logging

Log each prediction request and response with timestamp. AC: Log file records predictions after curl requests.

Unit 5: Project Close

No tickets. Unit 5 covers committing to Git, pushing to GitHub, writing the README, and delivering results to Emeka.