Churn Prediction Pipeline
An end-to-end pipeline that scores customer churn risk every week, retrains itself on a schedule, and flags model drift before it silently degrades decisions downstream.
The problem
The product team could see churn happening in hindsight — by the time a customer cancelled, it was too late to intervene. They needed a forward-looking risk score that updated automatically and stayed honest as customer behavior shifted over time.
Approach
The pipeline pulls behavioral and billing events nightly, builds a rolling feature set (engagement decay, support ticket velocity, billing friction), and scores every active account weekly with a gradient-boosted model.
- Feature pipeline built on scheduled Airflow DAGs, with each stage independently retriable.
- XGBoost classifier retrained on a sliding 12-month window, versioned and compared against the live model before promotion.
- Drift monitoring on both feature distributions and prediction calibration, with automatic alerts if either drifts past a threshold.
- Inference and retraining jobs run on scheduled AWS batch compute, with results written back to the warehouse for the CS team's dashboards.
Screenshot
Weekly churn risk distribution and drift monitor, as surfaced to the CS team.
Results
What I'd do differently
Calibration monitoring came later than it should have — building it in from day one would have caught an early drift issue faster. Next iteration moves feature computation to streaming rather than nightly batch, to shorten the response window.