[0]

Churn Prediction Pipeline

An end-to-end pipeline that scores customer churn risk every week, retrains itself on a schedule, and flags model drift before it silently degrades decisions downstream.

Python
XGBoost
Airflow
AWS

View on GitHub → ← Back to projects

The problem

The product team could see churn happening in hindsight — by the time a customer cancelled, it was too late to intervene. They needed a forward-looking risk score that updated automatically and stayed honest as customer behavior shifted over time.

Approach

The pipeline pulls behavioral and billing events nightly, builds a rolling feature set (engagement decay, support ticket velocity, billing friction), and scores every active account weekly with a gradient-boosted model.

Feature pipeline built on scheduled Airflow DAGs, with each stage independently retriable.
XGBoost classifier retrained on a sliding 12-month window, versioned and compared against the live model before promotion.
Drift monitoring on both feature distributions and prediction calibration, with automatic alerts if either drifts past a threshold.
Inference and retraining jobs run on scheduled AWS batch compute, with results written back to the warehouse for the CS team's dashboards.

Screenshot

Dashboard showing churn risk scores and model drift metrics over time

Weekly churn risk distribution and drift monitor, as surfaced to the CS team.

Results

+18% retention on flagged at-risk accounts

Weekly automated scoring cadence

0 manual retrains needed since launch

What I'd do differently

Calibration monitoring came later than it should have — building it in from day one would have caught an early drift issue faster. Next iteration moves feature computation to streaming rather than nightly batch, to shorten the response window.