Context
Basic telco churn dataset used to challenge students and academics
Content
Info |
Description |
file |
churn_100k.csv |
n_samples |
101K |
n_features |
28 |
pct_missing |
1% |
suggested model features
numeric_features = ['monthly_minutes', 'customerServiceCalls', 'streaming_minutes', 'TotalBilled', 'PrevBalance', 'latePayments']
categorical_features = ['ip_address_asn', 'phone_area_code', 'customer_reg_date', 'email_domain', 'phoneModel', 'billing_city', 'billing_postal', 'billing_state', 'partner', 'PhoneService', 'MultipleLines', 'streamingPlan', 'mobileHotspot', 'wifiCallingText', 'OnlineBackup', 'device_protection', 'number_phones', 'contract_code', 'currency_code', 'maling_code', 'paperlessBilling', 'paymentMethod']
dataset performance
random sampling 70/15/15
Train AUC Score : 0.967279
Eval AUC Score : 0.958073
Test AUC Score : 0.946909
Inspiration
Fun and simple dataset to practice with.