Customer churn – Survival analysis

Customer churn

Different industries, in different customer life cycle, the definition of customer churn is different. But in general, churn is a customer who stops using a company’s products and services over a period of time.

There are many machine learning models for predicting customer churn. Anticipating customer churn has several benefits:

  • To advance the possibility of loss of customers to intervene, the retention measures ahead;
  • Conduct data analysis for customers that may be lost and find out the biggest difference between lost customers and retained customers;
  • A timely and effective early warning mechanism can be formed according to the situation of loss;

We know which customers are going to leave and at what rate, and we have to deliver some strategies to keep those customers at risk. However, there are still some problems. Through the loss prediction model, we know that some customers will lose and the importance of features, but we still can’t catch the “hook” to retain customers. Data analysts can only disassemble these lost customers and important influential features and get some clues.

Survival analysis

COX Proportional Hazards Model

COX model is a semi-parametric regression model proposed by British statistician D.R.COX (1972). This model is usually used in medical research to analyze the influence of one or more pre-determined variables on patient survival time.

The most interesting aspect of this survival modeling is its ability to examine the relationship between survival time and predictive variables.

For example, if we are examining patient survival, the predictive variables could be age, blood pressure, gender, smoking habits, etc. These predictive variables are often referred to as covariables.

Explanation of model parameters:

  • Hazard function λ(t) : gives the instantaneous risk of death at time t;
  • Covariable Z: feature vector;
  • Baseline risk function λo(t) : describes the change of event risk over time. It is the potential risk when all covariables are equal to 0.

In addition, unlike kaplan-Meier curves commonly used in univariate analysis, COX model is a method of multifactor survival analysis, and COX model can include category variables (such as gender) and numerical variables (such as age).

Kaplan-meier curves can only contain category variables. In addition, COX regression extends the survival analysis method to simultaneously evaluate the impact of several risk factors on survival time, which is more widely used (direct good guy 😎).

Model application

Taking the data set of telecom loss on Kaggle as an example, lifelines package is used to build the risk model.

  1. Read the data
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt

df = pd.read_csv('Telecom_customer churn.csv')
df = df.dropna()
df.set_index('Customer_ID', inplace=True)
Copy the code
  1. Delete the feature whose category number is greater than 2
df_str = df.loc[ : , df.dtypes == object]

for i in df_str.columns:
   if len(np.unique(df_str[i].values)) >2:
       del df[i]
Copy the code
  1. Characteristics of one – hot
df_str = df.loc[:, df.dtypes == object]
for i in df_str.columns:
   one_hot = pd.get_dummies(df[i])
   one_hot.columns = [ i +'_'+j for j in one_hot.columns]
   df = df.drop(i,axis = 1)
   df = df.join(one_hot)
   
survival_time = df['months'].values
del df['months']
churn = df['churn'].values
del df['churn']
Copy the code
  1. Delete features that are highly relevant
corr_matrix = df.corr().abs()
upper = corr_matrix.where(np.triu(np.ones(corr_matrix.shape), k=1).astype(np.bool))
to_drop = [column for column in upper.columns if any(upper[column] > 0.98)]
df.drop(to_drop, axis=1, inplace=True)

df = df[list(df.columns[:69])]
df['months'] = survival_time
df['churn'] = churn
df = df[df['churn'] = =1]
Copy the code
  1. Variables were selected and cox model was established
df_sampled = df.sample(n=1000)
from lifelines import CoxPHFitter

cph = CoxPHFitter(penalizer=0.01) 
cph.fit(df_sampled, duration_col='months', event_col='churn')
df_stats = cph.summary

features_valuable = list(df_stats[df_stats['exp(coef)'].values > 1.01].index) + list(df_stats[df_stats['exp(coef)'].values < 0.98].index)
df = df[features_valuable+['churn'.'months']]
Copy the code

One of the basic assumptions of CPH model is that features do not exist multicollinearity, so multicollinearity between features needs to be processed before modeling:

  • The multicollinearity problem can be solved before fitting Cox model.
  • A penalty can be applied to the size of the coefficients during regression, which improves the stability of the estimates and controls for high correlations between covariables.
  1. Interpretation of model results
cph.summary

cph.plot()
Copy the code

The hazard ratio (HR) is equal to exp(COef), where COef is the corresponding weight of the feature.

If exp(COef) = 1 for a feature, it doesn’t work; If EXP (COEF) > 1, the risk is reduced and survival rate is improved.

The best way to understand the impact of each feature or decision is to plot a survival curve for an individual feature or decision by holding the other feature data constant.

Here the plot_partial_effects_on_outcome() method is called and the parameters — the characteristics of interest and the values to display — are passed.

Characteristic models=9, 80% survival after 42 months, and lower survival for other values.

By drawing the survival curve of the above features, we can get which actions can improve the survival rate of customers.

We can even plot a survival curve for each customer and analyze the causes of low survival rates by looking at customer characteristics:

We can also compare the impact of strategies on the survival curve by applying different strategies to a customer:

Here strategy 1 (orange line) performed better than strategy 2 (green line), with higher survival rates. Therefore, each customer can be analyzed and proactive strategies designed to ensure the highest statistical survival rates.

summary

Cox proportional risk model can not only find the factors affecting churn and the direction of different factors, but also through the analysis of characteristics, so that we can get personalized strategies to reduce customer churn rate, and even can compare different strategies to get the best strategy to improve retention.