Explaining Machine Learning models

Let's talk

The following is a predicted* conversation between a Data Analyst and a Customer Retention Manager:

Data Analyst: I have this great model, which can predict customer churn with really high accuracy!! Yayyy

Customer Retention Manager: Awesome, Can you please send me the list of all customers likely to churn.

Data Analyst: Here is the spreadsheet with the list of customers whose probability of churning is greater than 60%.

Customer Retention Manager: I see that customer abc123 has 90% of churning. Why?

Data Analyst: Well, I just built a very complex Machine Learning Model. The prediction is based off 50 different factors. The model is made accurate due to a combination of complex variables.

Generally at this point the conversation stops. Machine Learning research has focused a lot on creating complex and accurate models. Explainability, however, has been the casualty of greater accuracy and complexity.

According to the authors of a brilliant algorithm LIME:

“Despite widespread adoption, Machine Learning models remain mostly black boxes. Understanding the reasons behind predictions is, however, quite important in assessing trust, which is fundamental if one plans to take action based on a prediction, or when choosing whether to deploy a new model.”

In the scenario above, giving the Customer Retention Manager the list of customers with a high probability of churn is a brilliant first step. However, for the manager to make a good customer retention strategy, the manager needs to know which factors are causing churn probability to go up or down. This is where this brilliant algorithm LIME comes to the rescue.

The following is an excerpt from the author of the lime algorithm:

“The purpose of lime is to explain the predictions of black box classifiers. What this means is that for any given prediction and any given classifier it is able to determine a small set of features in the original data that has driven the outcome of the prediction.”

Using the publicly available IBM Watson Telco Customer Churn Data Set, I have created a model that predicts customer churn. I have then used a LIME algorithm to explain the results.

The following model building is inspired from a blog post by Matt Dancho:

I have used a generalised linear model as they are quick and produce accurate results. I made a glmnet model using standard caret parameters. The AUC score was 85%. The model is accurate enough to proceed to the next step.

I deployed the model on test data and below are the results for the first five cases. The model got Case No 1 wrong, but the remaining are correct:

Explaining Machine Learning Models

As mentioned earlier, giving the above list to the Customer Retention Manager would be a good first step. However, to convert this intelligence into actionable intelligence data, a data scientist should be able to explain the probabilities. Why for instance was the customer in the first row given 39% for predicted churning compared to 4% for the customer in the 5th row?

I have run these five cases with the LIME algorithm and explain the results.

In the example below, the LIME algorithm has shown the seven most important factors that contributed to the model output on whether a particular customer will churn. The green bars mean that the feature supports the model conclusion, and the red bars contradicts.

 

In Case No 2, the model predicted that there was a 54% probability that the customer will churn. According to the LIME algorithm, having fiber optic and streaming movies supports the model prediction that the customer will churn . If we see the same pattern for multiple customers it could mean that customers who have fiber optic and like to steam movies are not happy with the product.

In Case No 4, the model predicts that probability of churn is 2%. Total charges and having a two year contract supports the model’s conclusion, whereas Fiber optic and steaming TV contradict. It is likely that this 2% chance of churn is stemming from the fact that this customer has fiber optic and likes to stream movies.

Overall, the LIME algorithm is an excellent addition to the Machine Learning toolbox. It enables data scientist to understand why the specific prediction was made and provides a framework to explain the results in a easy to understand format. LIME algorithm is a valuable conduit to combine the power of statistics and gut feeling that front-line managers have about their customers.

I fully recommend data scientists to read this paper on LIME algorithm. Special thanks to Matt Dancho whose made understanding LIME algorithm very easy with his valuable blog posts.

This post is by Suraj Banjade, a data scientist from our Advanced Analytics team. His post first appeared here on medium.com

Let’s talk.

 

The Data Scientists in our Advanced Analytics team can expertly apply a range of advanced techniques – including Machine Learning – to develop deep business insights and solve problems that go beyond the traditional Business Intelligence domain. Talk to one of our consultants today.

[email protected]

Sydney: +61 2 9299 4430

Melbourne: +61 3 8605 4880

  • This field is for validation purposes and should be left unchanged.

Not based in Australia? Our team in the UK or the US would be happy to help.

Do more with your data

Download our capabilities brochure to find out what we can do with your data.
Download brochure