gears - churn analysis accuracy Churn prediction is a powerful business asset to help prevent attrition. But only if your models can predict with high accuracy. That doesn’t mean the work has to be more complicated; it just needs a strategic approach. With the right planning and process, you can avoid some common mistakes and build more accuracy into predictive churn analysis.

Let’s take a quick look at pitfalls to avoid, and steps you can take to create more accurate models.

Go unique, not generic

Attrition is a common problem, but you don’t want to use an off-the-shelf attrition model. Each company serves a different mix of customers, and has its own business practices and ways of recording data. To accurately predict the behavior of your customers, you need churn models that are fine-tuned to the specifics of your business.

Choose the right data

Understandably, subscription-based software companies (and others) using churn analysis are in a rush for results. But for more accurate predictions, it’s best not to dive into making models until you’ve identified the appropriate data to use. In part, it depends on the specific question you want to answer (e.g., future churn rate, risk of attrition for a specific group, etc.). What you choose to include in the machine learning model also depends on the quality of the data.

Data curation and data preparation are critical to success. Companies often find their initial churn models reveal idiosyncrasies in how people enter data into their systems. “Up to 80% of the effort in machine learning goes toward selecting and preparing the data,” notes Beyond the Arc data scientist, Bruce Johnson.

Look at a problem from many angles

To accurately predict the likelihood of churn, you need data that helps you uncover any pain points (or other reasons) that might trigger customers to leave. It can help to first brainstorm a list of likely reasons, and then design models to see if those reasons are supported by your data or not.

One model, using historical and demographic data, might be to compare a group of your most successful customers with a group who were quickest to churn. Look for characteristics or behaviors they have in common. Also compare how each group interacts with your product or service, such as usage trends, support requests, and how often they engage with their account manager. For B2B SaaS customers, it also helps to look at their volume of users and how activity changes over time.

Consider decision tree algorithms

For predictive churn analysis, many data science experts favor machine learning models using decision tree or random forest algorithms. A decision tree splits the data into smaller data groups based on the features of the data, branching down to a dataset small enough that it only has one label (a decision point).1

Two similar algorithms build on trees… “Bagged trees” aggregate a set of decision trees to increase accuracy of predictions. “A single decision tree has high variance, so by bagging together many weak learners into strong learners, you can average away the variance.” With the random forest algorithm you can use bagged trees, and at each split of a decision tree, the model considers only a small subset of features rather than all of the features of the model.2

At Beyond the Arc, our data scientists sometimes use the CHAID decision tree. It helps us find significant attributes, and interactions between them, that might not be obvious from just looking at the data. In the example below, the decision tree explores variables to predict which customer segments might be more likely to accept a specific financial services offer.

CHAID decision tree

The moral of the story is that there’s no single ideal churn model. Increasing predictive accuracy takes strategic data preparation and creative experimentation — and it’s worth the effort. Getting it right can deliver insights you can translate into big wins for your customer experience and your bottom line.

How Beyond the Arc can help with churn analysis

Our data scientists are passionate about using data to help companies make better decisions and take action to improve outcomes.

Interested in exploring how machine learning models can help you reduce attrition? Let’s talk >


1,2   Why Random Forest is My Favorite Machine Learning Model, Towards Data Science, Oct 2018

Johnson Controls is using AI to reduce churn and identify over $100M a year of protectable revenue


Johnson Controls had no usable data sets, no data science team or data engineers.

How could they rapidly build a global data team with new AI/ML capabilities to improve business outcomes on a major scale?

Is your company struggling with how to implement predictive analytics?