Data Science 101 – Part 1: How to use data science for predictions
Everyone’s talking about data science for predictions… How it helps a business transform data into actionable insights to improve marketing ROI, increase customer satisfaction, drive operational efficiencies, and more. Many companies know they need to get in the game – but it can be challenging when it’s new terrain.
At Beyond the Arc, our data science team has decades of experience creating innovative approaches to using data to solve business problems. In this blog (the first in a series), we’ll walk through the basics of how to approach a data science project.
First, generate ideas about business problems
Using data science for predictions doesn’t start with data. Our approach is to start by interviewing the business experts and stakeholders about how they operate, the problems they need to solve, and what they think is causing them. Our data scientists also generate hypotheses by combining our expertise in machine learning and customer experience.
Dozens of ideas can come from an exercise like this.
Test ideas with machine learning predictions
Next, we translate those ideas into hypotheses that data can answer (e.g., which customers are most likely to accept offer A? Or most likely to churn?). And then we use machine learning algorithms to find patterns in the data that help predict answers.
To validate a modeling approach, it helps to start with the attributes from the brainstorm that the business currently uses in reporting or dashboards. It allows machine learning algorithms to find patterns and validate them against past reporting.
If predictive accuracy is quite good at this point, it’s safe to pilot a model. In other words, put this first model into use and start to generate insights for taking action. From there, the business can gather feedback about how accurate or useful the predictions are, and consider ways to improve the model.
Refine the predictive model
While the first model is in its pilot phase, development of a revised model can begin by adding new features based on the interviews and on inputs from data scientists. Here are a few best practices:
Look for new attributes that make actionable patterns easier to spot. You rarely have access to all of the potential data inputs at the beginning of a machine learning project.
Focus on ideas that people think are important. Start with inputs that are easy to derive and that are based on data that is readily available at the time you will generate predictions.
Try to avoid adding attributes that are highly correlated with each other, because the similarities might muddy the water.
Make sure any attributes you add make the model better. Better can mean more accurate, more stable, or easier to interpret.
If there are so many predictors that you can’t keep track of what they mean, then you have gone too far.
Evolve the model over time
Machine learning models are never “one and done.” After you complete the first few model iterations, you’ll have a better understanding of the model’s strengths and weaknesses. Making further progress to improve a predictive model may require additional data elements, changes to the data preparation, or experimentation with new algorithms.
And as changes take place in the underlying business, it is important to ensure those changes are reflected in the data.
You can also progress to more complex deep learning “black box” algorithms. However, they make the pathway to prediction harder to discern. Black box algorithms are basically so complex that results cannot be interpreted directly, but they can deliver highly accurate predictions. If the team is comfortable with getting predictions without the ability to view exactly how the answers were determined, a black box algorithm is worth a try. And even then, you can still test model accuracy and reliability.
Using data science for predictions can help companies grow revenue and increase operating efficiencies. Think about it as a new way to solve problems and a toolkit for getting there. Putting the tools to good use requires a mix of talent, domain knowledge, and technology. These are a few of the topics we’ll address in future articles in this series.
Our team includes experts in data science who are passionate about making decisions and taking action based on data. They specialize in using statistics and machine learning to deliver actionable business insights.