User Churn Prediction. Modern data warehousing and Machine… | by 💡Mike Shakhomirov

Modern data warehousing and Machine Learning

No doubt, user retention is a crucial performance metric for many companies and online apps. We will discuss how we can use built-in data warehouse machine learning capabilities to run propensity models on user behaviour data to determine the likelihood of user churn. In this story, I would like to focus on dataset preparation and model training using standard SQL. Modern data warehouses allow this. Indeed, retention is an important business metric that helps understand user behaviour’s mechanics. It provides a high-level overview of how successful our Application is in terms of retaining users by answering one simple question: Is our App good enough at retaining users? It is a well-known fact that it’s cheaper to retain an existing user than to acquire a new one.

In one of my previous articles, I wrote about modern data warehousing [1].

Modern DWH has a lot of useful features and components which differentiate them from other data platform types [2].

ML model support seems to be the foundational DWH component when dealing with big data.

In this story, I will use Binary logistic regression, one of the fastest models to train. I will demonstrate how we can use it to predict user propensity to churn. Indeed, We don’t need to know every machine-learning model.

We can’t compete with cloud service providers such as Amazon ang Google in machine learning and data science but we need to know how to use it.

I previously wrote about it in my article here [3]:

In this tutorial, we will learn how to transform raw event data to create a training dataset for our ML…