Data Science, Machine Learning
A ready-to-run tutorial in Python and scikit-learn to evaluate a classification model compared to a baseline model
The other day, I had to understand if my classification algorithm’s performance was decent. I had obtained a precision, recall, and accuracy of 64%, and I honestly thought I had obtained a terrible result. In fact, 64% was a little better than a random model. In reality, this is true if the problem to be solved is simple. For example, in the case of two values, a random algorithm has a 50% probability of predicting the correct result. Therefore, in this case, an algorithm with an accuracy of 64% is better than a random algorithm.
The problem is different if you are dealing with a multiclass algorithm in which the number of classes is more significant than two. In my case, I had about 1000 classes. In this case, the problem is much more complex than in the binary case, so an accuracy of 64% might even be a good algorithm!
But then, how can you understand if the performance obtained is satisfactory? The solution is to compare the model with a dummy model representing the case. The results are promising if our model performs better than the dummy model. Conversely, if our model performs worse than the dummy model, then it is worth reviewing our model.
Let’s implement a practical case to see how to proceed. We will use a classic dataset, the Pima Indians Diabetes Database, released by UCI Machine Learning under the CC0: Public Domain license. This is a binary classification problem, but you can generalize the described concepts to multiclass classification as well.
We will divide the tutorial into three parts. In the first part, we will load the dataset, divide it into training and test sets, and use a simple scaler to normalize the data.
In the second part, we will implement our classic Machine Learning model using a K-Nearest Neighbors classifier. Then, in the second part, we will implement a dummy classifier.
Finally, in the third part, we will compare the two models to understand whether it is worth using our model or…