Unlocking Insights: Random Forests for PCA and Feature Importance | by Christopher Karg | Mar, 2024

Editor
2 Min Read


How a tried and tested solution can yield excellent results in tackling a day-to-day ML problem

source: https://www.pexels.com/photo/a-tractor-on-a-crop-18410308/

With so much attention on generative AI and vast neural networks, it is easy to overlook the tried and tested Machine Learning algorithms of yore (they’re actually not that old…). I would go so far as to argue that for most business cases, a straightforward Machine Learning solution will go further than most complex AI implementation. Not only do ML algorithms scale extremely well, the far lower model complexity is what (in my opinion) makes them superior in most scenarios. Not to mention, I have also had a far easier time tracking the performance of such ML solutions.

In this article, we will tackle a classic ML problem using a classic ML solution. More specifically, I will show how one can (in only a few lines of code) identify feature importance within a dataset using a Random Forest classifier. I’ll start by demonstrating the effectiveness of this technique. I’ll then apply a ‘back-to-basics’ approach to show how this method works under the hood by creating a Decision Tree and a Random Forest from scratch whilst benchmarking the models along the way.

Share this Article
Please enter CoinGecko Free Api Key to get this plugin works.