Random Forest Disadvantages

Random Forest Advantages by far outweighs Random Forest Disadvantages. We compiled a small list of Random Forest’s shortcomings and it can be useful to know these factors for an improved practical experience with Random Forests and more aligned expectations.

In this article we will elaborate on disadvantages of Random Forests and we will also share some points on how to address them through tuning.

1- Black Box

If you are spoiled with the interpretability decision trees offer, you may not like that random forest is not interpretable in the same way.

You can’t observe each split that takes place in each tree included in a Random Forest it just wouldn’t be very intuitive.

However, Random Forest makes up for it with so many cool characteristics such as accuracy, great parameters for tuning, high performance and more.

2- Overfitting

Random Forests like all models can tend to overfit especially if you are not careful with the tuning aspect of machine learning.

It is such a powerful model with so many powerful parameters but if you go overboard you will end up with an overfit model which struggles predicting real world data accurately.

Don’t worry though there are so many ways to address it. Kindly see our Random Forests Tuning page for ideas.

3- Might Train Slow

Since Random Forest consists of multiple trees (100 by default), it tends to train slower compared to a single decision tree model.

Random Forest still handles big data and dimensional data fairly well and you can do so much to address performance issues if any.

For example Random Forest will have great accuracy even with 5 to 10 trees in most cases and you can take advantage of parallel processing by tuning n_jobs parameter.

Many Options for Customization

4- Random Forest Learning Curve

Another potential downside of having plenty of significant hyperparameters is the learning curve. Although very intuitive and enjoyable random forest introduces plenty of parameters and hyperparameters on top of parameters that come with decision trees.

So, it can be slightly overwhelming learn, understand and master random forests in the beginning compared to a super plain vanilla machine learning model such as kNN, just something to also keep in mind and be prepared for.

Random Forests will make a lot more sense if you start learning from Decision Trees and allocate necessary time to comprehend fundamental tree concepts such as splitting, nodes, branches, leaf nodes, tree depth, bagging, boosting etc.

Not an easy list to make

Random Forest Disadvantages Summary

So we have a few Random Forest disadvantages, which none of them are very solid disadvantages and mostly they are easily addressable. Random Forest don’t have any significant issue in feasibility department regardless if data is too big or missing values or has noise. It’s fast, accurate and has greatly useful hyperparameter for tuning.

Interpretability is not a real disadvantage and you can also resort to Decision Trees for demonstration purposes. Overfitting can be addressed and it’s a shared drawback for all machine learning models. And slow training is not that slow and it can even be easily improved by resorting to one or two of the many Random Forest hyperparameters that can easily be tuned without affecting predicting performance too much if at all.

This also explains further why Random Forests are a single most favorite machine learning model for many. If you are curious about who found it, it was also who co-invented Decision Trees. You can read our post about

Leo Breiman and the original Random Forest paper.

For the many selling points of Random Forest machine learning algorithm you can check out our post here:

Advantages of Random Forest.