Decision Tree Classification

DecisionTreeClassifier is a built-in model in Scikit-learn that can be used to simply implement Decision Tree Classification models in Python.

DecisionTreeClassifier is capable of high performance training and it will handle up to million rows and 100 features in a few minutes. Furthermore, DecisionTreeClassifier can make very fast predictions thanks to its Linear Time Complexity.

In this tutorial we will demonstrate basics of a DecisionTreeClassifier implementation with Scikit-learn in Python.

How to Construct?

DecisionTreeClassifier

You can import DecisionTreeClassifier from sklearn.tree module as below and use it to create a Decision Tree model object.

Creating DecisionTreeClassifier Model:

from sklearn.tree import DecisionTreeClassifier
DT = DecisionTreeClassifier()

Once the model is created next steps will be to fit the model and it will be ready for prediction.

Training DecisionTreeClassifier Model:

Once the model is created next steps will be to fit the model and in this phase model is being trained with training data.

DT.fit(X_train, y_train)

Predicting with DecisionTreeClassifier Model:

After training model will be ready for predictions.

yhat = DT.predict(X_test)

DecisionTreeClassifier has plenty of hyperparameters that can be tuned. Tuning the model can be used to:

increase accuracy
increase performance
decrease overfitting
avoid bias

Which is the most commonly tuned decision tree parameter?

max_depth

max_depth is the parameter that defines the maximum tree depth allowed. This is probably the most commonly adjusted decision tree parameter.

Lower values of it will increase performance and help avoid overfitting but if you go too low accuracy and information gain will likely suffer.

To learn more about max_depth and its usage as well as other decision hyperparameters that can be tuned check out:

Tuning Decision Trees

Decision trees can also do regression. If you are considering to implement decision trees for predicting continuous values you can visit:

Decision Tree Regression

DecisionTreeClassifier has plenty of hyperparameters that can be tuned to:

increase accuracy
increase performance
decrease overfitting
avoid bias

DecisionTreeClassifier Summary

Thank you for visiting this Decision Tree Tutorial. We have seen the basics of decision tree classification and Scikit-learn’s built-in DecisionTreeClassifier model.