DecisionTreeClassifier is a built-in model in Scikit-learn that can be used to simply implement Decision Tree Classification models in Python.
DecisionTreeClassifier is capable of high performance training and it will handle up to million rows and 100 features in a few minutes. Furthermore, DecisionTreeClassifier can make very fast predictions thanks to its Linear Time Complexity.
In this tutorial we will demonstrate basics of a DecisionTreeClassifier implementation with Scikit-learn in Python.
You can import DecisionTreeClassifier from sklearn.tree module as below and use it to create a Decision Tree model object.
from sklearn.tree import DecisionTreeClassifier
DT = DecisionTreeClassifier()
Once the model is created next steps will be to fit the model and it will be ready for prediction.
Once the model is created next steps will be to fit the model and in this phase model is being trained with training data.
DT.fit(X_train, y_train)
After training model will be ready for predictions.
yhat = DT.predict(X_test)
DecisionTreeClassifier has plenty of hyperparameters that can be tuned. Tuning the model can be used to:
max_depth is the parameter that defines the maximum tree depth allowed. This is probably the most commonly adjusted decision tree parameter.
Lower values of it will increase performance and help avoid overfitting but if you go too low accuracy and information gain will likely suffer.
To learn more about max_depth and its usage as well as other decision hyperparameters that can be tuned check out:
DecisionTreeClassifier has plenty of hyperparameters that can be tuned to:
Thank you for visiting this Decision Tree Tutorial. We have seen the basics of decision tree classification and Scikit-learn’s built-in DecisionTreeClassifier model.