Visualizing A Decision tree using GraphViz and Pydotplus.
Graphviz is an open-source graph visualization software. Graph visualization is a way of representing structural information as diagrams of abstract graphs and networks.
export_graphviz:
This function generates a GraphViz representation of the decision tree in dot format, which is then written into an output file (‘out_file’).
We’ll be using the iris dataset to visualize the decision tree formation.
from sklearn import datasets
from sklearn.tree import DecisionTreeClassifier
from sklearn.tree import export_graphviz
iris = datasets.load_iris()
clf = DecisionTreeClassifier()
clf.fit(iris.data,iris.target)
Creating a dot file.
dot_data = export_graphviz(clf, out_file=None,feature_names=iris.feature_names,filled=True)
out_file
It gives the name of the output file. If None, the result is returned as a string.
feature_names
Names of each of the features in the dataset. If None generic names like feature_0,feature_1,feature_2… will be used.
filled
It takes a boolean value. If True it paints the node to indicate the majority class for classification, extremity of values for regression, or purity of node for multi-output.
Pydotplus
PyDotPlus provides a Python Interface to Graphviz’s Dot language.
import pydotplus
from IPython.display import Image
graph = pydotplus.graph_from_dot_data(dot_data)
Image(graph.create_png())
The jupyter file to this notebook is here.
In the next blog, we will be doing our first project on building a decision tree from scratch in python.