What is a decision tree?

A decision tree is a unique kind of probability tree. A popular and powerful tool used for prediction and classification. The decision tree structure is the same as a flowchart with a tree structure. The internal node refers to the attribute test, the branch refers to the outcome of the test, and the leaf node(terminal) contains the class label.

A decision tree is an illustration of the decision-making process. In artificial intelligence (AI), the decision trees are used to decide the conclusion based on data available from past decisions. However, the conclusions are reserved values that are deployed to forecast the action(taken in near feature).

Supervised learning

A decision tree is an algorithmic and statistical model of machine learning, which learns and interprets the response from several problems and their consequences. However, the decision tree knows the decision-making rules in certain contexts depending upon the data availability. In a decision tree, the learning process is continuous and feedbacks are used to improve the learning outcome. Thus, this type of learning is known as supervised learning and the models in the decision tree support supervised learning.

Terminologies of decision tree

The important terminologies used in the decision tree are,

Root node represents the entire sample or population and this node is further divided into two or multiple homogeneous sets.
Decision node is represented in a square shape. Decision node splits the sub-node into additional sub-nodes.
Chance node is represented in circle shape and shows the probablities of vertain results
End node is represented in a triangle shape and shows the final output of the decision path.
Splitting is the process that divides the node into more than one sub-node.
Pruning is the process of removing a sub-node from the decision tree.
Branch tree: The entire tree’s subsection is called a sub-tree or branch tree.

The image represents decision tree classification.

Symbols in decision tree

The image represent the symbols used in decision tree.

Types of decision tree

The decision tree is broadly classified into two kinds,

Classification tree

The classification tree analysis is used, when the outcome of the predicate is a class (where it has the data). In other words, a dataset is classified into available datasets or classes. Example: Examining a Facebook comment and classifying the text as either positive or negative.

Regression tree

The regression tree analysis is used, when the outcome of the predicate is a real number. In other words, the prediction depends on either single or multiple predictors. Example: Length of patient’s stay at the hospital.

Classification and regression tree(CART) analysis is a kind of umbrella term, which refers to the above procedures. These procedures are introduced in the year 1984. Both the trees have certain similarities, but the major difference is the procedure to discover the split.

Metrics

The algorithm to establish a decision tree will work in a top-down manner. It determines the variable at each step to splits the item set. To calculate the best, different metrics are used by different algorithms and some of the metrics are,

Gini impurity

Measures the count of randomly selected items from a set, which could be labeled incorrectly (randomly named depending on the distribution label at the subset). CART algorithm uses Gini impurity for a classification tree. The Gini impurity of the data set is defined as,

$G i n i = 1 - \sum_{i = 1}^{n} {(p_{i})}^{2}$

Entropy

The measure of randomness in the data being processed. Higher entropy, harder to attain the conclusion. Mathematically entropy for a single attribute is,

$E (S) = {\sum - p_{i} \log_{2} p_{i}}_{i = 1}^{c} . W h e r e S - c u r r e n t s t a t e, P i - p r o b a b i l i t y o f t h e e v e n t (i) o f s t a t e (S) o r p e r c e n t a g e o f t h e c l a s s (i) i n s t a t e n o d e (S) .$

Mathematically entropy for a single attribute is,

$E (T, X) = {\sum P (c) E (c)}_{c \in X}^{c} W h e r e T - c u r r e n t s t a t e a n d X - s e l e c t e d a t t r i b u t e$

Information gain

The tree algorithm like ID3, C5.0, and C4.5 use information gain(IG). It is based on the concept of information content and entropy from the information theory. This is used to find the feature to splits at every step at tree construction. The mathematical representation of IG is,

$I n f o r m a t i o n G a i n = E n t r o p y (b e f o r e) - \sum_{j = 1}^{k} E n t r o p y (j, a f t e r) W h e r e b e f o r e - d a t a s e t b e f o r e t h e s p l i t, k - s u b s e t s g e n e r a t e d b y u \sin g s p i l t, a n d (j, a f t e r) - s u b s e t j a f t e r t h e s p l i t .$

Variance reduction

The reduction variance is used, when the decision tree works under regression; however, the output will be continuous. The algorithm uses a variance formula to split the population,

$V a r i a n c e = {\frac{\sum_{}^{} (X - \bar{X})}{n}}^{2} W h e r e X - m e a n o f t h e v a l u e,$

X- actual value,

n- number of value.

Decision tree algorithm

The decision tree algorithm comes under the supervised learning algorithm. This algorithm solves regression problems and classification problems. The decision tree aims to develop a training model, which predicts the target variable value and class with decision rules from the training data (prior data). For the record, a class label is predicted from the tree root and compares the root attribute value with its record attribute. This could be a basic comparison as it follows the branch values and jumps to its next node. The following are the algorithms used in creating a decision tree,

Iterative Dichotomiser 3(ID3).
C4.5( ID3 Successor).
CART.
Multivariate Adaptive Regression Splines(MARS).
Chi-square Automatic Interaction Detector(CHAID).

Example

The following example explains the options for mobile phone production. Each of the units has high and low-profit margins. In the end, it contains terminator nodes with their results. On the basis of that, Technology A has been chosen while Technology B has been rejected.

Context and Applications

This topic is important for postgraduate and undergraduate courses, particularly for,

Bachelors in computer science engineering.
Associate of science in computer science.

Practice Problems

Question 1: ____ is used to predict and classify data.

a) Flowchart

b) Decision tree

c) B+ tree

d) Regression tree

Answer: Option is b correct.

Explanation: A decision tree is a unique kind of probability tree, which is both a popular and powerful tool used for prediction and classification. The internal node refers to the attribute test, the branch refers to the test outcome, and the leaf node contains the class label.

Question 2: How many types of decision trees are there?

a) 5

b) 3

c) 2

d) 4

Answer: Option c is correct.

Explanation: A decision tree is a unique kind of probability tree, which is categorized into two kinds namely, regression and classification tree. Both trees have certain similarities, but the major difference is the procedure to discover the split.

Question 3: The subsection of a whole tree is called ___.

a) Branch tree

b) Internal node

c) Training data

d) Regression tree

Answer: Option a is correct.

Explanation: Subtree is the subsection of a whole tree; branch tree is the other name of the subtree. This is one of the important terminology used in the decision trees, and the others are root node, decision node, splitting, end node, change node, and pruning.

Question 4: Select the metric used in the construction of the decision tree.

a) AdaBoost

b) Gini impurity

c) Linear regression

d) None of the above

Answer: Option b is correct.

Explanation: Gini impurity is a metric of the decision tree. The decision tree supports several metrics like Information gain, chi-square, variance reduction, gain ratio, entropy, and Gini impurity. To measure the best, various metrics are used by various algorithms.

Question 5: The end node is represented in _____ shape.

a) Triangle

b) Circle

c) Rectangle

d) None of the above

Answer: Option a is correct.

Explanation: The decision tree uses different shapes to represent different nodes. The end node is represented in a triangle shape and shows the final output of the decision path.

Want more help with your computer science homework?

We've got you covered with step-by-step solutions to millions of textbook problems, subject matter experts on standby 24/7 when you're stumped, and more.

Check out a sample computer science Q&A solution here!

*Response times may vary by subject and question complexity. Median response time is 34 minutes for paid subscribers and may be longer for promotional offers.

Search. Solve. Succeed!

Study smarter access to millions of step-by step textbook solutions, our Q&A library, and AI powered Math Solver. Plus, you get 30 questions to ask an expert each month.

Tagged in

Engineering Computer Science

Artificial Intelligence

Decision Making

Decision Tree

Decision Tree Homework Questions from Fellow Students

Browse our recently answered Decision Tree homework questions.

Q: Write a menu-driven program for Food Court. Display the food menu to a user (Just show the 5…

Q: the LU factorization of the symmetric part of A(100*100)and verify that your factorization is…

Q: Convert the regular expression 0(0+1)*11 to an \epsi - NFA in such a way that you are guaranteed…

Q: Write a function with signature double var(int n, double *xarr) that takes an array xarr of length n…

Q: Do monitoring tools such as DataDog require us to use a complex query language to search, filter,…

Q: 2.2 a. Use the languages A = {ab"c" | m, n > 0} and B = {a"b"c"|m, n > 0} together with Example 2.36…

Q: What is Bayes' Theorem in the context of artificial intelligence?

Q: this is what my professor said: 1. Memory starts at 32k, but we always have a leading 1k header, so…

Q: Create a program in assembly code that generates all prime numbers between 2 and 1000 using the…

Q: Construct a PDA that matches all strings in the language over {x,y} such that each string has an odd…

Q: Dot Product iTask: Write a program that outputs the dot product of two given vectors. Your program…

Q: Write a C++ function that checks whether a given string is a palindrome. A palindrome is a word,…

Q: Q4: Consider the finite-state machine logic implementation in Fig. 2. 1. Determine the next-state…

Q: 2.13 Let G = (V,E, R, S) be the following grammar. V = {S,T,U}; Σ = {0, #}; and R is the set of…

Q: Please give correct code. Thanks

Q: USING MATLAB: Project 2 – Vaccine DistributionIntroductionThis project will require you to write a…

Q: Can you please explain to me the error in this code? #include void main() { int…

Q: What are the key factors that the company should consider while choosing a cloud service provider?…

Q: Explain the concept of Quality of Service (QoS) in computer networks. How does QoS manage network…

Q: Construct a diagram to depict the structure of an HTTP response. Elucidate every component of the…

Q: Consider a system that handles railway connections as it associates (source) cities to all their…

Q: Discuss four scenarios of getting the output data and the results obtained based on the Turing…

Q: This is what my professor said: Items are 4 bytes, pages are 2k, so ~512 items fit on a page. Even…

Q: Code in main() takes a space-separated list of numbers and inserts each into a Sorted NumberList.…

Q: Why won't my java code run properly? " P1 Implement a binary search on an array iteratively using…

Q: Please help me with this. I am having trouble understanding what to do Please create an analog…

Q: Visual basic>

Q: dont use ai to do it !!! Q1). Suppose $s0 stores the base address of word array A and $t0 is…

Q: Convert the following CFG into CNF. SXYZ XaXbS |a|A Y→ SbS | X | bb Z→ b

Q: How do i locate a hast.txt file on my windows PC that was copied over to me by my professor?

Q: PYTHON PROGRAMMING/ LU FACTORIZATION (Please provide the python code for the solution)

Q: Write a functionIsItASpanningSubgraph (G,H) which takes as input two graphs in nx format (not as…

Q: Write a program in C to assist in circuit calculations (No previous knowledge in circuits is…

Q: What is the role of a Scrum Master in an Agile development team? How does the Scrum Master…

Q: Given the matrices below and that A = LU, complete L-1, U-1, and A-1. 1 -4 - -1 1 00 1 -4 - -1 A =…

Q: Apply the Paint Blue algorithm discussed in class to the following Finite Automata. a a a b b a a +…

Q: rite a Little Man progr the odd values from 11 1,1+ 3,1 +3 +5, 1 required. As an aside, ting about…

Q: Why is understanding probaility important in artificial intelligence?

Q: Read the whole research article titled 'Of Techno-Ethics and Techno-Affects' by Sareeta Amrute. And…

Q: Hello, I'm struggling to comprehend this problem and its elements. Would you mind providing a…

Q: Write a program in JAVA that sorts a Stack of unique Integers using In-Place Selection Sort. The…

Q: Suppose we have the following pseudocode (i.e. not directly C++ syntax) that executes in the same…

Q: Need help with these questions. Add your tests to the main() method. #1: Implement a binary…

Q: • A constructor that creates a cruise ship with the specified name, the specified year that the ship…

Q: Refer to image and provide correct solution! Computation and automata!

Q: if possible. This will give you a grey disk on which to make a map of the lit side of the Moon.…

Q: Please complete coding assignment in Python

Q: A windows machine was determined to have LM hashes. Can you crack the users passwords?…

Q: Answer one

Q: Write an assembly program that checks whether an unsigned number is a prime number or not. Assume…

Search. Solve. Succeed!

Study smarter access to millions of step-by step textbook solutions, our Q&A library, and AI powered Math Solver. Plus, you get 30 questions to ask an expert each month.