Saturday, May 06, 2017

Decision trees

Decision trees provide a way of assigning a probability to a choice (Ledolter, 2013).  They are a visual representation of a decision-making process. They are useful as a tool to aid in decision making when there is a cost or reward associated with a decision with estimated probability of success. Figure 1 shows a contrived example of deciding to take an umbrella. This example would be more interesting if it cost something to take the umbrella, and the umbrella saved money. For example, if you had to rent an umbrella but having it saved you a dry-cleaning expense if it rained, then the example gets much more interesting.
The expected value related to a decision is calculated by applying the probability of a decision to the value associated with a related outcome (Kirkwood, 2002). If a decision is deconstructed to subsequent decisions, the expected values of those decisions are summed to the higher level.
Decision trees require that the results of a decision point be mutually exclusive (Ahlemeyer-Stubbe & Coleman, 2014). Using the tree analogy, once a limb branches, the resulting limbs must not be connected or rejoined. The goal is to partition the data at each decision.

Comparison of Approaches

The following is a breakdown of the strengths and weaknesses of linear and logistic regression, decision trees, and neural networks (Ahlemeyer-Stubbe & Coleman, 2014, pp. 141-142). I include neural networks as it is both a research interest and a common data mining technique.

Linear and logistic regression

Advantages

·        Possible to do parameter estimation and hypothesis testing
·        Model is directly interpretable
·        Fast execution
·        Good for screening purposes
·        Standard software 

Disadvantages

·        Large amount of manual
·        Assumptions must be met about distribution and linearity
·         Sensitive to outliers
·        Mediocre performance

Decision Trees

Advantages

·        Simple to interpret and implement results
·        No assumptions about distribution and linearity
·        Not sensitive to outliers
·        Fast execution
·        Good for screening purposes
·        Good performance

Disadvantages

·        Often too few (6–8) final nodes
·        Results of limited value, because not enough groups
·        No hypothesis testing and parameter estimation
·        Needs specialized software

Neural Networks

Advantages

·        All types of data can be analyzed
·        No assumptions about distribution and linearity
·        Good performance
·        Generally applicable predictive equations derived
·        Little manual work

Disadvantages

·        Difficult to interpret
·        Sensitive to outliers in continuous data
·        No hypothesis testing and parameter estimation
·        Slower execution
·        Needs specialized software


Figure 1

References

Ahlemeyer-Stubbe, A., & Coleman, S. (2014). A practical guide to data mining for business and industry: John Wiley & Sons.

Kirkwood, C. W. (2002). Decision tree primer. available on-line at http://www. public. asu. edu/~ kirkwood/DAStuff/decisiontrees/index. html.

Ledolter, J. (2013). Data mining and business analytics with R: John Wiley & Sons.

No comments: