Logistic regression is a powerful way to perform regression on variables with a binary outcome (Alice, 2015). Logistic regression utilizes the logit function. Logit is a non-linear transformation function (Ahlemeyer-Stubbe & Coleman, 2014). It is used to perform regression on the probabilities of a value taking on a particular categorical outcome, rather than the outcome value itself.
One approach that could be utilized to use logistic regression to produce categorical values would be to test for a single categories value at a time. For example, if the categorical output variable is color with values of red, black or white one test can be performed to predict that color is black, otherwise it is one of the other colors. This would require multiple iterations and a model per category.
A better approach would be to utilize multinomial logistic regression. multinomial logistic regression is an extension of the logit link function that supports categorical variables that can take on one of several values (Ledolter, 2013).
The multinom function of the nnet package is used to perform multinomial logistic regression in R (Anonymous, 2017). Multinomial logistic regression differs from logistic regression in that one of the output categories must be chosen as the reference category (Marley, 2015). The odds ratio then becomes the odds of a particular value compared to the reference category.
An example of the application of multinom is prediction of a program of study (academic, general, or vocational) based on a categorical value related (categorical value) to socioeconomic status and written test scores (continuous value) (Anonymous, 2017). The example is somewhat difficult to follow. A key element is that the reference categorical value is academic, that is the odds general compared to academic and vocational to academic. The multinom procedure does not include p-value calculation. The referenced example utilizes a two-tailed z-test via the pnorm method.
References
Ahlemeyer-Stubbe, A., & Coleman, S. (2014). A practical guide to data mining for business and industry: John Wiley & Sons.
Alice, M. (2015, 2015-09-13). How to perform a Logistic Regression in R. Retrieved from https://www.r-bloggers.com/how-to-perform-a-logistic-regression-in-r/
Anonymous. (2017). R Data Analysis Examples: Multinomial Logistic Regression. Retrieved from http://www.ats.ucla.edu/stat/r/dae/mlogit.htm
Ledolter, J. (2013). Data mining and business analytics with R: John Wiley & Sons.
Marley, S. (2015, 2015-04-08). Analysing Categorical Data Using Logistic Regression Models - Select Statistical Consultants. Retrieved from https://select-statistics.co.uk/blog/analysing-categorical-data-using-logistic-regression-models/
No comments:
Post a Comment