Thursday, 18 August 2016

Proper Analysis of Categorical Outcomes: Make Sure You Understand the Model

Introduction: This letter is written in response to the recent publication by Darabont et al. regarding the possible relationship between Acute Pulmonary Edema (APE) and Renal Artery Stenosis (RAS). In this publication, the authors used a statistical technique known as “linear discriminant analysis” to assess the relationships between selected predictor variables (including APE) and their study outcome (RAS). Although the authors are commended for taking on this investigation, the choice of statistical analysis is inappropriate for their data, and has some technical assumptions which make it unsuitable for the manner in which it was used by Darabont et al.


Briefly, linear discriminant analysis is meant to find a linear combination of factors that correctly predict or characterize a certain event. This may sound like technical jargon, but simply put it is a way to predict a categorical outcome variable using continuous predictor variables. To the statistically fluent reader, this may sound a lot like logistic regression, and it should; logistic regression, also, essentially creates a model using a linear combination of factors to predict a categorical outcome variable. However, there are key technical differences between the two, and there is a reason that logistic regression remains highly prevalent today while linear discriminant analysis is rarely seen.In 1978, Press and Wilson compared lineardiscriminant analysis to logistic regression and found that logistic regression was the superior technique in the vast majority of cases. 

No comments:

Post a Comment