Many papers refer to the use of GLMs in their analyses – but are you sure you know to which statistical approach they refer? Professor Daniel Blumstein and Associate Professor Noa Pinter-Wollman (University of California, Los Angeles) are here to clear up any confusion, and suggest a path going forwards…
Statistics have evolved rapidly and the proliferation of acronyms sometimes creates novel problems, particularly for those who use statistics as a tool rather than as their subject discipline. Two similar analyses now are abbreviated GLM: the general linear model and the generalized linear model. The use of the same acronym for two different statistical approaches creates confusion and miscommunication when reading methods sections of scientific papers.
Traditionally, the general linear model was viewed as broad term for linear regression, analysis of variance, or analysis of covariance, all of which minimize the sum of squares to explain variation in a continuous dependent variable as a function of categorical and/or continuous independent variables. Commercially produced statistical packages, like SPSS, give users a choice of fitting a GLM, or use different procedures to fit a regression or an ANOVA depending on the nature of the independent variables. In R, the lm() function produces identical results.
More recently, generalized linear models, also abbreviated GLM, extend general linear models by using maximum likelihood algorithms to fit models to data by specifying non-normal error distributions using a link function. Their use has become common due to the development of computationally intensive maximum likelihood techniques to fit statistical models and the availability of powerful personal computers. One of the functions in R that implements such models is glm().
We suggest that to avoid confusion, the original general linear model be referred to exclusively as LM, while the newer generalized linear model be referred to by the acronym GLM. By adopting this as tradition we will reduce the opportunity for statistical consumers trained in the Anthropocene to argue with their elders, trained in the Holocene, about which GLM is being used in a particular instance.
One thought on “Which GLM?”
Very true! While I feel authors do specify whether they used lm() or glm() early on in the methods, the first time they mention the abbreviations, nonetheless this aspect may get neglected. As suggested linear and linearized modelling is totally in sync with the R functions (lm() and glm()) respectively.