For those readers who have met me, it will come as no surprise that I was a bit of a geek when I was doing my undergraduate studies. And that was long before geek was in any way sexy. Sheldon (from ‘The Big Bang Theory’ and not @ben_sheldon_EGI) probably hadn’t been born. However, one day one of the cool gang of undergraduates did talk to me. She wondered whether she could use my results from a practical she had been ‘unable’ to attend. I wanted to help but I was also concerned she’d copy my data and I’d end up being the one hauled over the coals for plagiarism. So I came up with a cunning plan. I wrote some code on the VAX (look it up online if you’re under 45) that took my data and generated a pseudo random dataset with many of the same statistical properties as the dataset I had collected. It took me most of the night. Nicole seemed happy, but not sufficiently so to come for a drink with me.
This exercise was a great way to start to learn about statistics, and it made me appreciate that statistics was primarily about decomposing variance. I say ‘start to learn’ because each time I learn a new statistical method, someone, with probability 1, invents a more powerful one that I am supposed use. I can illustrate this by looking at papers I have been involved with analyzing survival rates in Soay sheep. When I started my first post-doc in 1994, logistic regression was all the rage. When I applied it, I found that age, sex, density, body weight and weather explained considerable amounts of variation. But, of course, this analysis was biased because we didn’t correct for imperfect recapture rates. So we did that. Then we had to go all Bayesian, as that had become the trend. Then we explored model space as well as parameter space within each model using Reversal Jump Markov Chain Monte Carlo. By this point I am using ‘we’ in the loosest possible sense. I was collaborating with, among others, Byron Morgan, Ted Catchpole, Steve Brooks and Ruth King who are some of the most capable statisticians on the planet. At this point the methods started to get a little more vague. I have a recollection of discussing a Frequentist version of Reversible Jump Markov Chain Monte Carlo called Trans-Dimensional Simulated Annealing. I remember thinking that sounds cool, and that I can make myself sound more intelligent than I am, by casually dropping the methods into a blog at some point in the future. Anyway, by the end of all this statistical wizardry we had come to the conclusion that survival was influenced by age, sex, density, body weight and the weather. This is not to knock all this work at all – reaching the same conclusions was by no means guaranteed, and we now had less biased estimates with more appropriate levels of uncertainty. I suspect that one of the reasons that we reached the same conclusion across a range of ever-more-complicated statistical methods of increasing complexity is because the data are pretty complete – the recapture rate is very close to unity –, the signatures in the data are strong, the data are measured with limited measurement error and we always used the same linearized association between survival and our explanatory variables.
Collaborating with brilliant statisticians proved useful for a multitude of reasons. I was forced to learn a fair bit about each of the methods we used. Usually this was not sufficient for me to efficiently apply the methods myself, but it was sufficient to understand the theory behind the methods. I reinforced my earlier insights from helping Nicole: all statistical methods are based on decomposing variance in data, but they differ in what variance is being decomposed (your y variable or the process being modeled), the choice of underlying model being fit (additive, non-additive, linear, non-linear), the components into which to decompose the variance (measurement variance, sampling variance, variance due to fixed or random effects), the way the parameters are estimated, and the way goodness-of-model fit is assessed.
Personally I don’t request authors to fit more complicated statistical models when I do not know how to assess how well the model fits the data. Many reviewers feel similar, and often suggest that a more statistically minded reviewer looks over the paper. However, this is not always the case. I have (at least) two bugbears that repeatedly irritate me, and I see them quite a lot as an editor of JAE. One is reviewers who I know struggle to conduct a t-test, requesting authors to conduct something like non-linear mixed effect models with a rather exotic error structure when it is clear from the figures in the paper that no matter how the nicely designed experiment is analyzed, the extremely strong patterns in data cannot be forced away. As an editor I quite frequently tell authors that they do not need to run the more complex analysis, primarily because assessing the goodness-of-fit of very complex models is usually a challenge, and one that statisticians can’t always agree on.
My second bugbear is prophets of information criteria. I don’t have any problem with things like the AIC per se, but each criterion is just one measure of goodness of fit. The problem is that no one number can perfectly summarize something as complex as how well a model fits the data. My personal preference would be to include plots of model fit in appendices where this is possible. Or if plots are rather hard to do, why not try cross-validation? Presumably being able to predict part of the data from the remainder is often a great test. There are cases when cross-validation may not be an option – for example, when data are sparse.
The other day I read a review where authors had conducted cross-validation and included plots of goodness-of-model fit. To my mind, the paper was a paradigm of how statistics should be conducted. But one reviewer demanded AIC statistics and quoted Burnham and Anderson. I have my doubts that the reviewer has even read the book.
My take on statistics is that, where possible, simple statistical analyses should be used. This is particularly the case when data are from a well-planned experiment. In some cases, more complex analyses are required, particularly when measurement or sampling error is likely to be considerable. Complex statistical methods certainly have an important place in biology, but they can never make up for more accurate data and larger sample sizes. Most of us are also not capable of conducting technically demanding methods like Reversal Jump Markov Chain Monte Carlo (RJMCMC) without a huge investment of time, even if we understand that they offer a fantastic opportunity to explore both model and parameter space. Perhaps we can find code to fit these models, but without understanding what the code does, there is a risk we will apply the approach incorrectly. One thing that I learned about RJMCMC is that the order with which the algorithm explores models can impact results and conclusions. I am not sure it would be within my capacities to use the algorithm correctly!
In spite of the new breath of life breathed into statistics with the new computationally intensive methods developed over the past decade or so statistics may still not be particularly sexy. A few years ago a few of us were chatting with a colleague on St Kilda. I will change the colleague’s name to spare his blushes. I shall call him Jon. He is from a Northern European country, and he was bemoaning his lack of success in finding an English girlfriend. We asked about his approach, and he told us his opening gambit was “hi, I’m Jon, and I teach statistics”. My colleagues and I decided to try to help him out, and came up with a few alterations. He left St Kilda with a handful of chat-up lines including
“Hi, I’m Jon, I teach statistics and chi-square I’ve met you before” and
“Hi, I’m Jon, I teach statistics and you’ve just run out of degrees of freedom”.
He is now happily married, so perhaps statistics is sexy; I find it hard to believe our chat-up lines made any difference. But it made me wonder what happened to Nicole. I have a strong suspicion she is now a signed up member of the church of the AIC, and I wouldn’t be surprised if she’s telling people to fit generalized non-linear mixed effect models with a multinomial error structure. Fortunately she always preferred cell biology rather than ecology and evolution, so hopefully I won’t have to respond to one her reviews!
Tim Coulson
Senior Editor, Journal of Animal Ecology
@tncoulson