In his thought-provoking blog, Tim asks a fundamental question every ecologist has to think about occasionally: how many terms should I include in my model? Tim argues that models with a high heuristic value include only a few parameters; models like Verhulst’s logistic model of population dynamics and Lotka-Volterra’s predator-prey model. Tim also advises that ecologists in the quest of universal laws should limit the number of parameters in their models to as few as necessary to get the job done. However, I shall argue that the devil is in the detail!
I agree that there is a trade-off between the general value of a model and its complexity. This corresponds to the well-known trade-off between external and internal validity. The mere existence of this trade-off means that the complexity of a model an ecologist should construct will depend on the context. When one’s aim is to get some idea about how a complex system behaves, we need simple models that allow us to identify how model predictions respond to changing an individual parameter. In addition, in such cases, we can learn more from any deviation between observation and model predictions than by simply examining parameter values estimated from data. However, using a general model to address a question from a specific case study is likely to give the wrong result. From my viewpoint, it is trivial to state that model complexity should be determined by the question being asked. I follow this simple principle by building models of ungulate populations with very different levels of complexity. For instance, I sometimes fit detailed age-specific models to show that senescence occurs in both survival and reproduction, but I did not have any concerns in neglecting senescence for questions for which I was confident that senescence was not altering model predictions. However, to date I have always included a minimum of two age-classes because it has been consistently reported from empirical studies of ungulates that the survival rates of individuals between birth and one year of age have markedly different values, amount of variation, drivers, and demographic impact than those of individuals that are older than 1 year of age.
I consequently closely follow Einstein’s statement quoted by Tim that ‘Everything should be made as simple as possible, but no simpler’, but I have a slightly different view than Tim about where simple becomes too simple. The main difference between our viewpoints is, I guess, whether we should consider the threshold between “simple” and “too simple” as fixed or as increasing in complexity with time. I interpret (maybe wrongly) that the quest for “universal laws” that Tim mentions as assuming a fixed threshold, whereby I definitely see the threshold as being dynamic, such that the truth of today is not that of tomorrow. Having now reached the ripe old age of 50, I have been witness to a continuing “statistical revolution” in ecological research, nicely summarized by Olivier Gimenez and colleagues in a recent issue of Biology Letters. When I began my research career in the mid-eighties, running a one-way ANOVA was considered to be a complex statistical analysis, and getting time-dependent capture-mark-recapture estimates of survival from a 15 year long monitoring of individuals of some species took several hours. Nowadays, most field ecologists are used to running a large number of Generalized Linear Mixed models to assess the effects of 5 or more independent variables, or they fit complex Resource Selection Functions to assess habitat selection by animals from large GPS datasets. It thus seems obvious to me that our knowledge is continuously progressing along with the tools we use, and this means that what was once acceptable, or even great some years ago, is not necessarily relevant today.
Let me provide some examples to support my dynamic view of how to tease apart “simple” and “too simple”. We have very few laws in biology (and maybe no universal laws, but I leave that philosophical viewpoint for later!), but one of the most well-known is the “law of mortality” proposed by Gompertz almost two centuries ago. The exponential increase of mortality rate with age inherent in the “law of mortality” has long been assumed across all living organisms. However, by revealing the existence of a strong diversity of patterns of senescence across the tree of life, including the occurrence of negligible senescence in some lineages, Owen Jones and colleagues demonstrated last year in Nature that the “law of mortality” is not universal. Therefore, one should not model actuarial senescence of, for example, hydra using the Gompertz model. The logistic model is another very good example. As rightly stated by Tim, this model has contributed a lot to improving our knowledge, but it has no value nowadays in the empirical world. For instance, not a single manager would pretend today that maximum sustainable yield would occur at K/2 for most ungulate populations. Note that Dale McCullough made this point as early as1979! As a consequence, I (like many others) have changed the type of models I use over the years. Most empirical analyses published today are multifactorial, and people are increasingly concerned with confounding factors when interpreting results. My view is that we should use our current knowledge about the process and the population under study to decide what should be the simplest acceptable model, not some hypothetical universal law.
A last point that struck me in Tim’s blog was the distinction he made between “field biologists” and “theoreticians”. I don’t think such a Manichean distinction is real. For example, Darwin was not a modeller and made careful observations without formulating a mathematical theory. Fisher did that later on. While, as a Fisherian evolutionary biologist, I rank Fisher very highly, I will not pretend he was more influential than Darwin. More importantly, I see a potential negative side effect of maintaining a distinction between theoreticians and field biologists. As ecologists, we all face the huge problem of conserving biodiversity in an increasingly changing world. To have a chance of being successful, we need to join efforts, by combining a large panel of skills, coming either from theory, model, observation, or experience. Given this, I believe that being in the middle with modellers and field biologists, as Tim is, corresponds to a privileged rather than to an uncomfortable position. Having the chance of collaborating a lot with Tim, I can tell you that he is far too modest in his writing. Believe me, he is the living proof that it is possible for an ecologist to be a skilled theoretician and a clever empiricist at the same time!
Senior Editor, Journal of Animal Ecology
2 thoughts on “How complex should models used by ecologists be?”
Pingback: FLUMP – Darwin Day, Machine Learning, Model Complexity, and more | BioDiverse Perspectives
Pingback: A Look Back At 2015 … And A Little Peak Forward | Animal Ecology In Focus