We have a new edition that is a total rewrite and rearrangement, still published by Cambridge (official details here). We have updated many things and (we hope) explained some topics more clearly. The big change is that we have used a linear modelling framework for most of the book. Most or many “conventional” statistical approaches are introduced as particular flavours of linear models. For more details, see the New Edition tab.
Statistical analysis is at the core of most modern biology. Many biological hypotheses, even deceptively simple ones, align with complex statistical models. Before the development of modern desktop computers, determining whether the data fit these complex models was the province of professional statisticians. Many biologists instead opted for simpler models whose structure had been simplified quite arbitrarily. Now, powerful statistical software is available to nearly everyone. This allows complex models to be easily fitted. It creates a new set of demands and problems for biologists.
We need to:
- Know the pitfalls and assumptions of particular statistical models.
- Identify the model appropriate for the sampling design and kind of data that we plan to collect.
- Interpret the output of analyses using these models.
- Design experiments and sampling programs optimally, i.e. with the best possible use of our limited time and resources.
Data analysis may be done by professional statisticians, rather than statistically trained biologists, especially in large research groups or multidisciplinary teams. In these situations, we need to be able to speak a common language:
- Frame our questions in such a way as to get a sensible answer.
- Be aware of biological considerations that can cause statistical problems. We can’t expect a statistician to be familiar with the biological idiosyncrasies of our particular study. Without that information, we may get misleading or incorrect advice.
- Understand the advice or analyses we get, and be able to translate that back into biology.
Our book aims to place biologists in a better position to do these things. It arose from our involvement in designing and analysing our own data, providing advice to students and colleagues, and teaching classes in design and analysis. As part of these activities, we became aware of our limitations. This prompted us to read more widely in the primary statistical literature. More importantly, we realized the complexity of the statistical models underlying much biological research. We continually encountered experimental designs that were not described comprehensively in many of our favourite texts. This book describes many of the common designs used in biological research. We present the statistical models underlying those designs. We provide enough information to highlight their benefits and pitfalls.
Our emphasis here is on dealing with biological data – how to design sampling programs that represent the best use of our resources, avoid mistakes that make analyzing our data difficult, and analyse the data we collect. We emphasise the problems encountered with real world biological situations.
