Analysis of Variance (ANOVA): Experimental Design for Fixed and Random Effects

Authors:

David M. Francis, The Ohio State University

Heather L. Merk, The Ohio State University

Matthew Robbins, The Ohio State University

This page is a continuation of the overview of Analysis of Variance (ANOVA) and is intended to help plant breeders consider fixed and random effects. The concepts of fixed and random effects are discussed in the context of experimental design and analysis. Reference ANOVA tables are provided.

Introduction

This page is a continuation of the Overview of Analysis of Variance page and is intended to help plant breeders consider the notions of fixed and random effects and the impacts these can have on ANOVA in the context of plant breeding. Briefly, ANOVA is a statistical test that takes the total variation and assigns it to known causes, leaving a residual portion allocated to uncontrolled or unexplained variation, called the experimental error. By measuring variability as sums of squares deviating from the mean sum of squares for all observations, the variation assigned to different controlled causes will be additive. It is therefore important to completely define the statistical model. Otherwise, the experimental error may be unnecessarily inflated (McIntosh, 1983).

Fixed and Random Effects

In the Overview of Analysis of Variance page, we considered the following linear model:

Y = m + f(treatment) + error

where

  • Y is equal to the trait value
  • m is the population mean
  • f(treatment) is a function of the treatment
  • error represents the residual

Fixed Effects

Intuitively, we may think about the treatments as being under our control and as “fixed.” Usually we are interested in comparing the dependent variable among factors/levels of the fixed effect. For example, we may want to evaluate whether differences in yield (dependent variable) between field locations for some elite cultivars we’ve been developing. To conduct this experiment, we would select the cultivars we want to evaluate and find suitable locations for our trial. We could think of the cultivars and locations as being fixed; we purposely chose to study different cultivars and locations. In this case, we are only interested in the performance of the elite cultivars we’re testing in the specific locations we’re testing.

Random Effects

Random effects, in contrast to fixed effects, are typically used to account for variance in the dependent variable. Also, unlike fixed effects, we aren’t looking to compare one level of the random effect to another. In our example, we could also consider location as a random effect. In the case of random effects, levels are chosen randomly from an infinite population and we want to make inferences that can extend beyond the sample. If this were the case, the cultivars would still be fixed effects, but location would be random. If we felt our locations were representative of all possible locations, we could use the different locations to help us make an evaluation of how well cultivars perform across locations as a whole, not just at the locations we’ve tested. The classification of effects as fixed or random determines the appropriate F-test.

ANOVA tables

McIntosh (1983) provides a set of reference tables for use during experimental design and analysis. These tables are intended for field experiments conducted over two or more locations or years. Some of the tables are replicated below.

Table 1. Expected mean squares for randomized complete blocks experiments combined over locations.
Sources of variation df Mean squares Expected mean squares1
RL-RT RL-FT FL-FT
Locations (l) l-1 M1 σ2e + rσ2TL + tσ2R(L) + rtσ2L σ2e + tσ2R(L) + rtσ2L σ2e + tσ2R(L) + rtσ2L
Blocks(Location) (r) l(r-1) M2 σ2e + tσ2R(L) σ2e + tσ2R(L) σ2e + tσ2R(L)
Treatment (t) t-1 M3 σ2e + rσ2TL + rlσ2T σ2e + rσ2TL + rlσ2T σ2e + rlσ2T
Location x treatment (l-1)(t-1) M4 σ2e + rσ2TL σ2e + rσ2TL σ2e + rσ2TL
Pooled error l(r-1)(t-1) M5 σ2e σ2e σ2e

1 R = random, F = fixed, L = location, T = treatment

Table 2. F-ratios used to test effects for randomized complete block experiments combined over locations.
Sources of variation Mean squares Expected mean squares1
RL-RT RL-FT FL-FT
Locations (l) M1 (M1+M5)/(M2+M4) M1/M2 M1/M2
Blocks(Location) (r) M2      
Treatment (t) M3 M3/M4 M3/M4 M3/M5
Location x treatment M4 M4/M5 M4/M5 M4/M5
Pooled error M5      

1 R = random, F = fixed, L = location, T = treatment

In a genetic/breeding experiment, treatments would likely be genotypes or varieties.

Conclusion

When designing experiments, plant breeders must consider the question they want to answer. Consequently, plant breeders must consider what type of statistical analyses are appropriate to answer the desired question. With regards to ANOVA, two important points should be considered in this context.

  1. Unaccounted sources of variation will be pooled into the error term resulting in an inflated error.
  2. The appropriate F-tests differ depending on whether the effects are fixed or random.
  3. Fixed effects influence mean and random effects influence variance.

References Cited

Additional Information

Many statistics textbooks provide a good discussion of theory and applications of ANOVA. Two examples are listed below.

  • Clewer, A. G., and D. H. Scarisbrick. 2001. Practical statistics and experimental design for plant and crop science. John Wiley & Sons, New York.
  • Steel, R.G.D., J. H. Torrie, and D. A. Dickey. 1997. Principles and procedures of statistics: A biometrical approach. McGraw–Hill, New York.

Funding Statement

Development of this page was supported in part by the National Institute of Food and Agriculture (NIFA) Solanaceae Coordinated Agricultural Project, agreement 2009-85606-05673, administered by Michigan State University. Any opinions, findings, conclusions, or recommendations expressed in this publication are those of the author(s) and do not necessarily reflect the view of the United States Department of Agriculture.

PBGworks 865