David M. Francis, The Ohio State University
Heather L. Merk, The Ohio State University
Deana Namuth-Covert, University of Nebraska-Lincoln
After a source, often a wild accession, has been identified as possessing a trait of interest to a breeding program, a next logical question for plant breeders is, “How can this trait be best incorporated into valuable breeding material?” The answer depends largely on the genetic nature of the trait. To study the genetic nature of a trait, phenotypic data and genotypic data from molecular markers can, by detecting associations between markers and traits, help determine the number and nature of a gene/quantitative trait locus (QTL) controlling a trait.
To detect associations between molecular markers and traits of interest, data analysis approaches include single marker analysis, simple interval mapping (SIM), multiple interval mapping (MIM), and composite interval mapping (CIM). Although these approaches are designated for QTL analysis, they are also typically employed whenever a trait’s method of genetic control is unknown. This article focuses on single marker analysis.
Single marker analysis can be conducted using a variety of statistical analyses, including t-tests, ANOVA, regression, maximum likelihood estimations, and log likelihood ratios. The fact that molecular marker genotypes can be classified into groups means that marker genotypes can be used as classifying variables for a t-test or ANOVA, or as variables for regression analysis. The null hypothesis tested is that genotypic classes do not differ in pheontype for a given molecular marker. Single marker analysis calculates whether phenotype values differ among genotypes for a given molecular marker. For example, do resistant and susceptible individuals have different genotypes at a given molecular marker? Significant differences suggest that the marker genotype and phenotype are connected.
In the simplest case, linear equations can be developed to describe the relationship between a trait and each molecular marker using the following form:
Y = µ + f(marker) + error
- Y is equal to the trait value
- µ is equal to the population mean
- f(marker) is a function of the molecular marker
Single marker analysis using ANOVA was used in the bacterial spot start-to-finish example to determine associations between phenotype for bacterial spot resistance in the field and genotype for F2 populations.
- View an example of single marker analysis in a BC1 population using ANOVA in SAS
The Plant and Soil Sciences eLibrary provides a helpful animation that is complementary to this lesson. The animation guides users through single marker analysis using ANOVA in Microsoft Excel.
To perform single marker analysis, the plant breeder must first develop a population that is segregating for the trait of interest. When developing a population for QTL analysis, the population structure and size, as well as the number and type of molecular markers, must be considered. Population structure and size are briefly considered here.
When analyzing populations with balanced structure (e.g., backcross one [BC1], F2, and recombinant inbred line [RIL] populations), the analyses can be easily performed using genetic mapping software such as QTL Cartographer and the statistical analyses are parametric (e.g., ANOVA). As mentioned above, this type of analysis was used to determine associations between phenotype for bacterial spot resistance in the field and genotype for F2 populations in the bacterial spot start-to-finish example.
When analyzing populations with unbalanced structure (e.g., inbred backcross [IBC] populations like a BC2S5 population, which has an expected 7:1 genotypic ratio), non-parametric statistics such as the Kruskal–Wallis statistic may be appropriate. Unbalanced populations typically have a phenotype and/or genotype class that has too few individuals to make parametric statistics appropriate. The IBC population developed in the start-to-finish example provides an example.
Population Size Considerations
The ability to detect associations between molecular markers and bacterial spot resistance are in part dependent on population size. In general, the smaller the effect of the QTL, the larger the number of individuals required to detect it.
To detect an additive QTL that explains 50% of the phenotypic variation of a trait in an F2 population requires a population size of at least 16. This assumes that the marker is completely linked to the trait, the probability level is 0.05, and the probability of missing a true association is 10%. Assuming the same conditions, at least 206 individuals are required to detect an additive QTL that explains only 5% of the phenotypic variation.
Advantages of Single Marker Analysis
The advantages of single marker analysis are based on Collard et al. (2005).
- It is the simplest method of QTL detection.
- Analysis can be performed using basic statistical software.
- Analysis does not require a complete linkage map.
Disadvantages/Limitations of Single Marker Analysis
The disadvantages and limitations of single marker analysis are based on Collard et al. (2005).
- The further a marker is from a QTL, the more difficult the QTL is to detect, due to recombination between the marker and QTL.
- QTL effects may be underestimated due to recombination between the marker and QTL.
These limitations may be overcome by using a large number of molecular markers spread throughout the genome.
Single marker analysis is a relatively simple method of QTL analysis that can be conducted to detect associations between molecular markers and traits of interest.
- Collard, B.C.Y., M.Z.Z. Jaufer, J. B. Brouwer, and E.C.K. Pang. 2005. An introduction to markers, quantitative trait locus (QTL) mapping and marker-assisted selection for crop improvement: The basic concepts. Euphytica 142: 169–196. (Available online at: http://dx.doi.org/10.1007/s10681-005-1681-5) (verified 29 Dec 2010).
- Robbins, M. D., A. Darrigues, S. Sim, M.A.T. Masud, and D. M. Francis. 2009. Characterization of hypersensitive resistance to bacterial spot race T3 (Xanthomonas perforans) from tomato accession PI 128216. Phytopathology 99: 1037–1044. (Available at: http://apsjournals.apsnet.org/doi/abs/10.1094/PHYTO-99-9-1037) (verified 24 Sept 2010).
- Byrne, P. Quantitative trait locus (QTL) analysis 1 [Online lesson]. Plant and Soil Sciences eLibrary, University of Nebraska – Lincoln. Available at: http://plantandsoil.unl.edu/croptechnology2005/pages/index.jsp?what=topicsD&topicOrder=1&informationModuleId=1031263034 (verified 24 Sept 2010).
- Byrne, P. Quantitative trait locus (QTL) analysis 2 [Online lesson]. Plant and Soil Sciences eLibrary, University of Nebraska – Lincoln. Available at: http://plantandsoil.unl.edu/croptechnology2005/pages/index.jsp?what=topicsD&topicOrder=1&informationModuleId=1067442598 (verified 24 Sept 2010).
- Duffy, D., and P. Byrne. QTL analysis: Single factor ANOVA in Excel [Online animation]. Plant and Soil Sciences eLibrary, University of Nebraska – Lincoln. Available at: http://plantandsoil.unl.edu/croptechnology2005/pages/animationOut.cgi?anim_name=QTLSingleFactorANOVAinExcel.swf (verified 24 Sept 2010).
- Lee, D. Quantitative traits. [Online animation]. Plant and Soil Sciences eLibrary, University of Nebraska – Lincoln. Available at: http://plantandsoil.unl.edu/croptechnology2005/pages/index.jsp?what=topicsD&topicOrder=1&informationModuleId=979259458 (verified 24 Sept 2010).
Ben Hui Liu provides a thorough explanation of QTL analysis in his text, Statistical Genomics.
- Liu, B. H. 1998. Statistical genomics: Linkage, mapping, and QTL analysis. CRC Press, Boca Raton, FL.
Development of this page was supported in part the National Institute of Food and Agriculture (NIFA) Solanaceae Coordinated Agricultural Project, agreement 2009-85606-05673, administered by Michigan State University. Any opinions, findings, conclusions, or recommendations expressed in this publication are those of the author(s) and do not necessarily reflect the view of the United States Department of Agriculture.