SWEEP: A Tool for Filtering High-Quality SNPs in Polyploid Crops |
High-throughput next-generation sequence-based genotyping and single nucleotide polymorphism (SNP) detection opens the door for emerging genomics-based breeding strategies such as genome-wide association analysis and genomic selection. In polyploids, SNP detection is confounded by a highly similar homeologous sequence where a polymorphism between subgenomes must be differentiated from a SNP. |
Josh P. Clevenger and Peggy Ozias-Akins Corresponding author: The University of Georgia, Department of Horticulture, 2356 Rainwater Road, Tifton, GA 31793. E-mail: pozias@uga.edu AbstractHigh-throughput next-generation sequence-based genotyping and single nucleotide polymorphism (SNP) detection opens the door for emerging genomics-based breeding strategies such as genome-wide association analysis and genomic selection. In polyploids, SNP detection is confounded by a highly similar homeologous sequence where a polymorphism between subgenomes must be differentiated from a SNP. We have developed and implemented a novel tool called SWEEP: Sliding Window Extraction of Explicit Polymorphisms. SWEEP uses subgenome polymorphism haplotypes as contrast to identify true SNPs between genotypes. The tool is a single command script that calls a series of modules based on user-defined options and takes sorted/indexed bam files or vcf files as input. Filtering options are highly flexible and include filtering based on sequence depth, alternate allele ratio, and SNP quality on top of the SWEEP filtering procedure. Using real and simulated data we show that SWEEP outperforms current SNP filtering methods for polyploids. SWEEP can be used for high-quality SNP discovery in polyploid crops.
See http://www.g3journal.org/content/5/9/1797.abstract?etoc G3 September 1, 2015 vol. 5 no. 9 1797-1803
Figure 1 (A) Logic for SWEEP pipeline. (B) Example of detection of a SNP between genotypes. Blue bar represents the reference consensus sequence. Green bars represent one subgenome-derived sequence. Orange bars represent the alternative subgenome-derived sequence. Bases in red are within genome polymorphisms and in this instance are the anchor SNPs. Bases in yellow are the true between-genotype SNPs.
|
Trở lại In Số lần xem: 1116 |
[ Tin tức liên quan ]___________________________________________________
|