Ning Jiang, Fengjun Zhang, Jinhua Wu, Yue Chen, Xiaohua Hu, Ou Fang, Lindsey J. Leach, Di Wang, Zewei Luo
Abstract
Key message
This optimized approach provides both a computational tool and a library construction protocol, which can maximize the number of genomic sequence reads that uniformly cover a plant genome and minimize the number of sequence reads representing chloroplast DNA and rRNA genes. One can implement the developed computational tool to feasibly design their own RAD-seq experiment to achieve expected coverage of sequence variant markers for large plant populations using information of the genome sequence and ideally, though not necessarily, information of the sequence polymorphism distribution in the genome.
Abstract
Advent of the next generation sequencing techniques motivates recent interest in developing sequence-based identification and genotyping of genome-wide genetic variants in large populations, with RAD-seq being a typical example. Without taking proper account for the fact that chloroplast and rRNA genes may occupy up to 60 % of the resulting sequence reads, the current RAD-seq design could be very inefficient for plant and crop species. We presented here a generic computational tool to optimize RAD-seq design in any plant species and experimentally tested the optimized design by implementing it to screen for and genotype sequence variants in four plant populations of diploid and autotetraploid Arabidopsis and potato Solanum tuberosum. Sequence data from the optimized RAD-seq experiments shows that the undesirable chloroplast and rRNA contributed sequence reads can be controlled at 3–10 %. Additionally, the optimized RAD-seq method enables pre-design of the required uniformity and density in coverage of the high quality sequence polymorphic markers over the genome of interest and genotyping of large plant or crop populations at a competitive cost in comparison to other mainstream rivals in the literature.
See: http://link.springer.com/article/10.1007/s00122-016-2736-9
Theoretical and Applied Genetics, September 2016; Volume 129, Issue 9, pp 1739–1757
Fig. 2 Distribution of the number of short reads across 12 barcoded samples in each pooled RAD-seq dataset. The red dashed line shows the average number of paired reads per sample.
|
[ Other News ]___________________________________________________
|