Quinoa genome assembly employing genomic variation for guided scaffolding | Vien Khoa Hoc Ky Thuat Nong Nghiep Mien Nam

Welcome To Website IAS

Home >> News >> Scientific news >> Quinoa genome assembly employing genomic variation for guided scaffolding

Hot news

Achievement

Independence Award

- First Rank - Second Rank - Third Rank

Labour Award

- First Rank - Second Rank -Third Rank

National Award

- Study on food stuff for animal(2005)

- Study on rice breeding for export and domestic consumption(2005)

VIFOTEC Award

- Hybrid Maize by Single Cross V2002 (2003)

- Tomato Grafting to Manage Ralstonia Disease(2005)

- Cassava variety KM140(2010)

Centres

Department of Biotechnology
https://sites.google.com/site/cadcnshias/
Dalat center
http://pvfcdalat.vn
Hung Loc Agricultural Research Center
http://harc-ias.vn/

Website links

Vietnamese calendar

Library

Visitors summary

Curently online : 11
Total visitors : 7666610

News

Quinoa genome assembly employing genomic variation for guided scaffolding

Alexandrina Bodrug-Schepers, Nancy Stralis-Pavese, Hermann Buerstmayr, Juliane C. Dohm & Heinz Himmelbauer

Theoretical and Applied Genetics November 2021; vol. 134: 3577–3594

Key message

We propose to use the natural variation between individuals of a population for genome assembly scaffolding. In today’s genome projects, multiple accessions get sequenced, leading to variant catalogs. Using such information to improve genome assemblies is attractive both cost-wise as well as scientifically, because the value of an assembly increases with its contiguity. We conclude that haplotype information is a valuable resource to group and order contigs toward the generation of pseudomolecules.

Abstract

Quinoa (Chenopodium quinoa) has been under cultivation in Latin America for more than 7500 years. Recently, quinoa has gained increasing attention due to its stress resistance and its nutritional value. We generated a novel quinoa genome assembly for the Bolivian accession CHEN125 using PacBio long-read sequencing data (assembly size 1.32 Gbp, initial N50 size 608 kbp). Next, we re-sequenced 50 quinoa accessions from Peru and Bolivia. This set of accessions differed at 4.4 million single-nucleotide variant (SNV) positions compared to CHEN125 (1.4 million SNV positions on average per accession). We show how to exploit variation in accessions that are distantly related to establish a genome-wide ordered set of contigs for guided scaffolding of a reference assembly. The method is based on detecting shared haplotypes and their expected continuity throughout the genome (i.e., the effect of linkage disequilibrium), as an extension of what is expected in mapping populations where only a few haplotypes are present. We test the approach using Arabidopsis thaliana data from different populations. After applying the method on our CHEN125 quinoa assembly we validated the results with mate-pairs, genetic markers, and another quinoa assembly originating from a Chilean cultivar. We show consistency between these information sources and the haplotype-based relations as determined by us and obtain an improved assembly with an N50 size of 1079 kbp and ordered contig groups of up to 39.7 Mbp. We conclude that haplotype information in distantly related individuals of the same species is a valuable resource to group and order contigs according to their adjacency in the genome toward the generation of pseudomolecules.

See https://link.springer.com/article/10.1007/s00122-021-03915-x

Figure 1: Variants of 22 individuals (1–22) compared to a reference assembly (Ref), only variation positions are shown (schematic). In the genomic interval A, two haplotypes can be distinguished represented by individuals 1–12 and 13–22, respectively. In the genomic interval B, three haplotypes can be distinguished, represented by individuals 1–5, 6–12 and 13–22, respectively. In the whole genomic region (C), four haplotypes can be distinguished, represented by individuals 1–5, 6–12, 13–16 and 17–22, respectively. In total there are three different patterns of variation per genomic position: the first pattern in interval A, the second in interval D, the third in interval E.

Trở lại

Số lần xem: 227

[ Tin tức liên quan ]___________________________________________________