Chromosome-level assembly of Arabidopsis thaliana Ler reveals the extent of translocation and inversion polymorphisms
Sunday, 2016/07/17 | 05:32:12
|
Luis Zapata, Jia Ding, Eva-Maria Willing, Benjamin Hartwig, Daniela Bezdan, Wen-Biao Jiao, Vipul Patel, Geo Velikkakam James, Maarten Koornneef, Stephan Ossowski, and Korbinian Schneeberger GENETICS SignificanceDespite widespread reports on deciphering the sequences of all kinds of genomes, most of these reconstructed genomes rely on a comparison of short DNA sequencing reads to a reference sequence, rather than being independently reconstructed. This method limits the insights on genomic differences to local, mostly small-scale variation, because large rearrangements are likely overlooked by current methods. We have de novo assembled the genome of a common strain of Arabidopsis thaliana Landsberg erecta and revealed hundreds of rearranged regions. Some of these differences suppress meiotic recombination, impacting the haplotypes of a worldwide population of A. thaliana. In addition to sequence changes, this work, which, to our knowledge is the first comparison of an independent, chromosome-level assembled A. thaliana genome, revealed hundreds of unknown, accession-specific genes. AbstractResequencing or reference-based assemblies reveal large parts of the small-scale sequence variation. However, they typically fail to separate such local variation into colinear and rearranged variation, because they usually do not recover the complement of large-scale rearrangements, including transpositions and inversions. Besides the availability of hundreds of genomes of diverse Arabidopsis thaliana accessions, there is so far only one full-length assembled genome: the reference sequence. We have assembled 117 Mb of the A. thaliana Landsberg erecta (Ler) genome into five chromosome-equivalent sequences using a combination of short Illumina reads, long PacBio reads, and linkage information. Whole-genome comparison against the reference sequence revealed 564 transpositions and 47 inversions comprising ∼3.6 Mb, in addition to 4.1 Mb of nonreference sequence, mostly originating from duplications. Although rearranged regions are not different in local divergence from colinear regions, they are drastically depleted for meiotic recombination in heterozygotes. Using a 1.2-Mb inversion as an example, we show that such rearrangement-mediated reduction of meiotic recombination can lead to genetically isolated haplotypes in the worldwide population of A. thaliana. Moreover, we found 105 single-copy genes, which were only present in the reference sequence or the Ler assembly, and 334 single-copy orthologs, which showed an additional copy in only one of the genomes. To our knowledge, this work gives first insights into the degree and type of variation, which will be revealed once complete assemblies will replace resequencing or other reference-dependent methods.
See http://www.pnas.org/content/113/28/E4052.full PNAS July 12, 2016 vol. 113 no. 28: E4052–E4060
Fig. 2. Higher-order sequence variation. (A) Schematic of local (Upper) and higher-order (Lower) sequence variation as revealed by a whole-genome alignment. Local sequence divergence does not only include small-scale variation like SNPs and small indels, but also structural variation like large indels and HDRs. Higher-order variation includes transpositions and inversions, which do not reside in the orthologous regions in the other genome. Both colinear (allelic) and rearranged (nonallelic) regions can harbor local variation. (B) Amount of aligned and nonaligned regions in a nonredundant whole-genome alignment of Col-0 and Ler. Aligned regions can be separated into colinear (gray) and rearranged regions [inversions and transpositions (transpos.); red]. Nonaligned regions, typically residing in the breaks between allelic and nonallelic regions, are shown for Col-0 and Ler separately, including the amount of putatively duplicated regions. (C) Location of transpositions and inversions. (D) Genomic space involved in different types of local sequence variation, separately shown for allelic and nonallelic regions. (E) Sequence divergence in allelic and nonallelic alignments. (F) Schematic examples for the consequences of meiotic recombination (CO) events in transposed (Upper) and inverted (Lower) regions. Chromosome arm exchange in nonallelic regions can lead to extreme chromosomal rearrangements. (G) Distribution of the location of 362 CO events in respect to their occurrence in allelic (gray), nonaligned (green), and nonallelic (red) regions in contrast to the genomic fractions of these regions; shown are complete genome (Upper) and only chromosome arms (Lower). |
Back Print View: 614 |
[ Other News ]___________________________________________________
|