Welcome To Website IAS

Hot news
Achievement

Independence Award

- First Rank - Second Rank - Third Rank

Labour Award

- First Rank - Second Rank -Third Rank

National Award

 - Study on food stuff for animal(2005)

 - Study on rice breeding for export and domestic consumption(2005)

VIFOTEC Award

- Hybrid Maize by Single Cross V2002 (2003)

- Tomato Grafting to Manage Ralstonia Disease(2005)

- Cassava variety KM140(2010)

Centres
Website links
Vietnamese calendar
Library
Visitors summary
 Curently online :  9
 Total visitors :  7452404

GenBank is a reliable resource for 21st century biodiversity research

Traditional methods of characterizing biodiversity are increasingly being supplemented and replaced by approaches based on DNA sequencing alone. These approaches commonly involve extraction and high-throughput sequencing of bulk samples from biologically complex communities or samples of environmental DNA (eDNA). In such cases, vouchers for individual organisms are rarely obtained, often unidentifiable, or unavailable. Thus, identifying these sequences typically relies on comparisons with sequences from genetic databases, particularly GenBank.

Matthieu Leray, Nancy Knowlton, Shian-Lei Ho, Bryan N. Nguyen, and Ryuji J. Machida

PNAS November 5, 2019 116 (45) 22651-22656

Significance

As loss of biodiversity and ecosystem degradation become major concerns worldwide, scientists increasingly depend on DNA-based characterization of animal communities for monitoring and impact assessments. These analyses ultimately depend on the taxonomic reliability of genetic databases for taxonomic assignments. Concerns have been raised about the reliability of GenBank, the largest and most widely used genetic database. We show that, contrary to expectations, the proportion of mislabeled sequences in GenBank is surprisingly low. Major taxonomic errors are vanishingly small (0.01% at the class level, 0.05% at the order level), and likely <1% even at the genus level. These results show that GenBank is much more reliable for a range of applications, including studies of environmental change, than previously thought.

Abstract

Traditional methods of characterizing biodiversity are increasingly being supplemented and replaced by approaches based on DNA sequencing alone. These approaches commonly involve extraction and high-throughput sequencing of bulk samples from biologically complex communities or samples of environmental DNA (eDNA). In such cases, vouchers for individual organisms are rarely obtained, often unidentifiable, or unavailable. Thus, identifying these sequences typically relies on comparisons with sequences from genetic databases, particularly GenBank. While concerns have been raised about biases and inaccuracies in laboratory and analytical methods, comparatively little attention has been paid to the taxonomic reliability of GenBank itself. Here we analyze the metazoan mitochondrial sequences of GenBank using a combination of distance-based clustering and phylogenetic analysis. Because of their comparatively rapid evolutionary rates and consequent high taxonomic resolution, mitochondrial sequences represent an invaluable resource for the detection of the many small and often undescribed organisms that represent the bulk of animal diversity. We show that metazoan identifications in GenBank are surprisingly accurate, even at low taxonomic levels (likely <1% error rate at the genus level). This stands in contrast to previously voiced concerns based on limited analyses of particular groups and the fact that individual researchers currently submit annotated sequences to GenBank without significant external taxonomic validation. Our encouraging results suggest that the rapid uptake of DNA-based approaches is supported by a bioinformatic infrastructure capable of assessing both the losses to biodiversity caused by global change and the effectiveness of conservation efforts aimed at slowing or reversing these losses.

 

See https://www.pnas.org/content/116/45/22651

 

Figure 1: Percentage of sequences in multisequence clusters for 13 protein and 2 ribosomal RNA-coding metazoan mitochondrial encoded genes. Clustering was performed on sequences retrieved from the GenBank BLAST nucleotide database using VSEARCH.

Trở lại      In      Số lần xem: 258

[ Tin tức liên quan ]___________________________________________________

 

Designed & Powered by WEBSO CO.,LTD