Genome 10K: a proposition to Obtain Whole Genome Sequence For 10,000 Vertebrate Species

Yüklə 249.91 Kb.
ölçüsü249.91 Kb.
1   2   3   4   5


Non-avian reptile diversity includes snakes, lizards, turtles, crocodiles, and two species of tuatara. Because the traditional view of interfamilial relationships (based on morphology) differs appreciably with recent molecular phylogenies and the molecular phylogenies with one another, major issues such as the origin of snakes (which are clearly nested within lizards) remain controversial (references). Despite these uncertainties, the content of the major groups of reptiles (i.e., the families) are quite stable. Following online databases and Townsend et al. (2004) [27], reptile diversity is distributed among the following groupings: Snakes are divided among 19 families, 477 genera, and 3048 species. Lizards are comprised 33 families, 490 genera, and 5,057 species. Turtle diversity is divided among 13 families, 91 genera, and 313 species. Crocodiles include 23 species divided among 7 genera in 3 taxonomic families. The tuatara is the only extant member of the formerly diverse and widespread Rhyncocephalians. Total reptile diversity therefore includes 69 families, 1,066 genera, and 8,442 species. The Genome 10K collection has 67%, 34% and 10% of these, respectively (Appendix 2, reptiles).


The Class Amphibia is divided into three orders: Anura (frogs), Caudata (salamanders) and Gymnophiona (caecilians), derived from a common ancestor approximately 370MYA and representing the only three surviving lineages from a much greater diversity that existed before the Permian extinction 250MYA [28]. These major clades contain 5,532 frog species, 552 salamander species, and 176 caecilian species, respectively. Amphibian taxonomy is currently in a state of flux, if not disarray, with competing groups of systematists making frequent and conflicting changes. Although controversial, we summarize amphibian diversity and tissue holdings (Appendix 2, amphibians) largely following the work of Frost et al. (2006) [29] because this schema tends to split groups, adding more families and genera, which helps us label the major lineages using the standard ranks of Linnean classification. This taxonomy contains 61 families of amphibians shared among the three orders, containing a total of 510 genera and 6,260 species. The Genome 10K collection contains a total of 394 species (6%), 144 genera (28%), and 45 families (74%).

Collectively, amphibians are of global conservation concern, most recently because of pathogenic affects of a chytrid fungus, Batrachochytrium dendrobatidis [30]. Habitat loss, pollutants, pesticides, herbicides, fertilizers and climatic changes further contribute to their rapid demise.


“Fishes” include all non-tetrapod vertebrates comprising (i) jawless vertebrates (hagfishes and lampreys, 108 species), (ii) chondrichthyans (sharks, rays, chimeras, 970 species), (iii) actinopterygians (ray-fin fishes, 26,950 species), and (iv) piscine sarcopterygians (coelacanths, lungfishes, 8 species). Total described diversity comprises approximately 28,036 species [31], but actual diversity is probably greater than 50,000 species. A broad outline of the evolution of these most deeply coalescing of the vertebrate clades is provided by Stiassny et al. (2004) [32].

In keeping with their very ancient origins, fishes account for nearly 50% of all living species of vertebrates, exhibiting a vast diversity in their morphology, physiology, behavior and ecological adaptations, and providing an exceptional opportunity to study basic vertebrate biology.

Fishes are also a commercially important group of vertebrates as a food crop totaling about US $51 billion in trade per year, increasingly driven by the aquaculture industry [33]. Some 16% of all protein consumed is fish protein, and about one billion people depend on fishes as their major source of protein. Because of the great demand, many groups of fishes are over-exploited. Molecular data for commercially important species of fishes, especially those that are currently endangered and those raised by aquaculture will be valuable in designing strategies for maintaining sustainable stocks and combating disease and other threats.

Fish tissues for the Genome 10K project reside in a number of institutions and are usually curated as parts of formal institutional collections. The total number of species represented by tissue samples is not known, but 6,400 species have been DNA-barcoded and collections of new species continue to be added. Fresh material from many commonly available species can be obtained easily from fishing boats and the pet-trade industry for both genome and other molecular projects. The Genome 10K project has in hand suitable samples from 60/62 orders (97%), 382/492 families (78%), 1,270 of about 3,006 genera (42%) and 2,667 of about 11,430 species (23%), (Appendix 2, fish).

The largest known animal genome is that of the marbled lungfish, Protopterus aethiopicus, with a haploid size of 133 pg (about 130Gb), followed by the salamanders Necturus lewisi and Necturus punctatus at 120pg (about 117Gb) [10]. The genomes are bloated through the activity of transposons which, combined with their enormous size, make genome sequencing and assembly extremely challenging. While RNA-sequencing is one avenue by which we may get direct access to interesting biology in these species, we recommend that full genome sequencing projects for selected large-genome species be undertaken nonetheless. There are important questions pertaining to gene regulation, genome structure and genome evolution that cannot be answered from analysis of transcribed RNA alone.


Careful observations of the morphological and functional adaptations in vertebrates have formed the basis of biological studies for a millennium, but it is only very recently that we are able to observe the action of evolution directly at the genetic level. It is not known whether convergent adaptations in independent lineages are often governed by analogous changes in a small number of orthologous genome loci or if macro-evolutionary events in separate lineages usually result from entirely idiosyncratic combinations of mutations. Evidence from several recent studies point toward the former hypothesis (still need to get the best cites). For example, adaptive hind limb reduction occurred independently many times in different lineages and even within the same species, just as sticklebacks in different lakes adapted from an oceanic to a freshwater environment [34]. These stickleback adaptations are all traced to independent deletions of the same distal enhancer of the PITX2 development gene, demonstrating remarkable convergent evolution at the genomic level [35]. By cataloging the footprints of adaptive evolution in every genomic locus on every vertebrate lineage, the Genome 10K project will provide the power to thoroughly test the “same adaptation, same loci” hypothesis, along with other fundamental questions about molecular adaptive mechanisms.

In the course of this investigation, we will discover the genetic loci governing fundamental vertebrate processes. The study of the evolution of viviparity is an outstanding example. Birds, crocodiles and turtles all lay eggs, while apart from monotremes, mammals are all live-bearers. Thus, we found one fundamental transition from oviparity to viviparity in these amniotes, which caused a fundamental reorganization in the developmental program and large-scale change in gene interactions that we are only just beginning to understand.

Remarkably however, reptiles have an estimated 108 independent origins of the evolution of viviparity. Fish have an equally spectacular variety of such transitions, along with some amphibians, such as the frog genus Gastrotheca, which includes species with placental-like structures. These many independent instances of the evolution of viviparity afford an extraordinary opportunity to explore the genomics behind this reproductive strategy. The architecture of sex determination in vertebrates is similarly diverse, with examples of XY, ZW and temperature sensitive determination methods. This provides an equally exciting opportunity. In fact, many vertebrate species have abandoned sex altogether. What happens when an asexual genome descends from an ancestral sexual genome, as has occurred repeatedly in Cnemidophorus lizard lineages? Are the independent parthenogenetic genomes parallel in any way? In one group of lizards, the formation of unisexual species is phylogenetically constrained, yet in others it is not. Many species of lizards and snakes are also known to have facultative parthenogenesis: unmated females produce viable eggs and offspring. Unisexuality also occurs in amphibians and fish by gynogenesis, hybridogenesis, and in amphibians by kleptogenesis. Do these parallel convergent changes involve the same genes? By identifying the genomic loci that support different evolutionary innovations, such analyses will drive fundamental progress in molecular and developmental biology, as have discoveries of the molecular basis for human diseases.

The symphony of vertebrate species that cohabit on our planet attests to underlying life processes with remarkable potential. Genomics reveals a unity behind these life processes that is unrivaled by any other avenue of investigation, exposing an the undeniable molecular relatedness and common origin among of all vertebrate species. By revealing genetic vulnerabilities in endangered species and tracking host-pathogen interactions, genomics plays an increasing role in sustaining biodiversity and tracking emerging infectious diseases. Access to the information contained in the genomes of threatened and endangered species is an important component of the Genome 10K project and provides crucial information for comparative vertebrate genomics, while generating information that may assist conservation efforts [36] [37]. In studying the genomes of recently extinct species as well, molecular aspects of species vulnerability are revealed and vital gaps in the vertebrate record restored. In all these ways, the Genome 10K project will re-engage the public in the quest for the scientific basis of animal diversity and in the application of the knowledge we gain to improve animal health and species conservation.

As Guttenberg altered the course of human history with the publication printing of the first book, so did the human genome project forever change the course of the life sciences with the publication of the first full vertebrate genome sequence. When Guttenberg’s success was followed by the publication of other books, libraries naturally emerged to hold the fruits of this new technology for the benefit of all who sought to imbibe the vast knowledge made available by the new print medium. We must now follow the human genome project with a library of vertebrate genomes; a genomic ark for thriving and threatened species alike; and a permanent digital recording record of countless molecular triumphs and stumbles across some 500 million years of evolutionary episodes that forged the “endless forms most beautiful” on our living world.
1   2   3   4   5

Verilənlər bazası müəlliflik hüququ ilə müdafiə olunur © 2016
rəhbərliyinə müraciət

    Ana səhifə