Genome 10K: a proposition to Obtain Whole Genome Sequence For 10,000 Vertebrate Species

Yüklə 249.91 Kb.
ölçüsü249.91 Kb.
1   2   3   4   5


To this end, we propose to assemble a “virtual collection” of frozen or otherwise suitably preserved tissues or DNA samples representing on the order of 10,000 extant vertebrate species as well as some recently extinct species that are amenable to genomic sequencing (Table 1). This collection represents combined specimen materials from at least 15 participating institutions (Table 2). In most many cases, we have collected both male and female samples, and for certain species, several samples reflecting geographic diversity and/or diversity within localized populations.

Tissues in genetic resources collection have been stored with different methods, which yield varying results with regard to DNA quality [18]. Many tissues that are sampled from the field may be left at ambient temperatures for several hours before they are finally frozen in liquid nitrogen and subsequently stored there or at -80°C. Nonetheless, many of these tissues still yield high quality DNA [19] In other cases, non–cryogenic field buffers [20] are used, although with varying results. In addition to DNA quality, permit and species validation are also important issues (Appendix 1). We follow four general guidelines for Genome 10K sample collection.

  • We seek 10μg of genomic DNA or about 0.5g of frozen tissue for each target species.

  • Tissues may be field preserved in liquid nitrogen, ethanol, or DMSO. Initial preservation in liquid nitrogen is strongly recommended for new acquisitions when field conditions permit. Transport in liquid nitrogen or dry ice is encouraged. Tissues should be stored in -80ºC freezers or in liquid nitrogen.

  • Tissues should be documented with voucher specimens when feasible. Preserved carcasses are preferred; photo vouchers acceptable and DNA Barcode information will be collected for all (DNA barcode reference). In the case of rare or endangered species, a tissue sample, locality, ID by competent biologist and DNA-barcode confirmation or listing in the International Species Information System is nominally acceptable.

  • All specimens used in the G10K project will be obtained and transported in accord with local and international statutes regulating the collection, use, and transport of biological specimens.

In addition to samples for DNA extraction, we have included 1,006 cryopreserved fibroblast cell lines derived from 602 different vertebrate species, primarily mammals, but include including representatives of 300 taxa comprising 42 families of non-mammalian amniotes and one amphibian species, which: these resources provides an additional window into the unique cell biology of these species. With the recent development of transformation techniques to create induced pluripotent stem (iPS) cells from fibroblast lines [21, 22] [23] [24], the potential of cell line studies is greatly expanded. While it is still unclear how well current cell line generation methods can be extended to all vertebrate clades [25] [26], we propose to initiate primary fibroblast cell cultures for as many species as possible, with a target of at least 2,000 diverse species, as a corollary to outcome of the Genome 10K project. These cell cultures, along with cDNA derived from primary tissues, will provide direct access to gene expression and regulation data in the vertebrate species we catalog and provide a renewable experimental resource to complement the Genome 10K genome sequences. For at least one species of each vertebrate order, we propose to assemble additional genomic resources, including physical maps and a BAC library, cell lines, and primary tissues for transcriptome analysis. For these species, we will also sequence multiple individuals to assess within-species diversity, including members of both sexes to assess sex chromosome differences. A resource of this magnitude would help catalyze a much-needed extension of experimental molecular biology beyond the very limited set of model organisms it currently explores.

The Genome 10K species collection will include tissue/DNA specimens from five major taxonomic organismal groups: mammals, birds, amphibians, reptiles and fishes (Table 1).


Mammals contain a morphologically and behaviorally diverse assemblage of approximately 5,400 species distributed in three major lineages: monotremes (platypus and echidnas—5 species), marsupials (~330 species, including the koala, kangaroos, opossums, etc.), and the speciose eutherian, or placental mammals (~5,000 species).

The Genome 10K collections contain exemplars of all 139 families and we have access to ~90% of non-muroid and non-sciurid rodent genera and non-vespertilionid bat genera (Appendix 2, mammals). Ultimately, we will target all 1,200-1,300 genera.

Additional sampling will be applied to deeply divergent, and especially endangered, or EDGE (Evolutionary Distinct and Endangered) species, including all Zaglossus (echidna) species, Cuban and Hispaniolan solenodon, saddle-baked tapir, aardvark, etc. For fundamental biological investigation, it is a high priority to sequence “extremophiles”, such as deep sea divers, high elevation species, aquatic versus terrestrial taxa, species spanning the range of brain size, body size, and morphological convergence as well as aquatic species, gliders, lifespan extremes, nocturnal/diurnals, species with distinct sensory modalities, such as echolocation, and social versus solitary species.

Capturing wide ecological diversity holds great potential in identifying the genomic changes underlying the major mammalian anatomical and behavioral transformations. We project approximately 3,000 of the 5,400 living mammal species.


Like eutherian mammals, modern birds (Neornithes) arose in the Late Cretaceous and diversified soon after the KT extinction event, 65 million years ago. Since then, birds have dispersed across the globe and now occupy a panoply of Earth's habitats and diverse ecosystems representing a wide array of functional lifestyles. At this time, we know very little about the genetic and developmental underpinnings of this biological diversity, having high quality genome sequences of only two species, the chicken (Gallus gallus) and zebra finch (Taeniopygia guttata). but Wwe expect that many key questions can and will be addressed as additional whole genome sequences are accumulated and interpreted in the context of an increasingly accurate comparative framework (e.g., Hackett et al. 2008).

During recent decades, the avian systematic community has built large frozen tissue collections that house high quality genetic samples of a substantial portion of avian diversity. These collections provide an essential resource for future genomic analyses of avian structural, functional, and behavioral diversity. With representation from 15 natural history collections distributed globally, the Genome 10K collection includes specimens from 100% of the 27 orders, 87% of the 205 families, 71% of the 2,172 genera and 50% of the 9,723 species of birds (Appendix 2, birds). Every order is represented in multiple collections, as are all but 5 families and all but 104 genera, which is important for ensuring at least one sample of high quality. We plan to sequence both sexes for many a number of lineages, especially the ratite birds, many of which have apparently monomorphic or near-monomorphic sex chromosomes. Sampling each genus may result in oversampling of some avian orders and families (such as the extremely diverse passerines or and hummingbirds), but we will strive to capture maximal phylogenetic coverage across all bird speciesthe avian tree.

1   2   3   4   5

Verilənlər bazası müəlliflik hüququ ilə müdafiə olunur © 2016
rəhbərliyinə müraciət

    Ana səhifə