five

alignment1-combined-extant-56mor

收藏
DataCite Commons2025-02-02 更新2025-04-16 收录
下载链接:
https://www.scidb.cn/en/detail?dataSetId=61d7df11786d4e31b5e4be25d6edc305
下载链接
链接失效反馈
官方服务:
资源简介:
This is a dataset of morphological and mitochondrial gene combinations for all extant Erinaceidae species (26 species) as well as three outgroup species in well-aligned nex format. Sample numbers, sources and other information are provided in the table.Twenty-nine specimens (marked in yellow) were collected from the field, the National Museum of Natural History (USNM), and the Field Museum of Natural History for mitochondrial DNA extraction, sequencing, assembly, annotation, and alignment. Three samples (marked in gray) only had morphological data, but no mitochondrial gene data. The remaining 24 sample sequences were downloaded from GenBank.The data set of morphological characteristics of all samples was referenced (Gould, 1995; Gould, 1997; Gould, 2001; He et al., 2012). DNA extraction, mitogenome enrichment and sequencing The total DNA was extracted using 10% sodium dodecyl sulphate (SDS) with proteinase K, and purified using the phenol/chloroform approach (Sambrook and Russell, 2001). For DNA extracted from fresh tissue samples, we sheared the DNA into small fragments using NEBNext dsDNA Fragmentise (New England Biolabs, Canada). This step was skipped for DNA extracted from museum samples. We constructed barcoded DNA libraries using a NEBNext Fast DNA Library Prep Set (NewEngland Biolabs, Canada) with barcode adapters (NEXTflex DNA Barcodes, BIOO Scientific, USA). We purified the libraries using magnetic beads, size-selected using a 2% E-gel on an E-Gel electrophoresis system (Thermo Fisher), and re-amplified using a NEBNext High-Fidelity 2X PCR Master Mix (New England Biolabs). We enriched mitochondrial genes using a capture-hybridization protocol following He et al. (2018). In brief, we first prepared biotin-linked DNA probes using long-range PCR amplicons. We hybridized our DNA libraries with probes at 65 °C for 24-48 hours. We reamplified the enriched libraries before purification and concentration measurement using a Qubit Fluorometric Quantitation (Thermo Fisher Scientific, Canada). We ran the sequencing using a 316-chip on an Ion Torrent Personal Genome Machine (PGM). Assembly, annotation and alignmentWe used Trimmomatic v0.36(Bolger et al., 2014) to remove adapters and BFC to correct sequence errors (Li, 2015). The mitochondrial genome was then assembled using SPAdes v3.11(Bankevich et al., 2012) and Geneious R11 (Olsen & Qaadri, 2014). The results produced by the two procedures were then aligned using MAFFT, and the mismatch regions were carefully checked by the naked eye. MITOS was used to annotate the mitochondrial genome (Bernt et ai., 2013).For mitochondrial genome comparison, firstly we downloaded all available erinaceid mitochondrial genomes from GenBank and selected Solenodon paradoxus, Crocidura russula, and Uropsilus gracilis as outgroups. Subsequently, two ribosomal RNA genes and 13 protein-coding genes were aligned using MAFFT v.7 in Geneious R11 (Katoh & Standley, 2013). Protein-coding genes (PCGs) were carefully inspected visually and manually corrected for premature stop codons.Recombination detectionBecause mammalian mitochondria are maternally inherited, a "recombination" event implies that one of the two sequence segments may be contaminated. We used RDP V.5 for recombination detection (RDP) (Martin et al., 2021). RDP detects the presence of potential recombination sites (breakpoints). RDP estimates the UPGMA tree using the two fragments on either side of the breakpoint (that is, the parental sequence), and if the UP-GMA tree shows the presence of a recombination event, we masked the position-specific sequence of the sample (Table S2).ReferencesBankevich, A., Nurk, S., Antipov, D., Gurevich, A. A., Dvorkin, M., Kulikov, A. S., Lesin, V. M., Nikolenko, S. I., Pham, S., Prjibelski, A. D., Pyshkin, A. V., Sirotkin, A. V., Vyahhi, N., Tesler, G., Alekseyev, M. A., & Pevzner, P. A. (2012). SPAdes: A new genome assembly algorithm and its applications to single-cell sequencing. Journal of Computational Biology, 19(5), 455–477. https://doi.org/10.1089/cmb.2012.0021Bolger, A. M., Lohse, M., & Usadel, B. (2014). Trimmomatic: A flexible trimmer for Illumina sequence data. Bioinformatics, 30(15), 2114–2120. https://doi.org/10.1093/bioinformatics/btu170Bernt, M., Donath, A., Jühling, F., Externbrink, F., Florentz, C., Fritzsch, G., Pütz, J., Middendorf, M., & Stadler, P. F. (2013). MITOS: Improved de novo metazoan mitochondrial genome annotation. Molecular Phylogenetics and Evolution, 69(2), 313–319. https://doi.org/10.1016/j.ympev.2012.08.023Gould, G. C. (1995). Hedgehog Phylogeny (Mammalia, Erinaceidae) - the Reciprocal Illumination of the Quick and the Dead. American Museum Novitates, 1–45.Gould, G. C. (1997). Systematic revision of the Erinaceidae (Mammalia)-a comprehensive phylogeny based on the morphology of all known taxa. Columbia University, New York.Gould, G. C. (2001). The phylogenetic resolving power of discrete dental morphology among extant hedgehogs and the implications for their fossil record. Am. Mus. Novit. 3340, 1-52.He, K., Chen, J. H., Gould, G. C., Yamaguchi, N., Ai, H. Sen, Wang, Y. X., Zhang, Y. P., & Jiang, X. L. (2012). An estimation of erinaceidae phylogeny: A combined analysis approach. PLoS ONE, 7(6). https://doi.org/10.1371/journal.pone.0039304He, K., Chen, X., Chen, P., He, S., Cheng, F., Jiang, X., & Campbell, K. L. (2018). A new genus of Asiatic short-tailed shrew (Soricidae, Eulipotyphla) based on molecular and morphological comparisons. Zoological Research, 39(5), 309–323. https://doi.org/10.24272/j.issn.2095-8137.2018.058Katoh, K., & Standley, D. M. (2013). MAFFT multiple sequence alignment software version 7: Improvements in performance and usability. Molecular Biology and Evolution, 30(4), 772–780. https://doi.org/10.1093/molbev/mst010Li, H. (2015). BFC: Correcting Illumina sequencing errors. Bioinformatics, 31(17), 2885–2887. https://doi.org/10.1093/bioinformatics/btv290Martin, D. P., Varsani, A., Roumagnac, P., Botha, G., Maslamoney, S., Schwab, T., Kelz, Z., Kumar, V., & Murrell, B. (2021). RDP5: A computer program for analyzing recombination in, and removing signals of recombination from, nucleotide sequence datasets. Virus Evolution, 7(1). https://doi.org/10.1093/ve/veaa087Olsen, C., & Qaadri, K. (2014). Geneious R7: A Bioinformatics Platform for Biologists. Plant and Animal Genome XXII Conference, 22(21), 9490. https://pag.confex.com/pag/xxii/webprogram/Paper12272.html%5Cnpapers3://publication/uuid/646FA28A-5D95-44CC-9BE8-63628A6D2DEASambrook, J., Russell, D. (2001). Molecular cloning: a laboratory manual. Cold Spring Harbor Laboratory Press.
提供机构:
Science Data Bank
创建时间:
2022-10-25
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作