Assembled exon data for Gobioid phylogenetic study
收藏NIAID Data Ecosystem2026-05-02 收录
下载链接:
http://datadryad.org/dataset/doi%253A10.5061%252Fdryad.s1rn8pkb8
下载链接
链接失效反馈官方服务:
资源简介:
This dataset contains sequence data for hundreds of protein-coding loci used to create a high-resolution phylogeny of Gobiidae which confidently resolved the interrelationships of the major lineages within the family. This is the combined gobioid dataset with the assembled coding sequences of the exons trimmed of their adaptors in fasta format. It does not include the flanking sequences of the exons. This unaligned data was further processed and filtered at several levels of stringency before being used to create phylogenies.
Methods
This data was obtained using the exon-capture method of Li et al. (2013). RNA baits were designed using EvolMarkers to target approximately 12,000 exons present in eight model fish genomes. The samples used for gene-capture were obtained from collections available in the Fish Systematics and Conservation Lab at Texas A&M University – Corpus Christi or from collaborators at other institutions. This dataset includes 170 samples, which include 158 species and 130 genera. Samples were sequenced on an Illumina HiSeq 2500 or an Illumina HiSeq 4000 using paired-end 150 sequencing. The Assexon pipeline was used to assemble the raw gene-capture data into coding regions without the flanking regions (Yuan et al., 2019). The assembled sequence files are in fasta format and named according to the corresponding gene in the bait set.
The two files "concat.goby_50pct_JE2.nex.fas" and "concat.goby_75pct_JE2.nex.fas" were further processed by filtering down the number of taxas to only 102 samples used in the final phylogenies for the submitted manuscript. These are the aligned sequence files used in the phylogenies. "concat.goby_50pct_JE2.nex.fas" contains only genes present in at least 50% of the samples, whereas "concat.goby_75pct_JE2.nex.fas" contains genes found in at least 75% of the samples.
创建时间:
2025-07-22



