Data from: Identification and qualification of 500 nuclear, single-copy, orthologous genes for the Eupulmonata (Gastropoda) using transcriptome sequencing and exon capture
收藏DataCite Commons2025-06-01 更新2025-06-15 收录
下载链接:
https://datadryad.org/dataset/doi:10.5061/dryad.fn627
下载链接
链接失效反馈官方服务:
资源简介:
The qualification of orthology is a significant challenge when developing
large, multiloci phylogenetic data sets from assembled transcripts.
Transcriptome assemblies have various attributes, such as fragmentation,
frameshifts and mis-indexing, which pose problems to automated methods of
orthology assessment. Here, we identify a set of orthologous single-copy
genes from transcriptome assemblies for the land snails and slugs
(Eupulmonata) using a thorough approach to orthology determination
involving manual alignment curation, gene tree assessment and sequencing
from genomic DNA. We qualified the orthology of 500 nuclear,
protein-coding genes from the transcriptome assemblies of 21 eupulmonate
species to produce the most complete phylogenetic data matrix for a major
molluscan lineage to date, both in terms of taxon and character
completeness. Exon capture targeting 490 of the 500 genes (those with at
least one exon >120 bp) from 22 species of Australian Camaenidae
successfully captured sequences of 2825 exons (representing all targeted
genes), with only a 3.7% reduction in the data matrix due to the presence
of putative paralogs or pseudogenes. The automated pipeline Agalma
retrieved the majority of the manually qualified 500 single-copy gene set
and identified a further 375 putative single-copy genes, although it
failed to account for fragmented transcripts resulting in lower data
matrix completeness when considering the original 500 genes. This could
potentially explain the minor inconsistencies we observed in the supported
topologies for the 21 eupulmonate species between the manually curated and
‘Agalma-equivalent’ data set (sharing 458 genes). Overall, our study
confirms the utility of the 500 gene set to resolve phylogenetic
relationships at a range of evolutionary depths and highlights the
importance of addressing fragmentation at the homolog alignment stage for
probe design.
提供机构:
Dryad
创建时间:
2016-05-24



