Full-length transcriptomes of 25 grassland plant species

NIAID Data Ecosystem2026-05-02 收录

下载链接：

http://datadryad.org/dataset/doi%253A10.5061%252Fdryad.z08kprrpv

下载链接

链接失效反馈

官方服务：

资源简介：

Grasslands are essential, biodiverse ecosystems of economic importance that play a critical role for carbon storage and soil health. Despite their ecological and economic importance, transcriptomic resources for wild grassland species facilitating eco-evolutionary and functional genomic studies remain limited. In this study, we present full-length transcriptomes for shoot tissue of natural accessions of 25 wild grassland plant species collected from the field site of a long-term grassland biodiversity experiment (Jena Experiment). Using PacBio Iso-Seq technology, we generated a total of 522.45 million subreads which were assembled into isoforms for each species separately. This resulted in an average of 49,180 isoforms per species of which 68.6% were successfully annotated against the Swiss-Prot database. Fifty-six percent of the transcripts had complete open reading frames (ORFs), and 29.6% of the transcripts have been identified as non-coding RNAs (ncRNAs) by two prediction tools. This dataset provides a valuable full-length transcriptomic resource for exploring gene expression, alternative splicing, and evolutionary patterns in wild grassland plant species, paving the way for future functional genomics and conservation studies. Methods Substreads from PacBio Sequel II platform were processed using the PacBio Iso-Seq pipeline. Circular Consensus Sequences (CCS) were generated from subreads using the ccs tool (version 6.4.0) with default parameters, which identifies high-fidelity full-length reads by multiple passes of each molecule. These CCS reads were further processed with lima (version 2.9.0, using the --isoseq option) to remove sequencing adapters and barcodes. Poly(A) tails and artificial concatemers were removed using the isoseq tool (version 4.0.0), yielding Full-Length Non-Chimeric (FLNC) reads. The FLNC reads were then clustered using isoseq cluster to generate polished isoforms, with the --singletons option enabled to retain singleton transcripts.

创建时间：

2025-04-17

5,000+

优质数据集

54 个

任务类型

进入经典数据集