five

Full-length transcriptomes of 25 grassland plant species

收藏
DataONE2025-04-17 更新2025-05-10 收录
下载链接:
https://search.dataone.org/view/sha256:baa71b29423c611ec5212c40d515a9cc67862befe7e677003c00ed79cfd59d43
下载链接
链接失效反馈
官方服务:
资源简介:
Grasslands are essential, biodiverse ecosystems of economic importance that play a critical role for carbon storage and soil health. Despite their ecological and economic importance, transcriptomic resources for wild grassland species facilitating eco-evolutionary and functional genomic studies remain limited. In this study, we present full-length transcriptomes for shoot tissue of natural accessions of 25 wild grassland plant species collected from the field site of a long-term grassland biodiversity experiment (Jena Experiment). Using PacBio Iso-Seq technology, we generated a total of 522.45 million subreads which were assembled into isoforms for each species separately. This resulted in an average of 49,180 isoforms per species of which 68.6% were successfully annotated against the Swiss-Prot database. Fifty-six percent of the transcripts had complete open reading frames (ORFs), and 29.6% of the transcripts have been identified as non-coding RNAs (ncRNAs) by two prediction tools. Thi..., Substreads from PacBio Sequel II platform were processed using the PacBio Iso-Seq pipeline. Circular Consensus Sequences (CCS) were generated from subreads using the ccs tool (version 6.4.0) with default parameters, which identifies high-fidelity full-length reads by multiple passes of each molecule. These CCS reads were further processed with lima (version 2.9.0, using the --isoseq option) to remove sequencing adapters and barcodes. Poly(A) tails and artificial concatemers were removed using the isoseq tool (version 4.0.0), yielding Full-Length Non-Chimeric (FLNC) reads. The FLNC reads were then clustered using isoseq cluster to generate polished isoforms, with the --singletons option enabled to retain singleton transcripts., , # Full-length transcriptomes of 25 grassland plant species [https://doi.org/10.5061/dryad.z08kprrpv](https://doi.org/10.5061/dryad.z08kprrpv) ## Description of the data and file structure This dataset contains 25 FASTA files, each representing the final assembled IsoSeq data for a grassland plant species. The respective file names listed below: | Species | Assembly | | :------------------------ | :------------------------------------------------- | | Lotus corniculatus | BMK230426-BJ202-01P0001.flnc.clustered.isoforms.fa | | Medicago variegata | BMK230426-BJ202-01P0002.flnc.clustered.isoforms.fa | | Trisetum flavescens | BMK230426-BJ202-01P0003.flnc.clustered.isoforms.fa | | Crepis biennis | BMK230426-BJ202-01P0004.flnc.clustered.isoforms.fa | | Geranium pratense | BMK230426-BJ202-01P0005.flnc.clustered.isoforms.fa | | Alopecurus pratensis | BMK230426-BJ202-01P0006.flnc.clustered...,
创建时间:
2025-04-18
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作