Full-length transcriptomes of 25 grassland plant species
收藏DataONE2025-04-17 更新2025-05-10 收录
下载链接:
https://search.dataone.org/view/sha256:baa71b29423c611ec5212c40d515a9cc67862befe7e677003c00ed79cfd59d43
下载链接
链接失效反馈官方服务:
资源简介:
Grasslands are essential, biodiverse ecosystems of economic importance that play a critical role for carbon storage and soil health. Despite their ecological and economic importance, transcriptomic resources for wild grassland species facilitating eco-evolutionary and functional genomic studies remain limited. In this study, we present full-length transcriptomes for shoot tissue of natural accessions of 25 wild grassland plant species collected from the field site of a long-term grassland biodiversity experiment (Jena Experiment). Using PacBio Iso-Seq technology, we generated a total of 522.45 million subreads which were assembled into isoforms for each species separately. This resulted in an average of 49,180 isoforms per species of which 68.6% were successfully annotated against the Swiss-Prot database. Fifty-six percent of the transcripts had complete open reading frames (ORFs), and 29.6% of the transcripts have been identified as non-coding RNAs (ncRNAs) by two prediction tools. Thi..., Substreads from PacBio Sequel II platform were processed using the PacBio Iso-Seq pipeline. Circular Consensus Sequences (CCS) were generated from subreads using the ccs tool (version 6.4.0) with default parameters, which identifies high-fidelity full-length reads by multiple passes of each molecule. These CCS reads were further processed with lima (version 2.9.0, using the --isoseq option) to remove sequencing adapters and barcodes. Poly(A) tails and artificial concatemers were removed using the isoseq tool (version 4.0.0), yielding Full-Length Non-Chimeric (FLNC) reads. The FLNC reads were then clustered using isoseq cluster to generate polished isoforms, with the --singletons option enabled to retain singleton transcripts., , # Full-length transcriptomes of 25 grassland plant species
[https://doi.org/10.5061/dryad.z08kprrpv](https://doi.org/10.5061/dryad.z08kprrpv)
## Description of the data and file structure
This dataset contains 25 FASTA files, each representing the final assembled IsoSeq data for a grassland plant species.
The respective file names listed below:
| Species | Assembly |
| :------------------------ | :------------------------------------------------- |
| Lotus corniculatus | BMK230426-BJ202-01P0001.flnc.clustered.isoforms.fa |
| Medicago variegata | BMK230426-BJ202-01P0002.flnc.clustered.isoforms.fa |
| Trisetum flavescens | BMK230426-BJ202-01P0003.flnc.clustered.isoforms.fa |
| Crepis biennis | BMK230426-BJ202-01P0004.flnc.clustered.isoforms.fa |
| Geranium pratense | BMK230426-BJ202-01P0005.flnc.clustered.isoforms.fa |
| Alopecurus pratensis | BMK230426-BJ202-01P0006.flnc.clustered...,
创建时间:
2025-04-18



