five

Test data for the Large Genome Assembly tutorial

收藏
NIAID Data Ecosystem2026-03-13 收录
下载链接:
https://zenodo.org/record/7055934
下载链接
链接失效反馈
官方服务:
资源简介:
A set of test data to use for the Galaxy Training Network tutorial, Large genome assembly. This data is publicly available in other sources, but has been combined here and subsampled for easier use in the tutorial. We do not claim ownership of this data - please see the full attribution to each of the data sources explained below. Sequencing reads: From the Snow gum: Eucalyptus pauciflora. From NCBI BioProject number: PRJNA450887; Paper: Wang W, Das A, Kainer D, Schalamun M, Morales-Suarez A, Schwessinger B, Lanfear R; 2020, doi: 10.1093/gigascience/giz160. From NCBI, three read files were imported into Galaxy for this tutorial: nanopore reads (SRR7153076), and paired Illumina reads (SRR7153045). For the test data set: these were randomly subsampled to 10% of the original file size, and reads mapping to related chloroplast gene sequences (rbcL sequence: accession KM360776.1; matK sequence: accession KT632904.1) were excluded.  Files:  Nanopore reads; Illumina reads, R1 and R2 Reference genome:  Arabidopsis thaliana. Although this is not the same species as above, we can use it as an example for a comparison step in the tutorial. This has been downloaded from The Arabidopsis Information Resource at https://www.arabidopsis.org/index.jsp from Genes: Download: TAIR10 genome release: TAIR10 chromosome files: file TAIR10_chr_all.fas.gz. Then unzipped into a fasta file.
创建时间:
2022-09-07
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作