five

Dataset supporting the tool 'delfies: a Python package for the detection of DNA breakpoints with neo-telomere addition'

收藏
NIAID Data Ecosystem2026-05-02 收录
下载链接:
https://zenodo.org/record/14101797
下载链接
链接失效反馈
官方服务:
资源简介:
Purpose These data can be used to test my tool delfies on real data, to get a concrete sense of its inputs/outputs and test that it is properly installed. Description Genome I downloaded the genome of Oscheius onirici, accession: GCA_932521025. I subsampled the genome to the last 2kbp of chromosome I, which contains an elimination breakpoint, using `seqkit` v2.8.2, giving the FASTA file in this release. Sequencing data I then downloaded the following sequencing data for *O. onirici*, from the European Nucleotide Archive: ERR5967937: Illumina NovaSeq 6000 paired end short reads. Reads are 2x150bp with average per-base quality of Q27. ERR10796202: Oxford Nanopore PromethION long reads. Reads have average length 11.9kbp and average per-base quality Q11.4. ERR7979900: Pacific Biosciences (PacBio) Sequel II long reads. Reads have average length 11.1kbp and average per-base quality Q28. And aligned them to the above genome with `minimap2` version 2.26-r1175, using the following presets: "map-ont" for the Nanopore data, "map-hifi" for the PacBio data, "sr" for the Illumina data. After sorting with `samtools`, this gives the BAM files in this release. Running delfies I then ran `delfies` version 0.6.0 on each BAM and genome, as: ```shdelfies --threads 16 \    --telo_forward_seq TTAGGC \    --breakpoint_type all \    --min_mapq 20 \    --min_supporting_reads 6 \    \${genome} \${bam} \${odirname}``` The three resulting output directories are in this release, prefixed with `delfies_`. A single, identical breakpoint is found using all three BAMs (see files '*breakpoint_locations.bed'). Data source The above raw data were produced and released by the Wellcome Sanger Institute as part of projects PRJEB51305 and PRJEB59023.
创建时间:
2024-12-05
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作