HG002-HG004 R9 methylation basecalled data
收藏DataCite Commons2025-09-12 更新2026-05-05 收录
下载链接:
https://www.scidb.cn/detail?dataSetId=01d012668c644416a6014bdb79af3e67
下载链接
链接失效反馈官方服务:
资源简介:
Raw FAST5 files were downloaded from human pangenome reference consortium s3://human-pangenomics/NHGRI_UCSC_panel/.Basecalling was performed using Guppy v4.2.2 or v6.3.8 with the command: “guppy_basecaller --bam out -r -i ${fast5_data} -s ${output_directory} -x cuda:0,1 -c ${config.cfg}”. The resulting BAM files were merged using samtools with command “samtools merge -l ${bam_list.txt} -o ${merged.bam}”. The merged BAM file was aligned to the GRCh38 reference genome (GenBank assembly ID GCA_000001405.15) using dorado v0.9.5 (https://github.com/nanoporetech/dorado) with command “dorado align -t 80 --mm2-opts "-x map-ont" ${reference.fa} > ${output.bam}”, followed by sorting and indexing with Samtools.It should be noted that for datasets basecalled with Guppy v4.2.2 contained “Mm” and “Ml” tags instead of the standard “MM” and “ML” were presented in output BAM file. These BAM files were therefore updated using the command “modkit update-tags ${input.bam} ${output.bam}” before methylation calling.
提供机构:
Science Data Bank
创建时间:
2025-09-12



