five

Nanopore R9.4.1 CHM13 methylation frequencies

收藏
DataCite Commons2025-06-01 更新2024-07-29 收录
下载链接:
https://figshare.com/articles/dataset/Nanopore_R9_4_1_HG2_methylation_frequencies/21520950/2
下载链接
链接失效反馈
官方服务:
资源简介:
This dataset contains the methylation frequencies for the CHM13 computed using the complete nanopore public dataset available at https://github.com/marbl/CHM13. <br> Nanopore raw signals for 390 Gbp of data (126x coverage) were downloaded and then converted to BLOW5 format using slow5tools. Then, they were basecalled using buttery-eel under Guppy 6.3.7 high accuracy mode. Reads that passed the qscore filter (&gt;7) were mapped using minimap2 2.17 to hg38noAlt genome. Next, methylation calling was performed using f5c 1.1. Finally, the methylation frequencies output by f5c in tsv format were converted to bigwig format. <br> <br> <strong>Commands:</strong> <strong>basecall gridION data</strong> <em>buttery-eel -i min_grid.blow5 --guppy_bin /install/ont-guppy-6.3.7/bin/ --config dna_r9.4.1_450bps_hac.cfg -x cuda:all -q 7 -o reads_min_grid.fastq --port 5555 --use_tcp</em> <br> <strong>basecall promethION data<br> </strong><em>buttery-eel -i prom.blow5 --guppy_bin /install/ont-guppy-6.3.7/bin/ --config dna_r9.4.1_450bps_hac_prom.cfg -x cuda:all -q 7 -o reads_prom.fastq --port 5556 --use_tcp</em> <br> <strong>alignment</strong> <em>minimap2 -ax map-ont -t40 --secondary=no /genome/hg38noAlt.idx chm13_merged_pass.fastq &gt; hg2_merged_pass.sam</em> <em>samtools sort -@40 -o chm13_merged_pass.bam chm13_merged_pass.sam </em> <em>samtools index chm13_merged_pass.bam</em> <strong>methylation calling</strong> <em>f5c index -t20 chm13_merged_pass.fastq --skip-slow5-idx --slow5 hg2_merged.blow5</em> <em> f5c call-methylation -x hpc-low -t20 -g /genome/hg38noAlt.fa -r chm13_merged_pass.fastq -b chm13_merged_pass.bam --slow5 chm13_merged.blow5 &gt; hg2_merged_pass_f5c_meth.tsv</em> <em>f5c meth-freq -s -i chm13_merged_pass_f5c_meth.tsv -o chm13_merged_pass_f5c_methfreq.tsv</em> <br> <strong>convert to bigwig</strong> <em>tail -n +2 chm13_merged_pass_f5c_methfreq.tsv | awk '{print $1"\t"$2"\t"$3+1"\t"$7}' | sort -k1,1 -k2,2n &gt; meth_freq.bedgraph</em> <em>bedGraphToBigWig meth_freq.bedgraph /genome/hg38.chrom.sizes chm13_merged_pass_f5c_methfreq.bigwig</em> <br> <br>

本数据集包含基于公开纳米孔(nanopore)测序完整数据集计算得到的CHM13样本甲基化频率,该数据集获取自https://github.com/marbl/CHM13。<br>研究人员首先下载了390 Gbp(126倍覆盖度)的纳米孔测序原始信号数据,使用slow5tools工具将其转换为BLOW5(BLOW5)格式。随后借助buttery-eel工具,在Guppy 6.3.7高精度模式下完成碱基识别(basecalling)。对质量值(qscore)大于7的测序读段(reads),使用minimap2 2.17比对至hg38noAlt参考基因组。接下来通过f5c 1.1工具开展甲基化识别(methylation calling)。最终将f5c输出的TSV(TSV)格式甲基化频率文件转换为bigwig(bigwig)格式。<br><br><strong>操作命令:</strong><strong>gridION数据碱基识别</strong><em>buttery-eel -i min_grid.blow5 --guppy_bin /install/ont-guppy-6.3.7/bin/ --config dna_r9.4.1_450bps_hac.cfg -x cuda:all -q 7 -o reads_min_grid.fastq --port 5555 --use_tcp</em><br><strong>promethION数据碱基识别</strong><em>buttery-eel -i prom.blow5 --guppy_bin /install/ont-guppy-6.3.7/bin/ --config dna_r9.4.1_450bps_hac_prom.cfg -x cuda:all -q 7 -o reads_prom.fastq --port 5556 --use_tcp</em><br><strong>序列比对</strong><em>minimap2 -ax map-ont -t40 --secondary=no /genome/hg38noAlt.idx chm13_merged_pass.fastq > hg2_merged_pass.sam</em><em>samtools sort -@40 -o chm13_merged_pass.bam chm13_merged_pass.sam </em><em>samtools index chm13_merged_pass.bam</em><strong>甲基化识别</strong><em>f5c index -t20 chm13_merged_pass.fastq --skip-slow5-idx --slow5 hg2_merged.blow5</em><em> f5c call-methylation -x hpc-low -t20 -g /genome/hg38noAlt.fa -r chm13_merged_pass.fastq -b chm13_merged_pass.bam --slow5 chm13_merged.blow5 > hg2_merged_pass_f5c_meth.tsv</em><em>f5c meth-freq -s -i chm13_merged_pass_f5c_meth.tsv -o chm13_merged_pass_f5c_methfreq.tsv</em><br><strong>转换为bigwig格式</strong><em>tail -n +2 chm13_merged_pass_f5c_methfreq.tsv | awk '{print $1" "$2" "$3+1" "$7}' | sort -k1,1 -k2,2n > meth_freq.bedgraph</em><em>bedGraphToBigWig meth_freq.bedgraph /genome/hg38.chrom.sizes chm13_merged_pass_f5c_methfreq.bigwig</em><br><br>
提供机构:
figshare
创建时间:
2022-11-11
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作