five

Reference files and small demo files for MisMatchFinder

收藏
NIAID Data Ecosystem2026-05-02 收录
下载链接:
https://zenodo.org/record/12754453
下载链接
链接失效反馈
官方服务:
资源简介:
This Zenodo repository contains the reference files and small demo files for running MisMatchFinder.  MisMatchFinder is a fast computational tool for detecting variants and infering mutational signatures from plasma low-coverage, next generation sequencing data of cancer patients without using sequencing from matched normal tissues.  MisMatchFinder is developed by the Molecular Biomarkers and Translational Genomics Lab in Peter MacCallum Cancer Centre (Australia). The tool and instructions are available at the following Bitbucket repository. To run the tool, it is highly recommanded to activate the germline filter to filter out common germline variants and use a whitelist which helps to only detect variants in high mappability regions of the human genome.  For example:  $ mismatchfinder --germline_file /path/to/echtvarfile/gnomad.v3.1.2.echtvar.v2.zip --whitelist_bed /path/to/whitelist.bed -o /path/to/outputfolder/ --strict_overlaps --only_overlaps /path/to/plasma_DNA_demo.bam The files included in this dataset are described below: gnomad.v3.1.2.echtvar.v2.zip: the pre-built gnomad data used for the MisMatchFinder germline filter. GCA_000001405.15_GRCh38_full_analysis_set.100mer.highMappability.bed: the high mappability regions file for the GRCh38 human genome, used as the MisMatchFinder whitelist.  Note, the plasma_DNA_demo.bam is simulated human plasma sequencing data of chromosome 19. This file and the corresponding MisMatchFinder output (plasma_DNA_demo_bamsites.vcf.gz) should only be used for the sanity testing of MisMatchFinder.  clean_10x_rmd.bam and clean_3x_rmd.bam are two bam files which used to assess the effectiveness of MisMatchFinder quality filters. These data are computationally simuated by ART (Weichun Huang, Bioinformatics, 2012), which are supposed to contain sequencing errors similar to Illumina platforms. These data can be used to understand how high quality filters, such as applying high qulity base & mapping qualites and selecting variants only from overlapped regions of reads, can help to reduce the errors in input data (similar to Fig 1b in MisMatchFinder manuscript).
创建时间:
2024-10-03
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作