Reference files and small demo files for MisMatchFinder
收藏NIAID Data Ecosystem2026-05-02 收录
下载链接:
https://zenodo.org/record/12754453
下载链接
链接失效反馈官方服务:
资源简介:
This Zenodo repository contains the reference files and small demo files for running MisMatchFinder.
MisMatchFinder is a fast computational tool for detecting variants and infering mutational signatures from plasma low-coverage, next generation sequencing data of cancer patients without using sequencing from matched normal tissues.
MisMatchFinder is developed by the Molecular Biomarkers and Translational Genomics Lab in Peter MacCallum Cancer Centre (Australia). The tool and instructions are available at the following Bitbucket repository.
To run the tool, it is highly recommanded to activate the germline filter to filter out common germline variants and use a whitelist which helps to only detect variants in high mappability regions of the human genome.
For example:
$ mismatchfinder --germline_file /path/to/echtvarfile/gnomad.v3.1.2.echtvar.v2.zip --whitelist_bed /path/to/whitelist.bed -o /path/to/outputfolder/ --strict_overlaps --only_overlaps /path/to/plasma_DNA_demo.bam
The files included in this dataset are described below:
gnomad.v3.1.2.echtvar.v2.zip: the pre-built gnomad data used for the MisMatchFinder germline filter.
GCA_000001405.15_GRCh38_full_analysis_set.100mer.highMappability.bed: the high mappability regions file for the GRCh38 human genome, used as the MisMatchFinder whitelist.
Note, the plasma_DNA_demo.bam is simulated human plasma sequencing data of chromosome 19. This file and the corresponding MisMatchFinder output (plasma_DNA_demo_bamsites.vcf.gz) should only be used for the sanity testing of MisMatchFinder.
clean_10x_rmd.bam and clean_3x_rmd.bam are two bam files which used to assess the effectiveness of MisMatchFinder quality filters. These data are computationally simuated by ART (Weichun Huang, Bioinformatics, 2012), which are supposed to contain sequencing errors similar to Illumina platforms. These data can be used to understand how high quality filters, such as applying high qulity base & mapping qualites and selecting variants only from overlapped regions of reads, can help to reduce the errors in input data (similar to Fig 1b in MisMatchFinder manuscript).
创建时间:
2024-10-03



