five

ReMM score

收藏
NIAID Data Ecosystem2026-03-14 收录
下载链接:
https://zenodo.org/record/1197578
下载链接
链接失效反馈
官方服务:
资源简介:
The Regulatory Mendelian Mutation (ReMM) score was created for relevance prediction of non-coding variations (SNVs and small InDels) in the human genome (hg19/hg38) in terms of Mendelian diseases.   Usage The ReMM score is genome position wise (nucleotide changes are neglected). We precomputed all positions in the human genome (hg19 and hg38 release) and stored the values in a tabix file (1-based). The scores ranging from 0 (non-deleterious) to 1 (deleterious). If you want to use the ReMM score together with the Genomiser, please have a look at the Exomiser framework manual For more information, direct VCF file scoring, and an API please have a look at our website: https://remm.bihealth.org ReMM score changelog 0.4: Features: For missing values using genome mean of feature for sequence and conservaton features. 1 for p-value. All other features have zero as missing value Updating DGVCount to 02/25/2020 on hg19/hg38 Update dbVARCount to 10/20/2021 on hg19/hg38 Update ISCApath to 11/03/2021 on hg19/hg38 Replace tfbsConsSites with UCSC table encRegTfbsClustered on hg19/hg38 Software: Using parSMURF for training Complete retraining of hg19 and hg38 builds (hg19: AUROC=0.993; AUPRC=0.394; hg38: AUROC=0.996; AUPRC=0.610) 0.3.1.post1: New hg38 release. Completely retrained on the new genome build. Training data: Liftover positives. No change in size. Negatives used from CADD v1.4 GRCh38 (human derived), filtered as described in the original paper (https://doi.org/10.1016/j.ajhg.2016.07.005). Size slightly different (hg38: 13,902,234; hg19: 14,755,199) . Features Same size as in hg19: 26 features. We tried to use the same features as in hg19. Sometimes new versions of data have to be used (e.g. DGV, ISCA, dbVAR). Training was done with the parSMURF implementation of hyperSMURF. Same hg19 parameters are used. Metrics via 10-fold cytoband cross-validation (same cytoband to fold map): Area under the ROC curve: 0.996 (hg19: 0.989, see https://doi.org/10.1016/j.ajhg.2016.07.005) Area under the precision recall curve: 0.548 (hg19: 0.441, see https://doi.org/10.1016/j.ajhg.2016.07.005) Scores for hg19 in this release are the same as version 0.3.1. Only the files have been renamed. 0.3.1: Bugfix of region chr17:79759050-81195210. Region is missing in older versions. 0.3: First official public version. Values for positions in training data are computed by cytoband-aware 10 fold cross-validation. Other position scores are compted by a generalized model of all training data. This version was used in the Genomiser publication (Smeley et.al. A Whole-Genome Analysis Framework for Effective Identification of Pathogenic Regulatory Variants in Mendelian Disease. AHJG. 2016)
创建时间:
2023-03-14
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作