gnomAD V4.1 Annotation Resource for Talos
收藏NIAID Data Ecosystem2026-05-02 收录
下载链接:
https://zenodo.org/record/15110229
下载链接
链接失效反馈官方服务:
资源简介:
This file is an echtvar-encoded extract of the gnomAD V4.1 joint dataset, limited to genic regions +/-2kb
---
Talos is a Variant Prioritisation tool, designed to be used in clinical analysis of rare disease cases. This tool aims to sift through entire datasets (which can be whole genome, exome, or panels), and identify variants which are likely to be causative of disease based on a range of heuristics. The selection and filtering of variants is done on the basis of annotations - population frequencies, per-transcript consequences, in silico predictions of consequence, and presence of variants in the ClinVar clinical database.
To be as portable as possible, and present a consistent collection of annotations regardless of site, Talos takes responsibility for all annotation steps. As described in the Talos project README, an annotation pipeline written in NextFlow can take individual VCFs and a handful of annotation sources, and generate a fully formatted MatrixTable object, ready to be used in Talos. The majority of the annotation inputs (an Ensembl GFF3 file, a summary file from MANE, AlphaMissense data) are small enough to be collected and processed as part of this workflow, but the gnomAD population frequencies are too large to easily distribute.
The Talos pipeline makes use of Echtvar to rapidly apply gnomAD annotations to an input VCF. Echtvar contains an encoding process, where select attributes from an annotation source are extracted and densified into a compressed representation. This file contains the compressed representation of:
gnomAD v4.1 Joint (exomes and genomes)
Reduced to genic regions (a union of all gene/ncRNA/snRNA, +/-2kb) in the Ensembl 113 GFF3 file
Echtvar encoding extracted and renamed the fields defined in this config file
This file can be dropped into the Talos annotation workflow per instructions in the README. It is not a general purpose annotation source usable for applying gnomAD frequencies to whole-genome data, users should be aware that this is a subset of annotations and a subset of sites.
创建时间:
2025-04-10



