Data and code for "Predicting the functional impact of single nucleotide variants in Drosophila melanogaster with FlyCADD"
收藏NIAID Data Ecosystem2026-05-02 收录
下载链接:
https://zenodo.org/record/14887337
下载链接
链接失效反馈官方服务:
资源简介:
Purpose
Single nucleotide polymorphisms (SNPs), the most common form of genomic variation, play key roles in micro-evolution and adaptation. In Drosophila melanogaster, many SNPs have been associated to phenotypes through association studies, yet functional validation remains challenging and experimental evidence for functional impact rare.
Here, we present FlyCADD, an impact prediction tool that integrates high-quality D. melanogaster genome annotations into a single score reflecting the predicted impact of a SNP. FlyCADD can be applied to distinguish causal from neutral variants, 1) for variant ranking and prioritization of SNPs for functional studies, 2) to improve genome-editing experimental design or evaluation and 3) to enhance interpretation of naturally occuring SNPs, thereby improving our understanding of genotype-phenotype relationships.
Dataset content
We provide FlyCADD impact prediction scores readily available, both as precomputed scores for all possible single nucleotide variants on the D. melanogaster reference genome and through a locally executable pipeline for scoring novel variants of interest. If you want the FlyCADD scores to your SNPs of interest, they can often be found in the precomputed FlyCADD score files without the need to run the pipeline. If you have any questions, feel free to contact j.beets@vu.nl.
This repository provides:
Precomputed FlyCADD scores for all possible SNPs on the D. melanogaster reference genome Release 6 (.csv files)
A locally executable FlyCADD pipeline for scoring novel variants, including the trained logistic regression model files, annotation files and scripts (Python)
166-way multi-species alignment file underlying FlyCADD (.maf file)
Reconstructed ancestral sequence underlying FlyCADD (.fasta files)
Derived and simulated variants used for model training and testing FlyCADD (.vcf files)
FlyCADD scores at all codon positions of unique transcripts in D. melanogaster (.txt files)
Additional resources
The full pipeline of FlyCADD development is available on GitHub (https://github.com/JuliaBeets/FlyCADD/). For any additional information regarding the collection and generation of data please contact us.
创建时间:
2025-02-27



