five

ClinVar Variant ACMG Evidence Prediction Results

收藏
NIAID Data Ecosystem2026-05-02 收录
下载链接:
https://figshare.com/articles/dataset/ClinVar_Variant_ACMG_Evidence_Prediction_Results/29297546
下载链接
链接失效反馈
官方服务:
资源简介:
This CSV file contains predictions for three types of evidence (functional, population, and computational) for all ClinVar variant submissions. The output CSV contains the following columns: Identifier Columns SCV: Submission accession number from ClinVar (format: SCV000000000) VCV: Variation accession number from ClinVar (format: VCV000000000) RCV: Record accession number from ClinVar (format: RCV000000000) VariationID: Numerical identifier for the genetic variation Genomic Coordinates GRCh38_Chr: Chromosome number GRCh38_Start: Start position on chromosome (GRCh38/hg38 assembly) GRCh38_Stop: Stop position on chromosome (GRCh38/hg38 assembly) GRCh38_ReferenceAllele: Reference allele sequence GRCh38_AlternateAllele: Alternate allele sequence Protein-Level Information aapos: Amino acid position in the protein aaref: Reference amino acid (single letter code) aaalt: Alternate amino acid (single letter code) gene: Gene symbol (e.g., BRCA1, TP53) Original Classification SubmissionClassification: Original classification provided by the submitter mainly includes pathogenic, likely pathogenic, uncertain significance, benign, likely benign, but also includes other values Input Text Comment: The textual comment/description provided with the variant submission that was used as input to the model Model Predictions has_evidence: does the text summary contain a specific type of evidence or not (TRUE or FALSE) evidence_confidence: model prediction confidence (0.0 to 1.0) predicted_evidence: if the text summary contains a specific type of evidence, model prediction of whether it is pathogenic or benign (P or B for population or computational evidence, PS3 or BS3 for functional evidence) B_Score: model prediction confidence for being a benign type of evidence (0.0 to 1.0) P_Score: model prediction confidence for being a pathogenic type of evidence (0.0 to 1.0) Notes on Probability Scores Probability scores (B_Score, P_Score) sum to 1.0 for each row. Higher probability indicates greater model confidence for that classification. The predicted_evidence corresponds to the classification with the highest probability score. Model Type: Fine-tuned transformer model for sequence classification Input: Textual comments from variant submissions Training Data: ClinVar variant submissions with known classifications and mentions of ACMG evidence guideline
创建时间:
2025-06-11
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作