ClinVar Variant ACMG Evidence Prediction Results
收藏NIAID Data Ecosystem2026-05-02 收录
下载链接:
https://figshare.com/articles/dataset/ClinVar_Variant_ACMG_Evidence_Prediction_Results/29297546
下载链接
链接失效反馈官方服务:
资源简介:
This CSV file contains predictions for three types of evidence (functional, population, and computational) for all ClinVar variant submissions.
The output CSV contains the following columns:
Identifier Columns
SCV: Submission accession number from ClinVar (format: SCV000000000)
VCV: Variation accession number from ClinVar (format: VCV000000000)
RCV: Record accession number from ClinVar (format: RCV000000000)
VariationID: Numerical identifier for the genetic variation
Genomic Coordinates
GRCh38_Chr: Chromosome number
GRCh38_Start: Start position on chromosome (GRCh38/hg38 assembly)
GRCh38_Stop: Stop position on chromosome (GRCh38/hg38 assembly)
GRCh38_ReferenceAllele: Reference allele sequence
GRCh38_AlternateAllele: Alternate allele sequence
Protein-Level Information
aapos: Amino acid position in the protein
aaref: Reference amino acid (single letter code)
aaalt: Alternate amino acid (single letter code)
gene: Gene symbol (e.g., BRCA1, TP53)
Original Classification
SubmissionClassification: Original classification provided by the submitter mainly includes pathogenic, likely pathogenic, uncertain significance, benign, likely benign, but also includes other values
Input Text
Comment: The textual comment/description provided with the variant submission that was used as input to the model
Model Predictions
has_evidence: does the text summary contain a specific type of evidence or not (TRUE or FALSE)
evidence_confidence: model prediction confidence (0.0 to 1.0)
predicted_evidence: if the text summary contains a specific type of evidence, model prediction of whether it is pathogenic or benign (P or B for population or computational evidence, PS3 or BS3 for functional evidence)
B_Score: model prediction confidence for being a benign type of evidence (0.0 to 1.0)
P_Score: model prediction confidence for being a pathogenic type of evidence (0.0 to 1.0)
Notes on Probability Scores
Probability scores (B_Score, P_Score) sum to 1.0 for each row. Higher probability indicates greater model confidence for that classification. The predicted_evidence corresponds to the classification with the highest probability score.
Model Type: Fine-tuned transformer model for sequence classification
Input: Textual comments from variant submissions
Training Data: ClinVar variant submissions with known classifications and mentions of ACMG evidence guideline
创建时间:
2025-06-11



