huggingworld/genomic-reasoning-qa
收藏Hugging Face2026-04-28 更新2026-05-03 收录
下载链接:
https://hf-mirror.com/datasets/huggingworld/genomic-reasoning-qa
下载链接
链接失效反馈官方服务:
资源简介:
该数据集包含来自23andMe客户的631,455个SNP(单核苷酸多态性)数据,用于训练和评估一个多步推理代理系统,该系统能够解释个人基因组变异与生物医学知识数据库(如ClinVar、GWAS Catalog、gnomAD)的关系。数据集通过ClinVar API进行注释,构建了一个包含36个问答对的QA数据集,涵盖三种任务类型:变异解释、基因型解释和通路推理。每个问答对都有可验证的答案,基于ClinVar注释的真实数据。
The dataset includes 631,455 SNPs (Single Nucleotide Polymorphisms) from 23andMe client data, used to train and evaluate a multi-step reasoning agent system that interprets personal genomic variants against biomedical knowledge databases (ClinVar, GWAS Catalog, gnomAD). The dataset is annotated via ClinVar API and constructs a QA dataset with 36 question-answer pairs covering three task types: variant interpretation, genotype interpretation, and pathway reasoning. Each QA pair has verifiable answers based on ground truth from ClinVar annotations.
提供机构:
huggingworld



