five

Rain021217/clinvar-pathogenicity-prediction-dataset

收藏
Hugging Face2026-04-22 更新2026-04-26 收录
下载链接:
https://hf-mirror.com/datasets/Rain021217/clinvar-pathogenicity-prediction-dataset
下载链接
链接失效反馈
官方服务:
资源简介:
该数据集包含用于ClinVar-Pathogenicity-Prediction项目的最终处理建模表。它来源于ClinVar变异摘要和UniProt人类参考蛋白质组,旨在用于蛋白质变异致病性预测的机器学习实验。数据集包含训练集、测试集和完整数据集,分别以不同的文件格式提供。标签定义为1表示致病或可能致病变异,0表示良性或可能良性变异。数据集统计信息显示了不同类别的变异数量和比例。此外,还详细列出了20个工程化的蛋白质突变特征,并提供了数据集的使用方法和来源数据。

This dataset contains the final processed modeling table used in the ClinVar-Pathogenicity-Prediction project. It is derived from the ClinVar variant summary and the UniProt human reference proteome, and is intended for machine-learning experiments on protein variant pathogenicity prediction. The dataset includes training, test, and full datasets provided in different file formats. The label is defined as 1 for pathogenic or likely pathogenic variants and 0 for benign or likely benign variants. Dataset statistics show the number and proportion of variants in different categories. Additionally, 20 engineered protein mutation features are listed in detail, and the usage method and source data of the dataset are provided.
提供机构:
Rain021217
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作