Spike Protein Sequences
收藏arXiv2025-09-30 收录
下载链接:
https://github.com/caiocheohen/AI-driven-COVID-health-stats-predictor
下载链接
链接失效反馈官方服务:
资源简介:
该数据集包含了来自南美患者的3,467个刺突蛋白序列,其中2,313个序列被归类为重症病例,1,154个为轻症病例。此外,数据集还包括了人口统计学和临床元数据,病毒基因组中存在大量的谱系多样性。规模上,该数据集包含了3,467个样本。任务是基于刺突蛋白序列和临床数据预测新冠病毒感染的严重程度。
This dataset contains 3,467 spike protein sequences from South American patients, among which 2,313 are classified as severe cases and 1,154 as mild cases. Additionally, the dataset includes demographic and clinical metadata, with extensive lineage diversity present in the viral genomes. In terms of scale, this dataset comprises 3,467 samples. The task is to predict the severity of COVID-19 infection based on the spike protein sequences and clinical data.
提供机构:
GISAID



