five

Brazilian datasets classified to support differential diagnosis of Severe Acute Respiratory Syndrome (SARS) caused by COVID-19 and influenza

收藏
NIAID Data Ecosystem2026-05-02 收录
下载链接:
https://data.mendeley.com/datasets/f6sjz6by8k
下载链接
链接失效反馈
官方服务:
资源简介:
The SIVEP-Gripe database contains 3,395,398 records with 166 attributes, covering the years 2020 to 2022. These records document cases of Severe Acute Respiratory Syndrome (SARS) caused by COVID-19, Influenza, other etiological agents, various respiratory viruses, and unspecified cases. Of the total records, 1,872,106 are related to SARS due to COVID-19, and 21,490 are related to SARS due to Influenza, highlighting the need for class balancing. Four datasets were created with different balancing configurations: * Balanced by age range (1BAR): The majority class was reduced to match the number of records in the minority class, based on age ranges. Specifically, records from the majority class were selected to match the minimum and maximum age ranges of the minority class. * Balanced by age, sex, and same distribution (2BASD): For each record in the minority class, an equal number of records with the same sex and age were selected from the majority class. * Balanced by age, sex, region, and same distribution (3BARD): This approach included balancing by region, in addition to age and sex. * Balanced by age, sex, outcome, and same distribution (4BASED): This method balanced records by age, sex, and outcome (recovery or death) to maintain consistent distributions of these factors across both classes. After preprocessing, all datasets retained 24 attributes and one target class, "classi_fin", where 1 represents SARS due to influenza and 5 represents SARS due to COVID-19. These subsets were created to evaluate the performance of machine learning models during training.
创建时间:
2024-10-01
二维码
社区交流群
二维码
科研交流群
商业服务