Processed Synthetic Real-World Data for binary modelling

NIAID Data Ecosystem2026-03-14 收录

下载链接：

https://zenodo.org/record/7410141

下载链接

链接失效反馈

官方服务：

资源简介：

This model learning dataset is created out of the Raw Synthetic RWD raw dataset, including some of the original attributes. It is distributed in JOBLIB files, where .joblib files contain the vectors and _ids.joblib contain the ID of the person from which each vector is extracted. This is useful in case it is needed to map the vectors to metadata about the people that are found in the original raw dataset. Note that corresponds to , or , depending on the dataset. The split is roughly 60% of the people are in the training dataset, and 20% in each of the validation and the testing datasets. The input attributes are the age, the short-term averages and the trends of the current week’s BMI, steps walked, calories burned, sleep quality, mood and water consumption, as well as the previous week’s short-term average and trend of the answer to the health self-assessment question. The outcome to be predicted is the binary quantized health self-assessment answer to be given in the current week. The dataset is normalized based on the training set. The means and standard deviations used can be found in the train_statistics.joblib file. Finally, the output_descriptions.joblib file contains descriptions of the outcomes to be predicted (not actually needed, since included here).

创建时间：

2022-12-09

5,000+

优质数据集

54 个

任务类型

进入经典数据集