Composition of datasets used for model training and evaluation. “Random, Continental” refers to randomly sampled geographical locations across continental Africa, excluding the country of interest. “Random, National” refers to randomly sampled geographical locations in the country of interest. Satellite images containing active settlements were designated with “Positive” labels and all other images were designated with “Negative” labels. The training subset was used to learn model parameters, the validation subset was used to select model hyperparameters, and the test subset was used to perform a final evaluation. We used the Omo Valley dataset for all model experiments, while the Samburu County dataset was specifically used to assess model generalizability.
收藏NIAID Data Ecosystem2026-05-02 收录
下载链接:
https://figshare.com/articles/dataset/Composition_of_datasets_used_for_model_training_and_evaluation_Random_Continental_refers_to_randomly_sampled_geographical_locations_across_continental_Africa_excluding_the_country_of_interest_Random_National_refers_to_randomly_sampled_geogr/28857917
下载链接
链接失效反馈官方服务:
资源简介:
Composition of datasets used for model training and evaluation. “Random, Continental” refers to randomly sampled geographical locations across continental Africa, excluding the country of interest. “Random, National” refers to randomly sampled geographical locations in the country of interest. Satellite images containing active settlements were designated with “Positive” labels and all other images were designated with “Negative” labels. The training subset was used to learn model parameters, the validation subset was used to select model hyperparameters, and the test subset was used to perform a final evaluation. We used the Omo Valley dataset for all model experiments, while the Samburu County dataset was specifically used to assess model generalizability.
用于模型训练与评估的数据集构成。「大陆区域随机采样」(Random, Continental)指的是在非洲大陆范围内(排除目标研究国)随机抽取的地理点位;「本国区域随机采样」(Random, National)指的是在目标研究国内随机抽取的地理点位。包含活跃定居点的卫星影像将被标记为「正样本」(Positive)标签,其余所有影像则标记为「负样本」(Negative)标签。训练子集用于学习模型参数,验证子集用于筛选模型超参数,测试子集则用于开展最终性能评估。本研究所有模型实验均采用奥莫河谷(Omo Valley)数据集,而桑布卢县(Samburu County)数据集仅用于评估模型的泛化能力。
创建时间:
2025-04-24



