five

MIAP (More Inclusive Annotations for People)

收藏
OpenDataLab2026-05-24 更新2024-05-09 收录
下载链接:
https://opendatalab.org.cn/OpenDataLab/MIAP
下载链接
链接失效反馈
官方服务:
资源简介:
MIAP 是通过在 Open Images 数据集的子集上获取一组新注释而创建的数据集,其中包含这些图像中所有可见人物的边界框和属性,因为原始 Open Images 数据集注释并不详尽,带有边界框以及每个图像中仅一部分类的属性标签。 MIAP 数据集专注于实现 ML 公平性研究。它为 100,000 张(70k 来自训练,30k 来自验证/测试)图像提供了额外的注释,这些图像在原始注释中至少包含一个人的边界框。这些额外的注释为图像中的所有人提供了详尽的边界框。人框进一步用属性标签进行注释,以进行公平性研究。带注释的属性包括人类感知的性别表现(主要是女性,主要是男性和未知)和本地化人的感知年龄范围(年轻、中年、年长和未知)。此过程添加了近 100,000 个未在原始标签管道下注释的新框。详尽集上的注释可以研究在部分注释上训练的模型的公平性以及产生这些注释的管道。

MIAP is a dataset developed by collecting a new set of annotations on a subset of the Open Images dataset. These annotations include bounding boxes and attributes for all visible persons in the corresponding images, as the original Open Images dataset annotations are non-exhaustive, only providing bounding boxes and attribute labels for a subset of categories within each image. The MIAP dataset focuses on machine learning (ML) fairness research. It offers supplementary annotations for 100,000 images (70k from the training split, 30k from the validation/test splits) that contained at least one person's bounding box in the original annotations. These additional annotations provide exhaustive bounding boxes for every person appearing in the images. The person bounding boxes are further annotated with attribute tags tailored for fairness research. The annotated attributes cover human-perceived gender presentation (primarily female, primarily male, and unknown) and human-perceived age ranges of the localized individuals (young, middle-aged, elderly, and unknown). This process added nearly 100,000 new bounding boxes that were not annotated through the original labeling pipeline. Annotations on this exhaustive dataset enable research into both the fairness of models trained on partially annotated data and the pipelines that generate such annotations.
提供机构:
OpenDataLab
创建时间:
2022-08-16
搜集汇总
数据集介绍
main_image_url
背景与挑战
背景概述
MIAP数据集在Open Images子集上新增了人物边界框和属性注释,以弥补原始标注的不完整性,专注于机器学习公平性研究。该数据集为10万张图像提供了详尽的人物边界框,并标注了性别表现和年龄范围属性,用于分析部分注释模型的公平性。
以上内容由遇见数据集搜集并总结生成
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作