Social B(eye)as Dataset v2.0
收藏NIAID Data Ecosystem2026-03-12 收录
下载链接:
https://doi.org/10.7910/DVN/4W5GOW
下载链接
链接失效反馈官方服务:
资源简介:
Researchers of Web and social media rely extensively on image analysis tools to understand users' sharing behaviors and engagement with content on the large scale. However, it has been made clear over the past years that there are disparities in the way that these tools treat images depicting people from different social groups. Previously, we released the Social B(eye)as Dataset, consisting of machine- and human-generated descriptions on a controlled set of people images without context. This resource allows researchers to compare the behaviors of taggers and humans systematically. We now update this, with a process that imposes the people-images onto backgrounds. The current release uses four stereotypically "feminine" and four "masculine" contexts. Thus, it enables us to consider the possible influences upon the gender inferences that are made by tagging algorithms. We also provide an updated typology of tags used by the six proprietary taggers as well as initial analyses. Our methodology for imposing semi-transparent images onto background images is publicly available, allowing others to repeat the process with other combinations of images for various research topics.
Web与社交媒体领域的研究者广泛依托图像分析工具,开展大规模用户分享行为及内容参与度相关研究。然而过往数年的研究已证实,此类工具针对不同社会群体人物图像的处理方式存在显著偏差。此前我们曾发布Social B(eye)as数据集(Social B(eye)as Dataset),该数据集包含针对受控人群图像生成的机器标注与人工标注描述,且未附带上下文信息。此资源可支持研究者系统性对比标注算法与人类标注者的行为差异。本次我们对该数据集进行了更新,新增了将人物图像叠加至背景图像的处理流程。当前版本共使用四类典型“女性化”与四类“男性化”场景背景,借此可探究标注算法在性别推断过程中可能受到的各类影响。我们还提供了六款专有标注工具所使用标签的更新分类体系,以及初步的分析结果。我们将人物图像叠加至背景图像的半透明合成方法已公开,可供其他研究者基于不同图像组合复现该流程,以服务于各类研究主题。
创建时间:
2020-09-08



