Crowdsourced Stereotype Pairs (CrowS-Pairs)
收藏arXiv2020-10-01 更新2024-06-21 收录
下载链接:
https://github.com/nyu-mll/crows-pairs/
下载链接
链接失效反馈官方服务:
资源简介:
CrowS-Pairs是由纽约大学创建的一个挑战性数据集,旨在测量预训练语言模型中的社会偏见。该数据集包含1508个例子,覆盖九种偏见类型,如种族、宗教和年龄。每个例子包含一对句子,其中一个句子表达更多刻板印象,另一个句子表达较少刻板印象。数据集重点关注历史上处于不利地位的群体,并与优势群体进行对比。CrowS-Pairs通过众包方式收集,允许收集更多样化的刻板印象表达和句子结构。该数据集可用于评估减少偏见模型构建的进展,并揭示直接部署近期MLM模型的风险。
CrowS-Pairs is a challenging dataset created by New York University, which aims to measure social biases in pre-trained language models. This dataset contains 1,508 examples covering nine types of biases, such as race, religion and age. Each example consists of a pair of sentences: one sentence expresses more stereotypes while the other expresses fewer. The dataset focuses on historically disadvantaged groups and makes comparisons with advantaged groups. CrowS-Pairs is collected via crowdsourcing, allowing for the collection of more diverse stereotype expressions and sentence structures. This dataset can be used to evaluate the progress of bias-mitigating model construction and reveal the risks of directly deploying recent MLM models.
提供机构:
纽约大学
创建时间:
2020-10-01



