Crowdsourced Stereotype Pairs (CrowS-Pairs)

Name: Crowdsourced Stereotype Pairs (CrowS-Pairs)
Creator: 纽约大学
Published: 2020-10-01 06:38:40
License: 暂无描述

arXiv2020-10-01 更新2024-06-21 收录

下载链接：

https://github.com/nyu-mll/crows-pairs/

下载链接

链接失效反馈

官方服务：

资源简介：

CrowS-Pairs是由纽约大学创建的一个挑战性数据集，旨在测量预训练语言模型中的社会偏见。该数据集包含1508个例子，覆盖九种偏见类型，如种族、宗教和年龄。每个例子包含一对句子，其中一个句子表达更多刻板印象，另一个句子表达较少刻板印象。数据集重点关注历史上处于不利地位的群体，并与优势群体进行对比。CrowS-Pairs通过众包方式收集，允许收集更多样化的刻板印象表达和句子结构。该数据集可用于评估减少偏见模型构建的进展，并揭示直接部署近期MLM模型的风险。

CrowS-Pairs is a challenging dataset created by New York University, which aims to measure social biases in pre-trained language models. This dataset contains 1,508 examples covering nine types of biases, such as race, religion and age. Each example consists of a pair of sentences: one sentence expresses more stereotypes while the other expresses fewer. The dataset focuses on historically disadvantaged groups and makes comparisons with advantaged groups. CrowS-Pairs is collected via crowdsourcing, allowing for the collection of more diverse stereotype expressions and sentence structures. This dataset can be used to evaluate the progress of bias-mitigating model construction and reveal the risks of directly deploying recent MLM models.

提供机构：

纽约大学

创建时间：

2020-10-01

5,000+

优质数据集

54 个

任务类型

进入经典数据集