nyu-mll/crows_pairs
收藏数据集概述
名称: CrowS-Pairs
语言: 英语 (en)
许可证: Creative Commons Attribution-ShareAlike 4.0 International License (cc-by-sa-4.0)
多语言性: 单语种
大小: 1K<n<10K
来源: 原始数据
任务类别: 文本分类
任务ID: 文本评分
标签: 偏见评估
数据集结构
特征:
- id: 整数 (int32)
- sent_more: 字符串
- sent_less: 字符串
- stereo_antistereo: 分类标签 (stereo, antistereo)
- bias_type: 分类标签 (race-color, socioeconomic, gender, disability, nationality, sexual-orientation, physical-appearance, religion, age)
- annotations: 分类标签序列 (同bias_type)
- anon_writer: 字符串
- anon_annotators: 字符串序列
数据分割:
- 测试集: 1508个样本,419976字节
下载大小: 437764字节
数据集大小: 419976字节
数据集创建
许可证信息: 数据集根据Creative Commons Attribution-ShareAlike 4.0 International License授权。
来源数据: 数据集使用来自ROCStories corpora和MNLI的虚构部分的提示创建。
贡献者: 感谢@patil-suraj添加此数据集。
引用信息:
@inproceedings{nangia-etal-2020-crows, title = "{C}row{S}-Pairs: A Challenge Dataset for Measuring Social Biases in Masked Language Models", author = "Nangia, Nikita and Vania, Clara and Bhalerao, Rasika and Bowman, Samuel R.", booktitle = "Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP)", month = nov, year = "2020", address = "Online", publisher = "Association for Computational Linguistics", url = "https://aclanthology.org/2020.emnlp-main.154", doi = "10.18653/v1/2020.emnlp-main.154", pages = "1953--1967", }




