GUS dataset
收藏arXiv2025-09-30 收录
下载链接:
https://github.com/Ethical-Spectacle/fair-ly/tree/main/resources
下载链接
链接失效反馈官方服务:
资源简介:
该数据集是一个为文本中的标记级别多标签分类偏误而创建的综合数据集,它专注于在标记序列中识别带有偏见的实体。该数据集融合了泛化、不公平性和刻板印象,旨在通过捕捉细腻且依赖于上下文的偏误,来提升自然语言处理中的偏见检测能力。其任务是对标记级别的多标签进行分类。
This comprehensive dataset was developed to address token-level multi-label classification bias in text. It focuses on identifying biased entities within token sequences. By integrating generalization, unfairness and stereotypes, the dataset aims to enhance bias detection capabilities in natural language processing (NLP) by capturing nuanced, context-dependent biases. Its core task is token-level multi-label classification.



