fairnlp/weat

Name: fairnlp/weat
Creator: fairnlp
Published: 2024-02-03 12:36:06
License: 暂无描述

Hugging Face2024-02-03 更新2024-03-04 收录

下载链接：

https://hf-mirror.com/datasets/fairnlp/weat

下载链接

链接失效反馈

官方服务：

资源简介：

该数据集包含原始Word Embedding Association Test (WEAT)的源词，用于计算不同嵌入关联的WEAT分数。数据集由FairNLP `fairscore`库贡献，并引用了Caliskan et. al. (2016)的论文。

This dataset contains the source words of the original Word Embedding Association Test (WEAT) as described by Caliskan et. al. (2016). The dataset contains word lists and attribute lists used to compute several WEAT scores for different embedding associations. For details on the methodology, please refer to the original paper. This dataset is contributed to Hugging Face as part of the WEAT implementation in the FairNLP `fairscore` library.

提供机构：

fairnlp

原始信息汇总

数据集卡片：词嵌入关联测试（WEAT）

数据集详情

该数据集包含原始词嵌入关联测试（WEAT）的源词，如Caliskan等人在2016年所述。数据集包含用于计算不同嵌入关联的多个WEAT分数的词列表和属性列表。有关方法论的详细信息，请参阅原始论文。该数据集作为FairNLP fairscore库中WEAT实现的一部分贡献给Hugging Face。

数据集来源

论文 [可选]: lcs.bath.ac.uk/~jjb/ftp/CaliskanSemantics-Arxiv.pdf

BibTeX:

bibtex @article{DBLP:journals/corr/IslamBN16, author = {Aylin Caliskan Islam and Joanna J. Bryson and Arvind Narayanan}, title = {Semantics derived automatically from language corpora necessarily contain human biases}, journal = {CoRR}, volume = {abs/1608.07187}, year = {2016}, url = {http://arxiv.org/abs/1608.07187}, eprinttype = {arXiv}, eprint = {1608.07187}, timestamp = {Sat, 23 Jan 2021 01:20:12 +0100}, biburl = {https://dblp.org/rec/journals/corr/IslamBN16.bib}, bibsource = {dblp computer science bibliography, https://dblp.org} }

搜集汇总

数据集介绍

背景与挑战

背景概述

该数据集为词嵌入关联测试（WEAT）提供源词和属性列表，用于计算不同嵌入关联的WEAT分数，源自2016年研究语言语义中人类偏见的论文。数据集规模小，格式为parquet，语言为英语。

以上内容由遇见数据集搜集并总结生成

5,000+

优质数据集

54 个

任务类型

进入经典数据集