MilaNLProc/a-tale-of-pronouns
收藏Hugging Face2023-11-18 更新2024-03-04 收录
下载链接:
https://hf-mirror.com/datasets/MilaNLProc/a-tale-of-pronouns
下载链接
链接失效反馈官方服务:
资源简介:
该数据集包含了预计算的特征归因分数,这些分数与一篇关于性别偏见缓解的论文相关。数据集的具体细节包括使用Flan-T5-XXL和mtT0-XXL模型对WinoMT翻译成西班牙语和德语的例子进行积分梯度标记级归因计算。文件格式为压缩的gz文件,需要使用特定版本的inseq库加载。数据集的创建者为Giuseppe Attanasio,语言为西班牙语和德语,许可证为Apache 2。
This dataset contains pre-computed feature attribution scores linked to a paper on mitigating gender bias. Specifically, it includes integrated gradient token-level attribution calculations conducted on WinoMT examples translated into Spanish and German using the Flan-T5-XXL and mtT0-XXL models. The dataset is provided as compressed GZ files, which must be loaded using a specific version of the inseq library. It was created by Giuseppe Attanasio, supports Spanish and German, and is licensed under Apache 2.
提供机构:
MilaNLProc
原始信息汇总
A Tale of Pronouns: Attributions on WinoMT
数据集详情
- 数据集名称: A Tale of Pronouns
- 许可证: Apache 2.0
- 语言: 西班牙语, 德语
- 数据集描述: 该数据集包含与论文《A Tale of Pronouns: Interpretability Informs Gender Bias Mitigation for Fairer Instruction-Tuned Machine Translation》相关的预计算特征归因分数。数据集发布了针对每个WinoMT翻译示例到西班牙语和德语的Flan-T5-XXL和mtT0-XXL的集成梯度令牌级归因。
- 数据来源:
- GitHub仓库: MilaNLProc/interpretability-mt-gender-bias
- 论文链接: arXiv:2310.12127
- 数据格式: 文件以
gz压缩格式存储,包含FeatureAttributionOutput,可以使用inseq库的load()函数加载。 - inseq版本要求:
- Flan-T5-XXL (En-Es): v0.5.0
- Flan-T5-XXL (En-De): WIP
- mT0-XXL: v0.4.0
- 数据集维护者: Giuseppe Attanasio
- 联系邮箱: Giuseppe Attanasio
搜集汇总
数据集介绍

以上内容由遇见数据集搜集并总结生成



