MilaNLProc/a-tale-of-pronouns

Name: MilaNLProc/a-tale-of-pronouns
Creator: MilaNLProc
Published: 2023-11-18 10:22:12
License: 暂无描述

Hugging Face2023-11-18 更新2024-03-04 收录

下载链接：

https://hf-mirror.com/datasets/MilaNLProc/a-tale-of-pronouns

下载链接

链接失效反馈

官方服务：

资源简介：

该数据集包含了预计算的特征归因分数，这些分数与一篇关于性别偏见缓解的论文相关。数据集的具体细节包括使用Flan-T5-XXL和mtT0-XXL模型对WinoMT翻译成西班牙语和德语的例子进行积分梯度标记级归因计算。文件格式为压缩的gz文件，需要使用特定版本的inseq库加载。数据集的创建者为Giuseppe Attanasio，语言为西班牙语和德语，许可证为Apache 2。

This dataset contains pre-computed feature attribution scores linked to a paper on mitigating gender bias. Specifically, it includes integrated gradient token-level attribution calculations conducted on WinoMT examples translated into Spanish and German using the Flan-T5-XXL and mtT0-XXL models. The dataset is provided as compressed GZ files, which must be loaded using a specific version of the inseq library. It was created by Giuseppe Attanasio, supports Spanish and German, and is licensed under Apache 2.

提供机构：

MilaNLProc

原始信息汇总

A Tale of Pronouns: Attributions on WinoMT

数据集详情

数据集名称： A Tale of Pronouns
许可证： Apache 2.0
语言： 西班牙语, 德语
数据集描述： 该数据集包含与论文《A Tale of Pronouns: Interpretability Informs Gender Bias Mitigation for Fairer Instruction-Tuned Machine Translation》相关的预计算特征归因分数。数据集发布了针对每个WinoMT翻译示例到西班牙语和德语的Flan-T5-XXL和mtT0-XXL的集成梯度令牌级归因。
数据来源：
- GitHub仓库： MilaNLProc/interpretability-mt-gender-bias
- 论文链接： arXiv:2310.12127
数据格式： 文件以gz压缩格式存储，包含FeatureAttributionOutput，可以使用inseq库的load()函数加载。
inseq版本要求：
- Flan-T5-XXL (En-Es): v0.5.0
- Flan-T5-XXL (En-De): WIP
- mT0-XXL: v0.4.0
数据集维护者： Giuseppe Attanasio
联系邮箱： Giuseppe Attanasio

搜集汇总

数据集介绍

以上内容由遇见数据集搜集并总结生成

5,000+

优质数据集

54 个

任务类型

进入经典数据集