zouharvi/nmt-pe-effects
收藏Hugging Face2024-03-04 更新2024-06-22 收录
下载链接:
https://hf-mirror.com/datasets/zouharvi/nmt-pe-effects
下载链接
链接失效反馈官方服务:
资源简介:
---
license: cc
configs:
- config_name: default
data_files:
- split: phase_1
path: "phase_1.json"
- split: phase_2
path: "phase_2.json"
task_categories:
- translation
language:
- en
- cs
tags:
- post editing
- quality
size_categories:
- 1K<n<10K
---
# Neural Machine Translation Quality and Post-Editing Performance
This is a repository for an experiment relating NMT quality and post-editing efforts, presented at EMNLP2021 ([presentation recording](https://youtu.be/rCuoUbmJ5Uk)).
Please cite the following [paper](https://aclanthology.org/2021.emnlp-main.801/) when you use this research:
```
@inproceedings{zouhar2021neural,
title={Neural Machine Translation Quality and Post-Editing Performance},
author={Zouhar, Vil{\'e}m and Popel, Martin and Bojar, Ond{\v{r}}ej and Tamchyna, Ale{\v{s}}},
booktitle={Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing},
pages={10204--10214},
year={2021},
url={https://aclanthology.org/2021.emnlp-main.801/}
}
```
You can [access the data on huggingface](https://huggingface.co/datasets/zouharvi/nmt-pe-effects):
```python3
from datasets import load_dataset
# contains phase_1 and phase_2
data = load_dataset("zouharvi/nmt-pe-effects")
```
The first phase is the main one where we can see the effect of NMT quality on post-editing time.
The second phase is to estimate the quality of the first post-editing round.
The [code is also public](https://github.com/ufal/nmt-pe-effects-2021).
This dataset is used to study the relationship between neural machine translation quality and post-editing performance, containing data from two phases, one for observing the effect of NMT quality on post-editing time and the other for estimating the quality of the first round of post-editing. The dataset supports translation tasks, involving English and Czech, and is tagged as related to post-editing and quality.
提供机构:
zouharvi
原始信息汇总
数据集概述
许可证
- 许可证类型:cc
配置
- 默认配置
- 数据文件:
- 阶段1:
phase_1.json - 阶段2:
phase_2.json
- 阶段1:
- 数据文件:
任务类别
- 翻译
语言
- 英语(en)
- 捷克语(cs)
标签
- 后期编辑
- 质量
数据集大小
- 1K<n<10K



