FBK-MT/MAGNETbenchmark4CALAMITA24
收藏Hugging Face2024-10-29 更新2025-04-12 收录
下载链接:
https://hf-mirror.com/datasets/FBK-MT/MAGNETbenchmark4CALAMITA24
下载链接
链接失效反馈官方服务:
资源简介:
---
license: cc-by-sa-4.0
task_categories:
- translation
language:
- en
- it
size_categories:
- 1K<n<10K
configs:
- config_name: OPEN
data_files:
- split: dev
path:
- data/OPEN/dev/dev.en-it.tsv
- split: devtest
path:
- data/OPEN/devtest/devtest.en-it.tsv
default: true
---
# OPEN evaluation sets for the MAGNET Challenge @ CALAMITA 2024
#### Last update: 16 Sept 2024
### Overview
This dataset represents the OPEN portion of the benchmark of the MAGNET Challenge @CALAMITA2024.
It consists of two Italian/English parallel sets, namely:
- dev and devtest taken from the FLORES+ collection (https://github.com/openlanguagedata/flores)
The other three files of the benchmark are not publicly distributed.
Please contact the organizers for information on them.
### Contents
The two sets are provided in .tsv files organized as follows:
```
MAGNETbenchmark4CALAMITA24/
|-- README.md
`-- data
|-- OPEN
| |-- dev
| | `-- dev.en-it.tsv
| `-- devtest
` `-- devtest.en-it.tsv
```
许可证:知识共享署名-相同方式共享4.0(CC BY-SA 4.0)
任务类别:翻译
语言:英语、意大利语
样本规模:1000 < 样本数 < 10000
配置列表:
- 配置名称:OPEN
数据文件:
- 拆分集:dev,路径:data/OPEN/dev/dev.en-it.tsv
- 拆分集:devtest,路径:data/OPEN/devtest/devtest.en-it.tsv
该配置为默认配置
# CALAMITA 2024大会MAGNET挑战赛OPEN评测集
#### 最后更新日期:2024年9月16日
### 数据集概览
本数据集为CALAMITA 2024大会MAGNET挑战赛基准测试集的OPEN子集。
其包含两套英意平行语料集,具体为从FLORES+数据集(https://github.com/openlanguagedata/flores)中提取的dev与devtest子集。
该基准测试集剩余三份文件未公开发布,如需获取相关详情,请联系赛事主办方。
### 数据集内容
两套语料集均以TSV(制表符分隔值,Tab-Separated Values)文件形式提供,目录结构如下:
MAGNETbenchmark4CALAMITA24/
|-- README.md
`-- data
|-- OPEN
| |-- dev
| | `-- dev.en-it.tsv
| `-- devtest
` `-- devtest.en-it.tsv
提供机构:
FBK-MT



