speedcell4/wmt16
收藏Hugging Face2024-09-22 更新2024-12-14 收录
下载链接:
https://hf-mirror.com/datasets/speedcell4/wmt16
下载链接
链接失效反馈官方服务:
资源简介:
该数据集包含罗马尼亚语(ro)和英语(en)之间的翻译对,分为三个不同的配置:ro-en-idx(序列化的整数表示)、ro-en-moses(原始字符串表示)和ro-en-tok(分词后的字符串表示)。每个配置都包含训练集、验证集和测试集,分别用于模型训练、验证和测试。
This dataset contains three configurations for Romanian to English translation: ro-en-idx, ro-en-moses, and ro-en-tok. Each configuration has different features and data file paths. The dataset is divided into training, validation, and test sets, each with corresponding number of examples and byte sizes. The download size and total size of the dataset are also listed in the file.
提供机构:
speedcell4



