quickmt/quickmt-train.ru-en
收藏Hugging Face2025-09-07 更新2025-08-30 收录
下载链接:
https://hf-mirror.com/datasets/quickmt/quickmt-train.ru-en
下载链接
链接失效反馈官方服务:
资源简介:
quickmt ru-en Training Corpus 数据集包含了多个经过去重和基本过滤处理的子数据集,这些子数据集来源于不同的渠道,包括commoncrawl、news_commentary、tedtalks、ELRC、OPUS等多个平台和项目,涵盖了从2012年到2021年的不同时间段。数据集包含了俄语(ru)和英语(en)两种语言的平行语料,适用于机器翻译等自然语言处理任务。
The quickmt ru-en Training Corpus dataset consists of multiple deduplicated and basic filtered subsets from various sources such as commoncrawl, news_commentary, tedtalks, ELRC, OPUS, etc., covering different time periods from 2012 to 2021. It includes parallel corpora in Russian (ru) and English (en), suitable for machine translation and other natural language processing tasks.
提供机构:
quickmt



