quickmt/quickmt-train.id-en
收藏Hugging Face2025-09-07 更新2025-08-30 收录
下载链接:
https://hf-mirror.com/datasets/quickmt/quickmt-train.id-en
下载链接
链接失效反馈官方服务:
资源简介:
quickmt英印对照训练语料库包含了多个经过去重和基本过滤的子集,这些子集来自于不同的数据源,如Statmt、Facebook、Neulab、ELRC、OPUS等。数据集内容涵盖了新闻评论、维基百科、TED演讲、字幕等多种类型的文本,提供了英文和印地语对照数据,用于机器翻译等自然语言处理任务的训练。
The `quickmt` id-en Training Corpus contains multiple deduplicated and basic filtered subsets from various sources such as Statmt, Facebook, Neulab, ELRC, OPUS, etc. The corpus covers a variety of text types including news commentary, Wikipedia, TED talks, subtitles, etc., and provides English-to-Indonesian parallel data for training in natural language processing tasks such as machine translation.
提供机构:
quickmt



