Nexdata | Multilingual Parallel Corpus Data | 200 Million Pair |Text AI & ML Training Data | Natural Language Processing Data |Translation Data
收藏Datarade2024-04-19 收录
下载链接:
https://datarade.ai/data-products/nexdata-multilingual-parallel-corpus-data-200-million-pai-nexdata
下载链接
链接失效反馈官方服务:
资源简介:
1. Overview Off-the-shelf parallel corpus data(Translation Data) covers many fields including spoken language, traveling, medical treatment,news, and finance. Data cleaning, desensitization, and quality inspection have been carried out. 2. Specifications Storage format : TXT Data content : Parallel Corpus Data Data size : 200 million pairs Language : 20 languages Application scenario : machine translation Accuracy rate : 90% 3. About Nexdata Nexdata owns off-the-shelf 200,000 hours of speech recognition data, 800TB of Annotated Imagery Data, about 2 billion pieces of Natural Language Processing (NLP) Data. These ready-to-go Translation Data support instant delivery, quickly improve the accuracy of AI models. For more details, please visit us at https://www.nexdata.ai/naturalLanguage?source=Datarade
提供机构:
Nexdata
搜集汇总
数据集介绍

背景与挑战
背景概述
该数据集包含20种语言的2亿对平行语料,覆盖口语、旅游、医疗等多个领域,经过清洗和质量检查,机器翻译准确率达90%。由Nexdata提供,支持快速交付以提升AI模型精度。
以上内容由遇见数据集搜集并总结生成



