five

[SAMPLE] Nexdata | Multilingual Parallel Corpus Data | 200 Million Pair |Text AI & ML Training ...

收藏
Databricks2024-05-09 收录
下载链接:
https://marketplace.databricks.com/details/a913472d-479b-4fc2-b079-b34056b7ad85/Nexdata_SAMPLE-Nexdata-Multilingual-Parallel-Corpus-Data-200-Million-Pair-Text-AI-&-ML-Training-
下载链接
链接失效反馈
官方服务:
资源简介:
1. Overview Off-the-shelf parallel corpus data(Translation Data) covers many fields including spoken language, traveling, medical treatment,news, and finance. Data cleaning, desensitization, and quality inspection have been carried out. 2. Specifications Storage format : TXT Data content : Parallel Corpus Data Data size : 200 million pairs Language : 20 languages Application scenario : machine translation Accuracy rate : 90% 3. About Nexdata Nexdata owns off-the-shelf 200,000 hours of speech recognition data, 800TB of Annotated Imagery Data, about 2 billion pieces of Natural Language Processing (NLP) Data. These ready-to-go Translation Data support instant delivery, quickly improve the accuracy of AI models. For more details, please visit us at https://www.nexdata.ai/naturalLanguage?source=Datarade
提供机构:
Nexdata
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作