Nexdata/English-Russian_Parallel_Corpus_Data
收藏Hugging Face2024-04-17 更新2024-03-04 收录
下载链接:
https://hf-mirror.com/datasets/Nexdata/English-Russian_Parallel_Corpus_Data
下载链接
链接失效反馈官方服务:
资源简介:
---
task_categories:
- translation
language:
- ru
- en
---
# Dataset Card for Nexdata/English-Russian_Parallel_Corpus_Data
## Description
English and Russian parallel corpus, 1,080,000 groups in total; excluded political, porn, personal information and other sensitive vocabulary; it can be a base corpus for text-based data analysis, used in machine translation and other fields.
For more details, please refer to the link: https://www.nexdata.ai/datasets/1161?source=Huggingface
# Specifications
## Storage format
TXT
## Data content
English-Russian Parallel Corpus Data
## Data size
1.08 million pairs of English-Russian Parallel Corpus Data
## Language
English,Russian
## Application scenario
machine translation
# Licensing Information
Commercial License
提供机构:
Nexdata
原始信息汇总
数据集卡片 Nexdata/English-Russian_Parallel_Corpus_Data
描述
英语和俄语的平行语料库,总共包含1,080,000组数据;排除了政治、色情、个人信息及其他敏感词汇;可作为基于文本的数据分析的基础语料库,适用于机器翻译等领域。
规范
存储格式
TXT
数据内容
英语-俄语平行语料库数据
数据规模
108万对英语-俄语平行语料库数据
语言
英语, 俄语
应用场景
机器翻译
许可信息
商业许可



