five

wecover/OPUS_OpenSubtitles

收藏
Hugging Face2024-01-31 更新2024-03-04 收录
下载链接:
https://hf-mirror.com/datasets/wecover/OPUS_OpenSubtitles
下载链接
链接失效反馈
官方服务:
资源简介:
--- configs: - config_name: default data_files: - split: train path: '*/*/train.parquet' - split: valid path: '*/*/valid.parquet' - split: test path: '*/*/test.parquet' - config_name: af data_files: - split: train path: '*/*af*/train.parquet' - split: test path: '*/*af*/test.parquet' - split: valid path: '*/*af*/valid.parquet' - config_name: ar data_files: - split: train path: '*/*ar*/train.parquet' - split: test path: '*/*ar*/test.parquet' - split: valid path: '*/*ar*/valid.parquet' - config_name: bg data_files: - split: train path: '*/*bg*/train.parquet' - split: test path: '*/*bg*/test.parquet' - split: valid path: '*/*bg*/valid.parquet' - config_name: bn data_files: - split: train path: '*/*bn*/train.parquet' - split: test path: '*/*bn*/test.parquet' - split: valid path: '*/*bn*/valid.parquet' - config_name: bs data_files: - split: train path: '*/*bs*/train.parquet' - split: test path: '*/*bs*/test.parquet' - split: valid path: '*/*bs*/valid.parquet' - config_name: cs data_files: - split: train path: '*/*cs*/train.parquet' - split: test path: '*/*cs*/test.parquet' - split: valid path: '*/*cs*/valid.parquet' - config_name: da data_files: - split: train path: '*/*da*/train.parquet' - split: test path: '*/*da*/test.parquet' - split: valid path: '*/*da*/valid.parquet' - config_name: de data_files: - split: train path: '*/*de*/train.parquet' - split: test path: '*/*de*/test.parquet' - split: valid path: '*/*de*/valid.parquet' - config_name: el data_files: - split: train path: '*/*el*/train.parquet' - split: test path: '*/*el*/test.parquet' - split: valid path: '*/*el*/valid.parquet' - config_name: en data_files: - split: train path: '*/*en*/train.parquet' - split: test path: '*/*en*/test.parquet' - split: valid path: '*/*en*/valid.parquet' - config_name: eo data_files: - split: train path: '*/*eo*/train.parquet' - split: test path: '*/*eo*/test.parquet' - split: valid path: '*/*eo*/valid.parquet' - config_name: es data_files: - split: train path: '*/*es*/train.parquet' - split: test path: '*/*es*/test.parquet' - split: valid path: '*/*es*/valid.parquet' - config_name: et data_files: - split: train path: '*/*et*/train.parquet' - split: test path: '*/*et*/test.parquet' - split: valid path: '*/*et*/valid.parquet' - config_name: fa data_files: - split: train path: '*/*fa*/train.parquet' - split: test path: '*/*fa*/test.parquet' - split: valid path: '*/*fa*/valid.parquet' - config_name: fi data_files: - split: train path: '*/*fi*/train.parquet' - split: test path: '*/*fi*/test.parquet' - split: valid path: '*/*fi*/valid.parquet' - config_name: fr data_files: - split: train path: '*/*fr*/train.parquet' - split: test path: '*/*fr*/test.parquet' - split: valid path: '*/*fr*/valid.parquet' - config_name: he data_files: - split: train path: '*/*he*/train.parquet' - split: test path: '*/*he*/test.parquet' - split: valid path: '*/*he*/valid.parquet' - config_name: hi data_files: - split: train path: '*/*hi*/train.parquet' - split: test path: '*/*hi*/test.parquet' - split: valid path: '*/*hi*/valid.parquet' - config_name: hr data_files: - split: train path: '*/*hr*/train.parquet' - split: test path: '*/*hr*/test.parquet' - split: valid path: '*/*hr*/valid.parquet' - config_name: hu data_files: - split: train path: '*/*hu*/train.parquet' - split: test path: '*/*hu*/test.parquet' - split: valid path: '*/*hu*/valid.parquet' - config_name: id data_files: - split: train path: '*/*id*/train.parquet' - split: test path: '*/*id*/test.parquet' - split: valid path: '*/*id*/valid.parquet' - config_name: it data_files: - split: train path: '*/*it*/train.parquet' - split: test path: '*/*it*/test.parquet' - split: valid path: '*/*it*/valid.parquet' - config_name: ja data_files: - split: train path: '*/*ja*/train.parquet' - split: test path: '*/*ja*/test.parquet' - split: valid path: '*/*ja*/valid.parquet' - config_name: lt data_files: - split: train path: '*/*lt*/train.parquet' - split: test path: '*/*lt*/test.parquet' - split: valid path: '*/*lt*/valid.parquet' - config_name: mk data_files: - split: train path: '*/*mk*/train.parquet' - split: test path: '*/*mk*/test.parquet' - split: valid path: '*/*mk*/valid.parquet' - config_name: ml data_files: - split: train path: '*/*ml*/train.parquet' - split: test path: '*/*ml*/test.parquet' - split: valid path: '*/*ml*/valid.parquet' - config_name: ms data_files: - split: train path: '*/*ms*/train.parquet' - split: test path: '*/*ms*/test.parquet' - split: valid path: '*/*ms*/valid.parquet' - config_name: nl data_files: - split: train path: '*/*nl*/train.parquet' - split: test path: '*/*nl*/test.parquet' - split: valid path: '*/*nl*/valid.parquet' - config_name: no data_files: - split: train path: '*/*no*/train.parquet' - split: test path: '*/*no*/test.parquet' - split: valid path: '*/*no*/valid.parquet' - config_name: pl data_files: - split: train path: '*/*pl*/train.parquet' - split: test path: '*/*pl*/test.parquet' - split: valid path: '*/*pl*/valid.parquet' - config_name: pt data_files: - split: train path: '*/*pt*/train.parquet' - split: test path: '*/*pt*/test.parquet' - split: valid path: '*/*pt*/valid.parquet' - config_name: ro data_files: - split: train path: '*/*ro*/train.parquet' - split: test path: '*/*ro*/test.parquet' - split: valid path: '*/*ro*/valid.parquet' - config_name: ru data_files: - split: train path: '*/*ru*/train.parquet' - split: test path: '*/*ru*/test.parquet' - split: valid path: '*/*ru*/valid.parquet' - config_name: si data_files: - split: train path: '*/*si*/train.parquet' - split: test path: '*/*si*/test.parquet' - split: valid path: '*/*si*/valid.parquet' - config_name: sk data_files: - split: train path: '*/*sk*/train.parquet' - split: test path: '*/*sk*/test.parquet' - split: valid path: '*/*sk*/valid.parquet' - config_name: sl data_files: - split: train path: '*/*sl*/train.parquet' - split: test path: '*/*sl*/test.parquet' - split: valid path: '*/*sl*/valid.parquet' - config_name: sq data_files: - split: train path: '*/*sq*/train.parquet' - split: test path: '*/*sq*/test.parquet' - split: valid path: '*/*sq*/valid.parquet' - config_name: sr data_files: - split: train path: '*/*sr*/train.parquet' - split: test path: '*/*sr*/test.parquet' - split: valid path: '*/*sr*/valid.parquet' - config_name: sv data_files: - split: train path: '*/*sv*/train.parquet' - split: test path: '*/*sv*/test.parquet' - split: valid path: '*/*sv*/valid.parquet' - config_name: ta data_files: - split: train path: '*/*ta*/train.parquet' - split: test path: '*/*ta*/test.parquet' - split: valid path: '*/*ta*/valid.parquet' - config_name: th data_files: - split: train path: '*/*th*/train.parquet' - split: test path: '*/*th*/test.parquet' - split: valid path: '*/*th*/valid.parquet' - config_name: tr data_files: - split: train path: '*/*tr*/train.parquet' - split: test path: '*/*tr*/test.parquet' - split: valid path: '*/*tr*/valid.parquet' - config_name: uk data_files: - split: train path: '*/*uk*/train.parquet' - split: test path: '*/*uk*/test.parquet' - split: valid path: '*/*uk*/valid.parquet' - config_name: vi data_files: - split: train path: '*/*vi*/train.parquet' - split: test path: '*/*vi*/test.parquet' - split: valid path: '*/*vi*/valid.parquet' - config_name: br data_files: - split: train path: '*/*br*/train.parquet' - split: test path: '*/*br*/test.parquet' - split: valid path: '*/*br*/valid.parquet' - config_name: ca data_files: - split: train path: '*/*ca*/train.parquet' - split: test path: '*/*ca*/test.parquet' - split: valid path: '*/*ca*/valid.parquet' - config_name: eu data_files: - split: train path: '*/*eu*/train.parquet' - split: test path: '*/*eu*/test.parquet' - split: valid path: '*/*eu*/valid.parquet' - config_name: gl data_files: - split: train path: '*/*gl*/train.parquet' - split: test path: '*/*gl*/test.parquet' - split: valid path: '*/*gl*/valid.parquet' - config_name: hy data_files: - split: train path: '*/*hy*/train.parquet' - split: test path: '*/*hy*/test.parquet' - split: valid path: '*/*hy*/valid.parquet' - config_name: is data_files: - split: train path: '*/*is*/train.parquet' - split: test path: '*/*is*/test.parquet' - split: valid path: '*/*is*/valid.parquet' - config_name: ka data_files: - split: train path: '*/*ka*/train.parquet' - split: test path: '*/*ka*/test.parquet' - split: valid path: '*/*ka*/valid.parquet' - config_name: kk data_files: - split: train path: '*/*kk*/train.parquet' - split: test path: '*/*kk*/test.parquet' - split: valid path: '*/*kk*/valid.parquet' - config_name: ko data_files: - split: train path: '*/*ko*/train.parquet' - split: test path: '*/*ko*/test.parquet' - split: valid path: '*/*ko*/valid.parquet' - config_name: te data_files: - split: train path: '*/*te*/train.parquet' - split: test path: '*/*te*/test.parquet' - split: valid path: '*/*te*/valid.parquet' - config_name: tl data_files: - split: train path: '*/*tl*/train.parquet' - split: test path: '*/*tl*/test.parquet' - split: valid path: '*/*tl*/valid.parquet' - config_name: ur data_files: - split: train path: '*/*ur*/train.parquet' - split: test path: '*/*ur*/test.parquet' - split: valid path: '*/*ur*/valid.parquet' ---
提供机构:
wecover
原始信息汇总

数据集配置

默认配置

  • 训练集: */*/train.parquet
  • 验证集: */*/valid.parquet
  • 测试集: */*/test.parquet

语言特定配置

  • 阿非利卡语 (af)
    • 训练集: */*af*/train.parquet
    • 验证集: */*af*/valid.parquet
    • 测试集: */*af*/test.parquet
  • 阿拉伯语 (ar)
    • 训练集: */*ar*/train.parquet
    • 验证集: */*ar*/valid.parquet
    • 测试集: */*ar*/test.parquet
  • 保加利亚语 (bg)
    • 训练集: */*bg*/train.parquet
    • 验证集: */*bg*/valid.parquet
    • 测试集: */*bg*/test.parquet
  • 孟加拉语 (bn)
    • 训练集: */*bn*/train.parquet
    • 验证集: */*bn*/valid.parquet
    • 测试集: */*bn*/test.parquet
  • 波斯尼亚语 (bs)
    • 训练集: */*bs*/train.parquet
    • 验证集: */*bs*/valid.parquet
    • 测试集: */*bs*/test.parquet
  • 捷克语 (cs)
    • 训练集: */*cs*/train.parquet
    • 验证集: */*cs*/valid.parquet
    • 测试集: */*cs*/test.parquet
  • 丹麦语 (da)
    • 训练集: */*da*/train.parquet
    • 验证集: */*da*/valid.parquet
    • 测试集: */*da*/test.parquet
  • 德语 (de)
    • 训练集: */*de*/train.parquet
    • 验证集: */*de*/valid.parquet
    • 测试集: */*de*/test.parquet
  • 希腊语 (el)
    • 训练集: */*el*/train.parquet
    • 验证集: */*el*/valid.parquet
    • 测试集: */*el*/test.parquet
  • 英语 (en)
    • 训练集: */*en*/train.parquet
    • 验证集: */*en*/valid.parquet
    • 测试集: */*en*/test.parquet
  • 世界语 (eo)
    • 训练集: */*eo*/train.parquet
    • 验证集: */*eo*/valid.parquet
    • 测试集: */*eo*/test.parquet
  • 西班牙语 (es)
    • 训练集: */*es*/train.parquet
    • 验证集: */*es*/valid.parquet
    • 测试集: */*es*/test.parquet
  • 爱沙尼亚语 (et)
    • 训练集: */*et*/train.parquet
    • 验证集: */*et*/valid.parquet
    • 测试集: */*et*/test.parquet
  • 波斯语 (fa)
    • 训练集: */*fa*/train.parquet
    • 验证集: */*fa*/valid.parquet
    • 测试集: */*fa*/test.parquet
  • 芬兰语 (fi)
    • 训练集: */*fi*/train.parquet
    • 验证集: */*fi*/valid.parquet
    • 测试集: */*fi*/test.parquet
  • 法语 (fr)
    • 训练集: */*fr*/train.parquet
    • 验证集: */*fr*/valid.parquet
    • 测试集: */*fr*/test.parquet
  • 希伯来语 (he)
    • 训练集: */*he*/train.parquet
    • 验证集: */*he*/valid.parquet
    • 测试集: */*he*/test.parquet
  • 印地语 (hi)
    • 训练集: */*hi*/train.parquet
    • 验证集: */*hi*/valid.parquet
    • 测试集: */*hi*/test.parquet
  • 克罗地亚语 (hr)
    • 训练集: */*hr*/train.parquet
    • 验证集: */*hr*/valid.parquet
    • 测试集: */*hr*/test.parquet
  • 匈牙利语 (hu)
    • 训练集: */*hu*/train.parquet
    • 验证集: */*hu*/valid.parquet
    • 测试集: */*hu*/test.parquet
  • 印度尼西亚语 (id)
    • 训练集: */*id*/train.parquet
    • 验证集: */*id*/valid.parquet
    • 测试集: */*id*/test.parquet
  • 意大利语 (it)
    • 训练集: */*it*/train.parquet
    • 验证集: */*it*/valid.parquet
    • 测试集: */*it*/test.parquet
  • 日语 (ja)
    • 训练集: */*ja*/train.parquet
    • 验证集: */*ja*/valid.parquet
    • 测试集: */*ja*/test.parquet
  • 立陶宛语 (lt)
    • 训练集: */*lt*/train.parquet
    • 验证集: */*lt*/valid.parquet
    • 测试集: */*lt*/test.parquet
  • 马其顿语 (mk)
    • 训练集: */*mk*/train.parquet
    • 验证集: */*mk*/valid.parquet
    • 测试集: */*mk*/test.parquet
  • 马拉雅拉姆语 (ml)
    • 训练集: */*ml*/train.parquet
    • 验证集: */*ml*/valid.parquet
    • 测试集: */*ml*/test.parquet
  • 马来语 (ms)
    • 训练集: */*ms*/train.parquet
    • 验证集: */*ms*/valid.parquet
    • 测试集: */*ms*/test.parquet
  • 荷兰语 (nl)
    • 训练集: */*nl*/train.parquet
    • 验证集: */*nl*/valid.parquet
    • 测试集: */*nl*/test.parquet
  • 挪威语 (no)
    • 训练集: */*no*/train.parquet
    • 验证集: */*no*/valid.parquet
    • 测试集: */*no*/test.parquet
  • 波兰语 (pl)
    • 训练集: */*pl*/train.parquet
    • 验证集: */*pl*/valid.parquet
    • 测试集: */*pl*/test.parquet
  • 葡萄牙语 (pt)
    • 训练集: */*pt*/train.parquet
    • 验证集: */*pt*/valid.parquet
    • 测试集: */*pt*/test.parquet
  • 罗马尼亚语 (ro)
    • 训练集: */*ro*/train.parquet
    • 验证集: */*ro*/valid.parquet
    • 测试集: */*ro*/test.parquet
  • 俄语 (ru)
    • 训练集: */*ru*/train.parquet
    • 验证集: */*ru*/valid.parquet
    • 测试集: */*ru*/test.parquet
  • 僧伽罗语 (si)
    • 训练集: */*si*/train.parquet
    • 验证集: */*si*/valid.parquet
    • 测试集: */*si*/test.parquet
  • 斯洛伐克语 (sk)
    • 训练集: */*sk*/train.parquet
    • 验证集: */*sk*/valid.parquet
    • 测试集: */*sk*/test.parquet
  • 斯洛文尼亚语 (sl)
    • 训练集: */*sl*/train.parquet
    • 验证集: */*sl*/valid.parquet
    • 测试集: */*sl*/test.parquet
  • 阿尔巴尼亚语 (sq)
    • 训练集: */*sq*/train.parquet
    • 验证集: */*sq*/valid.parquet
    • 测试集: */*sq*/test.parquet
  • 塞尔维亚语 (sr)
    • 训练集: */*sr*/train.parquet
    • 验证集: */*sr*/valid.parquet
    • 测试集: */*sr*/test.parquet
  • 瑞典语 (sv)
    • 训练集: */*sv*/train.parquet
    • 验证集: */*sv*/valid.parquet
    • 测试集: */*sv*/test.parquet
  • 泰米尔语 (ta)
    • 训练集: */*ta*/train.parquet
    • 验证集: */*ta*/valid.parquet
    • 测试集: */*ta*/test.parquet
  • 泰语 (th)
    • 训练集: */*th*/train.parquet
    • 验证集: */*th*/valid.parquet
    • 测试集: */*th*/test.parquet
  • 土耳其语 (tr)
    • 训练集: */*tr*/train.parquet
    • 验证集: */*tr*/valid.parquet
    • 测试集: */*tr*/test.parquet
  • 乌克兰语 (uk)
    • 训练集: */*uk*/train.parquet
    • 验证集: */*uk*/valid.parquet
    • 测试集: */*uk*/test.parquet
  • 越南语 (vi)
    • 训练集: */*vi*/train.parquet
    • 验证集: */*vi*/valid.parquet
    • 测试集: */*vi*/test.parquet
  • 布列塔尼语 (br)
    • 训练集: */*br*/train.parquet
    • 验证集: */*br*/valid.parquet
    • 测试集: */*br*/test.parquet
  • 加泰罗尼亚语 (ca)
    • 训练集: */*ca*/train.parquet
    • 验证集: */*ca*/valid.parquet
    • 测试集: */*ca*/test.parquet
  • 巴斯克语 (eu)
    • 训练集: */*eu*/train.parquet
    • 验证集: */*eu*/valid.parquet
    • 测试集: */*eu*/test.parquet
  • 加利西亚语 (gl)
    • 训练集: */*gl*/train.parquet
    • 验证集: */*gl*/valid.parquet
    • 测试集: */*gl*/test.parquet
  • 亚美尼亚语 (hy)
    • 训练集: */*hy*/train.parquet
    • 验证集: */*hy*/valid.parquet
    • 测试集: */*hy*/test.parquet
  • 冰岛语 (is)
    • 训练集: */*is*/train.parquet
    • 验证集: */*is*/valid.parquet
    • 测试集: */*is*/test.parquet
  • 格鲁吉亚语 (ka)
    • 训练集: */*ka*/train.parquet
    • 验证集: */*ka*/valid.parquet
    • 测试集: */*ka*/test.parquet
  • 哈萨克语 (kk)
    • 训练集: */*kk*/train.parquet
    • 验证集: */*kk*/valid.parquet
    • 测试集: */*kk*/test.parquet
  • 韩语 (ko)
    • 训练集: */*ko*/train.parquet
    • 验证集: */*ko*/valid.parquet
    • 测试集: */*ko*/test.parquet
  • 泰卢固语 (te)
    • 训练集: */*te*/train.parquet
    • 验证集: */*te*/valid.parquet
    • 测试集: */*te*/test.parquet
  • 塔加路语 (tl)
    • 训练集: */*tl*/train.parquet
    • 验证集: */*tl*/valid.parquet
    • 测试集: */*tl*/test.parquet
  • 乌尔都语 (ur)
    • 训练集: */*ur*/train.parquet
    • 验证集: */*ur*/valid.parquet
    • 测试集: */*ur*/test.parquet
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作