five

Jotschi/coco-karpathy-opus-de

收藏
Hugging Face2024-03-02 更新2024-03-04 收录
下载链接:
https://hf-mirror.com/datasets/Jotschi/coco-karpathy-opus-de
下载链接
链接失效反馈
官方服务:
资源简介:
--- language: - de license_name: cc-by-4.0 license_link: https://creativecommons.org/licenses/by/4.0/legalcode tags: - coco - mscoco - german annotations_creators: - machine-generated pretty_name: MS COCO Karpathy in german size_categories: - n<650k source_datasets: - mscoco task_categories: - text-generation - image-to-text - text-to-image --- # Dataset Card for MS COCO Karpathy in German language This dataset contains captions that were machine translated using [opus-mt-en-de](https://huggingface.co/Helsinki-NLP/opus-mt-en-de). ## Dataset Details ### Dataset Description - **Curated by:** {{ curators | default("[More Information Needed]", true)}} - **Language(s) (NLP):** {{ language | default("[More Information Needed]", true)}} - **License:** {{ license | default("[More Information Needed]", true)}} ### Dataset Sources The processed [MS COCO datasets](https://cocodataset.org/#download) (Karpathy Split) in this repo are based on the following sources: | Type | MD5 | URL | |------------|----------------------------------|-----------------------------------------------------------------------------------------------| | Train | aa31ac474cf6250ebb81d18348a07ed8 | https://storage.googleapis.com/sfr-vision-language-research/datasets/coco_karpathy_train.json | | Validation | b273847456ef5580e33713b1f7de52a0 | https://storage.googleapis.com/sfr-vision-language-research/datasets/coco_karpathy_val.json | | Test | 3ff34b0ef2db02d01c37399f6a2a6cd1 | https://storage.googleapis.com/sfr-vision-language-research/datasets/coco_karpathy_test.json | MS COCO: - **Download:** https://cocodataset.org/#download - **Paper:** http://arxiv.org/abs/1405.0312 ## Dataset Creation This dataset was generated by processing the annotations via [opus-mt-en-de](https://huggingface.co/Helsinki-NLP/opus-mt-en-de).
提供机构:
Jotschi
原始信息汇总

数据集卡片 for MS COCO Karpathy in German language

数据集详情

数据集描述

  • 语言(NLP): 德语
  • 许可证: CC BY 4.0

数据集来源

本数据集基于以下来源处理得到的 MS COCO 数据集 (Karpathy Split):

类型 MD5 URL
训练 aa31ac474cf6250ebb81d18348a07ed8 https://storage.googleapis.com/sfr-vision-language-research/datasets/coco_karpathy_train.json
验证 b273847456ef5580e3371d1f7de52a0 https://storage.googleapis.com/sfr-vision-language-research/datasets/coco_karpathy_val.json
测试 3ff34b0ef2db02d01c37399f6a2a6cd1 https://storage.googleapis.com/sfr-vision-language-research/datasets/coco_karpathy_test.json

MS COCO:

  • 下载: https://cocodataset.org/#download
  • 论文: http://arxiv.org/abs/1405.0312

数据集创建

本数据集是通过使用 opus-mt-en-de 处理标注生成的。

5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作