romrawinjp/multilingual-coco
收藏Hugging Face2024-10-25 更新2024-12-14 收录
下载链接:
https://hf-mirror.com/datasets/romrawinjp/multilingual-coco
下载链接
链接失效反馈官方服务:
资源简介:
该数据集是多语言版本的COCO数据集,包含了多种语言的图像描述。数据集的语言包括英语、泰语、俄语、日语、意大利语、德语、越南语、中文、阿拉伯语和西班牙语。每个语言的描述都有其特定的来源和翻译方法,例如英语描述来自COCO数据集的原生注释文件,而其他语言的描述则通过机器翻译或人工翻译得到。数据集的分割遵循Andrej Karpathy的分割方法,适用于非商业和研究用途。
This dataset is a multilingual version of the COCO dataset, containing image captions in multiple languages. The languages included are English, Thai, Russian, Japanese, Italian, German, Vietnamese, Chinese, Arabic, and Spanish. Each languages captions have specific sources and translation methods, such as English captions from the original COCO dataset annotations, while other languages captions are obtained through machine translation or human translation. The dataset split follows Andrej Karpathys split method and is intended for non-commercial and research purposes.
提供机构:
romrawinjp



