Obscure-Entropy/GBC10M_HU
收藏Hugging Face2024-08-14 更新2024-12-14 收录
下载链接:
https://hf-mirror.com/datasets/Obscure-Entropy/GBC10M_HU
下载链接
链接失效反馈官方服务:
资源简介:
该数据集是对现有图像字幕数据集的扩展,特别针对基于图表的字幕生成(GBC)进行了增强,并增加了匈牙利语翻译。数据集包含约1000万条字幕,分为10个parquet文件,便于部分或全部下载。数据字段包括图像的URL、图像本身、英文描述和匈牙利语描述。数据集主要用于图像到文本的任务,特别是GBC和跨语言应用。
This dataset is an extension of an existing image captioning dataset, enhanced for graph-based captioning (GBC) and augmented with Hungarian translations. It contains the URL of the image, the image itself, English captions, and Hungarian captions. The dataset is divided into multiple parquet files, making it easy to download and use partially. The dataset has some limitations, such as the accuracy of machine translations, the lack of explicit graph annotations, and the reduced quality of the images.
提供机构:
Obscure-Entropy



