five

gagan3012/multilingual-llava-bench

收藏
Hugging Face2024-04-13 更新2024-06-12 收录
下载链接:
https://hf-mirror.com/datasets/gagan3012/multilingual-llava-bench
下载链接
链接失效反馈
官方服务:
资源简介:
--- dataset_info: - config_name: arabic features: - name: question_id dtype: int64 - name: image dtype: image - name: question dtype: string - name: caption dtype: string - name: image_id dtype: string - name: gpt_answer dtype: string - name: category dtype: string splits: - name: train num_bytes: 22342774.0 num_examples: 60 download_size: 9778993 dataset_size: 22342774.0 - config_name: bengali features: - name: question_id dtype: int64 - name: image dtype: image - name: question dtype: string - name: caption dtype: string - name: image_id dtype: string - name: gpt_answer dtype: string - name: category dtype: string splits: - name: train num_bytes: 22378020.0 num_examples: 60 download_size: 9783130 dataset_size: 22378020.0 - config_name: chinese features: - name: question_id dtype: int64 - name: image dtype: image - name: question dtype: string - name: caption dtype: string - name: image_id dtype: string - name: gpt_answer dtype: string - name: category dtype: string splits: - name: train num_bytes: 22317502.0 num_examples: 60 download_size: 9772605 dataset_size: 22317502.0 - config_name: french features: - name: question_id dtype: int64 - name: image dtype: image - name: question dtype: string - name: caption dtype: string - name: image_id dtype: string - name: gpt_answer dtype: string - name: category dtype: string splits: - name: train num_bytes: 22327391.0 num_examples: 60 download_size: 9773783 dataset_size: 22327391.0 - config_name: hindi features: - name: question_id dtype: int64 - name: image dtype: image - name: question dtype: string - name: caption dtype: string - name: image_id dtype: string - name: gpt_answer dtype: string - name: category dtype: string splits: - name: train num_bytes: 22385129.0 num_examples: 60 download_size: 9799590 dataset_size: 22385129.0 - config_name: japanese features: - name: question_id dtype: int64 - name: image dtype: image - name: question dtype: string - name: caption dtype: string - name: image_id dtype: string - name: gpt_answer dtype: string - name: category dtype: string splits: - name: train num_bytes: 22333016.0 num_examples: 60 download_size: 9782382 dataset_size: 22333016.0 - config_name: russian features: - name: question_id dtype: int64 - name: image dtype: image - name: question dtype: string - name: caption dtype: string - name: image_id dtype: string - name: gpt_answer dtype: string - name: category dtype: string splits: - name: train num_bytes: 22355236.0 num_examples: 60 download_size: 9792575 dataset_size: 22355236.0 - config_name: spanish features: - name: question_id dtype: int64 - name: image dtype: image - name: question dtype: string - name: caption dtype: string - name: image_id dtype: string - name: gpt_answer dtype: string - name: category dtype: string splits: - name: train num_bytes: 22326471.0 num_examples: 60 download_size: 9781970 dataset_size: 22326471.0 - config_name: urdu features: - name: question_id dtype: int64 - name: image dtype: image - name: question dtype: string - name: caption dtype: string - name: image_id dtype: string - name: gpt_answer dtype: string - name: category dtype: string splits: - name: train num_bytes: 22349409.0 num_examples: 60 download_size: 9784751 dataset_size: 22349409.0 configs: - config_name: arabic data_files: - split: train path: arabic/train-* - config_name: bengali data_files: - split: train path: bengali/train-* - config_name: chinese data_files: - split: train path: chinese/train-* - config_name: french data_files: - split: train path: french/train-* - config_name: hindi data_files: - split: train path: hindi/train-* - config_name: japanese data_files: - split: train path: japanese/train-* - config_name: russian data_files: - split: train path: russian/train-* - config_name: spanish data_files: - split: train path: spanish/train-* - config_name: urdu data_files: - split: train path: urdu/train-* ---
提供机构:
gagan3012
原始信息汇总

数据集概述

数据集配置

配置名称 语言
arabic 阿拉伯语
bengali 孟加拉语
chinese 中文
french 法语
hindi 印地语
japanese 日语
russian 俄语
spanish 西班牙语
urdu 乌尔都语

数据集特征

特征名称 数据类型 描述
question_id int64 问题ID
image image 图像
question string 问题文本
caption string 图像描述
image_id string 图像ID
gpt_answer string GPT生成的答案
category string 类别

数据集分割

配置名称 分割类型 示例数量 数据大小(字节) 下载大小(字节)
arabic train 60 22342774.0 9778993
bengali train 60 22378020.0 9783130
chinese train 60 22317502.0 9772605
french train 60 22327391.0 9773783
hindi train 60 22385129.0 9799590
japanese train 60 22333016.0 9782382
russian train 60 22355236.0 9792575
spanish train 60 22326471.0 9781970
urdu train 60 22349409.0 9784751

数据集文件路径

配置名称 分割类型 文件路径
arabic train arabic/train-*
bengali train bengali/train-*
chinese train chinese/train-*
french train french/train-*
hindi train hindi/train-*
japanese train japanese/train-*
russian train russian/train-*
spanish train spanish/train-*
urdu train urdu/train-*
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作