AllenNella/mmmb
收藏Hugging Face2025-01-28 更新2025-02-15 收录
下载链接:
https://hf-mirror.com/datasets/AllenNella/mmmb
下载链接
链接失效反馈官方服务:
资源简介:
Parrot数据集是一个多语言、多模态的数据集,包括两个部分:多模态训练数据集sharegpt-4v-ar、sharegpt-4v-pt、sharegpt-4v-ru、sharegpt-4v-tr和sharegpt-4v-zh,以及多模态评估基准MMMBench和MMMB。由于翻译和审核过程,MMMB中其他翻译语言的样本数量将少于原始英文数据集(每种语言超过95%)。MMMB数据集中的图像已编码为base64,需要解码后才能使用。
The Parrot dataset is a multilingual, multimodal dataset comprising two parts: the multimodal training datasets sharegpt-4v-ar, sharegpt-4v-pt, sharegpt-4v-ru, sharegpt-4v-tr, and sharegpt-4v-zh, as well as the multimodal evaluation benchmarks MMBench and MMMB. Due to translation and review processes, the number of data points in other translated languages in mmbench will be fewer than in the original English dataset (each language will have more than 95%). The images in the MMMB dataset have been encoded in base64, so they need to be decoded from base64 for use.
提供机构:
AllenNella



