FreedomIntelligence/ALLaVA-4V-Arabic

Name: FreedomIntelligence/ALLaVA-4V-Arabic
Creator: FreedomIntelligence
Published: 2024-04-29 16:09:37
License: 暂无描述

Hugging Face2024-04-29 更新2024-06-22 收录

下载链接：

https://hf-mirror.com/datasets/FreedomIntelligence/ALLaVA-4V-Arabic

下载链接

链接失效反馈

官方服务：

资源简介：

--- license: apache-2.0 task_categories: - question-answering - text-generation language: - ar tags: - GPT-4V - LVLM - Vision - Language size_categories: - 1M<n<10M configs: - config_name: allava_laion data_files: - split: caption path: "allava_laion/ALLaVA-Caption-LAION-4V_Arabic.json" # - split: instruct # path: "allava_laion/ALLaVA-Instruct-LAION-4V_Chinese.json" - config_name: allava_vflan data_files: - split: caption path: "allava_vflan/ALLaVA-Caption-VFLAN-4V_Arabic.json" # - split: instruct # path: "allava_vflan/ALLaVA-Instruct-VFLAN-4V_Chinese.json" # - config_name: allava_laion_instruction # data_files: "allava_laion/ALLaVA-Instruct-LAION-4V.json" # configs: # - config_name: default # data_files: # - split: allava_laion_caption # path: "allava_laion/ALLaVA-Caption-LAION-4V.json" # - split: allava_laion_instruction # path: "allava_laion/ALLaVA-Instruction-LAION-4V.json" # configs: # - config_name: default # - data_files: # - split: allava_laion_caption # - path: # - "allava_laion/ALLaVA-Caption-LAION-4V.json" # - split: allava_laion_instruction # - path: # - "allava_laion/ALLaVA-Instruction-LAION-4V.json" --- ## ALLaVA-4V for Arabic This is the Arabic version of the ALLaVA-4V data. We have translated the ALLaVA-4V data into Arabic through ChatGPT and instructed ChatGPT not to translate content related to OCR. The original dataset can be found [here](https://huggingface.co/datasets/FreedomIntelligence/ALLaVA-4V), and the image data can be downloaded from [ALLaVA-4V](https://huggingface.co/datasets/FreedomIntelligence/ALLaVA-4V). #### Citation If you find our data useful, please consider citing our work! We are FreedomIntelligence from Shenzhen Research Institute of Big Data and The Chinese University of Hong Kong, Shenzhen. ``` @misc{chen2024allava, title={ALLaVA: Harnessing GPT4V-synthesized Data for A Lite Vision-Language Model}, author={Guiming Hardy Chen and Shunian Chen and Ruifei Zhang and Junying Chen and Xiangbo Wu and Zhiyi Zhang and Zhihong Chen and Jianquan Li and Xiang Wan and Benyou Wang}, year={2024}, eprint={2402.11684}, archivePrefix={arXiv}, primaryClass={cs.CL} } ```

提供机构：

FreedomIntelligence

原始信息汇总

数据集概述

基本信息

许可证: Apache-2.0
任务类别:
- 问答
- 文本生成
语言: 阿拉伯语
标签:
- GPT-4V
- LVLM
- Vision
- Language
数据集大小: 1M<n<10M

配置详情

配置名称: allava_laion
- 数据文件:
  - 分割: caption
  - 路径: "allava_laion/ALLaVA-Caption-LAION-4V_Arabic.json"
配置名称: allava_vflan
- 数据文件:
  - 分割: caption
  - 路径: "allava_vflan/ALLaVA-Caption-VFLAN-4V_Arabic.json"

数据集描述

版本: 阿拉伯语版本
翻译说明: 通过ChatGPT翻译了ALLaVA-4V数据集，并指示ChatGPT不翻译与OCR相关的内容。

引用信息

@misc{chen2024allava, title={ALLaVA: Harnessing GPT4V-synthesized Data for A Lite Vision-Language Model}, author={Guiming Hardy Chen and Shunian Chen and Ruifei Zhang and Junying Chen and Xiangbo Wu and Zhiyi Zhang and Zhihong Chen and Jianquan Li and Xiang Wan and Benyou Wang}, year={2024}, eprint={2402.11684}, archivePrefix={arXiv}, primaryClass={cs.CL} }

5,000+

优质数据集

54 个

任务类型

进入经典数据集