LLaVA-OneVision-Mid-Data

Name: LLaVA-OneVision-Mid-Data
Creator: maas
Published: 2025-12-03 17:06:38
License: 暂无描述

魔搭社区2025-12-03 更新2024-10-12 收录

下载链接：

https://modelscope.cn/datasets/lmms-lab/LLaVA-OneVision-Mid-Data

下载链接

链接失效反馈

官方服务：

资源简介：

# Dataset Card for LLaVA-OneVision > Due to unknow reasons, we are unable to process dataset with large amount into required HF format. So we directly upload the json files and image folders (compressed into tar.gz files). > You can use the following link to directly download and decompress them. > https://huggingface.co/datasets/lmms-lab/LLaVA-OneVision-Mid-Data/tree/main/evol_instruct We provide the whole details of LLaVA-OneVision Dataset. In this dataset, we include the data splits used in our mid-stage training. For more details, please check our [paper](arxiv.org/abs/2408.03326). ## Dataset Description - **Curated by:** Bo Li, Kaichen Zhang, Hao Zhang, Yuanhan Zhang, Renrui Zhang, Feng Li, Dong Guo - **Language(s) (NLP):** English, Chinese - **License:** Apache License 2.0 ## Dataset Sources  - **Dataset Collection:** We include a few subsets from existing dataset collection [Cambrian](https://huggingface.co/datasets/nyu-visionx/Cambrian-10M), [Cauldron](https://huggingface.co/datasets/HuggingFaceM4/the_cauldron), [UReader](https://arxiv.org/abs/2310.05126). Since we only used a few subsets from these datasets, and applied the cleaning and re-annotation process, we uploaded our processed version of these datasets into our own repository and thank the authors for providing the original datasets. - **Other Datasets:** For rest single source dataset, such as AI2D, OKVQA, we cite and link the original sources in our paper. ## Uses This dataset is used for the training of the LLaVA-OneVision Mid-Stage model. We only allow the use of this dataset for academic research and education purpose. For OpenAI GPT-4 generated data, we recommend the users to check the [OpenAI Usage Policy](https://openai.com/policies/usage-policies/). ## Dataset Structure We expalin the data composition for mid-stage and final-stage at our repo in [**training doc**](https://github.com/LLaVA-VL/LLaVA-NeXT/tree/main/scripts/train#about-the-llava-onevision-data). ## Citation **BibTeX:** [More Information Needed] ## Glossary The dataset collection process is conducted by all of the authors, we thank the Feng Li and Renrui Zhang for providing [LLaVA-M4-Instruct Data](https://huggingface.co/datasets/lmms-lab/M4-Instruct-Data) and Yuanhan for providing the Video datasets (will seperately released later). After the dataset collection, the cleaning and re-annotation process, including final mixture of the dataset, is conducted by Bo Li and with the great help of Kaichen Zhang. ## Dataset Card Authors The dataset is curated by the following authors: Bo Li, Kaichen Zhang, Hao Zhang, Yuanhan Zhang, Renrui Zhang, Feng Li ## Dataset Card Contact [Bo Li](https://brianboli.com/): drluodian@gmail.com [Kaichen Zhang](https://www.linkedin.com/in/kaichen-zhang-014b17219/?originalSubdomain=sg)

# LLaVA-OneVision 数据集卡片 > 由于未知原因，我们无法将大规模数据集处理为符合要求的Hugging Face（HF）格式。因此我们直接上传了JSON文件与图像文件夹（已压缩为tar.gz格式）。 > 您可通过以下链接直接下载并解压相关文件： > https://huggingface.co/datasets/lmms-lab/LLaVA-OneVision-Mid-Data/tree/main/evol_instruct 我们完整呈现了LLaVA-OneVision数据集的全部细节。本次发布的数据集包含我们在中期训练阶段所使用的数据划分方式，更多细节请参阅我们的[论文](arxiv.org/abs/2408.03326)。 ## 数据集描述 - **整理方：** 李博、张凯宸、张浩、张元翰、张任睿、李丰、郭栋 - **自然语言语种：** 英语、中文 - **授权协议：** Apache许可证2.0 ## 数据集来源 - **数据集合集来源：** 我们从现有数据集合集[Cambrian](https://huggingface.co/datasets/nyu-visionx/Cambrian-10M)、[Cauldron](https://huggingface.co/datasets/HuggingFaceM4/the_cauldron)、[UReader](https://arxiv.org/abs/2310.05126)中选取了部分子集。鉴于我们仅使用了上述数据集的少量子集，并对其进行了清洗与重新标注处理，我们已将处理后的版本上传至自有仓库，在此感谢原数据集作者的开源贡献。 - **其他数据集：** 其余单源数据集（如AI2D、OKVQA）的原始来源与引用信息已在我们的论文中给出。 ## 使用范围本数据集用于LLaVA-OneVision中期模型的训练，仅允许用于学术研究与教育用途。若涉及由OpenAI GPT-4生成的数据，请使用者务必查阅[OpenAI使用政策](https://openai.com/policies/usage-policies/)。 ## 数据集结构我们在自有仓库的[**训练文档**](https://github.com/LLaVA-VL/LLaVA-NeXT/tree/main/scripts/train#about-the-llava-onevision-data)中详细说明了中期与最终阶段训练的数据集构成。 ## 引用格式 **BibTeX格式：** [待补充完整信息] ## 术语说明本数据集的收集工作由全体作者共同完成，在此感谢李丰与张任睿提供的[LLaVA-M4-Instruct数据集](https://huggingface.co/datasets/lmms-lab/M4-Instruct-Data)，以及张元翰提供的视频数据集（该数据集将另行发布）。数据集收集完成后，李博主导完成了数据清洗、重新标注以及最终的数据集混合工作，在此感谢张凯宸提供的大力协助。 ## 数据集卡片作者本数据集卡片的整理作者如下：李博、张凯宸、张浩、张元翰、张任睿、李丰 ## 数据集卡片联系人 [李博](https://brianboli.com/)：drluodian@gmail.com [张凯宸](https://www.linkedin.com/in/kaichen-zhang-014b17219/?originalSubdomain=sg)

提供机构：

maas

创建时间：

2024-10-07

5,000+

优质数据集

54 个

任务类型

进入经典数据集