five

lmms-lab/LLaVA-OneVision-Mid-Data

收藏
Hugging Face2024-08-26 更新2025-04-08 收录
下载链接:
https://hf-mirror.com/datasets/lmms-lab/LLaVA-OneVision-Mid-Data
下载链接
链接失效反馈
官方服务:
资源简介:
--- license: apache-2.0 task_categories: - text-generation language: - zh - en tags: - multimodal pretty_name: llava-onevision-mid size_categories: - 1M<n<10M --- # Dataset Card for LLaVA-OneVision > Due to unknow reasons, we are unable to process dataset with large amount into required HF format. So we directly upload the json files and image folders (compressed into tar.gz files). > You can use the following link to directly download and decompress them. > https://huggingface.co/datasets/lmms-lab/LLaVA-OneVision-Mid-Data/tree/main/evol_instruct We provide the whole details of LLaVA-OneVision Dataset. In this dataset, we include the data splits used in our mid-stage training. For more details, please check our [paper](arxiv.org/abs/2408.03326). ## Dataset Description - **Curated by:** Bo Li, Kaichen Zhang, Hao Zhang, Yuanhan Zhang, Renrui Zhang, Feng Li, Dong Guo - **Language(s) (NLP):** English, Chinese - **License:** Apache License 2.0 ## Dataset Sources <!-- Provide the basic links for the dataset. --> - **Dataset Collection:** We include a few subsets from existing dataset collection [Cambrian](https://huggingface.co/datasets/nyu-visionx/Cambrian-10M), [Cauldron](https://huggingface.co/datasets/HuggingFaceM4/the_cauldron), [UReader](https://arxiv.org/abs/2310.05126). Since we only used a few subsets from these datasets, and applied the cleaning and re-annotation process, we uploaded our processed version of these datasets into our own repository and thank the authors for providing the original datasets. - **Other Datasets:** For rest single source dataset, such as AI2D, OKVQA, we cite and link the original sources in our paper. ## Uses This dataset is used for the training of the LLaVA-OneVision Mid-Stage model. We only allow the use of this dataset for academic research and education purpose. For OpenAI GPT-4 generated data, we recommend the users to check the [OpenAI Usage Policy](https://openai.com/policies/usage-policies/). ## Dataset Structure We expalin the data composition for mid-stage and final-stage at our repo in [**training doc**](https://github.com/LLaVA-VL/LLaVA-NeXT/tree/main/scripts/train#about-the-llava-onevision-data). ## Citation **BibTeX:** [More Information Needed] ## Glossary The dataset collection process is conducted by all of the authors, we thank the Feng Li and Renrui Zhang for providing [LLaVA-M4-Instruct Data](https://huggingface.co/datasets/lmms-lab/M4-Instruct-Data) and Yuanhan for providing the Video datasets (will seperately released later). After the dataset collection, the cleaning and re-annotation process, including final mixture of the dataset, is conducted by Bo Li and with the great help of Kaichen Zhang. ## Dataset Card Authors The dataset is curated by the following authors: Bo Li, Kaichen Zhang, Hao Zhang, Yuanhan Zhang, Renrui Zhang, Feng Li ## Dataset Card Contact [Bo Li](https://brianboli.com/): drluodian@gmail.com [Kaichen Zhang](https://www.linkedin.com/in/kaichen-zhang-014b17219/?originalSubdomain=sg)

许可证:Apache-2.0 任务类别: - 文本生成 语言: - 中文 - 英文 标签:多模态(multimodal) 展示名称:LLaVA-OneVision-Mid 数据规模:100万 < 样本数 < 1000万 # LLaVA-OneVision 数据集卡片 > 由于不明原因,我们无法将大规模数据集处理为要求的Hugging Face(HF)格式,因此直接上传了JSON文件与打包为tar.gz格式的图像文件夹。 > 您可通过以下链接直接下载并解压相关文件: > https://huggingface.co/datasets/lmms-lab/LLaVA-OneVision-Mid-Data/tree/main/evol_instruct 本数据集卡片完整呈现LLaVA-OneVision数据集的相关信息,本次上传的是我们在中期训练阶段所使用的数据划分子集。如需了解更多细节,请参阅我们的[论文](arxiv.org/abs/2408.03326)。 ## 数据集概述 - **数据整理者**:Bo Li、Kaichen Zhang、Hao Zhang、Yuanhan Zhang、Renrui Zhang、Feng Li、Dong Guo - **自然语言处理语言**:英语、中文 - **许可证**:Apache许可证2.0 ## 数据集来源 <!-- 请提供数据集的基础链接 --> - **数据集收录**:本次数据集收录了来自现有数据集集合[Cambrian](https://huggingface.co/datasets/nyu-visionx/Cambrian-10M)、[Cauldron](https://huggingface.co/datasets/HuggingFaceM4/the_cauldron)与[UReader](https://arxiv.org/abs/2310.05126)的部分子集。鉴于我们仅使用了上述数据集的少量子集,并对其进行了清洗与重新标注处理,我们已将处理后的数据集版本上传至自有仓库,在此感谢原数据集作者的开源贡献。 - **其他数据集**:其余单源数据集(如AI2D、OKVQA)的原始来源与引用信息均已在我们的论文中列出并链接。 ## 数据集用途 本数据集用于LLaVA-OneVision中期训练模型的训练。我们仅允许将本数据集用于学术研究与教育用途。若涉及由OpenAI GPT-4生成的数据,请使用者务必查阅[OpenAI使用政策](https://openai.com/policies/usage-policies/)。 ## 数据集结构 我们已在自有仓库的[**训练文档**](https://github.com/LLaVA-VL/LLaVA-NeXT/tree/main/scripts/train#about-the-llava-onevision-data)中详细说明了中期与最终阶段训练的数据集构成。 ## 引用格式 **BibTeX格式**: [信息待补充] ## 术语说明 本数据集的收录工作由全体作者共同完成,在此感谢Feng Li与Renrui Zhang提供的[LLaVA-M4-Instruct Data](https://huggingface.co/datasets/lmms-lab/M4-Instruct-Data),以及Yuanhan提供的视频数据集(该数据集将另行发布)。 数据集收录完成后,Bo Li主导了数据清洗、重新标注以及最终数据集混合整合的工作,Kaichen Zhang为此提供了重要协助。 ## 数据集卡片作者 本数据集由以下作者整理: Bo Li、Kaichen Zhang、Hao Zhang、Yuanhan Zhang、Renrui Zhang、Feng Li ## 数据集卡片联系方式 [Bo Li](https://brianboli.com/):drluodian@gmail.com [Kaichen Zhang](https://www.linkedin.com/in/kaichen-zhang-014b17219/?originalSubdomain=sg)
提供机构:
lmms-lab
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作