shi-labs/Eagle-1.8M
收藏Hugging Face2024-08-29 更新2024-12-14 收录
下载链接:
https://hf-mirror.com/datasets/shi-labs/Eagle-1.8M
下载链接
链接失效反馈官方服务:
资源简介:
Eagle-1.8M数据集是一个包含180万样本的多模态数据集,主要用于训练Eagle模型。数据集的语言为英语,许可证为cc-by-nc-nd-4.0。数据来源包括LLaVA v1.5、DocVQA、synDog-EN、ChartQA、DVQA、AI2D、ShareGPT-4V、laion-GPT4V、LVIS-Instruct4V、LRV-Instruct、Geo170k、LLaVAR、Visual7W和Open-Hermes 2.5等多个子数据集,涵盖了多模态对话、文档理解、OCR、图表理解、数学、视觉问答和文本等多种任务。该数据集仅限于学术研究和教育用途,部分数据来源于OpenAI API生成。
The Eagle-1.8M dataset is a large-scale dataset containing diverse language and multimodal data, primarily used for training the Eagle model. The dataset includes multiple subsets such as LLaVA v1.5, DocVQA, synDog-EN, etc., with a total of 1.8M samples. The language of the dataset is English, and the license is cc-by-nc-nd-4.0, restricted to academic research and educational purposes only.
提供机构:
shi-labs



