dahwinsingularity/vision_m
收藏Hugging Face2024-07-10 更新2024-07-22 收录
下载链接:
https://hf-mirror.com/datasets/dahwinsingularity/vision_m
下载链接
链接失效反馈官方服务:
资源简介:
InternVL2-8B是InternVL系列中的最新多模态大语言模型,具有多种指令调优模型,参数范围从20亿到1080亿不等。该模型在文档和图表理解、信息图表问答、场景文本理解和OCR任务、科学和数学问题解决以及文化理解和综合多模态能力等任务上表现出色。模型训练使用了8k上下文窗口,并包含长文本、多图像和视频的训练数据,显著提高了处理这些输入类型的能力。
The InternVL2-8B dataset is the latest addition to the InternVL series of multimodal large language models, featuring instruction-tuned models ranging from 2 billion to 108 billion parameters. The dataset surpasses most open-source models and demonstrates competitive performance with proprietary commercial models across various capabilities, including document and chart comprehension, infographics QA, scene text understanding and OCR tasks, scientific and mathematical problem solving, as well as cultural understanding and integrated multimodal capabilities. The dataset is trained with an 8k context window and utilizes training data consisting of long texts, multiple images, and videos, significantly improving its ability to handle these types of inputs.
提供机构:
dahwinsingularity



