DAMO-NLP-SG/multimodal_textbook
收藏Hugging Face2025-03-17 更新2025-02-15 收录
下载链接:
https://hf-mirror.com/datasets/DAMO-NLP-SG/multimodal_textbook
下载链接
链接失效反馈官方服务:
资源简介:
Multimodal-Textbook-6.5M是一个用于视觉语言预训练的多模态教材,包含6.5M个从教学视频中提取的关键帧和0.8B个文本(ASR文本)。这些数据和文本覆盖了数学、物理、化学等多个基础学科,提供了丰富的知识和连贯的上下文,用于图像和文本的对齐。
Multimodal-Textbook-6.5M is a multimodal textbook for vision-language pretraining, containing 6.5M keyframes extracted from instructional videos and 0.8B ASR texts. The data and texts cover various fundamental subjects such as mathematics, physics, and chemistry, providing rich knowledge and coherent context for image-text alignment.
提供机构:
DAMO-NLP-SG



