中国移动生活百科问答数据集
收藏国家数据集管理服务平台2026-05-28 更新2026-04-29 收录
下载链接:
https://www.ndsms.cn/dataRetrieval/datasetDetail/?id=43ed08b144068775f00d5819b8dc66f3
下载链接
链接失效反馈官方服务:
资源简介:
本数据集为大规模生活百科类图文对话资源,包含101万条高质量样本。数据覆盖生活用品、科普知识、动物植物、食品菜品、工业用品、艺术作品、服饰美妆、家居清洁、家电使用、个人护理、手工制作、健康养生等丰富主题。每条数据由全局唯一标识、高清图像、场景分类标签、画面全局描述文本以及单轮与多轮对话组成。其中,标签支持按场景类别快速检索,描述文本全面概括画面主体与氛围,单轮对话针对图像内容进行直观问答,多轮对话则围绕细节展开层层递进的上下文关联追问与精准作答,完整还原真实交互逻辑。
This dataset is a large-scale image-text dialogue resource themed around life encyclopedias, containing 1.01 million high-quality samples.
It covers a wide range of themes including daily necessities, popular science knowledge, animals and plants, food and dishes, industrial supplies, artistic works, apparel and cosmetics, household cleaning, household appliance usage, personal care, handmade crafts, health and wellness, and more.
Each sample consists of a globally unique identifier, a high-definition image, a scene classification label, a global descriptive text of the image, as well as single-turn and multi-turn dialogues.
Specifically, the labels enable fast retrieval based on scene categories; the descriptive text comprehensively summarizes the main subjects and overall ambiance of the image; single-turn dialogues provide intuitive question-and-answer interactions targeting the image content; while multi-turn dialogues carry out stepwise, contextually relevant follow-up inquiries and precise responses focused on details, fully restoring real-world interactive logic.
提供机构:
中移九天人工智能科技(北京)有限公司
创建时间:
2026-04-25
搜集汇总
数据集介绍

背景与挑战
背景概述
该数据集是一个大规模生活百科类图文对话资源,包含101万条高质量样本,覆盖生活用品、科普知识等多个主题。每条数据由高清图像、场景标签、描述文本以及单轮与多轮对话组成,适用于多模态大模型的训练与评测,如视觉推理和对话交互任务。
以上内容由遇见数据集搜集并总结生成



