qualcomm-interactive-cooking-dataset
收藏魔搭社区2026-01-02 更新2025-12-06 收录
下载链接:
https://modelscope.cn/datasets/qualcomm/qualcomm-interactive-cooking-dataset
下载链接
链接失效反馈官方服务:
资源简介:
# Qualcomm Interactive Cooking Dataset
## Description
The Qualcomm Interactive Cooking Dataset is designed to evaluate the ability of multi-modal LLMs to provide step-by-step instructions, focusing on the cooking domain.
## Dataset Details
The Qualcomm Interactive Cooking Dataset includes step-by-step instructions and feedback pairs. The videos are from the [CaptainCook4D dataset](https://captaincook4d.github.io/captain-cook/) - licensed under Apache 2.0.
## Dataset Collection Process
The text annotations and timestamps are manually annotated.
## Data Format
The dataset can be loaded using the following command:
`load_dataset("qualcomm/qualcomm-interactive-cooking-dataset", <set>, split=<split>)`
Where, `set = {“main”, “advanced_planning”}` and `split={“train”, “validation”, “test”}`.
Each row of the dataset corresponds to a video from the CaptainCook4D dataset. The row contains the following columns:
- 1. `video_id`: The identifier of the video from CaptainCook4D.
- 2. `activity_name`: The name of the recipe in the video.
- 3. `output_texts`: The instruction and feedback messages.
- 4. `output_timestamps`: The timestamp corresponding to each instruction and feedback message.
- 5. `output_types`: The type of each message in output_texts classified as described in Appendix B.
- 6. `output_actions`: The action that the user performed corresponding to each feedback message.
- 7. `remaining_plan`: The remaining step-by-step plan before each instruction or feedback message.
## Dataset license
This dataset is intended for research purposes only.
Data License Agreement - Research Use
## Dataset Citation Instructions
Please cite our paper if you use this dataset in your research.
```
@inproceedings{interactivecooking,
title = {Can Multi-Modal LLMs Provide Live Step-by-Step Task Guidance?},
author = {Apratim Bhattacharyya and Bicheng Xu and Sanjay Haresh and Reza Pourreza and Litian Liu and Sunny Panchal and Leonid Sigal and Roland Memisevic},
booktitle = {NeurIPS},
year = {2025}
}
```
## Qualcomm AI Research
At Qualcomm AI Research, we are advancing AI to make its core capabilities – perception, reasoning, and action – ubiquitous across devices. Our mission is to make breakthroughs in fundamental AI research and scale them across industries. By bringing together some of the best minds in the field, we’re pushing the boundaries of what’s possible and shaping the future of AI.
Qualcomm AI Research continues to invest in and support deep-learning research in computer vision. The publication of this dataset for use by the AI research community is one of our many initiatives.
Find out more about [Qualcomm AI Research](https://developer.qualcomm.com/forums/software/ai-research-datasets).
For any questions or technical support, please contact us at research.datasets@qti.qualcomm.com
*Qualcomm AI Research is an initiative of Qualcomm Technologies, Inc.*
# 高通交互式烹饪数据集(Qualcomm Interactive Cooking Dataset)
## 描述
本数据集旨在评估多模态大语言模型(multi-modal LLM)生成分步任务指导的能力,核心聚焦烹饪领域。
## 数据集详情
高通交互式烹饪数据集包含分步指导与反馈配对样本。其视频素材源自[CaptainCook4D数据集](https://captaincook4d.github.io/captain-cook/),采用Apache 2.0许可协议。
## 数据集采集流程
文本标注与时间戳均通过人工标注完成。
## 数据格式
可通过以下命令加载该数据集:
`load_dataset("qualcomm/qualcomm-interactive-cooking-dataset", <set>, split=<split>)`
其中,`set` 取值范围为 `{"main", "advanced_planning"}`,`split` 取值范围为 `{"train", "validation", "test"}`。
数据集的每一行对应CaptainCook4D数据集中的一段视频,包含以下列:
1. `video_id`:CaptainCook4D数据集中的视频标识符。
2. `activity_name`:视频中对应食谱的名称。
3. `output_texts`:指导与反馈信息。
4. `output_timestamps`:每条指导与反馈信息对应的时间戳。
5. `output_types`:`output_texts` 中每条信息的类型,分类规则详见附录B。
6. `output_actions`:与每条反馈信息对应的用户执行动作。
7. `remaining_plan`:每条指导或反馈信息生成前的剩余分步规划内容。
## 数据集许可
本数据集仅用于研究目的。
数据许可协议——研究用途
## 数据集引用说明
若在研究中使用本数据集,请引用以下论文:
@inproceedings{interactivecooking,
title = {Can Multi-Modal LLMs Provide Live Step-by-Step Task Guidance?},
author = {Apratim Bhattacharyya and Bicheng Xu and Sanjay Haresh and Reza Pourreza and Litian Liu and Sunny Panchal and Leonid Sigal and Roland Memisevic},
booktitle = {NeurIPS},
year = {2025}
}
## 高通人工智能研究院(Qualcomm AI Research)
高通人工智能研究院致力于推进人工智能核心能力——感知、推理与交互——在各类设备中的普及应用。我们的使命是在基础人工智能研究领域取得突破性进展,并将其推广至各行业。通过汇聚本领域顶尖人才,我们不断突破技术边界,塑造人工智能的未来。
高通人工智能研究院持续投入并支持计算机视觉领域的深度学习研究,面向人工智能研究社区发布本数据集便是我们诸多举措之一。
了解更多[高通人工智能研究院](https://developer.qualcomm.com/forums/software/ai-research-datasets)相关信息。
如有任何疑问或技术支持需求,请发送邮件至 research.datasets@qti.qualcomm.com 与我们联系。
*高通人工智能研究院是高通技术公司旗下的研究项目。
提供机构:
maas
创建时间:
2025-11-28



