five

stanford_kuka_multimodal_dataset

收藏
魔搭社区2025-12-05 更新2025-02-08 收录
下载链接:
https://modelscope.cn/datasets/lerobot/stanford_kuka_multimodal_dataset
下载链接
链接失效反馈
官方服务:
资源简介:
This dataset was created using [LeRobot](https://github.com/huggingface/lerobot). ## Dataset Description - **Homepage:** https://sites.google.com/view/visionandtouch - **Paper:** https://arxiv.org/abs/1810.10191 - **License:** mit ## Dataset Structure [meta/info.json](meta/info.json): ```json { "codebase_version": "v2.0", "robot_type": "unknown", "total_episodes": 3000, "total_frames": 149985, "total_tasks": 1, "total_videos": 3000, "total_chunks": 3, "chunks_size": 1000, "fps": 20, "splits": { "train": "0:3000" }, "data_path": "data/chunk-{episode_chunk:03d}/episode_{episode_index:06d}.parquet", "video_path": "videos/chunk-{episode_chunk:03d}/{video_key}/episode_{episode_index:06d}.mp4", "features": { "observation.images.image": { "dtype": "video", "shape": [ 128, 128, 3 ], "names": [ "height", "width", "channel" ], "video_info": { "video.fps": 20.0, "video.codec": "av1", "video.pix_fmt": "yuv420p", "video.is_depth_map": false, "has_audio": false } }, "language_instruction": { "dtype": "string", "shape": [ 1 ], "names": null }, "observation.state": { "dtype": "float32", "shape": [ 7 ], "names": { "motors": [ "motor_0", "motor_1", "motor_2", "motor_3", "motor_4", "motor_5", "motor_6" ] } }, "action": { "dtype": "float32", "shape": [ 7 ], "names": { "motors": [ "motor_0", "motor_1", "motor_2", "motor_3", "motor_4", "motor_5", "motor_6" ] } }, "timestamp": { "dtype": "float32", "shape": [ 1 ], "names": null }, "episode_index": { "dtype": "int64", "shape": [ 1 ], "names": null }, "frame_index": { "dtype": "int64", "shape": [ 1 ], "names": null }, "next.reward": { "dtype": "float32", "shape": [ 1 ], "names": null }, "next.done": { "dtype": "bool", "shape": [ 1 ], "names": null }, "index": { "dtype": "int64", "shape": [ 1 ], "names": null }, "task_index": { "dtype": "int64", "shape": [ 1 ], "names": null } } } ``` ## Citation **BibTeX:** ```bibtex @inproceedings{lee2019icra, title={Making sense of vision and touch: Self-supervised learning of multimodal representations for contact-rich tasks}, author={Lee, Michelle A and Zhu, Yuke and Srinivasan, Krishnan and Shah, Parth and Savarese, Silvio and Fei-Fei, Li and Garg, Animesh and Bohg, Jeannette}, booktitle={2019 IEEE International Conference on Robotics and Automation (ICRA)}, year={2019}, url={https://arxiv.org/abs/1810.10191} } ```

本数据集基于[LeRobot](https://github.com/huggingface/lerobot)构建。 ## 数据集说明 - **主页**:https://sites.google.com/view/visionandtouch - **论文**:https://arxiv.org/abs/1810.10191 - **许可证**:MIT许可证 ## 数据集结构 `meta/info.json` 文件包含如下元数据: json { "代码库版本": "v2.0", "机器人类型": "未知", "总回合数": 3000, "总帧数": 149985, "总任务数": 1, "总视频数": 3000, "总分片数": 3, "单分片大小": 1000, "帧率": 20, "数据拆分": { "训练集": "0:3000" }, "数据文件路径": "data/chunk-{episode_chunk:03d}/episode_{episode_index:06d}.parquet", "视频文件路径": "videos/chunk-{episode_chunk:03d}/{video_key}/episode_{episode_index:06d}.mp4", "数据特征": { "observation.images.image": { "数据类型": "视频", "形状": [128, 128, 3], "维度名称": ["高度", "宽度", "通道数"], "视频信息": { "视频帧率": 20.0, "视频编码格式": "av1", "视频像素格式": "yuv420p", "是否为深度图": false, "是否含音频": false } }, "language_instruction": { "数据类型": "字符串", "形状": [1], "维度名称": null }, "observation.state": { "数据类型": "float32", "形状": [7], "维度名称": { "电机": ["motor_0", "motor_1", "motor_2", "motor_3", "motor_4", "motor_5", "motor_6"] } }, "action": { "数据类型": "float32", "形状": [7], "维度名称": { "电机": ["motor_0", "motor_1", "motor_2", "motor_3", "motor_4", "motor_5", "motor_6"] } }, "timestamp": { "数据类型": "float32", "形状": [1], "维度名称": null }, "episode_index": { "数据类型": "int64", "形状": [1], "维度名称": null }, "frame_index": { "数据类型": "int64", "形状": [1], "维度名称": null }, "next.reward": { "数据类型": "float32", "形状": [1], "维度名称": null }, "next.done": { "数据类型": "bool", "形状": [1], "维度名称": null }, "index": { "数据类型": "int64", "形状": [1], "维度名称": null }, "task_index": { "数据类型": "int64", "形状": [1], "维度名称": null } } } ## 引用 **BibTeX 引用格式:** bibtex @inproceedings{lee2019icra, title={理解视觉与触觉:面向密集接触任务的多模态表征自监督学习}, author={Lee, Michelle A and Zhu, Yuke and Srinivasan, Krishnan and Shah, Parth and Savarese, Silvio and Fei-Fei, Li and Garg, Animesh and Bohg, Jeannette}, booktitle={2019 IEEE国际机器人与自动化会议(ICRA)}, year={2019}, url={https://arxiv.org/abs/1810.10191} }
提供机构:
maas
创建时间:
2025-02-06
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作