RoboManip-Traj-Demo

Name: RoboManip-Traj-Demo
Creator: maas
Published: 2025-12-05 16:57:34
License: 暂无描述

魔搭社区2025-12-05 更新2025-12-06 收录

下载链接：

https://modelscope.cn/datasets/Codatta/RoboManip-Traj-Demo

下载链接

链接失效反馈

官方服务：

资源简介：

# Codatta Robotic Manipulation Trajectory (Sample) ## Dataset Summary This dataset contains high-quality annotated trajectories of robotic gripper manipulations. It is designed to train models for fine-grained control, trajectory prediction, and object interaction tasks. Produced by **Codatta**, this dataset focuses on third-person views of robotic arms performing pick-and-place or manipulation tasks. Each sample includes the raw video, a visualization of the trajectory, and a rigorous JSON annotation of keyframes and coordinate points. **Note:** This is a sample dataset containing **50 annotated examples**. ## Supported Tasks * **Trajectory Prediction:** Predicting the path of a gripper based on visual context. * **Keyframe Extraction:** Identifying critical moments in a manipulation task (e.g., contact, velocity change). * **Robotic Control:** Imitation learning from human-demonstrated or teleoperated data. ## Dataset Structure ### Data Fields * **`id`** (string): Unique identifier for the trajectory sequence. * **`total_frames`** (int32): Total number of frames in the video sequence. * **`video_path`** (string): Path to the source MP4 video file recording the manipulation action. * **`trajectory_image`** (image): A JPEG preview showing the overlaid trajectory path or keyframe visualization. * **`annotations`** (string): A JSON-formatted string containing the detailed coordinate data. * *Structure:* Contains lists of keyframes, timestamp, and the 5-point coordinates for the gripper in each annotated frame. ### Data Preview *(Hugging Face's viewer will automatically render the `trajectory_image` here)* ## Annotation Standards The data was annotated following a strict protocol to ensure precision and consistency. ### 1. Viewpoint Scope * **Included:** Third-person views (fixed camera recording the robot). * [cite_start]**Excluded:** First-person views (Eye-in-Hand) are explicitly excluded to ensure consistent coordinate mapping[cite: 5, 15]. ### 2. Keyframe Selection Annotations are not dense (every frame) but sparse, focusing on **Keyframes** that define the motion logic. [cite_start]A Keyframe is defined by the following events [cite: 20-25]: 1. [cite_start]**Start Frame:** The gripper first appears in the screen[cite: 21]. 2. [cite_start]**End Frame:** The gripper leaves the screen[cite: 22]. 3. [cite_start]**Velocity Change:** Frames where the speed direction suddenly changes (marking the minimum speed point)[cite: 23]. 4. [cite_start]**State Change:** Frames where the gripper opens or closes[cite: 24]. 5. [cite_start]**Contact:** The precise moment the gripper touches the object[cite: 25]. ### 3. The 5-Point Annotation Method [cite_start]For every annotated keyframe, the gripper is labeled with **5 specific coordinate points** to capture its pose and state accurately[cite: 27]: | Point ID | Description | Location Detail | | :--- | :--- | :--- | | **Point 1 & 2** | **Fingertips** | [cite_start]Center of the bottom edge of the gripper tips[cite: 28, 29]. | | **Point 3 & 4** | **Gripper Ends** | [cite_start]The rearmost points of the closing area (indicating the finger direction)[cite: 31]. | | **Point 5** | **Tiger's Mouth** | [cite_start]The center of the crossbeam (base of the gripper)[cite: 32]. | ### 4. Quality Control * [cite_start]**Accuracy:** All datasets passed a rigorous quality assurance process with a minimum **95% accuracy rate**[cite: 78]. * **Occlusion Handling:** If the gripper is partially occluded, points are estimated based on object geometry. [cite_start]Sequences where the gripper is fully occluded or only shows a side profile without clear features are discarded[cite: 58, 63]. ## Usage Example ```python from datasets import load_dataset import json # Load the dataset ds = load_dataset("Codatta/robotic-manipulation-trajectory", split="train") # Access a sample sample = ds[0] # View the image print(f"Trajectory ID: {sample['id']}") sample['trajectory_image'].show() # Parse annotations annotations = json.loads(sample['annotations']) print(f"Keyframes count: {len(annotations)}")

# Codatta 机器人操作轨迹数据集（样本版） ## 数据集概述本数据集包含高质量的标注机器人夹持器操作轨迹，旨在为细粒度控制、轨迹预测以及物体交互任务的模型训练提供支撑。本数据集由**Codatta**团队制作，聚焦于机械臂执行拾取-放置或操作任务的第三人称视角场景。每个样本均包含原始视频、轨迹可视化结果，以及包含关键帧与坐标点的标准化JSON标注文件。 **注意：本数据集为样本集，仅包含50个标注样本。** ## 支持任务 * **轨迹预测**：基于视觉上下文预测夹持器的运动路径。 * **关键帧提取**：识别操作任务中的关键节点（如接触、速度变化时刻）。 * **机器人控制**：基于人类演示或遥操作数据开展模仿学习。 ## 数据集结构 ### 数据字段 * **`id`**（字符串型）：轨迹序列的唯一标识符。 * **`total_frames`**（int32 类型）：视频序列的总帧数。 * **`video_path`**（字符串型）：记录操作动作的原始MP4视频文件路径。 * **`trajectory_image`**（图像类型）：叠加了轨迹路径或关键帧可视化结果的JPEG预览图。 * **`annotations`**（字符串型）：包含详细坐标数据的JSON格式字符串。 * *结构说明*：包含关键帧列表、时间戳，以及每个标注帧中夹持器的5点坐标信息。 ### 数据预览 *(Hugging Face 可视化工具将自动渲染`trajectory_image`字段内容)* ## 标注规范本数据集遵循严格的标注协议以确保精度与一致性。 ### 1. 视角范围 * **包含场景**：第三人称视角（固定相机拍摄机械臂）。 * **排除场景**：明确排除第一人称视角（Eye-in-Hand），以确保坐标映射的一致性[引用来源：5、15]。 ### 2. 关键帧选取本次标注并非逐帧密集标注，而是采用稀疏标注策略，聚焦于定义运动逻辑的**关键帧**。[引用来源：20-25]关键帧由以下事件定义： 1. **起始帧**：夹持器首次出现在画面中的帧[引用来源：21]。 2. **结束帧**：夹持器离开画面的帧[引用来源：22]。 3. **速度变化帧**：运动方向突然改变的帧（对应速度极小值点）[引用来源：23]。 4. **状态变化帧**：夹持器开合状态发生改变的帧[引用来源：24]。 5. **接触帧**：夹持器与物体接触的精确时刻[引用来源：25]。 ### 3. 五点标注法 [引用来源：27]针对每个标注的关键帧，我们通过5个特定坐标点来精准捕捉夹持器的位姿与状态： | 点ID | 描述 | 位置细节 | | :--- | :--- | :--- | | **点1与点2** | **指尖** | [引用来源：28、29]夹持器指尖底部边缘的中心点。 | | **点3与点4** | **夹持器末端** | [引用来源：31]夹持区域的最后端点位（用于指示手指朝向）。 | | **点5** | **虎口（Tiger's Mouth）** | [引用来源：32]横梁（夹持器基座）的中心点。 | ### 4. 质量管控 * **精度要求**：所有数据集均通过严格的质量保证流程，标注准确率不低于95%[引用来源：78]。 * **遮挡处理**：若夹持器被部分遮挡，将基于物体几何结构估算点位。[引用来源：58、63]若夹持器被完全遮挡，或仅呈现侧面轮廓且无清晰特征，则该序列将被弃用。 ## 使用示例 python from datasets import load_dataset import json # 加载数据集 ds = load_dataset("Codatta/robotic-manipulation-trajectory", split="train") # 访问单个样本 sample = ds[0] # 查看可视化图像 print(f"轨迹ID：{sample['id']}") sample['trajectory_image'].show() # 解析标注信息 annotations = json.loads(sample['annotations']) print(f"关键帧数量：{len(annotations)}")

提供机构：

maas

创建时间：

2025-11-28

5,000+

优质数据集

54 个

任务类型

进入经典数据集