机器人AI自主学习与跨形态迁移数据集

Name: 机器人AI自主学习与跨形态迁移数据集
Creator: 北京海百川科技有限公司
License: 暂无描述

北京国际大数据交易所2026-01-06 收录

下载链接：

https://webs.bjidex.com/sys-bsc-home/#/bscConsole/tradingMarket/detail?id=5905

下载链接

链接失效反馈

官方服务：

资源简介：

一、产品综述本产品是一款面向通用机器人智能前沿研究的高标准、结构化、多模态数据集。其核心设计目标在于支撑机器人人工智能实现跨实体形态的技能迁移与自主持续学习。通过系统化集成多种机器人平台在多元化任务中产生的感知、控制与交互数据，并对其进行深度语义解构与对齐标注，本数据集为构建能够理解物理常识、抽象技能本质并快速适应新环境的下一代机器人智能模型提供了不可或缺的基础数据资源。二、核心特征真正的跨形态对齐：首次实现了人型机器人、自主移动平台及灵巧操作机械臂在同一任务语义下的多维度数据同步采集与标注，为研究通用技能表示提供了坚实基础。全链路多模态融合：提供严格时间同步的视觉、听觉、力觉、触觉及本体运动数据，支持从感知到执行的端到端模型训练与跨模态联合学习。深度语义解构：不仅提供原始数据，更对任务进行技能原子级分解与逻辑关联标注，赋予数据可迁移、可组合的语义信息，极大提升学习效率。三、数据内容详情本数据集由五个逻辑关联、互为支撑的核心子库构成，具体内容如下：3.1 跨平台多模态感知子库本子库包含来自不同机器人平台的同步感知数据。主要内容为：由多模态感知融合的人机交互智能识别系统与虚拟交互面部捕捉摄像头采集的多视角RGB-D视频流（1080p， 30fps）、高精度3D点云序列、以及环境音频流。所有感知数据均附带精细的物体级标注，包括2D/3D边界框、语义分割掩码及场景关系图。3.2 机器人本体状态与执行子库本子库完整记录机器人的内部状态与执行轨迹。数据源包括：智慧人型陪伴机器人、智能自主移动服务机器人以及搭载机器人柔性关节力反馈与触觉感知控制平台的机械臂。数据涵盖全身关节编码器与扭矩、基座里程计、IMU信息，以及由智能陪伴机器人触觉融合系统采集的六维力/力矩和触觉阵列数据。所有执行序列均被分割并标注了对应的“技能原子”（如“精确抓取”、“力控插入”）。3.3 人机交互与语义指令子库本子库聚焦于任务的高层语义与交互上下文。内容包括：以语音、文本和图形形式下达的多模态任务指令、人工标注的任务分解步骤（自然语言与形式化语言）、以及由人型机器人表情生成与情感表达控制平台生成的机器人情感响应参数。该子库提供了从人类意图到机器人技能链的映射关系，并包含交互质量评估标签。3.4 仿真与数字孪生支持子库为促进“仿真-现实”迁移研究，本子库提供了与真实数据配套的数字化资产。包括：所有机器人平台的高保真URDF模型、场景物体的3D网格模型及其实测物理属性（质量、摩擦系数等）、以及可精确复现任务初始状态的配置文件。这些数据为算法在安全、可控的仿真环境中进行大规模预训练和验证提供了可能。3.5 元数据与基准评测子库本子库确保数据集的系统性与可评估性。它包含描述整个数据集内容与结构的全局索引与关系图谱，一套专为评估“跨形态迁移能力”而设计的基准任务集（涵盖5大类共20项任务），以及相应的标准化评测脚本与指标计算工具。四、技术规格总体容量：原始数据总量不低于1.5PB。任务规模：包含超过10万条独立可执行的任务序列。标注数量：提供超过870万项2D图像标注及210万项3D实例标注。采集周期：基础版本（V1.0）数据采集于2024年8月至2025年10月。同步精度：跨所有传感器的全局时间戳同步误差小于1毫秒。更新计划：实施季度增量更新与年度重大版本升级策略。

I. Product Overview This product is a high-standard, structured, multimodal dataset targeting cutting-edge research on general robot intelligence. Its core design goal is to support robotic artificial intelligence in achieving cross-entity morphology skill transfer and autonomous continual learning. By systematically integrating perception, control, and interaction data generated by multiple robotic platforms in diverse tasks, and conducting deep semantic deconstruction and aligned annotation on this data, this dataset provides an indispensable foundational data resource for building next-generation robotic intelligent models that can understand physical common sense, grasp the essence of abstract skills, and rapidly adapt to new environments. II. Core Features 1. True cross-morphology alignment: For the first time, synchronous multi-dimensional data collection and annotation under the same task semantics are realized for humanoid robots, autonomous mobile platforms, and dexterous manipulators, providing a solid foundation for research on general skill representation. 2. Full-chain multimodal fusion: It provides strictly time-synchronized visual, auditory, force, tactile, and proprioceptive motion data, supporting end-to-end model training and cross-modal joint learning from perception to execution. 3. Deep semantic deconstruction: In addition to providing raw data, it also conducts skill primitive-level decomposition and logical association annotation for tasks, endowing the data with transferable and combinable semantic information, which greatly improves learning efficiency. III. Detailed Data Content This dataset consists of five core, logically interconnected and mutually supportive sub-libraries, with specific contents as follows: 3.1 Cross-platform Multimodal Perception Sub-library This sub-library contains synchronized perception data from different robotic platforms. Its main contents include: multi-view RGB-D video streams (1080p, 30fps) collected by a multimodal perception fusion human-computer interaction intelligent recognition system and a virtual interaction facial capture camera, high-precision 3D point cloud sequences, and environmental audio streams. All perception data are accompanied by fine-grained object-level annotations, including 2D/3D bounding boxes, semantic segmentation masks, and scene relationship graphs. 3.2 Robot Proprioceptive State and Execution Sub-library This sub-library fully records the internal state and execution trajectories of robots. The data sources include: intelligent humanoid companion robots, intelligent autonomous mobile service robots, and manipulators equipped with robotic flexible joint force feedback and tactile perception control platforms. The data covers full-body joint encoders and torque, base odometry, IMU information, as well as six-dimensional force/torque and tactile array data collected by the intelligent companion robot's tactile fusion system. All execution sequences are segmented and annotated with corresponding 'skill primitives' (such as 'precision grasping', 'force-controlled insertion'). 3.3 Human-robot Interaction and Semantic Instruction Sub-library This sub-library focuses on the high-level semantics and interaction context of tasks. Its contents include: multimodal task instructions issued in the form of speech, text, and graphics, manually annotated task decomposition steps (natural language and formal language), and robot emotional response parameters generated by the humanoid robot's expression generation and emotional expression control platform. This sub-library provides the mapping relationship from human intentions to robot skill chains, and includes interaction quality assessment labels. 3.4 Simulation and Digital Twin Support Sub-library To facilitate simulation-to-reality transfer research, this sub-library provides digital assets matching the real data. These include: high-fidelity URDF models of all robotic platforms, 3D mesh models of scene objects and their measured physical properties (mass, friction coefficient, etc.), and configuration files that can accurately reproduce the initial state of tasks. These data enable algorithms to conduct large-scale pre-training and verification in a safe and controllable simulation environment. 3.5 Metadata and Benchmark Evaluation Sub-library This sub-library ensures the systematicness and evaluability of the dataset. It includes a global index and relationship graph describing the entire dataset's content and structure, a benchmark task set designed specifically for evaluating 'cross-morphology transfer ability' (covering 20 tasks across 5 categories), as well as corresponding standardized evaluation scripts and indicator calculation tools. IV. Technical Specifications Total Capacity: The total volume of raw data is no less than 1.5 PB. Task Scale: It contains more than 100,000 independent executable task sequences. Annotation Quantity: It provides more than 8.7 million 2D image annotations and 2.1 million 3D instance annotations. Collection Period: The data for the basic version (V1.0) was collected from August 2024 to October 2025. Synchronization Accuracy: The global timestamp synchronization error across all sensors is less than 1 millisecond. Update Plan: A quarterly incremental update and annual major version upgrade strategy will be implemented.

提供机构：

北京海百川科技有限公司

搜集汇总

数据集介绍

背景与挑战

背景概述

该数据集是一个高标准、多模态的机器人研究数据集，旨在支持跨形态技能迁移和自主持续学习。它包含五个相互关联的子库，涵盖感知数据、机器人状态、人机交互、仿真支持和评测基准，为下一代机器人智能模型提供基础数据资源。

以上内容由遇见数据集搜集并总结生成