five

MuBlE|机器人操作数据集|仿真环境数据集

收藏
arXiv2025-03-05 更新2025-03-06 收录
机器人操作
仿真环境
下载链接:
https://github.com/michaal94/MuBlE
下载链接
链接失效反馈
资源简介:
MuBlE是由华为诺亚方舟实验室和布拉格捷克技术大学等机构开发的一款基于MuJoCo物理引擎和Blender渲染器的仿真环境,专为长距离机器人操作任务设计。该数据集包含12000个场景,涵盖十种单步和多项步骤的桌面操作任务,支持视觉、语言和操作的闭合循环方法。它能够生成高质量的多模态数据,并包含SHOP-VRB2这一新基准,该基准包含需要同时进行视觉和物理测量的多步骤推理场景。
提供机构:
华为诺亚方舟实验室
创建时间:
2025-03-05
AI搜集汇总
数据集介绍
main_image_url
构建方式
MuBlE数据集的构建基于Robosuite框架,融合了MuJoCo物理引擎和Blender渲染器。MuJoCo负责提供逼真的物理模拟,确保物体间的交互符合物理规律;Blender则负责生成高质量的图像,为视觉识别提供真实感。MuBlE支持多模态数据生成,包括场景和指令的合成、物体属性的地面实况场景图生成、任务完成评估以及一系列基本动作控制器,这些控制器能够观测物理物体的属性,如重量、硬度等。此外,MuBlE的高效物理和关键帧动作渲染循环促进了闭环推理方法的集成。
使用方法
MuBlE数据集的使用方法包括场景生成、指令生成、动作规划和执行等。首先,使用场景生成器工具创建各种场景,并随机放置对象。然后,指令生成器根据场景生成自然语言指令。接着,在动作规划阶段,MuBlE根据视觉输出和基本动作输入与推理方法进行交互。最后,在物理循环中,MuJoCo物理引擎根据控制信号进行物理交互计算,生成一系列观察数据,如末端执行器的位置和方向、夹持器状态、重量测量值等。MuBlE数据集可以用于训练和评估机器人操作任务,支持闭环推理和任务规划,是研究机器人操作任务的重要资源。
背景与挑战
背景概述
MuBlE数据集是一项开创性的研究,旨在解决当前机器人操作体在执行长时任务时面临的挑战。这些任务需要机器人与物理世界进行交互,以获取必要的信息,例如按重量对物体进行分类。为了提高此类机器人的能力,研究人员需要相关训练环境。因此,MuBlE应运而生,这是一个基于robosuite的模拟环境,利用MuJoCo物理引擎和高质量的渲染器Blender,提供逼真的视觉观察,同时确保物理状态的准确性。MuBlE是第一个专注于长时机器人操作任务的模拟器,保留了准确的物理建模。此外,MuBlE能够生成多模态数据用于训练,并通过环境交互在两个级别上设计闭环方法:视觉 - 动作循环,和控制 - 物理循环。
当前挑战
MuBlE数据集面临的挑战主要包括:1) 解决的领域问题,即机器人操作体在执行长时任务时需要与物理世界进行交互以获取必要信息,这要求模拟环境能够提供逼真的视觉观察和准确的物理建模;2) 构建过程中所遇到的挑战,包括如何在保持高质量渲染的同时,实现实时物理计算,以及如何设计能够支持闭环推理方法的模拟器。
常用场景
经典使用场景
MuBlE数据集是一款专为机器人操控任务规划而设计的模拟环境,它利用MuJoCo物理引擎和高质量渲染器Blender,提供了既真实又准确的视觉观测数据。该数据集的经典使用场景包括长时机器人操控任务,如将物体按重量从轻到重排序,这些任务需要机器人与物理世界进行交互以获取必要的信息。MuBlE数据集通过提供多模态数据,为训练和设计闭环方法提供了基础,支持视觉-动作循环和控制-物理循环两个层面的环境交互。
解决学术问题
MuBlE数据集解决了当前具身推理代理在规划长时机器人操控任务方面的困难,这类任务需要与物理世界进行交互以获取必要的信息。该数据集通过提供一个既真实又准确的物理建模,使得机器人可以更好地规划动作,并执行长时任务。此外,MuBlE数据集还提供了一个新的基准测试SHOP-VRB2,该测试由10类多步骤推理场景组成,需要同时进行视觉和物理测量,为评估具身、闭环推理在机器人操控中的能力提供了新的挑战。
实际应用
MuBlE数据集在实际应用中,可以用于机器人操控任务的训练和评估。例如,可以利用MuBlE数据集进行机器人操控任务的模拟训练,然后通过SHOP-VRB2基准测试评估机器人的操控能力。此外,MuBlE数据集还可以用于研究机器人与物理世界的交互,以及如何通过操控世界来获取关于非视觉物体属性的知识。
数据集最近研究
最新研究方向
在机器人操作任务规划领域,MuBlE数据集的最新研究方向聚焦于构建能够模拟长时机器人操作任务的真实物理环境。该数据集利用MuJoCo物理引擎和高质量的渲染器Blender,为机器人操作任务提供了既真实又准确的视觉观察数据。MuBlE环境通过视觉-动作循环和控制-物理循环两个层面的环境交互,支持闭环方法的设计,并为训练和评估机器人操作任务规划提供了多模态数据。此外,MuBlE还提出了SHOP-VRB2基准,包含10个多步推理场景类,要求同时进行视觉和物理测量,从而推动了机器人操作任务规划领域的长时闭环推理研究。MuBlE数据集的研究对于提高机器人操作任务规划的能力,尤其是在处理需要与物理世界进行物理交互的长时任务方面,具有重要意义。
相关研究论文
  • 1
    MuBlE: MuJoCo and Blender simulation Environment and Benchmark for Task Planning in Robot Manipulation华为诺亚方舟实验室 · 2025年
以上内容由AI搜集并总结生成
用户留言
有没有相关的论文或文献参考?
这个数据集是基于什么背景创建的?
数据集的作者是谁?
能帮我联系到这个数据集的作者吗?
这个数据集如何下载?
点击留言
数据主题
具身智能
数据集  4098个
机构  8个
大模型
数据集  439个
机构  10个
无人机
数据集  37个
机构  6个
指令微调
数据集  36个
机构  6个
蛋白质结构
数据集  50个
机构  8个
空间智能
数据集  21个
机构  5个
5,000+
优质数据集
54 个
任务类型
进入经典数据集
热门数据集

Canadian Census

**Overview** The data package provides demographics for Canadian population groups according to multiple location categories: Forward Sortation Areas (FSAs), Census Metropolitan Areas (CMAs) and Census Agglomerations (CAs), Federal Electoral Districts (FEDs), Health Regions (HRs) and provinces. **Description** The data are available through the Canadian Census and the National Household Survey (NHS), separated or combined. The main demographic indicators provided for the population groups, stratified not only by location but also for the majority by demographical and socioeconomic characteristics, are population number, females and males, usual residents and private dwellings. The primary use of the data at the Health Region level is for health surveillance and population health research. Federal and provincial departments of health and human resources, social service agencies, and other types of government agencies use the information to monitor, plan, implement and evaluate programs to improve the health of Canadians and the efficiency of health services. Researchers from various fields use the information to conduct research to improve health. Non-profit health organizations and the media use the health region data to raise awareness about health, an issue of concern to all Canadians. The Census population counts for a particular geographic area representing the number of Canadians whose usual place of residence is in that area, regardless of where they happened to be on Census Day. Also included are any Canadians who were staying in that area on Census Day and who had no usual place of residence elsewhere in Canada, as well as those considered to be 'non-permanent residents'. National Household Survey (NHS) provides demographic data for various levels of geography, including provinces and territories, census metropolitan areas/census agglomerations, census divisions, census subdivisions, census tracts, federal electoral districts and health regions. In order to provide a comprehensive overview of an area, this product presents data from both the NHS and the Census. NHS data topics include immigration and ethnocultural diversity; aboriginal peoples; education and labor; mobility and migration; language of work; income and housing. 2011 Census data topics include population and dwelling counts; age and sex; families, households and marital status; structural type of dwelling and collectives; and language. The data are collected for private dwellings occupied by usual residents. A private dwelling is a dwelling in which a person or a group of persons permanently reside. Information for the National Household Survey does not include information for collective dwellings. Collective dwellings are dwellings used for commercial, institutional or communal purposes, such as a hotel, a hospital or a work camp. **Benefits** - Useful for canada public health stakeholders, for public health specialist or specialized public and other interested parties. for health surveillance and population health research. for monitoring, planning, implementation and evaluation of health-related programs. media agencies may use the health regions data to raise awareness about health, an issue of concern to all canadians. giving the addition of longitude and latitude in some of the datasets the data can be useful to transpose the values into geographical representations. the fields descriptions along with the dataset description are useful for the user to quickly understand the data and the dataset. **License Information** The use of John Snow Labs datasets is free for personal and research purposes. For commercial use please subscribe to the [Data Library](https://www.johnsnowlabs.com/marketplace/) on John Snow Labs website. The subscription will allow you to use all John Snow Labs datasets and data packages for commercial purposes. **Included Datasets** - [Canadian Population and Dwelling by FSA 2011](https://www.johnsnowlabs.com/marketplace/canadian-population-and-dwelling-by-fsa-2011) - This Canadian Census dataset covers data on population, total private dwellings and private dwellings occupied by usual residents by forward sortation area (FSA). It is enriched with the percentage of the population or dwellings versus the total amount as well as the geographical area, province, and latitude and longitude. The whole Canada's population is marked as 100, referring to 100% for the percentages. - [Detailed Canadian Population Statistics by CMAs and CAs 2011](https://www.johnsnowlabs.com/marketplace/detailed-canadian-population-statistics-by-cmas-and-cas-2011) - This dataset covers the population statistics of Canada by Census Metropolitan Areas (CMAs) and Census Agglomerations (CAs). It is categorized also by citizen/immigration status, ethnic origin, religion, mobility, education, language, work, housing, income etc. There is detailed characteristics categorization within these stated categories that are in 5 layers. - [Detailed Canadian Population Statistics by FED 2011](https://www.johnsnowlabs.com/marketplace/detailed-canadian-population-statistics-by-fed-2011) - This dataset covers the population statistics of Canada from 2011 by Federal Electoral District of 2013 Representation Order. It is categorized also by citizen/immigration status, ethnic origin, religion, mobility, education, language, work, housing, income etc. There is detailed characteristics categorization within these stated categories that are in 5 layers. - [Detailed Canadian Population Statistics by Health Region 2011](https://www.johnsnowlabs.com/marketplace/detailed-canadian-population-statistics-by-health-region-2011) - This dataset covers the population statistics of Canada by health region. It is categorized also by citizen/immigration status, ethnic origin, religion, mobility, education, language, work, housing, income etc. There is detailed characteristics categorization within these stated categories that are in 5 layers. - [Detailed Canadian Population Statistics by Province 2011](https://www.johnsnowlabs.com/marketplace/detailed-canadian-population-statistics-by-province-2011) - This dataset covers the population statistics of Canada by provinces and territories. It is categorized also by citizen/immigration status, ethnic origin, religion, mobility, education, language, work, housing, income etc. There is detailed characteristics categorization within these stated categories that are in 5 layers. **Data Engineering Overview** **We deliver high-quality data** - Each dataset goes through 3 levels of quality review - 2 Manual reviews are done by domain experts - Then, an automated set of 60+ validations enforces every datum matches metadata & defined constraints - Data is normalized into one unified type system - All dates, unites, codes, currencies look the same - All null values are normalized to the same value - All dataset and field names are SQL and Hive compliant - Data and Metadata - Data is available in both CSV and Apache Parquet format, optimized for high read performance on distributed Hadoop, Spark & MPP clusters - Metadata is provided in the open Frictionless Data standard, and its every field is normalized & validated - Data Updates - Data updates support replace-on-update: outdated foreign keys are deprecated, not deleted **Our data is curated and enriched by domain experts** Each dataset is manually curated by our team of doctors, pharmacists, public health & medical billing experts: - Field names, descriptions, and normalized values are chosen by people who actually understand their meaning - Healthcare & life science experts add categories, search keywords, descriptions and more to each dataset - Both manual and automated data enrichment supported for clinical codes, providers, drugs, and geo-locations - The data is always kept up to date – even when the source requires manual effort to get updates - Support for data subscribers is provided directly by the domain experts who curated the data sets - Every data source’s license is manually verified to allow for royalty-free commercial use and redistribution. **Need Help?** If you have questions about our products, contact us at [info@johnsnowlabs.com](mailto:info@johnsnowlabs.com).

Databricks 收录

HazyDet

HazyDet是由解放军工程大学等机构创建的一个大规模数据集,专门用于雾霾场景下的无人机视角物体检测。该数据集包含383,000个真实世界实例,收集自自然雾霾环境和正常场景中人工添加的雾霾效果,以模拟恶劣天气条件。数据集的创建过程结合了深度估计和大气散射模型,确保了数据的真实性和多样性。HazyDet主要应用于无人机在恶劣天气条件下的物体检测,旨在提高无人机在复杂环境中的感知能力。

arXiv 收录

THCHS-30

“THCHS30是由清华大学语音与语言技术中心(CSLT)发布的开放式汉语语音数据库。原始录音是2002年在清华大学国家重点实验室的朱晓燕教授的指导下,由王东完成的。清华大学计算机科学系智能与系统,原名“TCMSD”,意思是“清华连续普通话语音数据库”,时隔13年出版,由王东博士发起,并得到了教授的支持。朱小燕。我们希望为语音识别领域的新研究人员提供一个玩具数据库。因此,该数据库对学术用户完全免费。整个软件包包含建立中文语音识别所需的全套语音和语言资源系统。”

OpenDataLab 收录

MultiTalk

MultiTalk数据集是由韩国科学技术院创建,包含超过420小时的2D视频,涵盖20种不同语言,旨在解决多语言环境下3D说话头生成的问题。该数据集通过自动化管道从YouTube收集,每段视频都配有语言标签和伪转录,部分视频还包含伪3D网格顶点。数据集的创建过程包括视频收集、主动说话者验证和正面人脸验证,确保数据质量。MultiTalk数据集的应用领域主要集中在提升多语言3D说话头生成的准确性和表现力,通过引入语言特定风格嵌入,使模型能够捕捉每种语言独特的嘴部运动。

arXiv 收录

CIA World Factbook

CIA世界概况是一个包含全球每个国家地理、经济和政治数据的公共领域数据集。数据类型包括自由文本、货币、百分比、经纬度、海拔、分类等,使其成为搜索应用测试和演示的有价值语料库,同时也具有数据本身的内在价值。

github 收录