five

DAGroup-PKU/RBench

收藏
Hugging Face2026-01-25 更新2026-02-07 收录
下载链接:
https://hf-mirror.com/datasets/DAGroup-PKU/RBench
下载链接
链接失效反馈
官方服务:
资源简介:
RBench是一个精心策划的评估基准,旨在系统地评估视频生成模型在现实机器人场景中的能力。该基准从任务类别和机器人体现类型两个互补的角度构建,共包含650个图像文本评估案例。任务导向评估集包含250个图像文本对,涵盖五个代表性机器人任务类别:常见操作、长时规划、多实体协作、空间关系和视觉推理。体现导向评估集包含400个图像文本对,涵盖四种主流机器人体现类型:双臂机器人、人形机器人、单臂机器人和四足机器人。该基准用于图像到视频(I2V)和视频生成评估以及视觉语言模型(VLM / MLLM)的基准测试。

RBench is a curated evaluation benchmark designed to systematically assess the capabilities of video-generation models in realistic robotic scenarios. The benchmark is constructed from two complementary perspectives: task categories and robot embodiment types, covering a total of 650 image-text evaluation cases. The task-oriented split contains 250 image-text pairs, spanning five representative robotic task categories: Common Manipulation, Long-horizon Planning, Multi-entity Collaboration, Spatial Relationship, and Visual Reasoning. The embodiment-oriented split contains 400 image-text pairs, covering four mainstream robotic embodiment types: Dual-arm Robots, Humanoid Robots, Single-arm Robots, and Quadruped Robots. This benchmark is intended for image-to-video (I2V) and video generation evaluation and vision-language model (VLM / MLLM) benchmarking.
提供机构:
DAGroup-PKU
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作