DAGroup-PKU/RBench
收藏Hugging Face2026-01-25 更新2026-02-07 收录
下载链接:
https://hf-mirror.com/datasets/DAGroup-PKU/RBench
下载链接
链接失效反馈官方服务:
资源简介:
RBench是一个精心策划的评估基准,旨在系统地评估视频生成模型在现实机器人场景中的能力。该基准从任务类别和机器人体现类型两个互补的角度构建,共包含650个图像文本评估案例。任务导向评估集包含250个图像文本对,涵盖五个代表性机器人任务类别:常见操作、长时规划、多实体协作、空间关系和视觉推理。体现导向评估集包含400个图像文本对,涵盖四种主流机器人体现类型:双臂机器人、人形机器人、单臂机器人和四足机器人。该基准用于图像到视频(I2V)和视频生成评估以及视觉语言模型(VLM / MLLM)的基准测试。
RBench is a curated evaluation benchmark designed to systematically assess the capabilities of video-generation models in realistic robotic scenarios. The benchmark is constructed from two complementary perspectives: task categories and robot embodiment types, covering a total of 650 image-text evaluation cases. The task-oriented split contains 250 image-text pairs, spanning five representative robotic task categories: Common Manipulation, Long-horizon Planning, Multi-entity Collaboration, Spatial Relationship, and Visual Reasoning. The embodiment-oriented split contains 400 image-text pairs, covering four mainstream robotic embodiment types: Dual-arm Robots, Humanoid Robots, Single-arm Robots, and Quadruped Robots. This benchmark is intended for image-to-video (I2V) and video generation evaluation and vision-language model (VLM / MLLM) benchmarking.
提供机构:
DAGroup-PKU



