RealVLG-11B

Name: RealVLG-11B
Creator: maas
Published: 2026-05-15 21:45:32
License: 暂无描述

魔搭社区2026-05-15 更新2026-05-17 收录

下载链接：

https://modelscope.cn/datasets/cslinfeili/RealVLG-11B

下载链接

链接失效反馈

官方服务：

资源简介：

<p align="center"> <h1 align="center"> RealVLG-R1: A Large-Scale Real-World Visual-Language Grounding Benchmark for Robotic Perception and Manipulation <br> [CVPR 2026] </h1> <p align="center"> <a href="https://lif314.github.io/"><strong>Linfei Li</strong></a> · <a href="https://scholar.google.com/citations?user=8VOk_S4AAAAJ&hl=en"><strong>Lin Zhang*</strong></a> · <a href="https://scholar.google.com/citations?user=A0N_mS0AAAAJ&hl=en"><strong>Ying Shen</strong></a> </p> <h3 align="center"><a href="https://lif314.github.io/projects/realvlg_r1/">🌐Project page</a> | <a href="https://arxiv.org/abs/2603.14880">📝Paper(arXiv)</a> | <a href="https://github.com/lif314/RealVLG-R1">💻Code </a> </h3> <div align="center"></div> </p> ## Sample Each data sample is annotated as follows: ```json [ { "image_name": "", "image_path": "", "object_id": "", "mask_path": "", "description": "", "label": "", # short description "bbox": [x1, y1, x2, y2], "grasps": [ [x0,y0,x1,y1,x2,y2,x3,y3], ... ], "contact_points": [ [x1,y1, x2, y2], ... ] } ] ``` The definition diagrams of bbox and grasp are shown in the figure below: ![](./assets/anno_demo.png) ## Usage Download the dataset and extract `xxx_VLG.zip`. In each `xxx_VLG` folder, run `python metadata_viewer.py` to view the data formatting. The left/right keys switch between different objects in the same image, and the up/down keys switch between images. The visualization of different data subsets is shown below: | Subdata | Cornell_VLG | VMRD_VLG | OCID_VLG | GraspNet_VLG | Jacquard_VLG | |---------|-------------|----------|----------|--------------|---------------| | Demo | ![](./assets/cornell.png) | ![](./assets/vmrd.png) | ![](./assets/ocid.png) | ![](./assets/graspnet.png) | ![](./assets/jacquard.png) | For more detailed data loading, please refer to `metadata_viewer.py`. > Note: ``Jacquard_VLG`` is a simulated dataset not discussed in the paper. Its language annotations are derived from ShapeNetSem category labels. ## License We thank all previous work. If you use this dataset, please cite the relevant work and comply with their licenses. - [Cornell](https://www.kaggle.com/datasets/oneoneliu/cornell-grasp) - [VMRD](https://opendatalab.com/OpenDataLab/VMRD) - [OCID-Grasp](https://github.com/stefan-ainetter/grasp_det_seg_cnn) - [GraspNet](https://graspnet.net/) - [Jacquard](https://jacquard.liris.cnrs.fr/)

提供机构：

maas

创建时间：

2026-05-12

5,000+

优质数据集

54 个

任务类型

进入经典数据集