five

RealVLG-11B

收藏
魔搭社区2026-05-15 更新2026-05-17 收录
下载链接:
https://modelscope.cn/datasets/cslinfeili/RealVLG-11B
下载链接
链接失效反馈
官方服务:
资源简介:
<p align="center"> <h1 align="center"> RealVLG-R1: A Large-Scale Real-World Visual-Language Grounding Benchmark for Robotic Perception and Manipulation <br> [CVPR 2026] </h1> <p align="center"> <a href="https://lif314.github.io/"><strong>Linfei Li</strong></a> · <a href="https://scholar.google.com/citations?user=8VOk_S4AAAAJ&hl=en"><strong>Lin Zhang*</strong></a> · <a href="https://scholar.google.com/citations?user=A0N_mS0AAAAJ&hl=en"><strong>Ying Shen</strong></a> </p> <h3 align="center"><a href="https://lif314.github.io/projects/realvlg_r1/">🌐Project page</a> | <a href="https://arxiv.org/abs/2603.14880">📝Paper(arXiv)</a> | <a href="https://github.com/lif314/RealVLG-R1">💻Code </a> </h3> <div align="center"></div> </p> ## Sample Each data sample is annotated as follows: ```json [ { "image_name": "", "image_path": "", "object_id": "", "mask_path": "", "description": "", "label": "", # short description "bbox": [x1, y1, x2, y2], "grasps": [ [x0,y0,x1,y1,x2,y2,x3,y3], ... ], "contact_points": [ [x1,y1, x2, y2], ... ] } ] ``` The definition diagrams of bbox and grasp are shown in the figure below: ![](./assets/anno_demo.png) ## Usage Download the dataset and extract `xxx_VLG.zip`. In each `xxx_VLG` folder, run `python metadata_viewer.py` to view the data formatting. The left/right keys switch between different objects in the same image, and the up/down keys switch between images. The visualization of different data subsets is shown below: | Subdata | Cornell_VLG | VMRD_VLG | OCID_VLG | GraspNet_VLG | Jacquard_VLG | |---------|-------------|----------|----------|--------------|---------------| | Demo | ![](./assets/cornell.png) | ![](./assets/vmrd.png) | ![](./assets/ocid.png) | ![](./assets/graspnet.png) | ![](./assets/jacquard.png) | For more detailed data loading, please refer to `metadata_viewer.py`. > Note: ``Jacquard_VLG`` is a simulated dataset not discussed in the paper. Its language annotations are derived from ShapeNetSem category labels. ## License We thank all previous work. If you use this dataset, please cite the relevant work and comply with their licenses. - [Cornell](https://www.kaggle.com/datasets/oneoneliu/cornell-grasp) - [VMRD](https://opendatalab.com/OpenDataLab/VMRD) - [OCID-Grasp](https://github.com/stefan-ainetter/grasp_det_seg_cnn) - [GraspNet](https://graspnet.net/) - [Jacquard](https://jacquard.liris.cnrs.fr/)
提供机构:
maas
创建时间:
2026-05-12
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作