HAI-Lab/LIBERO-Para

Name: HAI-Lab/LIBERO-Para
Creator: HAI-Lab
Published: 2026-04-06 13:12:03
License: 暂无描述

Hugging Face2026-04-06 更新2026-04-12 收录

下载链接：

https://hf-mirror.com/datasets/HAI-Lab/LIBERO-Para

下载链接

链接失效反馈

官方服务：

资源简介：

--- license: mit arxiv: "2603.28301" authors: - chanyoungkim task_categories: - robotics tags: - LIBERO - VLA - paraphrase-robustness - robotic-manipulation - benchmark - vision-language-action language: - en size_categories: - 1K<n<10K pretty_name: LIBERO-Para BDDL Files --- # LIBERO-Para: A Diagnostic Benchmark and Metrics for Paraphrase Robustness in VLA Models LIBERO-Para is a controlled benchmark for evaluating the **paraphrase robustness** of Vision-Language-Action (VLA) models. It independently varies **action expressions** and **object references**—the two core linguistic components of robotic manipulation instructions—enabling fine-grained analysis of how different types of linguistic variation affect VLA performance. 📄 **Paper**: [arXiv:2603.28301](https://arxiv.org/abs/2603.28301) 💻 **Code**: [GitHub](https://github.com/cau-hai-lab/LIBERO-Para) ## Overview LIBERO-Para is constructed on top of **LIBERO-Goal**, where all tasks share an identical initial state, making the instruction the sole cue for task identification. The benchmark paraphrases only the instructions while keeping all other factors (visual scene, physics, etc.) fixed. All paraphrases are held out for evaluation only. ### Key Statistics - **4,092** paraphrased instructions total - **10** original LIBERO-Goal instructions - **43** distinct paraphrase type combinations (Action × Object) - **~100** samples per variation type cell ## Benchmark Design ### Two-Axis Paraphrase Scheme LIBERO-Para adopts a two-axis design grounded in established linguistic taxonomies: **Object Axis** (3 types) — Lexical variation of object references, based on the Extended Paraphrase Typology (EPT; Kovatchev et al., 2018): - **Addition**: Adding functional descriptors (e.g., "stove" → "gas stove") - **SP-contextual**: Contextually appropriate substitution (e.g., "stove" → "cooktop") - **SP-habitual**: Common synonym substitution (e.g., "stove" → "cooker") **Action Axis** (10 types) — Variation in how actions are linguistically expressed: | Category | Types | |----------|-------| | Lexical | Addition (e.g., "Carefully turn on the stove"), SP-contextual (e.g., "Switch on the stove"), SP-habitual (e.g., "Fire up the stove") | | Structural | Coordination (e.g., "Go to the stove and turn it on"), Subordination (e.g., "Turn on the stove so that it becomes hot") | | Pragmatic | Need-statement, Embedded-imperative, Permission-directive, Question-directive, Hint | ### Compositional Variation Beyond individual axes, the benchmark includes compositional paraphrases that vary both action and object expressions simultaneously, yielding 30 combined types (3 Object × 10 Action). ### Dataset Statistics per Cell | Object \ Action | None | add | ctx | hab | coord | subord | need | embed | perm | quest | hint | Total | |---|---|---|---|---|---|---|---|---|---|---|---|---| | None | – | 100 | 79 | 74 | 98 | 75 | 93 | 93 | 83 | 87 | 88 | 870 | | Addition | 98 | 100 | 100 | 100 | 100 | 100 | 100 | 99 | 99 | 99 | 100 | 1,095 | | SP-contextual | 87 | 100 | 100 | 100 | 100 | 99 | 100 | 100 | 100 | 94 | 96 | 1,076 | | SP-habitual | 74 | 100 | 98 | 100 | 97 | 94 | 100 | 95 | 100 | 95 | 98 | 1,051 | | **Total** | 259 | 400 | 377 | 374 | 395 | 368 | 393 | 387 | 382 | 375 | 382 | **4,092** | ### Original Instructions (from LIBERO-Goal) | Instruction | # Paraphrases | |---|---| | Put the wine bottle on top of the cabinet | 423 | | Open the middle drawer of the cabinet | 416 | | Turn on the stove | 414 | | Put the wine bottle on the rack | 413 | | Put the cream cheese in the bowl | 411 | | Open the top drawer and put the bowl inside | 410 | | Put the bowl on top of the cabinet | 410 | | Push the plate to the front of the stove | 406 | | Put the bowl on the stove | 403 | | Put the bowl on the plate | 386 | ## Usage ### File Structure ``` LIBERO-Para/ ├── README.md └── bddl_files/ └── ... ``` ### Using with LIBERO The bddl files can be used directly with the [LIBERO](https://github.com/Lifelong-Robot-Learning/LIBERO) evaluation environment. Place the downloaded bddl files into the appropriate LIBERO directory: ```bash mv bddl_files/* /path/to/libero/libero/bddl_files/ ``` For detailed evaluation guides, model-specific setup instructions, and analysis scripts, please refer to our [GitHub repository](https://github.com/cau-hai-lab/LIBERO-Para). ## Citation ```bibtex @misc{kim2026liberoparadiagnosticbenchmarkmetrics, title={LIBERO-Para: A Diagnostic Benchmark and Metrics for Paraphrase Robustness in VLA Models}, author={Chanyoung Kim and Minwoo Kim and Minseok Kang and Hyunwoo Kim and Dahuin Jung}, year={2026}, eprint={2603.28301}, archivePrefix={arXiv}, primaryClass={cs.LG}, url={https://arxiv.org/abs/2603.28301}, } ``` ## Acknowledgments - This project is built upon [LIBERO](https://github.com/Lifelong-Robot-Learning/LIBERO) by Bo Liu, Yifeng Zhu, Chongkai Gao, Yihao Feng, Qiang Liu, Yuke Zhu, and Peter Stone. - This research was supported by the AI Computing Infrastructure Enhancement (GPU Rental Support) User Support Program funded by the Ministry of Science and ICT (MSIT), Republic of Korea (RQT-25-090040). ## License This dataset is released under the [MIT License](LICENSE).

提供机构：

HAI-Lab

5,000+

优质数据集

54 个

任务类型

进入经典数据集