zzzzmikezzzz/libero_similar_task_desc

Name: zzzzmikezzzz/libero_similar_task_desc
Creator: zzzzmikezzzz
Published: 2026-03-25 01:01:23
License: 暂无描述

Hugging Face2026-03-25 更新2026-03-29 收录

下载链接：

https://hf-mirror.com/datasets/zzzzmikezzzz/libero_similar_task_desc

下载链接

链接失效反馈

官方服务：

资源简介：

--- license: apache-2.0 language: - en size_categories: - 1M<n<10M --- # LIBERO Prompt Noise Dataset for VLA Robustness Evaluation This dataset provides **semantically similar alternative task descriptions** for the [LIBERO](https://github.com/Lifelong-Robot-Learning/LIBERO) benchmark, generated by large language models (LLMs) such as **DeepSeek**. It is designed to facilitate **robustness testing and adversarial attacks** on Vision-Language-Action (VLA) models by introducing natural language variations that preserve the original task intent while challenging the model's language grounding capabilities. Reference repo: [psga](https://github.com/blackcat0615/psga) --- ## 🔍 Background VLA models integrate visual observations and language instructions to produce robot actions. While they achieve strong performance on standard benchmarks, their sensitivity to paraphrased or slightly altered instructions remains underexplored. This dataset enables researchers to systematically evaluate how VLA models behave when the task description is **semantically equivalent but lexically/syntactically different** – a crucial aspect of real-world deployment where users may express the same goal in diverse ways. --- ## 📦 Dataset Generation - **Base Dataset**: [LIBERO](https://github.com/Lifelong-Robot-Learning/LIBERO) (including 4 suites: `libero_spatial`, `libero_object`, `libero_goal`, `libero_10`). - **LLM Used**: DeepSeek (or other compatible models) prompted to generate **multiple paraphrases** for each original task description. - **Quality Control**: Generated candidates are manually verified to ensure: - Semantic equivalence with the original task. - Fluency and naturalness in English. - Diversity in phrasing (e.g., different verb choices, sentence structures). --- ## 📁 Dataset Structure The dataset is organized by LIBERO task suites. For each suite (e.g., `libero_object`), there is a **single JSON file** named after the suite. The file contains a dictionary where: - **Keys** are the original task keys (as defined in LIBERO, e.g., `"turn_on_the_stove_and_put_the_moka_pot_on_it"`). - **Values** are lists of paraphrased task descriptions (typically 5 alternatives per task). ``` libero_similar_task_desc/ ├── libero_spatial.json ├── libero_object.json ├── libero_goal.json ├── libero_10.json └── libero_90.json ``` **Example content (`libero_object.json`):** ```json { "turn_on_the_stove_and_put_the_moka_pot_on_it": [ "Switch on the stove and place the moka pot on top", "Turn the stove on and set the moka pot over the heat", "Activate the stove burner and position the moka pot on it", "Ignite the stove and rest the moka pot on the burner", "Start the stove and put the moka pot directly on the flame" ], ... } ``` --- ## 🚀 Usage ### 1. Load a noise description for a given task The following Python function loads the **first** paraphrased description for a specific LIBERO task. You can easily modify it to randomly select an alternative. ```python import os import json def add_language_noise_to_task(suite_name, task_key, noise_root="/path/to/libero_prompts_noise"): # suite_name e.g., "libero_object" # task_key e.g., "turn_on_the_stove_and_put_the_moka_pot_on_it" suite_file = os.path.join(noise_root, f"{suite_name}.json") with open(suite_file, 'r', encoding='utf-8') as f: data = json.load(f) # dictionary: task_key -> list of strings # Use the first alternative (or randomly select one) new_task_desc = data[task_key][0] # default get fisrt task desc return new_task_desc ``` ## 📄 License This dataset is released under the **Apache License 2.0**. See [LICENSE](LICENSE) for details. --- ## 🤝 Contributing We welcome contributions! If you have additional paraphrases or find issues, please open an issue or submit a pull request. ---

提供机构：

zzzzmikezzzz

5,000+

优质数据集

54 个

任务类型

进入经典数据集