five

ImagenWorld-annotated-set

收藏
魔搭社区2025-12-05 更新2025-11-03 收录
下载链接:
https://modelscope.cn/datasets/TIGER-Lab/ImagenWorld-annotated-set
下载链接
链接失效反馈
官方服务:
资源简介:
## 📦 Dataset Access The dataset is organized as **zipped folders** by task for both `train` and `test` splits. ### 🐍 **Download with Python** ```python from huggingface_hub import snapshot_download import zipfile from pathlib import Path # Download annotated dataset local_path = snapshot_download( repo_id="TIGER-Lab/ImagenWorld-annotated-set", repo_type="dataset", local_dir="ImagenWorld-annotated-set", local_dir_use_symlinks=False, ) # Unzip all tasks for each split for split in ["train", "test"]: split_dir = Path(local_path) / split for zip_file in split_dir.glob("*.zip"): target_dir = split_dir / zip_file.stem target_dir.mkdir(exist_ok=True) with zipfile.ZipFile(zip_file, "r") as zf: zf.extractall(target_dir) print(f"✅ Extracted {zip_file.name} → {target_dir}") ``` --- ### 💻 **Download via Command Line** ```bash hf download TIGER-Lab/ImagenWorld-annotated-set --repo-type dataset --local-dir ImagenWorld-annotated-set cd ImagenWorld-annotated-set && for s in train test; do cd "$s"; for f in *.zip; do d="${f%.zip}"; mkdir -p "$d"; unzip -q "$f" -d "$d"; done; cd ..; done ``` --- ## 📁 Dataset Structure After extraction, your directory will look like this: ``` ImagenWorld-annotated-set/ │ ├── train/ │ ├── TIG.zip │ ├── TIE.zip │ ├── SRIG.zip │ ├── SRIE.zip │ ├── MRIG.zip │ └── MRIE.zip │ ├── test/ │ ├── TIG.zip │ ├── TIE.zip │ ├── SRIG.zip │ ├── SRIE.zip │ ├── MRIG.zip │ └── MRIE.zip ``` After unzipping, each task follows this internal structure: ### 🧩 `train/` split (with human evaluation) ``` TIG/ └── TIG_A_000001/ ├── input/ │ ├── metadata.json │ ├── 1.png │ └── ... └── outputs/ ├── sdxl/ │ ├── annotator1/ │ │ ├── evaluation.json │ │ ├── error_mask.png # optional; only if not 'None' or 'All' │ │ └── ... │ ├── annotator2/ │ ├── annotator3/ │ ├── out.png # model-generated output │ ├── som_segments.png # Set-of-Marks segmentation map (visual) │ └── som_segments.npz # corresponding NumPy map for the above └── gpt-image-1/ ├── ... ``` ### 🧠 `test/` split (without manual evaluation) Same structure as `train/`, except **no `annotatorX/` folders** are included: ``` TIG/ └── TIG_A_000001/ ├── input/ └── outputs/ ├── sdxl/ │ ├── out.png │ ├── som_segments.png │ └── som_segments.npz └── gpt-image-1/ ``` --- ## 🧾 File Descriptions | File | Description | |------|--------------| | `evaluation.json` | JSON file with annotator feedback and per-object or per-segment ratings. | | `error_mask.png` | Binary mask highlighting incorrectly generated regions (if annotator selected specific areas). | | `som_segments.png` | Visual segmentation map generated by the **Set-of-Marks (SoM)** model. | | `som_segments.npz` | NumPy array containing pixel-to-segment mappings corresponding to `som_segments.png`. | | `out.png` | The raw image generated by the model for this condition set. | | `metadata.json` | Input metadata and prompt from the original condition set. | --- ## 📊 Annotation Details - Human annotations were collected from **three independent annotators per model output**. - Each annotator could select: - `None` — no error found - `All` — the entire image contains severe issues - or mark **specific regions** using an error mask (`error_mask.png`). - Evaluations include **object-level**, **segment-level**, and **score-based** ratings. --- ## 🔗 Related Datasets | Component | Description | Repository | |------------|--------------|-------------| | **Condition Set** | Input prompts and reference images. | [`TIGER-Lab/ImagenWorld-condition-set`](https://huggingface.co/datasets/TIGER-Lab/ImagenWorld) | | **Model Outputs** | Generated images from all models used in evaluation. | [`TIGER-Lab/ImagenWorld-model-outputs`](https://huggingface.co/datasets/TIGER-Lab/ImagenWorld-model-outputs) | --- ## 🧠 Notes - The **`train/` split** includes **human annotations** from multiple annotators. - The **`test/` split** is simply the remaining portion **without** manual evaluation. - Segmentation files (`som_segments.*`) are included for all models to support error localization and structured comparison. --- ## 📜 Citation If you use **ImagenWorld**, please cite: ```bibtex @misc{imagenworld2025, title = {ImagenWorld: Stress-Testing Image Generation Models with Explainable Human Evaluation on Open-ended Real-World Tasks}, author = {Samin Mahdizadeh Sani and Max Ku and Nima Jamali and Matina Mahdizadeh Sani and Paria Khoshtab and Wei-Chieh Sun and Parnian Fazel and Zhi Rui Tam and Thomas Chong and Edisy Kin Wai Chan and Donald Wai Tong Tsang and Chiao-Wei Hsu and Ting Wai Lam and Ho Yin Sam Ng and Chiafeng Chu and Chak-Wing Mak and Keming Wu and Hiu Tung Wong and Yik Chun Ho and Chi Ruan and Zhuofeng Li and I-Sheng Fang and Shih-Ying Yeh and Ho Kei Cheng and Ping Nie and Wenhu Chen}, year = {2025}, doi = {10.5281/zenodo.17344183}, url = {https://zenodo.org/records/17344183}, projectpage = {https://tiger-ai-lab.github.io/ImagenWorld/}, blogpost = {https://blog.comfy.org/p/introducing-imagenworld}, note = {Community-driven dataset and benchmark release, Temporarily archived on Zenodo while arXiv submission is under moderation review.}, } ```

📦 数据集获取 本数据集按任务分为训练集(`train`)与测试集(`test`)两个数据划分,均以压缩文件夹形式存储。 ### 🐍 Python 下载方式 python from huggingface_hub import snapshot_download import zipfile from pathlib import Path # 下载带标注的数据集 local_path = snapshot_download( repo_id="TIGER-Lab/ImagenWorld-annotated-set", repo_type="dataset", local_dir="ImagenWorld-annotated-set", local_dir_use_symlinks=False, ) # 为每个数据划分解压所有任务数据 for split in ["train", "test"]: split_dir = Path(local_path) / split for zip_file in split_dir.glob("*.zip"): target_dir = split_dir / zip_file.stem target_dir.mkdir(exist_ok=True) with zipfile.ZipFile(zip_file, "r") as zf: zf.extractall(target_dir) print(f"✅ 已解压 {zip_file.name} → {target_dir}") --- ### 💻 命令行下载方式 bash hf download TIGER-Lab/ImagenWorld-annotated-set --repo-type dataset --local-dir ImagenWorld-annotated-set cd ImagenWorld-annotated-set && for s in train test; do cd "$s"; for f in *.zip; do d="${f%.zip}"; mkdir -p "$d"; unzip -q "$f" -d "$d"; done; cd ..; done --- ## 📁 数据集结构 解压完成后,目录结构如下: ImagenWorld-annotated-set/ │ ├── train/ │ ├── TIG.zip │ ├── TIE.zip │ ├── SRIG.zip │ ├── SRIE.zip │ ├── MRIG.zip │ └── MRIE.zip │ ├── test/ │ ├── TIG.zip │ ├── TIE.zip │ ├── SRIG.zip │ ├── SRIE.zip │ ├── MRIG.zip │ └── MRIE.zip 解压单个任务压缩包后,其内部结构如下: ### 🧩 训练集(`train`,含人工评估) TIG/ └── TIG_A_000001/ ├── input/ │ ├── metadata.json │ ├── 1.png │ └── ... └── outputs/ ├── sdxl/ │ ├── annotator1/ │ │ ├── evaluation.json │ │ ├── error_mask.png # 可选;仅当标注结果不为`"None"`或`"All"`时存在 │ │ └── ... │ ├── annotator2/ │ ├── annotator3/ │ ├── out.png # 模型生成的输出图像 │ ├── som_segments.png # 标记集(Set-of-Marks,SoM)可视化分段掩码图 │ └── som_segments.npz # 上述掩码对应的像素-分段映射NumPy数组 └── gpt-image-1/ ├── ... ### 🧠 测试集(`test`,无人工评估) 结构与训练集一致,但**不含`annotatorX/`系列文件夹**: TIG/ └── TIG_A_000001/ ├── input/ └── outputs/ ├── sdxl/ │ ├── out.png │ ├── som_segments.png │ └── som_segments.npz └── gpt-image-1/ --- ## 🧾 文件说明 | 文件路径 | 描述 | |------|--------------| | `evaluation.json` | 包含标注员反馈与逐对象/逐分段评分的JSON文件。 | | `error_mask.png` | 二值掩码图,用于高亮生成错误的区域(若标注员标记了特定区域)。 | | `som_segments.png` | 由**标记集(Set-of-Marks,SoM)**模型生成的可视化分段掩码图。 | | `som_segments.npz` | 与`som_segments.png`对应的像素-分段映射NumPy数组文件。 | | `out.png` | 模型基于当前条件集生成的原始图像。 | | `metadata.json` | 原始条件集的输入元数据与提示词。 | --- ## 📊 标注细节 - 每个模型输出均由**三名独立标注员**完成人工标注。 - 每位标注员可选择以下标注结果: - `"None"` — 未发现错误 - `"All"` — 整张图像存在严重问题 - 或使用错误掩码图(`error_mask.png`)标记**特定错误区域**。 - 评估内容包含**对象级、分段级与评分级**三类标注。 --- ## 🔗 关联数据集 | 组件 | 描述 | 仓库地址 | |------------|--------------|-------------| | **条件集(Condition Set)** | 输入提示词与参考图像。 | [`TIGER-Lab/ImagenWorld-condition-set`](https://huggingface.co/datasets/TIGER-Lab/ImagenWorld) | | **模型输出集(Model Outputs)** | 评估所用全部模型生成的图像。 | [`TIGER-Lab/ImagenWorld-model-outputs`](https://huggingface.co/datasets/TIGER-Lab/ImagenWorld-model-outputs) | --- ## 🧠 注意事项 - 训练集(`train`)包含多名标注员的**人工标注数据**。 - 测试集(`test`)为剩余未标注的数据集子集。 - 所有模型均附带分段文件(`som_segments.*`),以支持错误定位与结构化对比分析。 --- ## 📜 引用格式 若您使用**ImagenWorld**数据集,请引用以下文献: bibtex @misc{imagenworld2025, title = {ImagenWorld: Stress-Testing Image Generation Models with Explainable Human Evaluation on Open-ended Real-World Tasks}, author = {Samin Mahdizadeh Sani and Max Ku and Nima Jamali and Matina Mahdizadeh Sani and Paria Khoshtab and Wei-Chieh Sun and Parnian Fazel and Zhi Rui Tam and Thomas Chong and Edisy Kin Wai Chan and Donald Wai Tong Tsang and Chiao-Wei Hsu and Ting Wai Lam and Ho Yin Sam Ng and Chiafeng Chu and Chak-Wing Mak and Keming Wu and Hiu Tung Wong and Yik Chun Ho and Chi Ruan and Zhuofeng Li and I-Sheng Fang and Shih-Ying Yeh and Ho Kei Cheng and Ping Nie and Wenhu Chen}, year = {2025}, doi = {10.5281/zenodo.17344183}, url = {https://zenodo.org/records/17344183}, projectpage = {https://tiger-ai-lab.github.io/ImagenWorld/}, blogpost = {https://blog.comfy.org/p/introducing-imagenworld}, note = {Community-driven dataset and benchmark release, Temporarily archived on Zenodo while arXiv submission is under moderation review.}, }
提供机构:
maas
创建时间:
2025-10-14
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作