iLearn-Lab/MM25-OFFSET-SegmentationData

Name: iLearn-Lab/MM25-OFFSET-SegmentationData
Creator: iLearn-Lab
Published: 2026-04-10 05:25:51
License: 暂无描述

Hugging Face2026-04-10 更新2026-05-10 收录

下载链接：

https://hf-mirror.com/datasets/iLearn-Lab/MM25-OFFSET-SegmentationData

下载链接

链接失效反馈

官方服务：

资源简介：

--- license: apache-2.0 task_categories: - image-segmentation - text-to-image - image-to-text tags: - composed-image-retrieval - fashioniq - cirr - shoes - acm-mm-2025 --- <a id="top"></a> <div align="center"> <h1>(ACM MM 2025) OFFSET: Segmentation-based Focus Shift Revision for Composed Image Retrieval</h1> <div align="center"> <a target="_blank" href="https://zivchen-ty.github.io/">Zhiwei Chen</a>1, <a target="_blank" href="https://faculty.sdu.edu.cn/huyupeng1/zh_CN/index.htm">Yupeng Hu</a>1&#9993, <a target="_blank" href="https://lee-zixu.github.io/">Zixu Li</a>1, <a target="_blank" href="https://zhihfu.github.io/">Zhiheng Fu</a>1, <a target="_blank" href="https://xuemengsong.github.io">Xuemeng Song</a>2, <a target="_blank" href="https://liqiangnie.github.io/index.html">Liqiang Nie</a>3 </div> 1School of Software, Shandong University &#160&#160&#160 2Department of Data Science, City University of Hong Kong, &#160&#160&#160 3School of Computer Science and Technology, Harbin Institute of Technology (Shenzhen), &#160&#160&#160 &#9993 Corresponding author   <a href="https://acmmm2025.org/"><img src="https://img.shields.io/badge/ACM_MM-2025-blue.svg?style=flat-square" alt="ACM MM 2025"></a> <a href="https://arxiv.org/abs/2507.05631"><img alt='arXiv' src="https://img.shields.io/badge/arXiv-2507.05631-b31b1b.svg"></a> <a href="https://github.com/iLearn-Lab/MM25-OFFSET"><img alt='GitHub' src="https://img.shields.io/badge/GitHub-Repository-black?style=flat-square&logo=github"></a> </div> This dataset contains the official pre-computed dominant portion segmentation data used in the **OFFSET** framework for Composed Image Retrieval (CIR). --- ## 📌 Dataset Information ### 1. Dataset Source This dataset is derived from the official visual data of three widely-used Composed Image Retrieval (CIR) datasets: **FashionIQ**, **Shoes**, and **CIRR**. The segmentation data within this repository was machine-generated using visual language models (BLIP-2) to create image captions as a role-supervised signal, dividing images into dominant and noisy regions by CLIPSeg. ### 2. Dataset Purpose This data serves as the foundational input for the **Dominant Portion Segmentation** module in the OFFSET architecture. It is designed to: * Effectively mask noise information in visual data. * Act as a guiding signal for the Dual Focus Mapping (Visual and Textual Focus Mapping branches). * Address visual inhomogeneity and text-priority biases in Composed Image Retrieval tasks. ### 3. Field Descriptions & Structure The dataset is provided as a single compressed archive: `OFFSET_dominant_portion_segmentation.zip`. Once extracted, it contains pre-computed segmentation masks corresponding to the reference and target images of the downstream datasets. * **Image ID / Filename:** Corresponds directly to the original image names in FashionIQ (e.g., `B000ALGQSY.jpg`), Shoes (e.g., `img_womens_athletic_shoes_375.jpg`), and CIRR (e.g., `train-10108-0-img0.png`). * **Segmentation Mask/Data:** The processed dominant portion arrays/tensors indicating the salient regions versus noisy background regions. ### 4. Data Split The segmentation data aligns strictly with the official dataset splits of the corresponding benchmarks: * **FashionIQ:** `train` / `val` * **Shoes:** `train` / `test` * **CIRR:** `train` / `dev` / `test1` ### 5. License & Commercial Use This segmentation dataset is released under the **Apache 2.0 License**, which permits commercial use, modification, and distribution. *Note:* While this specific segmentation data is Apache 2.0, users must still comply with the original licenses of the underlying FashionIQ, Shoes, and CIRR datasets when using them in conjunction. ### 6. Usage Restrictions & Ethical Considerations * **Limitations:** This data is specifically optimized for the OFFSET model architecture and standard CIR tasks. Generalizing these specific masks to completely unrelated dense prediction tasks may yield sub-optimal results. * **Privacy & Ethics:** The source datasets consist of publicly available e-commerce product images (FashionIQ, Shoes) and natural real-world images (NLVR2/CIRR). The pre-computed segmentation process does not introduce new personally identifiable information (PII) or ethical risks beyond those present in the original public benchmarks. --- ## 🚀 How to Use This dataset is designed to be used directly with the official OFFSET GitHub repository. **1. Download the Data:** Download `OFFSET_dominant_portion_segmentation.zip` from the Files section and extract it. **2. Organize the Directory:** Place the extracted segmentation data into your local environment alongside the original datasets, following the directory requirements specified in the [OFFSET GitHub Repository Data Preparation guide](https://github.com/iLearn-Lab/MM25-OFFSET#--data-preparation). **3. Run Training/Evaluation:** Point the training script to the extracted data paths: ```bash python3 train.py \ --model_dir ./checkpoints/ \ --dataset {shoes, fashioniq, cirr} \ --cirr_path "path/to/CIRR" \ --fashioniq_path "path/to/FashionIQ" \ --shoes_path "path/to/Shoes" ``` --- ## 📝⭐️ Citation If you find this dataset or the OFFSET framework useful in your research, please consider leaving a **Star**⭐️ on our GitHub repository and **Citing**📝 our ACM MM 2025 paper: ```bibtex @inproceedings{OFFSET, title = {OFFSET: Segmentation-based Focus Shift Revision for Composed Image Retrieval}, author = {Chen, Zhiwei and Hu, Yupeng and Li, Zixu and Fu, Zhiheng and Song, Xuemeng and Nie, Liqiang}, booktitle = {Proceedings of the ACM International Conference on Multimedia}, pages = {6113–6122}, year = {2025} } ```

提供机构：

iLearn-Lab

5,000+

优质数据集

54 个

任务类型

进入经典数据集