Aleksandar/NearID-PowerPaint

Name: Aleksandar/NearID-PowerPaint
Creator: Aleksandar
Published: 2026-04-03 12:11:00
License: 暂无描述

Hugging Face2026-04-03 更新2026-04-12 收录

下载链接：

https://hf-mirror.com/datasets/Aleksandar/NearID-PowerPaint

下载链接

链接失效反馈

官方服务：

资源简介：

--- language: - en license: cc-by-4.0 size_categories: - 10K<n<100K task_categories: - image-feature-extraction pretty_name: NearID-PowerPaint (Near-Identity Distractors) dataset_info: features: - name: id dtype: int64 - name: category dtype: string - name: category_description dtype: string - name: nimg1 dtype: image - name: nimg2 dtype: image - name: nimg3 dtype: image - name: n_images dtype: int64 - name: objaverse_id dtype: string - name: prompts1 dtype: string - name: prompts2 dtype: string - name: prompts3 dtype: string - name: quality dtype: string splits: - name: train tags: - nearid - near-identity-distractors - identity-embedding - inpainting - synthetic - metric-learning --- # NearID-PowerPaint — Near-Identity Distractors (PowerPaint inpainting) [![Model](https://img.shields.io/badge/Model-nearid--siglip2-blue)](https://huggingface.co/Aleksandar/nearid-siglip2) [![Paper](https://img.shields.io/badge/arXiv-2604.01973-b31b1b)](https://huggingface.co/papers/2604.01973) [![Project Page](https://img.shields.io/badge/🌐-Project_Page-blue)](https://gorluxor.github.io/NearID/) [![GitHub](https://img.shields.io/badge/GitHub-Repository-black?logo=github)](https://github.com/Gorluxor/NearID) [![KAUST](https://img.shields.io/badge/KAUST-009B4D)](https://www.kaust.edu.sa/) [![Snap Research](https://img.shields.io/badge/Snap_Research-FFFC00?logoColor=black)](https://research.snap.com/) This dataset contains **near-identity distractors** generated by **PowerPaint inpainting** at **512×512** resolution as part of the [NearID](https://huggingface.co/Aleksandar/nearid-siglip2) project. It was presented in the paper [NearID: Identity Representation Learning via Near-identity Distractors](https://huggingface.co/papers/2604.01973). Each sample contains up to 3 distractor images (`nimg1`, `nimg2`, `nimg3`): different but visually similar instances inpainted into the **exact same background/context** as the corresponding anchor in the base [Aleksandar/NearID](https://huggingface.co/datasets/Aleksandar/NearID) dataset. These distractors are used to train and evaluate identity embeddings that distinguish true identity from contextual shortcuts. ## Quick Start ```python from datasets import load_dataset # Load this negative source ds = load_dataset("Aleksandar/NearID-PowerPaint") # Load base positives for anchor/positive pairs positives = load_dataset("Aleksandar/NearID") ``` ## Dataset Structure | Column | Type | Description | |---|---|---| | `id` | int64 | Sample ID (matches the base NearID dataset) | | `category` | string | Object category (`rigid`) | | `category_description` | string | Natural language description of the object | | `nimg1`, `nimg2`, `nimg3` | image | Near-identity distractor images (up to 3 per sample) | | `n_images` | int64 | Number of valid distractor images | | `objaverse_id` | string | Source Objaverse object identifier | | `prompts1`, `prompts2`, `prompts3` | string | Generation prompts for each distractor | | `quality` | string | Quality label | ## How the Distractors Were Generated 1. For each anchor identity in the base NearID dataset, a semantically similar but **different** object instance was retrieved. 2. The distractor instance was inpainted into the **same background** as the anchor using **PowerPaint inpainting**. 3. Resolution: **512×512** pixels. This creates a controlled test: a model must rely on intrinsic identity features, not background context, to distinguish anchor from distractor. ## All NearID Datasets | Dataset | Description | Resolution | |---|---|---| | [Aleksandar/NearID](https://huggingface.co/datasets/Aleksandar/NearID) | Multi-view positives (anchor + positive views) | Base | | [Aleksandar/NearID-Flux](https://huggingface.co/datasets/Aleksandar/NearID-Flux) | Near-identity distractors via FLUX.1 inpainting | 512×512 | | [Aleksandar/NearID-Flux_1024](https://huggingface.co/datasets/Aleksandar/NearID-Flux_1024) | Near-identity distractors via FLUX.1 inpainting | 1024×1024 | | [Aleksandar/NearID-FluxC](https://huggingface.co/datasets/Aleksandar/NearID-FluxC) | Near-identity distractors via FLUX.1 Canny-guided inpainting | 512×512 | | [Aleksandar/NearID-FluxC_1024](https://huggingface.co/datasets/Aleksandar/NearID-FluxC_1024) | Near-identity distractors via FLUX.1 Canny-guided inpainting | 1024×1024 | | [Aleksandar/NearID-PowerPaint](https://huggingface.co/datasets/Aleksandar/NearID-PowerPaint) | Near-identity distractors via PowerPaint inpainting | 512×512 | **← this dataset** | [Aleksandar/NearID-Qwen](https://huggingface.co/datasets/Aleksandar/NearID-Qwen) | Near-identity distractors via Qwen-based inpainting | 512×512 | | [Aleksandar/NearID-Qwen_1328](https://huggingface.co/datasets/Aleksandar/NearID-Qwen_1328) | Near-identity distractors via Qwen-based inpainting | 1328×1328 | | [Aleksandar/NearID-SDXL](https://huggingface.co/datasets/Aleksandar/NearID-SDXL) | Near-identity distractors via Stable Diffusion XL inpainting | 512×512 | | [Aleksandar/NearID-SDXL_1024](https://huggingface.co/datasets/Aleksandar/NearID-SDXL_1024) | Near-identity distractors via Stable Diffusion XL inpainting | 1024×1024 | ## Related - **Model:** [Aleksandar/nearid-siglip2](https://huggingface.co/Aleksandar/nearid-siglip2) — NearID identity embedding model - **Paper:** [NearID: Identity Representation Learning via Near-identity Distractors](https://huggingface.co/papers/2604.01973) - **Code:** [github.com/Gorluxor/NearID](https://github.com/Gorluxor/NearID) ## License & Attribution This dataset is released under [CC-BY-4.0](https://creativecommons.org/licenses/by/4.0/). It is derived from the [SynCD](https://github.com/nupurkmr9/syncd) dataset (MIT License, Copyright 2022 SynCD). If you use this dataset, please cite both NearID and SynCD. ## Citation ```bibtex @article{cvejic2026nearid, title={NearID: Identity Representation Learning via Near-identity Distractors}, author={Cvejic, Aleksandar and Abdal, Rameen and Eldesokey, Abdelrahman and Ghanem, Bernard and Wonka, Peter}, journal={arXiv preprint arXiv:2604.01973}, year={2026} } ```

提供机构：

Aleksandar

5,000+

优质数据集

54 个

任务类型

进入经典数据集