Aleksandar/NearID-PowerPaint
收藏Hugging Face2026-04-03 更新2026-04-12 收录
下载链接:
https://hf-mirror.com/datasets/Aleksandar/NearID-PowerPaint
下载链接
链接失效反馈官方服务:
资源简介:
---
language:
- en
license: cc-by-4.0
size_categories:
- 10K<n<100K
task_categories:
- image-feature-extraction
pretty_name: NearID-PowerPaint (Near-Identity Distractors)
dataset_info:
features:
- name: id
dtype: int64
- name: category
dtype: string
- name: category_description
dtype: string
- name: nimg1
dtype: image
- name: nimg2
dtype: image
- name: nimg3
dtype: image
- name: n_images
dtype: int64
- name: objaverse_id
dtype: string
- name: prompts1
dtype: string
- name: prompts2
dtype: string
- name: prompts3
dtype: string
- name: quality
dtype: string
splits:
- name: train
tags:
- nearid
- near-identity-distractors
- identity-embedding
- inpainting
- synthetic
- metric-learning
---
# NearID-PowerPaint — Near-Identity Distractors (PowerPaint inpainting)
[](https://huggingface.co/Aleksandar/nearid-siglip2) [](https://huggingface.co/papers/2604.01973) [](https://gorluxor.github.io/NearID/) [](https://github.com/Gorluxor/NearID) [](https://www.kaust.edu.sa/) [](https://research.snap.com/)
This dataset contains **near-identity distractors** generated by **PowerPaint inpainting** at **512×512** resolution as part of the [NearID](https://huggingface.co/Aleksandar/nearid-siglip2) project.
It was presented in the paper [NearID: Identity Representation Learning via Near-identity Distractors](https://huggingface.co/papers/2604.01973).
Each sample contains up to 3 distractor images (`nimg1`, `nimg2`, `nimg3`): different but visually similar instances inpainted into the **exact same background/context** as the corresponding anchor in the base [Aleksandar/NearID](https://huggingface.co/datasets/Aleksandar/NearID) dataset. These distractors are used to train and evaluate identity embeddings that distinguish true identity from contextual shortcuts.
## Quick Start
```python
from datasets import load_dataset
# Load this negative source
ds = load_dataset("Aleksandar/NearID-PowerPaint")
# Load base positives for anchor/positive pairs
positives = load_dataset("Aleksandar/NearID")
```
## Dataset Structure
| Column | Type | Description |
|---|---|---|
| `id` | int64 | Sample ID (matches the base NearID dataset) |
| `category` | string | Object category (`rigid`) |
| `category_description` | string | Natural language description of the object |
| `nimg1`, `nimg2`, `nimg3` | image | Near-identity distractor images (up to 3 per sample) |
| `n_images` | int64 | Number of valid distractor images |
| `objaverse_id` | string | Source Objaverse object identifier |
| `prompts1`, `prompts2`, `prompts3` | string | Generation prompts for each distractor |
| `quality` | string | Quality label |
## How the Distractors Were Generated
1. For each anchor identity in the base NearID dataset, a semantically similar but **different** object instance was retrieved.
2. The distractor instance was inpainted into the **same background** as the anchor using **PowerPaint inpainting**.
3. Resolution: **512×512** pixels.
This creates a controlled test: a model must rely on intrinsic identity features, not background context, to distinguish anchor from distractor.
## All NearID Datasets
| Dataset | Description | Resolution |
|---|---|---|
| [Aleksandar/NearID](https://huggingface.co/datasets/Aleksandar/NearID) | Multi-view positives (anchor + positive views) | Base |
| [Aleksandar/NearID-Flux](https://huggingface.co/datasets/Aleksandar/NearID-Flux) | Near-identity distractors via FLUX.1 inpainting | 512×512 |
| [Aleksandar/NearID-Flux_1024](https://huggingface.co/datasets/Aleksandar/NearID-Flux_1024) | Near-identity distractors via FLUX.1 inpainting | 1024×1024 |
| [Aleksandar/NearID-FluxC](https://huggingface.co/datasets/Aleksandar/NearID-FluxC) | Near-identity distractors via FLUX.1 Canny-guided inpainting | 512×512 |
| [Aleksandar/NearID-FluxC_1024](https://huggingface.co/datasets/Aleksandar/NearID-FluxC_1024) | Near-identity distractors via FLUX.1 Canny-guided inpainting | 1024×1024 |
| [Aleksandar/NearID-PowerPaint](https://huggingface.co/datasets/Aleksandar/NearID-PowerPaint) | Near-identity distractors via PowerPaint inpainting | 512×512 | **← this dataset**
| [Aleksandar/NearID-Qwen](https://huggingface.co/datasets/Aleksandar/NearID-Qwen) | Near-identity distractors via Qwen-based inpainting | 512×512 |
| [Aleksandar/NearID-Qwen_1328](https://huggingface.co/datasets/Aleksandar/NearID-Qwen_1328) | Near-identity distractors via Qwen-based inpainting | 1328×1328 |
| [Aleksandar/NearID-SDXL](https://huggingface.co/datasets/Aleksandar/NearID-SDXL) | Near-identity distractors via Stable Diffusion XL inpainting | 512×512 |
| [Aleksandar/NearID-SDXL_1024](https://huggingface.co/datasets/Aleksandar/NearID-SDXL_1024) | Near-identity distractors via Stable Diffusion XL inpainting | 1024×1024 |
## Related
- **Model:** [Aleksandar/nearid-siglip2](https://huggingface.co/Aleksandar/nearid-siglip2) — NearID identity embedding model
- **Paper:** [NearID: Identity Representation Learning via Near-identity Distractors](https://huggingface.co/papers/2604.01973)
- **Code:** [github.com/Gorluxor/NearID](https://github.com/Gorluxor/NearID)
## License & Attribution
This dataset is released under [CC-BY-4.0](https://creativecommons.org/licenses/by/4.0/). It is derived from the [SynCD](https://github.com/nupurkmr9/syncd) dataset (MIT License, Copyright 2022 SynCD). If you use this dataset, please cite both NearID and SynCD.
## Citation
```bibtex
@article{cvejic2026nearid,
title={NearID: Identity Representation Learning via Near-identity Distractors},
author={Cvejic, Aleksandar and Abdal, Rameen and Eldesokey, Abdelrahman and Ghanem, Bernard and Wonka, Peter},
journal={arXiv preprint arXiv:2604.01973},
year={2026}
}
```
提供机构:
Aleksandar



