scaleinvariant/paired-arcface-embeddings-casia-webface
收藏Hugging Face2026-03-14 更新2026-03-29 收录
下载链接:
https://hf-mirror.com/datasets/scaleinvariant/paired-arcface-embeddings-casia-webface
下载链接
链接失效反馈官方服务:
资源简介:
---
license: cc
configs:
- config_name: default
data_files:
- split: train
path: train/*.parquet
- split: validation
path: validation/*.parquet
pretty_name: paired-arcface
---
# Paired ArcFace Embeddings (CASIA-WebFace)
This dataset contains randomly paired face records from CASIA-WebFace embedded with ArcFace 512 (512-dimensional embeddings).
Introduced by Yi et al. in _Learning Face Representation from Scratch_
The CASIA-WebFace dataset is used for face verification and face identification tasks.
The purpose of this dataset was to enable fast training of models learning from paired embeddings.
## Record fields
- `image1_jpeg`, `image2_jpeg`: JPEG bytes for each image in the pair
- `image1_metadata`, `image2_metadata`: per-image metadata payloads
- `image1_embedding0`, `image2_embedding0`: ArcFace embedding vectors
## Arc2Face naming note
We picked ArcFace 512 for embeddings because Arc2Face diffusion can be used to generate debug face visualizations from embeddings.
See https://huggingface.co/FoivosPar/Arc2Face
## Splits
- `train`: 9 parquet files, 3.50 GB
- `validation`: 20 parquet files, 345.90 MB
## Example
```python
from datasets import load_dataset
ds = load_dataset("scaleinvariant/paired-arcface-embeddings-casia-webface", split="train", streaming=True)
print(next(iter(ds)).keys())
```
提供机构:
scaleinvariant



