Anti-Spoofing Real Videos Dataset - 87,340 Files for Facial Recognition and Anti-spoofing
收藏Databricks2026-04-10 收录
下载链接:
https://marketplace.databricks.com/details/3c1e9e76-b879-49e3-ae40-9e054c69a5fe/Unidata_Anti-Spoofing-Real-Videos-Dataset---87,340-Files-for-Facial-Recognition-and-Anti-spoofing
下载链接
链接失效反馈官方服务:
资源简介:
Overview
The Anti-Spoofing Real Videos Dataset is a large-scale biometric collection produced by Unidata, containing 87,340 files — videos and selfies of genuine, living faces recorded by participants from 170 countries. The dataset serves as a foundational resource for developing and benchmarking face antispoofing systems, facial recognition algorithms, and liveness detection pipelines.
Unlike attack-focused datasets, this collection captures exclusively real faces and natural behavior, providing the clean "genuine" class that every antispoofing system requires. Researchers can pair it with attack data to train binary classifiers, or use it standalone to evaluate how reliably a spoofing detection model accepts legitimate users — a critical metric for any production biometric authentication deployment.
The dataset supports development toward iBeta Level 2 certification, the leading benchmark for robust presentation attack detection (PAD) in biometric security.
Content & Structure
- 87,340 total files: video clips and selfie images
- Participants from 170 countries across all major world regions
- Recorded on a wide range of popular consumer devices: iPhone, Samsung, Xiaomi, and others
- Videos feature individuals performing natural head movements — turning left, right, up, and down — simulating real-world liveness cues
- Both video and static image formats included, supporting multi-modal recognition system development
Subject Demographics
The dataset reflects broad demographic representation, enabling models to generalize across diverse real-world populations rather than overfit to a narrow group.
Gender split:
- Male: 52%
- Female: 48%
Age groups:
- 18–25
- 26–35
- 36–45
- 46–60
- 60+
Geographic & ethnic coverage spans 170 countries, including Caucasian, Asian, African, Latin American, Middle Eastern, and South Asian participants — directly supporting domain generalization in anti-spoofing algorithms and reducing demographic bias in biometric systems.
Metadata & Annotations
Every file in the dataset is accompanied by structured metadata, giving researchers the context needed for rigorous model training and analysis:
1. Gender
2. Age
3. Ethnicity
4. Video resolution
5. Duration
6. Frames per second (FPS)
7. Recording device model
8. Country of origin
This annotation depth allows teams to segment training data by demographic subset, evaluate detection algorithms for fairness across age or ethnicity groups, and build anti-spoofing solutions with measurably stronger domain generalization.
Technical Specifications
- File formats: MP4 (video), jpg(selfies)
- Recording devices: iPhone (multiple generations), Samsung Galaxy series, Xiaomi Mi series, and additional Android devices
- Resolutions: HD to Full HD (1080p), varying by device
- Head movement types captured: frontal, left, right, upward tilt, downward tilt
- Collection method: crowdsourced, reflecting natural real-world variability in lighting, backgrounds, and angles
Crowdsourced capture ensures the dataset represents the unpredictable conditions authentication systems encounter in production — varied lighting, informal backgrounds, and inconsistent camera positioning — rather than the overly clean conditions of lab recordings.
Use Cases
- Liveness Detection Training. The core application. Anti-spoofing techniques and detection algorithms require high-quality genuine faces as the positive class. This dataset provides 87,340 clean, diverse real-face samples to anchor that training, reducing false-acceptance rates against photo attacks, 3D masks, printed photos, and replay attacks.
- Facial Recognition & Identity Verification. Developers building facial recognition and identity verification pipelines use demographically diverse real-face data to improve accuracy across population groups. The dataset's coverage of 170 countries directly addresses one of the most common failure modes in recognition technology: poor performance on underrepresented ethnicities.
- iBeta Level 2 Certification. Organizations pursuing iBeta Level 2 — the most rigorous standard for biometric presentation attack detection — need a validated real-face baseline for system evaluation. This dataset provides that foundation, supporting compliance with ISO/IEC 30107 requirements for liveness detection in security systems.
Compliance & Data Protection
The dataset was collected under consent-based conditions through crowdsourcing platforms. All biometric data complies with GDPR and applicable international data protection regulations. Data storage is hosted on AWS infrastructure certified to ISO 27001 and ISO 27701 standards, meeting enterprise requirements for handling sensitive biometric information.
Summary
The Anti-Spoofing Real Videos Dataset is a production-ready collection of 87,340 genuine face videos and selfies from 170 countries, built for teams developing face antispoofing, liveness detection, and facial recognition systems. Its demographic breadth — 52% male / 48% female, spanning all major age groups and ethnicities — combined with detailed per-file metadata, natural facial movements, and real-world capture conditions, makes it an essential component of any serious biometric security research or certification workflow. It provides the high-quality real-face baseline required for iBeta Level 2 preparation and for building anti-spoofing algorithms that perform reliably across the full diversity of real users.
提供机构:
Unidata



