five

Real vs Fake Faces Balanced Dataset with Multiple Dataset Splits

收藏
NIAID Data Ecosystem2026-05-02 收录
下载链接:
https://zenodo.org/record/14532967
下载链接
链接失效反馈
官方服务:
资源简介:
The Real vs Fake Faces Balanced Dataset with Multiple Dataset Splits is a restructured version of the existing RVF10K dataset (available at: https://www.kaggle.com/datasets/sachchitkunichetty/rvf10k), designed to provide structured splits and balance to aid research in binary classification tasks, particularly in the field of deepfake detection and real vs. fake face image classification. This dataset contains total 10,000 high-quality face images, equally split between real and fake images. The real face images are sourced from NVIDIA's Flickr Faces HQ (FFHQ) dataset, while the fake face images are derived from Bojan Tunguz's 1 Million Fake Face dataset. The folder structure of Real vs Fake Faces has been optimized for compatibility and interoperability with the benchmark 140k Real and Fake Faces dataset, ensuring ease of integration into existing research workflows. To support diverse experimental setups, facilitate benchmarking and reproducibility, Real vs Fake Faces is divided into four different training and testing splits: 60-40 (60% Training, 40% Testing): Designed for balanced training and moderately large test sets, suitable for initial experiments. 70-30 (70% Training, 30% Testing): Ideal for scenarios requiring more data for model training while maintaining a reasonable test set size. 75-25 (75% Training, 25% Testing): Focused on experiments needing extended training data with a smaller test set. 80-20 (80% Training, 20% Testing): Suitable for fine-tuning or model validation with maximal training data. Dataset Structure: The dataset is organized into two main folders: real and fake. Each folder contains subdirectories named train, test, and validation, corresponding to the specific split proportions. This hierarchical design facilitates ease of use in machine learning pipelines and seamless integration with pre-existing tools.
创建时间:
2025-02-12
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作