CUHK-CSE/wider_face

Name: CUHK-CSE/wider_face
Creator: CUHK-CSE
Published: 2024-01-18 11:17:56
License: 暂无描述

Hugging Face2024-01-18 更新2024-06-15 收录

下载链接：

https://hf-mirror.com/datasets/CUHK-CSE/wider_face

下载链接

链接失效反馈

官方服务：

资源简介：

--- annotations_creators: - expert-generated language_creators: - found language: - en license: - cc-by-nc-nd-4.0 multilinguality: - monolingual size_categories: - 10K<n<100K source_datasets: - extended|other-wider task_categories: - object-detection task_ids: - face-detection paperswithcode_id: wider-face-1 pretty_name: WIDER FACE dataset_info: features: - name: image dtype: image - name: faces sequence: - name: bbox sequence: float32 length: 4 - name: blur dtype: class_label: names: '0': clear '1': normal '2': heavy - name: expression dtype: class_label: names: '0': typical '1': exaggerate - name: illumination dtype: class_label: names: '0': normal '1': 'exaggerate ' - name: occlusion dtype: class_label: names: '0': 'no' '1': partial '2': heavy - name: pose dtype: class_label: names: '0': typical '1': atypical - name: invalid dtype: bool splits: - name: train num_bytes: 12049881 num_examples: 12880 - name: test num_bytes: 3761103 num_examples: 16097 - name: validation num_bytes: 2998735 num_examples: 3226 download_size: 3676086479 dataset_size: 18809719 --- # Dataset Card for WIDER FACE ## Table of Contents - [Table of Contents](#table-of-contents) - [Dataset Description](#dataset-description) - [Dataset Summary](#dataset-summary) - [Supported Tasks and Leaderboards](#supported-tasks-and-leaderboards) - [Languages](#languages) - [Dataset Structure](#dataset-structure) - [Data Instances](#data-instances) - [Data Fields](#data-fields) - [Data Splits](#data-splits) - [Dataset Creation](#dataset-creation) - [Curation Rationale](#curation-rationale) - [Source Data](#source-data) - [Annotations](#annotations) - [Personal and Sensitive Information](#personal-and-sensitive-information) - [Considerations for Using the Data](#considerations-for-using-the-data) - [Social Impact of Dataset](#social-impact-of-dataset) - [Discussion of Biases](#discussion-of-biases) - [Other Known Limitations](#other-known-limitations) - [Additional Information](#additional-information) - [Dataset Curators](#dataset-curators) - [Licensing Information](#licensing-information) - [Citation Information](#citation-information) - [Contributions](#contributions) ## Dataset Description - **Homepage:** http://shuoyang1213.me/WIDERFACE/index.html - **Repository:** - **Paper:** [WIDER FACE: A Face Detection Benchmark](https://arxiv.org/abs/1511.06523) - **Leaderboard:** http://shuoyang1213.me/WIDERFACE/WiderFace_Results.html - **Point of Contact:** shuoyang.1213@gmail.com ### Dataset Summary WIDER FACE dataset is a face detection benchmark dataset, of which images are selected from the publicly available WIDER dataset. We choose 32,203 images and label 393,703 faces with a high degree of variability in scale, pose and occlusion as depicted in the sample images. WIDER FACE dataset is organized based on 61 event classes. For each event class, we randomly select 40%/10%/50% data as training, validation and testing sets. We adopt the same evaluation metric employed in the PASCAL VOC dataset. Similar to MALF and Caltech datasets, we do not release bounding box ground truth for the test images. Users are required to submit final prediction files, which we shall proceed to evaluate. ### Supported Tasks and Leaderboards - `face-detection`: The dataset can be used to train a model for Face Detection. More information on evaluating the model's performance can be found [here](http://shuoyang1213.me/WIDERFACE/WiderFace_Results.html). ### Languages English ## Dataset Structure ### Data Instances A data point comprises an image and its face annotations. ``` { 'image': <PIL.JpegImagePlugin.JpegImageFile image mode=RGB size=1024x755 at 0x19FA12186D8>, 'faces': { 'bbox': [ [178.0, 238.0, 55.0, 73.0], [248.0, 235.0, 59.0, 73.0], [363.0, 157.0, 59.0, 73.0], [468.0, 153.0, 53.0, 72.0], [629.0, 110.0, 56.0, 81.0], [745.0, 138.0, 55.0, 77.0] ], 'blur': [2, 2, 2, 2, 2, 2], 'expression': [0, 0, 0, 0, 0, 0], 'illumination': [0, 0, 0, 0, 0, 0], 'occlusion': [1, 2, 1, 2, 1, 2], 'pose': [0, 0, 0, 0, 0, 0], 'invalid': [False, False, False, False, False, False] } } ``` ### Data Fields - `image`: A `PIL.Image.Image` object containing the image. Note that when accessing the image column: `dataset[0]["image"]` the image file is automatically decoded. Decoding of a large number of image files might take a significant amount of time. Thus it is important to first query the sample index before the `"image"` column, *i.e.* `dataset[0]["image"]` should **always** be preferred over `dataset["image"][0]` - `faces`: a dictionary of face attributes for the faces present on the image - `bbox`: the bounding box of each face (in the [coco](https://albumentations.ai/docs/getting_started/bounding_boxes_augmentation/#coco) format) - `blur`: the blur level of each face, with possible values including `clear` (0), `normal` (1) and `heavy` - `expression`: the facial expression of each face, with possible values including `typical` (0) and `exaggerate` (1) - `illumination`: the lightning condition of each face, with possible values including `normal` (0) and `exaggerate` (1) - `occlusion`: the level of occlusion of each face, with possible values including `no` (0), `partial` (1) and `heavy` (2) - `pose`: the pose of each face, with possible values including `typical` (0) and `atypical` (1) - `invalid`: whether the image is valid or invalid. ### Data Splits The data is split into training, validation and testing set. WIDER FACE dataset is organized based on 61 event classes. For each event class, 40%/10%/50% data is randomly selected as training, validation and testing sets. The training set contains 12880 images, the validation set 3226 images and test set 16097 images. ## Dataset Creation ### Curation Rationale The curators state that the current face detection datasets typically contain a few thousand faces, with limited variations in pose, scale, facial expression, occlusion, and background clutters, making it difficult to assess for real world performance. They argue that the limitations of datasets have partially contributed to the failure of some algorithms in coping with heavy occlusion, small scale, and atypical pose. ### Source Data #### Initial Data Collection and Normalization WIDER FACE dataset is a subset of the WIDER dataset. The images in WIDER were collected in the following three steps: 1) Event categories were defined and chosen following the Large Scale Ontology for Multimedia (LSCOM) [22], which provides around 1000 concepts relevant to video event analysis. 2) Images are retrieved using search engines like Google and Bing. For each category, 1000-3000 images were collected. 3) The data were cleaned by manually examining all the images and filtering out images without human face. Then, similar images in each event category were removed to ensure large diversity in face appearance. A total of 32203 images are eventually included in the WIDER FACE dataset. #### Who are the source language producers? The images are selected from publicly available WIDER dataset. ### Annotations #### Annotation process The curators label the bounding boxes for all the recognizable faces in the WIDER FACE dataset. The bounding box is required to tightly contain the forehead, chin, and cheek.. If a face is occluded, they still label it with a bounding box but with an estimation on the scale of occlusion. Similar to the PASCAL VOC dataset [6], they assign an ’Ignore’ flag to the face which is very difficult to be recognized due to low resolution and small scale (10 pixels or less). After annotating the face bounding boxes, they further annotate the following attributes: pose (typical, atypical) and occlusion level (partial, heavy). Each annotation is labeled by one annotator and cross-checked by two different people. #### Who are the annotators? Shuo Yang, Ping Luo, Chen Change Loy and Xiaoou Tang. ### Personal and Sensitive Information [More Information Needed] ## Considerations for Using the Data ### Social Impact of Dataset [More Information Needed] ### Discussion of Biases [More Information Needed] ### Other Known Limitations [More Information Needed] ## Additional Information ### Dataset Curators Shuo Yang, Ping Luo, Chen Change Loy and Xiaoou Tang ### Licensing Information [Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International (CC BY-NC-ND 4.0)](https://creativecommons.org/licenses/by-nc-nd/4.0/). ### Citation Information ``` @inproceedings{yang2016wider, Author = {Yang, Shuo and Luo, Ping and Loy, Chen Change and Tang, Xiaoou}, Booktitle = {IEEE Conference on Computer Vision and Pattern Recognition (CVPR)}, Title = {WIDER FACE: A Face Detection Benchmark}, Year = {2016}} ``` ### Contributions Thanks to [@mariosasko](https://github.com/mariosasko) for adding this dataset.

提供机构：

CUHK-CSE

原始信息汇总

数据集概述

数据集基本信息

名称: WIDER FACE
语言: 英语
许可: CC BY-NC-ND 4.0
多语言性: 单语种
大小类别: 10K<n<100K
任务类别: 目标检测
任务ID: 人脸检测
PapersWithCode ID: wider-face-1

数据集结构

特征

image: 图像数据，类型为 PIL.Image.Image。
faces: 包含人脸属性的字典
- bbox: 人脸边界框，格式为 coco。
- blur: 模糊级别，可能值为 clear (0), normal (1), heavy (2)。
- expression: 面部表情，可能值为 typical (0), exaggerate (1)。
- illumination: 光照条件，可能值为 normal (0), exaggerate (1)。
- occlusion: 遮挡级别，可能值为 no (0), partial (1), heavy (2)。
- pose: 面部姿态，可能值为 typical (0), atypical (1)。
- invalid: 图像是否有效，类型为布尔值。

数据分割

训练集: 12880张图像，大小为12049881字节。
验证集: 3226张图像，大小为2998735字节。
测试集: 16097张图像，大小为3761103字节。

数据集创建

数据收集与标注

数据来源: WIDER FACE数据集是从公开可用的WIDER数据集中选择的子集。
标注过程: 标注者为所有可识别的人脸标注边界框，并进一步标注面部属性如姿态和遮挡级别。每个标注由一名标注者完成，并由两人交叉检查。
标注者: Shuo Yang, Ping Luo, Chen Change Loy, Xiaoou Tang

许可信息

数据集遵循Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International (CC BY-NC-ND 4.0)许可。

引用信息

@inproceedings{yang2016wider, Author = {Yang, Shuo and Luo, Ping and Loy, Chen Change and Tang, Xiaoou}, Booktitle = {IEEE Conference on Computer Vision and Pattern Recognition (CVPR)}, Title = {WIDER FACE: A Face Detection Benchmark}, Year = {2016}}

搜集汇总

数据集介绍

构建方式

在计算机视觉领域，人脸检测数据集常因规模有限与多样性不足而制约算法性能评估。WIDER FACE数据集从公开的WIDER数据集中精选了32,203张图像，涵盖61个事件类别，并采用专家标注方式对其中393,703个人脸进行了精细标注。标注过程遵循严格的质量控制，每位标注者完成初始标注后，均由另外两位人员进行交叉校验，确保边界框紧密包围额头、下巴与脸颊，并对遮挡、模糊、姿态等属性进行系统记录。数据划分依据事件类别随机分配，训练集、验证集与测试集分别占比40%、10%与50%，这一构建策略旨在提升数据集的代表性与评估的可靠性。

特点

该数据集以其高度的多样性与丰富的标注维度著称，图像中的人脸在尺度、姿态、遮挡及光照条件上呈现显著变化，模拟了真实场景中的复杂情况。每张图像不仅提供人脸的边界框坐标，还附带模糊程度、表情、光照、遮挡水平、姿态及有效性等多重属性标签，这些细粒度标注为模型训练提供了深层次的监督信息。数据规模达到数万张图像与数十万个人脸标注，覆盖多种事件场景，使其成为评估人脸检测算法在挑战性条件下鲁棒性的重要基准。

使用方法

研究者可利用该数据集进行人脸检测模型的训练与评估，通过加载图像及对应的多属性人脸标注，构建端到端的检测流程。数据已预分为训练、验证与测试集，用户可在训练集上优化模型参数，在验证集上进行调参，最终将测试集的预测结果提交至官方评估平台获取性能指标。使用中需注意遵循CC BY-NC-ND 4.0许可协议，仅限非商业用途，且测试集未公开真实标注，需依赖官方评估以保持基准的公正性。

背景与挑战

背景概述

在计算机视觉领域，人脸检测作为基础任务，其性能高度依赖于训练数据的规模与多样性。WIDER FACE数据集由香港中文大学多媒体实验室的杨硕、罗平、罗诚江和汤晓鸥教授团队于2016年创建，旨在应对现实场景中因尺度、姿态、遮挡及光照变化带来的检测难题。该数据集从公开的WIDER图像库中精选32,203张图片，标注了393,703张人脸，并依据61个事件类别进行组织，极大地丰富了人脸检测任务的训练资源。其通过模拟真实世界的复杂条件，推动了人脸检测算法在鲁棒性与泛化能力方面的显著进步，成为该领域广泛认可的基准测试平台。

当前挑战

WIDER FACE数据集致力于解决现实场景下的人脸检测问题，其核心挑战在于处理高度可变的人脸尺度、极端姿态、严重遮挡以及复杂背景干扰。这些因素使得传统检测模型在精度与鲁棒性上难以满足实际应用需求。在构建过程中，数据收集面临从海量网络图像中筛选具有代表性事件类别的困难，需确保图像多样性并避免冗余。标注工作则要求对模糊、遮挡等边缘案例进行精确界定，且需通过多人交叉校验以保证标注质量，这些步骤均增加了数据集构建的复杂性与人力成本。

常用场景

经典使用场景

在计算机视觉领域，人脸检测作为基础任务，其性能评估高度依赖于具有丰富多样性的数据集。WIDER FACE数据集凭借其涵盖61个事件类别、超过39万张标注人脸的特性，成为该领域经典的基准测试集。它常被用于训练和评估各类人脸检测模型，尤其是在处理尺度变化、姿态多样性以及复杂遮挡等现实场景下的鲁棒性测试。研究者通过在该数据集上验证算法，能够系统性地衡量模型在极具挑战性的自然环境中的泛化能力。

实际应用

超越纯粹的学术研究，WIDER FACE数据集所代表的复杂场景人脸检测技术，已深入渗透至众多实际应用领域。在智能安防监控系统中，它助力开发能够在人群密集、光线多变环境下精准定位人脸的算法。社交媒体平台的自动照片标签与内容管理、智能手机的实时美颜与焦点追踪、以及公共场合的客流分析与身份核验等应用，均受益于基于此类大规模、高难度数据训练的鲁棒检测模型，显著提升了相关产品的实用性与用户体验。

衍生相关工作

WIDER FACE数据集的发布，催生了一系列围绕鲁棒人脸检测的经典研究工作。许多顶尖的检测框架，如基于Faster R-CNN、SSD、RetinaNet等架构的改进模型，都将其作为核心评估基准以验证性能。同时，该数据集也激发了针对特定挑战的子任务研究，例如专门处理微小面部、重度遮挡或极端姿态的检测算法。这些衍生工作不仅持续刷新着该数据集的官方排行榜性能，其提出的新颖网络结构、训练策略与损失函数，也深刻影响了通用目标检测领域的技术发展轨迹。

以上内容由遇见数据集搜集并总结生成

5,000+

优质数据集

54 个

任务类型

进入经典数据集