[LS2N_IPI_DisFER] Comparing the Robustness of Humans and Deep Neural Networks on Facial Expression Recognition

NIAID Data Ecosystem2026-05-01 收录

下载链接：

https://zenodo.org/records/10650096

下载链接

链接失效反馈

官方服务：

资源简介：

DisFER Distorted-FER (DisFER), a new facial expression recognition (FER) dataset composed of a wide number of distorted images of faces. Materials and Methods Dataset The source images used in our experiment come from the Facial Expression Recognition 2013 (FER-2013) dataset [1]. This dataset was firstly introduced in 2013 at the International Conference on Machine Learning, and has been used in a large number of research works since then, as it encompasses naturalistic conditions and challenges. This dataset consists of 35,887 images of faces in 48 × 48 format, collected thanks to a Google search. Human accuracy on FER-2013 was estimated by its authors around 65.5% [1]. To build the Distorted-FER (DisFER) dataset, we randomly selected, from FER-2013, twelve images per basic emotion, as defined by Ekman [2] (i.e., anger, disgust, fear, happiness, neutral, sadness, and surprise). This yields a total of 84 source images. Each original stimulus was then distorted using three different types of distortions, i.e., Gaussian blur (GB), Gaussian noise (GN), and salt-and-pepper noise (SP). Each distortion was applied at distinct levels: three standard deviation values were tested for GB, i.e., 0.8, 1.1, and 1.4; similarly for GN with standard deviation values equal to 10, 20, and 30; while probability levels of 0.02, 0.04, and 0.06 were chosen for SP; corresponding to low, medium, and high distortions, respectively. Crowdsourcing Experiment In order to collect as many votes as possible on our dataset, and because rating 840 images is time-consuming and can be extremely tiring for a single participant, we decided to set up a crowdsourcing experiment. Such experiments indeed allow the conduct of large-scale subjective tests with reduced costs and efforts. The DisFER dataset was therefore split into twenty-one playlists of forty images each, with a view to keep the tests as fast as possible—as crowdsourcing experiments should not last more than ten minutes or so. Playlists were carefully designed to contain the same numbers of images of a given configuration (i.e., emotion, distortion types, and distortion levels). Among a playlist, images were randomly displayed to participants. Each participant was asked to choose which emotion (i.e., anger, disgust, fear, happiness, neutral, sadness, or surprise) they recognized in the displayed image. No time constraint was imposed on participants to fulfill the task. A total of 1051 participants (including 50% of females) were recruited using the Prolific platform [3]. Prolific takes into consideration researchers’ needs by maintaining a subject recruitment process that is similar to that of a laboratory experiment. Indeed, participants are fully informed that they are being recruited for a research study. Consequently, this platform allows researchers to eliminate ethical concerns, and it further improves the reliability of collected data. Participants were aged between 19 and 75 years old (with a mean of 30±8.53 -- note that three participants did not wish to respond). Twenty playlists out of twenty-one were entirely watched and rated by fifty distinct participants, whereas one playlist was watched and evaluated by fifty-one participants. References [1] Goodfellow, I.J.; Erhan, D.; Carrier, P.L.; Courville, A. Challenges in Representation Learning: A Report on Three Machine Learning Contests. In Proceedings of the Neural Information Processing, Daegu, South Korea, 3–7 November 2013; Springer: Berlin/Heidelberg, Germany, 2013; pp. 117–124 [2] Ekman, P. An argument for basic emotions. Cogn. Emot. 1992, 6, 169–200 [3] https://www.prolific.com/

创建时间：

2024-02-12

5,000+

优质数据集

54 个

任务类型

进入经典数据集