Data for Prediction of Apparent Personality Traits from Selfies using the Five-Factor Model

Name: Data for Prediction of Apparent Personality Traits from Selfies using the Five-Factor Model
Creator: IEEE Dataport
License: 暂无描述

ieee-dataport.org2025-01-21 收录

下载链接：

https://ieee-dataport.org/open-access/data-prediction-apparent-personality-traits-selfies-using-five-factor-model

下载链接

链接失效反馈

官方服务：

资源简介：

Since there is no image-based personality dataset, we used the ChaLearn dataset for creating a new dataset that met the characteristics we required for this work, i.e., selfie images where only one person appears and his face is visible, labeled with the person's apparent personality in the photo. The ChaLearn dataset was distributed as follows: 6,000 data were destined for training, 2,000 for validation and 2,000 were separated for the test phase or final evaluation of the contest, therefore the personality tags of the test data set were not published, so we had 8,000 tagged videos available. For each of them, we took 3 or 4 frames, resulting in a total of 30,935 images. These images constitute the dataset of personality in portraits, first version (PortraitPersonality v1). For the purpose of this work we cut out each image extracted from the videos so that they looked like a selfie. Using OpenCV in Python we performed face detection in each image and then we cut the image making sure to contain the full face. This constitutes the second version of the dataset (PortraitPersonality v2). Each image is matched with its corresponding values (between 0 and 1) for each personality factor (Extraversion, Agreeableness, Conscientiousness, Neuroticism, and Openness to Experience). Here you can find the CSV file of such correspondences, and a github link to the captured frames (see link at the top).

鉴于尚无基于图像的人格数据集，本研究采用 ChaLearn 数据集构建了一个新的数据集，以满足本工作的特定需求，即包含仅有一人出现且其面部清晰可见的自拍照片，并对其照片中人物显现的人格特质进行标注。ChaLearn 数据集的分配情况如下：6000条数据用于训练，2000条用于验证，2000条则用于测试阶段或竞赛的最终评估，因此测试数据集的人格标签并未公开发布，从而我们获得了8000个标注的视频。对于每个视频，我们提取了3至4帧画面，共计得到30,935张图像。这些图像构成了肖像人格数据集的第一版（PortraitPersonality v1）。鉴于本研究的需要，我们对从视频中提取的每一幅图像进行了裁剪，使其呈现出自拍的效果。利用 Python 中的 OpenCV 工具，我们对每幅图像进行了面部检测，并确保裁剪后的图像包含完整面部，从而形成了数据集的第二版（PortraitPersonality v2）。每幅图像都与对应的人格因素（外向性、宜人性、责任心、神经质和经验开放性）的数值（介于0至1之间）相匹配。您可以在此找到相应的 CSV 文件，以及包含捕获帧的 GitHub 链接（请参见顶部链接）。

提供机构：

IEEE Dataport

5,000+

优质数据集

54 个

任务类型

进入经典数据集