A standardized personality lexicon for enhancing natural language processing and personalized human-machine interaction
收藏DataCite Commons2026-01-08 更新2025-09-08 收录
下载链接:
https://figshare.com/articles/dataset/A_standardized_personality_lexicon_for_enhancing_natural_language_processing_and_personalized_human-machine_interaction/29596547/1
下载链接
链接失效反馈官方服务:
资源简介:
We developed a reliable large, high-quality personality lexicon that covers both coarse- and fine-grained personality categories and can be used for comprehensive personality perception tasks. We also detailed a systematic construction and validation process, demonstrating its robustness across the five dimensions and 30 facets using real participant data.The main dataset is named personality_lexicon.xlsx. This file contains all annotated personality adjectives, including the source, dimension, facet, and emotional polarity of each personality adjective. The column “id” contains a consecutive number identifying each individual lexical word. The dimension and facet indicate to which dimension and facet of personality each adjective belongs. In addition, the source and emotional valence (positive, negative, neutral) of the adjectives are provided.Other raw material includes the data collected from both the pilot study and the formal study. The pilot data are stored in pilot_study.xlsx, and the formal data are stored in formal_data.xlsx. Notably, the file formal_data.xlsx contains two sheets: the first sheet includes the 241 personality adjectives used in the pilot study, and the second sheet includes the 3600 personality adjectives used in the formal study.Both files contain demographic information, Big five inventory (IPIP-NEO-120) data, and rating scores for each study participant. Within the demographic information, the column “id” contains a consecutive number identifying each individual study participant; other details are described in the Methods. The big five inventory data include responses to 120 items as well as scores across the five dimensions of personality. All remaining columns represent the rating scores assigned to each lexical word by participants. To facilitate participants’ understanding of each adjective, each lexical word was embedded into a complete sentence during the rating process.<br>
提供机构:
figshare
创建时间:
2025-07-18



