Personally_Speaking_Self_Disclosure_Data

NIAID Data Ecosystem2026-05-02 收录

下载链接：

https://doi.org/10.7910/DVN/9KV9TF

下载链接

链接失效反馈

官方服务：

资源简介：

PHASE 1 DESCRIPTION: Provides counts of the total number of candidate instances of self-disclosure by the Phase 1 participants (N = 33) when they interacted with the CA. For each participant, Linguistic Inquiry Word Count (LIWC) measures of the full context of the participant's conversations – that is, their full dialogue with the CA across the sessions – are provided. In addition to the count of candidate instances of self-disclosure for each participant, the data file includes (for the participants' full conversation): LIWC Word count, the four LIWC summary variables of Analytic, Clout, Authenticity, and (positive) Tone, plus frequency counts for six further LIWC dialogue characterization measures including use of the personal pronoun "I," references to Positive and Negative Emotions, and references to Social, Cognitive, and Motivational Processes. Note: To ensure deidentifiability of these data, individual participants are indicated by an Arbitrary ID Number that has also been used to sort the order of participants as listed in the data file. PHASE 2 DESCRIPTION: Provides the ontology ratings of the Phase 2 participants (N = 37) for the 32 dialogue-turns (in the H set) that were identified as candidate instances of self-disclosure from the Phase 1 CA-participant interactions. Each of the 32 dialogue turns has a unique identifier (e.g., Stim_1117) and is presented on a separate Excel sheet, divided into four files, with 8 stimuli (on 8 separate sheets) included in each file. The first column of each sheet provides the actual text of the dialogue to be rated as it was presented to the Phase 2 participants, and designated as simply "First Speaker" and "Second Speaker." [Note that when completing the ratings, Phase 2 participants were not informed that the first speaker was a CA and the second speaker was a human responding to the CA.] Each of the Phase 2 participants are indicated by a unique identifer that also indicates which one of the further four subsets of stimuli (A to D) they rated (e.g., 101_A, 110_B). In addition to the text of the actual dialogue-turn to be rated (provided in the first column), each file indicates, for the 37 Phase 2 participants, their judgments concerning six rating-prompts: (1) Did the instance involve self-disclosure (yes, possibly, no); (2) the topic being discussed in the self-disclosure (from a multi-option drop-down listing provided below); (3) the form or mental verb of the self-disclosure (from a multi-option drop-down listing, provided below); (4) the intimacy level of the self-disclosure (1=low intimacy, 5=high intimacy); (5) the emotional valence of the self-disclosure (1=strongly negative, 5=strongly positive); and (6) the relatedness of the disclosure to the previous conversational context (1=not related, 5=highly related). Response options to the topic prompt were work/study, leisure, health, eating/sleeping, body/physical appearance, finances, family/home, romantic relationship, friendship, professional relationship, current events, personal/biographical, and other. Response options to the form or mental verb prompt were: personal preference, feeling/emotional state, thought/belief or judgment, habit, planning, memory, factual, and other. Note: For each rated instance, Phase 2 participants were also asked to generate (in three free-response text boxes) three inferences about the second speaker that could be drawn from what was disclosed. As noted in the manuscript (in the section, Characterization of inferences), a full analysis of the naive raters’ inferences drawn from self-disclosure is reserved for future work. Given that participants' inferences are part of our ongoing research -- and themselves form a large textual database supplemented by additional "Phase 3" ratings -- the text of the inferences generated by Phase 2 participants will be made publicly available when the follow-up Inferences-focused manuscript is published. We will cross-reference the current data set then.

创建时间：

2025-08-21

5,000+

优质数据集

54 个

任务类型

进入经典数据集