LIWC Dataset of Known-Victim Rape Suspect Statements (Chile, 2017–2021)
收藏DataCite Commons2026-04-13 更新2026-05-04 收录
下载链接:
https://data.mendeley.com/datasets/8d5z5wp3j6/3
下载链接
链接失效反馈官方服务:
资源简介:
This repository contains a fully de-identified public-use analytic dataset derived from a cross-sectional secondary analysis of routinely collected police records from Chile. The public dataset does not include raw police statements, transcripts, narrative text, audio, direct identifiers, or any other source material from individual case files. Instead, it contains only numeric variables derived from prior text processing with LIWC-22 (v1.3.0), including 51 psycholinguistic indicators across seven dimensions and total word count, together with non-identifying analytic variables required for reproducibility. The underlying study examined 250 cases from 2017–2021, grouped by victim age category (<14 years vs ≥14 years), and evaluated between-group differences in LIWC-derived linguistic markers. LIWC-22 was applied to the original statements before public release, and the repository therefore provides only derived quantitative metrics, not language samples or recoverable verbal content. The shared files are intended exclusively to support transparency, reproducibility, and secondary methodological work in forensic psycholinguistics. Because the repository contains only de-identified, derived numeric outputs, it does not permit reconstruction of the original statements or re-identification of individual participants. These materials should not be interpreted as evidence of guilt, credibility, or investigative priority, but as reproducible research data documenting group-level linguistic associations within a police-investigation sample..
本数据集仓库包含一份完全去标识化的公开可用分析数据集,其源自对智利常规收集的警方记录所开展的横断面二次分析。该公开数据集未包含原始警方笔录、转录文本、叙述性文字、音频素材、直接标识符,亦未收录来自单个案件卷宗的任何其他源材料。取而代之的是,数据集仅包含通过LIWC-22(v1.3.0)进行前期文本处理后得到的数值变量,涵盖7个维度下的51项心理语言学指标以及总词数,同时附带可复现研究所需的非标识性分析变量。支撑该数据集的基础研究共纳入2017至2021年间的250起案件,按受害者年龄分组(<14岁与≥14岁),并对比分析了LIWC衍生语言标记的组间差异。在公开发布前,研究人员已将LIWC-22工具应用于原始笔录,因此本仓库仅提供衍生的量化指标,而非语言样本或可还原的原始言语内容。本次共享的文件仅用于支持法医心理语言学领域的研究透明度、可复现性以及二次方法学研究工作。由于本仓库仅包含去标识化的衍生数值输出,因此无法还原原始笔录,亦无法重新识别单个研究对象。本材料不应被解读为有罪判定、可信度评估或侦查优先级的依据,仅可作为可复现的研究数据,用于记录警方侦查样本中群体层面的语言关联特征。
提供机构:
Mendeley Data
创建时间:
2026-04-13



