Using Sentiment Analysis and Topic Modeling in Assessing the Impact of Police "Signaling" on Investigative and Prosecutorial Outcomes in Sexual Assault Reports, Cleveland, Ohio, 1993-2011
收藏DataCite Commons2026-03-11 更新2026-05-03 收录
下载链接:
https://www.icpsr.umich.edu/web/NACJD/studies/38644
下载链接
链接失效反馈官方服务:
资源简介:
The Cuyahoga County (Ohio, USA) Sexual Assault Kit (SAK) Initiative, led by the Cuyahoga County Prosecutor's Office, was launched in 2013 to test and follow up on previously untested sexual assault/rape kits that were collected as evidence from sexual assault victims. Rape reports typically include an incident report taken by the responding officer(s) who is tasked with gathering pertinent facts and evidence and then forwarding the report to an investigator (detective) for follow-up, as well as a summary of the investigative activity on the case as noted by the investigator, which can include the decision of the assigned prosecutor to file or not file charges.
Signaling is defined as information conveyed by responding officers in the narratives of police reports regarding a victim's creditability and rape-myth adherence. The goal was to better understand if and how responding officers' written reports in sexual assault cases impact investigating officers' decision-making and how cases proceed (or fail to proceed) in the criminal justice process. The objective of the study was to explore the first step in the investigative process to elucidate facilitators and barriers to sexual assault cases reaching a successful disposition.
The research team employed text mining and machine learning methods using natural language processing and advanced statistical analyses to evaluate the narratives of 5,638 police reports of sexual assaults where victims had sexual assault kits collected in Cuyahoga County over the span of nearly two decades (primarily 1993 through 2011). These reports were analyzed using topic modeling and sentiment analysis. The team addressed three research questions:
To what extent did "sentiments" in the responding officers' narratives reveal positive or negative signaling of victims' credibility?
To what extent were the "topics" and sentiments in the responding officers' narratives different in cases with increased investigative activity compared to those with less?
To what extent were both the topics and sentiments in the responding officers' narratives different for cases that were successfully investigated and prosecuted compared to those that were not?
This collection includes a quantitative dataset (DS1) and a qualitative dataset (DS2). The report-level quantitative dataset contains calculated sentiment scores, categorical variables describing the incident and outcome, and demographic variables of the victim(s) and suspect(s) for all reports analyzed (n=5,639). The full text for all reports is available in a CSV file that can be merged with the main data file. The qualitative data is a subset of reports from the main dataset (n=18) with high, medium, and low sentiment scores that were manually coded by the research team.
提供机构:
ICPSR - Interuniversity Consortium for Political and Social Research
创建时间:
2025-12-16



