miguelribeirokk/crime_tweets_in_portuguese
收藏Hugging Face2024-12-06 更新2024-12-14 收录
下载链接:
https://hf-mirror.com/datasets/miguelribeirokk/crime_tweets_in_portuguese
下载链接
链接失效反馈官方服务:
资源简介:
CrimeTrack数据集包含61,715条与犯罪相关的葡萄牙语推文,涵盖了情感分析、毒性分析和位置信息。数据集中的推文被标记为不同的犯罪类别,如“Assalto”(抢劫)、“Roubo”(盗窃)等,并且还包含情感分析(正面、中性、负面)和毒性分析(如侮辱、威胁等)的评分。此外,数据集还包含一个关键词列,用于标识推文中是否包含与犯罪相关的关键词,以及一个位置列,用于标识推文中是否提到了具体地点。数据集中的推文经过预处理,去除了特殊字符、转换为小写并删除了链接。数据集的来源包括NTSScraper、Twitter API和Kaggle公开数据集。
The CrimeTrack dataset contains 61,715 crime-related tweets in Portuguese, encompassing sentiment analysis, toxicity analysis, and location information. The tweets are labeled with various crime categories such as Assalto (Assault), Roubo (Robbery), etc., and include sentiment analysis (Positive, Neutral, Negative) and toxicity analysis (e.g., Insult, Threat) scores. Additionally, the dataset features a keyword column to indicate the presence of crime-related keywords in the tweets and a location column to identify if specific locations are mentioned. The tweets in the dataset have been preprocessed by removing special characters, converting to lowercase, and eliminating links. The dataset sources include NTSScraper, Twitter API, and a Kaggle public dataset.
提供机构:
miguelribeirokk



