FRENK数据集
收藏arXiv2019-06-13 更新2024-08-06 收录
下载链接:
http://arxiv.org/abs/1906.02045v2
下载链接
链接失效反馈官方服务:
资源简介:
FRENK数据集是由斯洛文尼亚约瑟夫·斯蒂芬研究所知识技术部创建,专注于斯洛文尼亚语和英语中关于移民和LGBT主题的社交媒体评论。该数据集包含约22927条评论,通过手动标注识别不同类型的社会不可接受言论(SUD)及其目标。数据集的创建过程涉及从主流媒体Facebook页面收集评论,使用SVM分类器进行主题识别,并通过PyBossa平台进行手动标注。该数据集主要用于理解和对抗社交媒体中的SUD现象,支持统计分析和机器学习模型的开发。
The FRENK Dataset was developed by the Department of Knowledge Technologies at the Jožef Stefan Institute in Slovenia. It focuses on social media comments regarding immigration and LGBT topics in both Slovenian and English. The dataset contains approximately 22,927 comments, which are manually annotated to distinguish different types of socially unacceptable discourse (SUD) and their corresponding targets. The creation of the dataset involved collecting comments from mainstream media Facebook pages, using an SVM classifier for topic identification, and conducting manual annotation via the PyBossa platform. This dataset is primarily intended to help understand and combat the phenomenon of SUD on social media, supporting both statistical analysis and the development of machine learning models.
提供机构:
斯洛文尼亚约瑟夫·斯蒂芬研究所知识技术部
创建时间:
2019-06-05



