Examples from dataset# 13.
收藏Figshare2025-06-17 更新2026-04-28 收录
下载链接:
https://figshare.com/articles/dataset/Examples_from_dataset_13_/29344422
下载链接
链接失效反馈官方服务:
资源简介:
This paper presents a novel taxonomy designed to classify offensive language in Arabic, filling a notable void in existing literature primarily concentrated on Indo-European languages. Our taxonomy delineates offensive language into seven distinct levels, comprising six explicit levels and one implicit level. Drawing inspiration from the simplified offensive language (SOL) taxonomy outlined in prior work, we adapted it to accommodate the intricacies and linguistic richness of Arabic. In our study, we analyzed existing datasets containing offensive language in Arabic, examining the range of annotations employed within these datasets. This exploration allowed us to gain insights into the diversity of offensive language instances and the methodologies used for their annotation, thereby informing the development of our streamlined taxonomy for categorizing such expressions. Initial examination of datasets uncovers compelling trends and distributions, emphasizing the intricate and distinct nature of offensive expressions in Arabic. We have also analyzed the performance of pre-trained and fine-tuned Arabic transformer offensive language detection models on these datasets. Our results underscore the importance of acknowledging linguistic and cultural diversity in the study and mitigation of online abusive language. We posit that our refined taxonomy and accompanying dataset will be pivotal in advancing research across Semitic languages, including sociocultural studies, natural language processing, and linguistic analyses.
创建时间:
2025-06-17



