five

AiActivity/ToxicDataset

收藏
Hugging Face2025-12-23 更新2026-03-29 收录
下载链接:
https://hf-mirror.com/datasets/AiActivity/ToxicDataset
下载链接
链接失效反馈
官方服务:
资源简介:
--- license: mit task_categories: - text-classification - text-generation - other language: - en tags: - toxic-content - hate-speech - content-moderation - abuse-detection - nlp - safety - moderation - offensive-language pretty_name: Comprehensive Toxic Content Dataset size_categories: 1M<n<10M --- # Comprehensive Toxic Content Dataset ## Dataset Description This dataset contains **1,000,000 synthetically generated records** of toxic, abusive, harmful, and offensive content designed for training content moderation systems and hate speech detection models. ### Dataset Summary This comprehensive dataset includes multiple categories of toxic content: - **Toxic** content (insults, derogatory terms) - **Abusive** language patterns - **Gender bias** statements - **Dangerous/threatening** content - **Harmful slang** and abbreviations - **Racist** content patterns - **Homophobic** content - **Religious bias** statements - **Disability bias** content - **Mixed** category combinations ### Supported Tasks - **Text Classification**: Multi-class classification of toxic content types - **Severity Detection**: Classification of content severity (low, medium, high, extreme) - **Content Moderation**: Training moderation filters and safety systems - **Hate Speech Detection**: Identifying hate speech patterns - **Abuse Detection**: Detecting abusive language online ### Languages The dataset is primarily in **English (en)**, with patterns based on real-world English-language toxic content from social media platforms. ## Dataset Structure ### Data Fields Each record contains the following fields: - **id** (`int`): Unique identifier for the record - **content** (`string`): The toxic content text - **category** (`string`): Category of toxic content (toxic, abusive, gender_bias, dangerous, harmful_slang, racist, homophobic, religious_bias, disability_bias, mixed) - **severity** (`string`): Severity level (low, medium, high, extreme) - **timestamp** (`string`): ISO format timestamp - **metadata** (`dict`): Additional metadata including: - `language`: Language code (en) - `type`: Content type (text) - `source`: Source identifier (generated) - `flagged`: Boolean flag indicating toxic content ### Data Splits The dataset can be split into train/validation/test sets. Recommended splits: - **Train**: 80% (800,000 records) - **Validation**: 10% (100,000 records) - **Test**: 10% (100,000 records) ## Dataset Statistics ### Category Distribution - Toxic: ~10% - Abusive: ~10% - Gender Bias: ~10% - Dangerous: ~10% - Harmful Slang: ~10% - Racist: ~10% - Homophobic: ~10% - Religious Bias: ~10% - Disability Bias: ~10% - Mixed: ~10% ### Severity Distribution - Low: ~25% - Medium: ~25% - High: ~25% - Extreme: ~25% ## Dataset Creation ### Source Data This dataset is synthetically generated based on patterns and vocabulary from: 1. **Academic Research**: - Davidson et al. (2017): Hate Speech Detection on Twitter - Waseem & Hovy (2016): Twitter hate speech patterns - Founta et al. (2018): Large-scale abusive behavior - Zampieri et al. (2019): Offensive language identification 2. **Public Datasets**: - Jigsaw Unintended Bias in Toxicity Classification (2M+ comments) - Hate Speech and Offensive Language Dataset (25K tweets) - Toxic Comment Classification Challenge (160K+ comments) - HateXplain Dataset (20K+ posts) - OLID Dataset (14K tweets) 3. **Real-World Sources**: - Hatebase.org lexicon - Jigsaw Perspective API patterns - Documented patterns from social media platforms ### Annotation Process - **Pattern-based generation**: Uses comprehensive word lists and pattern templates - **Validation**: All records validated for required fields and content quality - **Balanced distribution**: Ensures balanced representation across categories - **Realistic variations**: Includes leetspeak, character repetition, punctuation variations ### Personal and Sensitive Information This dataset contains **synthetic toxic content** generated for research purposes. It does not contain real personal information or actual harmful content from individuals. All content is algorithmically generated based on documented patterns. ## Considerations for Using the Data ### Ethical Considerations ⚠️ **WARNING**: This dataset contains toxic, abusive, harmful, and offensive content. **Intended Use**: - Training content moderation systems - Building safety filters and detection models - Academic research on online toxicity - Developing hate speech detection algorithms - Educational purposes for understanding toxic content patterns **NOT Intended For**: - Harassing individuals or groups - Creating harmful content - Targeting marginalized communities - Any malicious purposes ### Limitations 1. **Synthetic Content**: All content is algorithmically generated, not real user-generated content 2. **English Only**: Primarily English language patterns 3. **Pattern-Based**: May not capture all nuances of real-world toxic content 4. **Bias**: Patterns based on documented research may reflect biases in source data ### Bias and Fairness - The dataset is designed to be balanced across categories - Patterns are based on documented research and public datasets - Users should be aware of potential biases in source materials - Regular evaluation and bias testing recommended for production models ## Citation ```bibtex @dataset{toxic_content_dataset_2024, title={Comprehensive Toxic Content Dataset for Moderation Training}, author={Dataset Generator}, year={2024}, url={https://huggingface.co/datasets/[USERNAME]/toxic-content-dataset}, note={Generated for research and content moderation purposes only} } ``` ### Source Citations ```bibtex @inproceedings{davidson2017automated, title={Automated Hate Speech Detection and the Problem of Offensive Language}, author={Davidson, Thomas and Warmsley, Dana and Macy, Michael and Weber, Ingmar}, booktitle={Proceedings of the 11th International AAAI Conference on Web and Social Media}, year={2017}, pages={512-515} } @inproceedings{waseem2016hateful, title={Hateful Symbols or Hateful People? Predictive Features for Hate Speech Detection on Twitter}, author={Waseem, Zeerak and Hovy, Dirk}, booktitle={Proceedings of the NAACL Student Research Workshop}, year={2016}, pages={88--93} } ``` ## Dataset Card Contact For questions or concerns about this dataset, please refer to the repository issues or contact the maintainers. ## License This dataset is released under the **MIT License**. See LICENSE file for details. ## Acknowledgments This dataset is based on patterns and vocabulary from: - Academic research on hate speech and toxic content detection - Public datasets from Jigsaw, Davidson et al., and other researchers - Hatebase.org lexicon - Documented patterns from social media platforms We thank all researchers and organizations who have contributed to understanding and detecting toxic content online.
提供机构:
AiActivity
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作