KoTox
收藏arXiv2023-11-30 更新2024-08-06 收录
下载链接:
http://arxiv.org/abs/2311.18215v1
下载链接
链接失效反馈官方服务:
资源简介:
KoTox是由首尔国立大学创建的一个包含39,200个不道德指令-输出对的自动生成数据集,专注于政治偏见、犯罪和仇恨三个领域。该数据集通过自动化的句子生成系统,结合侮辱性词汇、偏见表达和多样化的谓词,增强了数据集在特定领域的丰富性。KoTox旨在通过指令调整提高大型语言模型(LLMs)的道德意识和应对各种有毒输入的能力,促进自然语言处理应用中更安全和负责任的交互。
KoTox is an automatically generated dataset developed by Seoul National University, consisting of 39,200 unethical instruction-output pairs and focusing on three domains: political bias, crime, and hatred. To enhance its domain-specific richness, this dataset leverages an automated sentence generation system that combines offensive vocabulary, biased expressions, and diverse predicates. KoTox aims to improve the moral awareness of Large Language Models (LLMs) and their ability to handle various toxic inputs through instruction tuning, thereby promoting safer and more responsible interactions in natural language processing applications.
提供机构:
首尔国立大学
创建时间:
2023-11-30



