高质量大模型内容审查数据集
收藏北京市数据知识产权2024-08-09 更新2024-08-13 收录
下载链接:
https://webs.bjidex.com/sys-bsc-home/#/bscConsole/intellectualProperty/infoPublicity?action=1
下载链接
链接失效反馈官方服务:
资源简介:
高质量大模型内容审查数据集,涵盖了各种恶意攻击、人身侮辱、歧视性表达、负面,专门设计用于人工智能大模型在内容安全方面的调优与评测。1.作为调优集:能增强大模型识别及妥善应对各类潜在违规指令的能力,大模型通过深度学习这些负面案例,能够被调校得更为敏锐、安全和可靠,使大模型减少产生不良或危害性内容的风险。2. 做为评估集,用于大模型安全评估,通过模拟多样化的负面指令挑战,检测大模型是否能有效拒绝不当请求,在回复中坚守正面、符合社会公序良俗的价值导向。
This high-quality large model content moderation dataset covers various malicious attacks, personal insults, discriminatory expressions and harmful negative content, and is specifically designed for the tuning and evaluation of AI large language models in the field of content security. 1. As a tuning set: It can enhance the ability of large language models to identify and properly handle all kinds of potential violating instructions. By learning these negative cases through deep learning, large language models can be tuned to be more sensitive, secure and reliable, thereby reducing the risk of generating inappropriate or harmful content. 2. As an evaluation set: It is used for the safety evaluation of large language models. By simulating diverse negative instruction challenges, it detects whether large language models can effectively refuse improper requests and adhere to positive value orientations that conform to social public order and good customs in their responses.
提供机构:
北京海天瑞声科技股份有限公司
搜集汇总
数据集介绍

以上内容由遇见数据集搜集并总结生成



