大模型生成结果自动研判训练数据集

Name: 大模型生成结果自动研判训练数据集
Creator: 北京信联数安科技有限公司
Published: 2025-02-17 00:00:00
License: 暂无描述

北京市数据知识产权2025-02-17 更新2025-03-04 收录

下载链接：

https://webs.bjidex.com/sys-bsc-home/#/bscConsole/intellectualProperty/infoPublicity?action=1

下载链接

链接失效反馈

官方服务：

资源简介：

目前，我国能为公众提供服务的生成式人工智能服务大模型已达190多个，注册用户数超过6亿。然而大模型的安全问题也不容忽视，针对大模型最常见的攻击行为是绕过大模型本身的一些限制，误导大模型输出违法甚至是有害的内容，这会对社会秩序和个人权益造成威胁。因此，网信部门需要能够为互联网上的各种生成式大模型产品进行备案检测和常态化的安全测评服务，从而实现对生成式大模型产品的安全治理。在对生成式大模型进行安全检测时，其生成结果一般需要人工进行标注，进行自动化安全合规研判实现起来困难且准确率低。本数据集可以提供给安全大模型，在使用这些数据集进行微调后，利用自身对自然语言的理解能力，在学习了训练数据集的研判逻辑后，可以实现对大模型生成结果的自动研判，避免生成违反意识形态、违法犯罪、隐私财产、伦理道德、偏见歧视等方面的内容。

Currently, there are more than 190 generative AI large models in China that can provide public services, with over 600 million registered users. However, the security issues of large models cannot be ignored. The most common attack against large models is bypassing their built-in restrictions to mislead them into generating illegal or even harmful content, which poses threats to social order and personal rights and interests. Therefore, cyberspace administration departments need to provide record inspection and regular security evaluation services for various generative large model products on the Internet, so as to realize the security governance of generative large model products. When conducting security detection on generative large models, their generated results usually require manual annotation, and automated security compliance judgment is difficult to implement with low accuracy. This dataset can be provided to security-focused large models. After fine-tuning with these datasets, leveraging their own natural language understanding capabilities and learning the judgment logic from the training dataset, they can achieve automatic judgment of the generated results of large models, preventing the generation of content that violates ideology, laws and regulations, criminal acts, privacy and property rights, ethics and morality, prejudice and discrimination, etc.

提供机构：

北京信联数安科技有限公司

搜集汇总

数据集介绍

以上内容由遇见数据集搜集并总结生成