AMindToThink/wmdp-cyber-corpus_unpaired-preference
收藏Hugging Face2024-12-04 更新2024-12-14 收录
下载链接:
https://hf-mirror.com/datasets/AMindToThink/wmdp-cyber-corpus_unpaired-preference
下载链接
链接失效反馈官方服务:
资源简介:
该数据集是从https://huggingface.co/datasets/cais/wmdp-corpora重新打包为“未配对偏好”格式的数据。数据集包含三个特征:prompt(字符串类型)、completion(字符串类型)和label(布尔类型)。其中,保留的数据映射为“True”,遗忘的数据映射为“False”。prompt字段为空,因为原始文本没有提供prompt。数据集分为一个训练集,包含5473个样本,总大小为81058697字节。数据集的下载大小为34512581字节,许可证为MIT,任务类别为强化学习,语言为英语,大小类别为1K<n<10K。作者不确定该数据集是否适用于强化学习,但计划尝试使用。
This dataset is repackaged from https://huggingface.co/datasets/cais/wmdp-corpora into the format of unpaired preference. The dataset contains three features: prompt (string type), completion (string type), and label (boolean type). Data to retain is mapped to True, while forget data is mapped to False. The prompt field is left empty since the original text does not come with a prompt. The dataset is divided into one training set containing 5473 examples, with a total size of 81058697 bytes. The download size of the dataset is 34512581 bytes, the license is MIT, the task category is reinforcement learning, the language is English, and the size category is 1K<n<10K. The author is unsure whether this dataset would really work for Reinforcement Learning but plans to try it out.
提供机构:
AMindToThink



