five

hikitoxin/chai-ppo-rm-implicit

收藏
Hugging Face2024-11-29 更新2024-12-14 收录
下载链接:
https://hf-mirror.com/datasets/hikitoxin/chai-ppo-rm-implicit
下载链接
链接失效反馈
官方服务:
资源简介:
该数据集包含三个特征:chosen(选择的文本)、rejected(被拒绝的文本)和margin(两者之间的差异值)。数据集分为训练集和测试集,训练集包含162819个样本,测试集包含1024个样本。数据集的下载大小为314960195字节,总大小为566584477.0字节。该数据集用于训练RP奖励模型,并且更好的格式化有助于训练更快收敛并达到更高的最终准确率。

The dataset contains three features: chosen (selected text), rejected (rejected text), and margin (the difference value between them). The dataset is divided into a training set and a test set, with the training set containing 162819 samples and the test set containing 1024 samples. The download size of the dataset is 314960195 bytes, and the total size is 566584477.0 bytes. The dataset is used for training RP reward models, and better formatting tends to make training converge faster and achieve higher final accuracy.
提供机构:
hikitoxin
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作