zhengbang0707/REFUEL-Ultrainteract-Llama-3-Armo-iter_2_TWise_30k_CUDA

Name: zhengbang0707/REFUEL-Ultrainteract-Llama-3-Armo-iter_2_TWise_30k_CUDA
Creator: zhengbang0707
Published: 2025-04-10 00:44:56
License: 暂无描述

Hugging Face2025-04-10 更新2025-04-12 收录

下载链接：

https://hf-mirror.com/datasets/zhengbang0707/REFUEL-Ultrainteract-Llama-3-Armo-iter_2_TWise_30k_CUDA

下载链接

链接失效反馈

官方服务：

资源简介：

该数据集包含用户选择的文本内容（chosen）和拒绝的文本内容（reject），每个内容都包含文本本身（content）和用户的角色（role）。此外，数据集还提供了关于文本的token信息、用户对文本的掩码信息、选择和拒绝的奖励值以及对应的日志概率。数据集分为训练集和测试集，训练集包含30000个示例，测试集包含500个示例。

The dataset includes user-chosen texts (chosen) and rejected texts (reject), each containing the text itself (content) and the users role (role). Additionally, the dataset provides token information of the text, user masks for the text, reward values for choosing and rejecting, and corresponding log probabilities. The dataset is split into a training set and a test set, with the training set containing 30,000 examples and the test set containing 500 examples.

提供机构：

zhengbang0707