dogtooth/off-policy-0.1-with-on-policy-0.1-uf_iter1_generated_ultrafeedback_binarized_1730418494

Name: dogtooth/off-policy-0.1-with-on-policy-0.1-uf_iter1_generated_ultrafeedback_binarized_1730418494
Creator: dogtooth
Published: 2024-10-31 23:49:04
License: 暂无描述

Hugging Face2024-10-31 更新2024-12-14 收录

下载链接：

https://hf-mirror.com/datasets/dogtooth/off-policy-0.1-with-on-policy-0.1-uf_iter1_generated_ultrafeedback_binarized_1730418494

下载链接

链接失效反馈

官方服务：

资源简介：

该数据集用于生成任务，涉及偏好数据的处理。数据集配置中使用了HuggingfaceH4/ultrafeedback_binarized数据集进行混合，并通过rejection sampling方法生成数据。数据集的具体描述未在README中提及。

This dataset is used for generation tasks and involves the processing of preference data. The dataset configuration uses the HuggingfaceH4/ultrafeedback_binarized dataset for mixing and generates data through the rejection sampling method. The specific description of the dataset is not mentioned in the README.

提供机构：

dogtooth