five

dogtooth/off-policy-0.1-with-on-policy-0.1-uf_iter1_generated_ultrafeedback_binarized_1730418494

收藏
Hugging Face2024-10-31 更新2024-12-14 收录
下载链接:
https://hf-mirror.com/datasets/dogtooth/off-policy-0.1-with-on-policy-0.1-uf_iter1_generated_ultrafeedback_binarized_1730418494
下载链接
链接失效反馈
官方服务:
资源简介:
该数据集用于生成任务,涉及偏好数据的处理。数据集配置中使用了HuggingfaceH4/ultrafeedback_binarized数据集进行混合,并通过rejection sampling方法生成数据。数据集的具体描述未在README中提及。

This dataset is used for generation tasks and involves the processing of preference data. The dataset configuration uses the HuggingfaceH4/ultrafeedback_binarized dataset for mixing and generates data through the rejection sampling method. The specific description of the dataset is not mentioned in the README.
提供机构:
dogtooth
二维码
社区交流群
二维码
科研交流群
商业服务