GenRM/reddit-dpo-nbeerbower
收藏Hugging Face2025-05-11 更新2025-11-01 收录
下载链接:
https://hf-mirror.com/datasets/GenRM/reddit-dpo-nbeerbower
下载链接
链接失效反馈官方服务:
资源简介:
reddit-dpo数据集是从euclaise/reddit-instruct数据集过滤而来的,去除了所有帖子或评论中包含超链接的样本。该数据集用于模型训练,特别是用于调整mistral-nemo-narwhal-12B模型,但调整后的模型性能有所下降。
The reddit-dpo dataset is filtered from the euclaise/reddit-instruct dataset, removing all samples with hyperlinks in either the post or comment. This dataset is used for model training, particularly for tuning the mistral-nemo-narwhal-12B model, which experienced a significant performance degradation after tuning.
提供机构:
GenRM



