RLHFlow/reinforce_ada_hard_prompt
收藏Hugging Face2025-10-10 更新2025-10-18 收录
下载链接:
https://hf-mirror.com/datasets/RLHFlow/reinforce_ada_hard_prompt
下载链接
链接失效反馈官方服务:
资源简介:
选定的硬提示,用于训练Qwen2.5-Math-7B和Qwen3-4B-Instruct-2507模型。
Selected hard prompts used to train Qwen2.5-Math-7B and Qwen3-4B-Instruct-2507.
提供机构:
RLHFlow



