five

jackf857/llama-3-8b-base-scheduled-beta-margin-dpo-hh-helpful-margin-log

收藏
Hugging Face2026-04-27 更新2026-05-03 收录
下载链接:
https://hf-mirror.com/datasets/jackf857/llama-3-8b-base-scheduled-beta-margin-dpo-hh-helpful-margin-log
下载链接
链接失效反馈
官方服务:
资源简介:
# jackf857/llama-3-8b-base-scheduled-beta-margin-dpo-hh-helpful-margin-log Per-step margin summary statistics exported from a margin-DPO training run. ## Source Run - Model repo id: `jackf857/llama-3-8b-base-scheduled-beta-margin-dpo-hh-helpful` - Base model: `W-61/llama-3-8b-base-sft-hh-helpful-4xh200` - Run name: `llama-3-8b-base-scheduled-beta-margin-dpo-hh-helpful` - Margin log path: `margin_outputs/llama-3-8b-base-margin-dpo-hh-helpful/margin_logs` - Published split: `train` - Rows: `681` ## Columns - `epoch` - `step` - `batch_size` - `mean` - `std` - `min` - `p10` - `median` - `p90` - `max` - `pos_frac` - `sample` (per-example margins for the effective batch on that logged step) - `npy` (optional path to the saved full margin array when `margin_save_full=true`) ## Dataset Mixer ```json { "Anthropic/hh-rlhf": 1.0 } ```
提供机构:
jackf857
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作