trillionlabs/NemoSlides-DPO-mix-v1.0
收藏Hugging Face2026-04-22 更新2026-04-26 收录
下载链接:
https://hf-mirror.com/datasets/trillionlabs/NemoSlides-DPO-mix-v1.0
下载链接
链接失效反馈官方服务:
资源简介:
Slide-DPO是一个用于训练大型语言模型(LLMs)生成Slidev markdown格式幻灯片演示的直接偏好优化(DPO)数据集。该数据集源自Slides-Align的人类偏好排名和SlidesGen-Bench基准测试。每个数据行包含一个偏好对:提示(包括简介和可用图像池)和两个Slidev-markdown响应(带有<think>推理痕迹)。数据集包含571个DPO对,覆盖187个独特的(难度,主题)组,以及16,142个PNG图像。数据集的构建过程包括配对生成、pptx转Slidev、图像保存、图像描述和推理痕迹生成等步骤。然而,数据集存在一些局限性,如布局丢失、图像引用无布局、最大池限制和合成推理等。
Slide-DPO is a Direct Preference Optimization (DPO) dataset for training Large Language Models (LLMs) to generate slide presentations in Slidev markdown format. Derived from the Slides-Align human preference rankings and the SlidesGen-Bench benchmark, each row contains a preference pair: a prompt (including a brief and an available image pool) and two Slidev-markdown responses (with <think> reasoning traces). The dataset comprises 571 DPO pairs across 187 unique (difficulty, topic) groups and includes 16,142 PNG images. The construction pipeline involves pair generation, pptx to Slidev conversion, image preservation, image description mining, and reasoning trace generation. However, the dataset has known limitations such as lossy layout, image references without layout, a maximum pool size of 40, and synthetic reasoning.
提供机构:
trillionlabs



