vprak17/synthetic_audio_paired_preferences
收藏Hugging Face2026-04-28 更新2026-05-03 收录
下载链接:
https://hf-mirror.com/datasets/vprak17/synthetic_audio_paired_preferences
下载链接
链接失效反馈官方服务:
资源简介:
一个合成的口语对话直接偏好优化(DPO)数据集,包含8482个示例,设计用于无文本语音语言模型的偏好对齐。每个示例对应一个双人对话中的单个助手轮次,提供口语提示(最后用户的发言,包括音频和文本)、选择的响应(模型生成的最佳口语回复,由LLM法官评分)和拒绝的响应(故意降级的回复,错误的响应类型或较低的LLM评分)。对话由Qwen3-32B合成生成,通过Voxtral-based TTS后端转换为语音,并由LLM法官评分。音频为24 kHz单声道WAV格式。
A synthetic spoken-dialogue DPO (Direct Preference Optimization) dataset with 8482 examples, designed for preference alignment of textless speech language models. Each example corresponds to a single assistant turn in a two-speaker conversation, providing the spoken prompt (the last user utterance, as audio + text), a chosen response (the best model-generated spoken reply scored by an LLM judge), and a rejected response (a deliberately degraded reply with wrong response type or lower LLM score). Conversations were synthetically generated with Qwen3-32B, converted to speech with a Voxtral-based TTS backend, and scored with an LLM judge. Audio is 24 kHz mono WAV.
提供机构:
vprak17



