gijs/emolia-balanced-5M-subset

Name: gijs/emolia-balanced-5M-subset
Creator: gijs
Published: 2026-04-26 12:40:53
License: 暂无描述

Hugging Face2026-04-26 更新2026-04-26 收录

下载链接：

https://hf-mirror.com/datasets/gijs/emolia-balanced-5M-subset

下载链接

链接失效反馈

官方服务：

资源简介：

这是emolia-balanced-5M-subset语料库的重新打包版本，作为WebDataset tars，包含了由MOSS-Audio-8B-Instruct生成的每段音频剪辑的语音维度注释。每段音频的JSON侧车文件增加了18个顶级组别键，每个组别包含3-4个短代码字段，描述了语音的一个维度，总共覆盖了59个语音维度。数据集布局为每个tar包含配对的<key>.mp3和<key>.json样本，音频与源文件字节相同，只有侧车JSON增加了18个注释键。生成方式是通过OpenMOSS-Team/MOSS-Audio-8B-Instruct模型，每个音频剪辑被提示18次（每个组别一次），模型返回包含该组别短代码键的JSON对象。注意事项包括一小部分剪辑可能包含错误或原始标签，注释由神经网络模型生成，建议在高风险下游使用时针对目标片段进行人工标注检查。数据集来源于emolia-balanced-5M-subset语料库。

This is the emolia-balanced-5M-subset corpus re-packaged as WebDataset tars with per-clip voice-dimension annotations generated by MOSS-Audio-8B-Instruct. For every audio clip the JSON sidecar is augmented with 18 top-level group keys, each containing 3–4 short-code fields that describe one dimension of the voice, covering 59 total voice dimensions. The dataset layout is a straight WebDataset: each tar contains paired <key>.mp3 + <key>.json samples. The audio is byte-identical to the source; only the sidecar JSON is enriched with the 18 annotation keys. It was generated using the OpenMOSS-Team/MOSS-Audio-8B-Instruct model, with each audio clip prompted 18 times — once per group — and the model returns a JSON object with the groups short-code keys. Caveats include a very small fraction of clips having an _error / _raw tag inside a group instead of parsed fields, and annotations being generated by a neural model, recommending spot-checking against human-labelled references for high-stakes downstream use. The source is derived from the emolia-balanced-5M-subset corpus.

提供机构：

gijs

5,000+

优质数据集

54 个

任务类型

进入经典数据集