AudioTokenBench
收藏魔搭社区2025-12-05 更新2025-12-06 收录
下载链接:
https://modelscope.cn/datasets/bosonai/AudioTokenBench
下载链接
链接失效反馈官方服务:
资源简介:
# AudioTokenBench
This is the evaluation dataset for [HiggsTokenizer](https://github.com/boson-ai/higgs-audio/blob/main/tech_blogs/TOKENIZER_BLOG.md). It contains 3150 24khz audio samples across 4 subsets:
- **Speech**: 1,000 clips of 10 seconds audio, randomly sampled from [DAPS](https://ccrma.stanford.edu/~gautham/Site/daps.html).
- **Music**: 1,000 clips of 10 seconds audio, randomly sampled from [MUSDB](https://sigsep.github.io/datasets/musdb.html).
- **Sound Event**: 1,000 clips of 10 seconds audio, randomly sampled from [AudioSet](https://research.google.com/audioset/index.html).
- **Audiophile**: Contains 150 clips of 30 seconds audio, curated from eleven high-fidelity test discs. The clips feature both music and sound events, selected for high-quality audio evaluation.
For detailed evaluation metrics, please refer to our [blog](https://github.com/boson-ai/higgs-audio/blob/main/tech_blogs/TOKENIZER_BLOG.md) and [github](https://github.com/boson-ai/higgs-audio/tree/main).
# 音频Token基准测试集(AudioTokenBench)
本数据集为[HiggsTokenizer(希格斯音频分词器)](https://github.com/boson-ai/higgs-audio/blob/main/tech_blogs/TOKENIZER_BLOG.md)的评估基准测试集。该数据集共包含3150条24kHz音频样本,涵盖4个子集:
- **语音(Speech)**:1000段10秒时长的音频片段,随机采样自[DAPS](https://ccrma.stanford.edu/~gautham/Site/daps.html)。
- **音乐(Music)**:1000段10秒时长的音频片段,随机采样自[MUSDB](https://sigsep.github.io/datasets/musdb.html)。
- **声音事件(Sound Event)**:1000段10秒时长的音频片段,随机采样自[AudioSet](https://research.google.com/audioset/index.html)。
- **高保真音频(Audiophile)**:包含150段30秒时长的音频片段,均从11张高保真测试光盘中精选而来。这些片段同时涵盖音乐与声音事件,专为高品质音频评估筛选。
如需了解详细的评估指标,请参阅我们的[博客](https://github.com/boson-ai/higgs-audio/blob/main/tech_blogs/TOKENIZER_BLOG.md)及[GitHub仓库](https://github.com/boson-ai/higgs-audio/tree/main)。
提供机构:
maas
创建时间:
2025-07-29



