Dalision/Omni2Sound_Benchmark
收藏Hugging Face2026-04-24 更新2026-04-26 收录
下载链接:
https://hf-mirror.com/datasets/Dalision/Omni2Sound_Benchmark
下载链接
链接失效反馈官方服务:
资源简介:
Omni2Sound基准包括两个主要资源:SoundAtlas和VGGSound-Omni。SoundAtlas是一个大规模、高质量的音频-字幕对数据集,具有紧密的音频-视觉-文本(A-V-T)对齐,源自VGGSound和AudioSet数据集。它采用了一种新颖的多轮代理注释流程来生成高质量的注释。VGGSound-Omni是一个统一的评估基准,用于VT2A、V2A和T2A任务,包括标准和离屏轨道以评估鲁棒性。该数据集在CC BY-NC 4.0许可下发布,仅限非商业用途。
The Omni2Sound Benchmark includes two main resources: SoundAtlas and VGGSound-Omni. SoundAtlas is a large-scale dataset of audio-caption pairs with tight Audio-Visual-Text (A-V-T) alignment, derived from VGGSound and AudioSet datasets. It employs a novel multi-turn agentic annotation pipeline to generate high-quality captions. VGGSound-Omni is a unified evaluation benchmark for VT2A, V2A, and T2A tasks, including standard and off-screen tracks for robustness evaluation. The dataset is released under the CC BY-NC 4.0 license for non-commercial use only.
提供机构:
Dalision



