five

Dalision/Omni2Sound_Benchmark

收藏
Hugging Face2026-04-24 更新2026-04-26 收录
下载链接:
https://hf-mirror.com/datasets/Dalision/Omni2Sound_Benchmark
下载链接
链接失效反馈
官方服务:
资源简介:
Omni2Sound基准包括两个主要资源:SoundAtlas和VGGSound-Omni。SoundAtlas是一个大规模、高质量的音频-字幕对数据集,具有紧密的音频-视觉-文本(A-V-T)对齐,源自VGGSound和AudioSet数据集。它采用了一种新颖的多轮代理注释流程来生成高质量的注释。VGGSound-Omni是一个统一的评估基准,用于VT2A、V2A和T2A任务,包括标准和离屏轨道以评估鲁棒性。该数据集在CC BY-NC 4.0许可下发布,仅限非商业用途。

The Omni2Sound Benchmark includes two main resources: SoundAtlas and VGGSound-Omni. SoundAtlas is a large-scale dataset of audio-caption pairs with tight Audio-Visual-Text (A-V-T) alignment, derived from VGGSound and AudioSet datasets. It employs a novel multi-turn agentic annotation pipeline to generate high-quality captions. VGGSound-Omni is a unified evaluation benchmark for VT2A, V2A, and T2A tasks, including standard and off-screen tracks for robustness evaluation. The dataset is released under the CC BY-NC 4.0 license for non-commercial use only.
提供机构:
Dalision
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作