Dalision/Omni2Sound_Benchmark

Name: Dalision/Omni2Sound_Benchmark
Creator: Dalision
Published: 2026-04-24 14:54:54
License: 暂无描述

Hugging Face2026-04-24 更新2026-04-26 收录

下载链接：

https://hf-mirror.com/datasets/Dalision/Omni2Sound_Benchmark

下载链接

链接失效反馈

官方服务：

资源简介：

Omni2Sound基准包括两个主要资源：SoundAtlas和VGGSound-Omni。SoundAtlas是一个大规模、高质量的音频-字幕对数据集，具有紧密的音频-视觉-文本（A-V-T）对齐，源自VGGSound和AudioSet数据集。它采用了一种新颖的多轮代理注释流程来生成高质量的注释。VGGSound-Omni是一个统一的评估基准，用于VT2A、V2A和T2A任务，包括标准和离屏轨道以评估鲁棒性。该数据集在CC BY-NC 4.0许可下发布，仅限非商业用途。

The Omni2Sound Benchmark includes two main resources: SoundAtlas and VGGSound-Omni. SoundAtlas is a large-scale dataset of audio-caption pairs with tight Audio-Visual-Text (A-V-T) alignment, derived from VGGSound and AudioSet datasets. It employs a novel multi-turn agentic annotation pipeline to generate high-quality captions. VGGSound-Omni is a unified evaluation benchmark for VT2A, V2A, and T2A tasks, including standard and off-screen tracks for robustness evaluation. The dataset is released under the CC BY-NC 4.0 license for non-commercial use only.

提供机构：

Dalision

5,000+

优质数据集

54 个

任务类型

进入经典数据集