Ref-AVS

Name: Ref-AVS
Creator: Gewu Lab
License: 暂无描述

arXiv2025-09-30 收录

下载链接：

https://gewu-lab.github.io/Ref-AVS

下载链接

链接失效反馈

官方服务：

资源简介：

该数据集名为Ref-AVS，旨在为参考音频视觉分割任务构建，为对应的多模态线索表达中描述的对象提供像素级注释。它包含了48个类别中的多种可听对象和3个类别的静态对象。此外，该数据集还含有丰富的表达方式，融合了听觉、视觉和时序信息，并特别对表达难度进行了分级（简单、中等、困难）。它覆盖了广泛的对象和场景，重点关注复杂性和交互性。该数据集所针对的任务是参考音频视觉分割。

The dataset is named Ref-AVS, which is constructed for the referential audio-visual segmentation task, providing pixel-level annotations for the objects described in the corresponding multimodal cues. It includes multiple audible objects across 48 categories and static objects from 3 categories. Additionally, this dataset incorporates rich expressive modalities, integrating auditory, visual and temporal information, and specially classifies the expression difficulty into three levels: simple, medium and hard. It covers a wide range of objects and scenarios, with emphasis on complexity and interactivity. The task targeted by this dataset is referential audio-visual segmentation.

提供机构：

Gewu Lab

5,000+

优质数据集

54 个

任务类型

进入经典数据集