tsinghua-ee/SAVEBench

Name: tsinghua-ee/SAVEBench
Creator: tsinghua-ee
Published: 2024-10-14 07:43:59
License: 暂无描述

Hugging Face2024-10-14 更新2025-04-12 收录

下载链接：

https://hf-mirror.com/datasets/tsinghua-ee/SAVEBench

下载链接

链接失效反馈

官方服务：

资源简介：

--- license: apache-2.0 --- Audio: 1. LibriSpeech test clean full set 2. Audio caps test full set Video: 1. NExTQA: nextqa_test.json ID provided in the "image" field Image: 1. Flickr30k: flickr30k_captions.json (this is the standard 1k test set). ID provided in the "image" field. 2. TextVQA: textvqa.json ID provided in the "image" field 3. GQA: testdev_balanced_questions_with_images.json ID provided in the "image" field Audio-visual: 1. How2: how2_test.json ID provided in "image". Format: <video_id>_<start_second>_<end_second>.mp4 or .wav. 2. Audio-Visual Sound Source Detection (AVSSD): testdata_formatted.json ID provided in the "image" field. The first one is image and the second one is the corresponding audio. 3. Audio Visual Matching (AVM): audiovisualmatching_combined.json ID provided in the "image" field as a list of two values. The first one is the image and the second one is the audio/speech Whether it is from VGGSS or is from SpokenCOCO is indicated in the ID as well 4. Audio-visual question answering (AVQA) Ego4D-QA: ego4d_qa.json Video ID indicates the frame index 5. Audio-visual question answering (AVQA) Presentation-QA: presentation_qa.json

提供机构：

tsinghua-ee

5,000+

优质数据集

54 个

任务类型

进入经典数据集