five

nvidia/LongAudio

收藏
Hugging Face2025-08-08 更新2025-10-25 收录
下载链接:
https://hf-mirror.com/datasets/nvidia/LongAudio
下载链接
链接失效反馈
官方服务:
资源简介:
LongAudio-XL是一个大规模的长音频问答(AQA)数据集,旨在开发(大型)音频语言模型,以便在长音频片段(30秒至10分钟)上进行长音频推理和解决问题的任务。该数据集在原始LongAudio集合的基础上增加了大约100万个新的QA对,共计约125万个多样化的例子。数据集被分为基于每个音频来源数据集的子集。由于版权限制,不提供原始音频文件,用户需要从原始来源获取音频剪辑。数据集包含各种音频推理任务,如描述、情节问答、时间问答等,适用于训练和微调大型音频语言模型,以提高对长音频的理解和推理能力。

LongAudio-XL is a large-scale long audio question-answering (AQA) dataset designed to develop (large) audio-language models on long audio reasoning and problem-solving tasks over long audio clips (30 seconds - 10 mins). It expands upon the original LongAudio collection by adding approximately 1 million new QA pairs for long speech, resulting in a total of ~1.25 million diverse examples. The dataset is partitioned into subsets based on each audio’s source dataset. Due to licensing constraints, the original audio files are not provided, and users are responsible for retrieving the corresponding audio clips from their original sources. The dataset includes various audio reasoning tasks such as captioning, plot QA, temporal QA, and more, suitable for training and fine-tuning large audio-language models to improve understanding and reasoning over long audio.
提供机构:
nvidia
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作