mteb/fsdkaggle2019-parquet

Name: mteb/fsdkaggle2019-parquet
Creator: mteb
Published: 2026-02-05 10:02:20
License: 暂无描述

Hugging Face2026-02-05 更新2026-02-07 收录

下载链接：

https://hf-mirror.com/datasets/mteb/fsdkaggle2019-parquet

下载链接

链接失效反馈

官方服务：

资源简介：

FSDKaggle2019是一个音频数据集，包含29,266个标注了AudioSet Ontology中80个标签的音频文件。该数据集用于DCASE Challenge 2019 Task 2，并以Freesound Audio Tagging 2019为名在Kaggle上举办了竞赛。所有音频片段均为未压缩的PCM 16位、44.1 kHz、单声道音频文件。数据集分为精心整理（curated）和噪声（noisy）两种配置，每种配置都有训练和测试分割。精心整理的数据集包含4,970个片段，平均每个片段有1.2个标签，总时长为10.5小时；噪声数据集包含19,815个片段，平均每个片段有1.2个标签，总时长为80小时。测试集包含4,481个片段，平均每个片段有1.4个标签，总时长为12.9小时。精心整理集的标签正确但可能不完整，噪声集的标签有噪声，测试集的标签正确且完整。

FSDKaggle2019 is an audio dataset containing 29,266 audio files annotated with 80 labels of the AudioSet Ontology. It was used for the DCASE Challenge 2019 Task 2, which was run as a Kaggle competition titled Freesound Audio Tagging 2019. All audio clips are provided as uncompressed PCM 16 bit, 44.1 kHz, mono audio files. The dataset is divided into curated and noisy configurations, each with train and test splits. The curated dataset has 4,970 clips with an average of 1.2 labels per clip, totaling 10.5 hours, while the noisy dataset has 19,815 clips with the same average labels per clip, totaling 80 hours. The test set contains 4,481 clips with an average of 1.4 labels per clip, totaling 12.9 hours. The labels in the curated set are correct but potentially incomplete, while the noisy set has noisy labels, and the test set has correct and complete labels.

提供机构：

mteb

5,000+

优质数据集

54 个

任务类型

进入经典数据集