BirdCLEF2023
收藏阿里云天池2026-05-15 更新2025-04-26 收录
下载链接:
https://tianchi.aliyun.com/dataset/202533
下载链接
链接失效反馈官方服务:
资源简介:
以下关于 kaggle_birdCLEF2023 数据集的详细介绍:
### 数据集概况
* **来源与背景** :该数据集来源于 BirdCLEF2023 挑战赛,由 Kaggle 主办,旨在推动通过声音进行鸟类识别的研究和创新,以助力生物声学领域的发展和世界各地鸟类种群的保护。
* **数据规模** :包含来自 Xeno-canto 提供的 16900 音频记录,涵盖 264 种非洲肯尼亚的鸟类 species,这些物种的鸣叫在 191 个 10 分钟的声景录音中被听到。
### 数据集构成
* **训练数据** :
* **音频文件** :包含在 train_audio 文件夹中,主要是 xenocanto.org 用户上传的单个鸟叫声的简短记录。这些文件在适用的情况下已缩减采样至 32kHz,以匹配测试集音频,并转换为 ogg 格式。
* **元数据** :由 train_metadata.csv 文件提供,共有 16941 条记录,454 个缺失值。其中最直接相关的字段包括 primary_label,即鸟类代码,可在 ebird.org/species/ 后附加代码查看详细信息;latitude & longitude 表示录制位置的坐标;author 为提供录音的用户;filename 是关联音频文件的名称。
* **测试数据** :test_soundscapes 文件夹中包含大约 200 个用于评分的录制文件,它们长达 10 分钟,采用 ogg 音频格式,文件名是随机的。
### 数据集特点
* **音频多样性** :数据集涵盖了丰富的鸟类叫声,包括来自非洲肯尼亚的 264 种物种,每个物种都有多条音频记录,能够在一定程度上满足对不同鸟类叫声特征的研究和分析需求。
* **场景复杂性** :测试集中的音频是在各种声景中记录的,包含了多种背景环境声音,这增加了鸟类叫声识别的难度,也更贴近实际应用场景,有助于提高模型在真实环境中的识别性能。
* **数据质量高** :训练数据中的音频文件经过了一定的预处理,如将立体声转换为单声道、重采样至 16kHz、进行高通滤波、归一化等,有助于提升模型对音频特征的提取和学习效果。
### 数据集用途
主要用于鸟类声音识别相关的任务,如开发能够准确识别音频录音中鸟类物种的机器学习模型,参赛者需要利用训练数据中的音频和标签信息来训练模型,然后对测试数据中的未知录音进行物种分类预测,以实现对东非鸟类声音的准确识别,进而为鸟类种群监测和保护提供技术支持。
### 评估标准
比赛的评估指标是 padded cmAP,它是 scikit-learn 实现的宏观平均精度分数的衍生物。在评分前,会对每个提交和解决方案进行填充,以接受对零真正例标签物种的预测,并减少对仅有少数正例标签物种的影响。
Detailed introduction of the kaggle_birdCLEF2023 dataset:
## Dataset Overview
* **Source and Background**: This dataset is derived from the BirdCLEF2023 challenge hosted by Kaggle, which aims to advance research and innovation in bird sound recognition, thereby supporting the development of bioacoustics and the conservation of bird populations across the globe.
* **Data Scale**: It contains 16,900 audio recordings sourced from Xeno-canto, covering 264 bird species native to Kenya, Africa. The vocalizations of these species are captured across 191 10-minute soundscape recordings.
## Dataset Composition
* **Training Data**:
* **Audio Files**: Stored in the `train_audio` folder, these are primarily short recordings of individual bird calls uploaded by users on xeno-canto.org. Where applicable, these files have been resampled to 32 kHz to match the test set audio and converted to the OGG format.
* **Metadata**: Provided via the `train_metadata.csv` file, which contains 16,941 records with 454 missing values. Key relevant fields include:
- `primary_label`: The bird species code, detailed information can be accessed by appending the code to `ebird.org/species/`;
- `latitude & longitude`: The coordinates of the recording location;
- `author`: The user who submitted the recording;
- `filename`: The name of the associated audio file.
* **Test Data**: Approximately 200 scoring recordings are stored in the `test_soundscapes` folder. These are 10-minute-long OGG-format audio files with random filenames.
## Dataset Characteristics
* **Audio Diversity**: The dataset features a rich collection of bird vocalizations from 264 Kenyan bird species, with multiple audio recordings per species. This adequately supports research and analysis of the acoustic features of different bird calls.
* **Scene Complexity**: The audio in the test set was recorded in various natural soundscapes, containing multiple background environmental sounds. This increases the difficulty of bird call recognition while closely simulating real-world application scenarios, helping to improve the recognition performance of models in actual environments.
* **High Data Quality**: The audio files in the training data have undergone standardized preprocessing, including converting stereo to mono, resampling to 16 kHz, applying high-pass filtering, and normalization. This facilitates the model's extraction and learning of audio features.
## Dataset Applications
It is primarily used for tasks related to bird sound recognition, such as developing machine learning models capable of accurately identifying bird species in audio recordings. Participants are required to use the audio and label information from the training data to train their models, then perform species classification predictions on the unknown recordings in the test set to achieve accurate recognition of East African bird sounds, thereby providing technical support for bird population monitoring and conservation.
## Evaluation Criteria
The competition uses padded cmAP, which is a derivative of the macro-average precision score implemented by scikit-learn. Prior to scoring, each submission and solution will be padded to accept predictions for species with zero true positive labels, and mitigate the impact on species with only a small number of positive labels.
提供机构:
阿里云天池
创建时间:
2025-04-24
搜集汇总
数据集介绍

背景与挑战
背景概述
BirdCLEF2023数据集来源于Kaggle挑战赛,专注于通过声音识别鸟类物种,以支持生物声学研究和鸟类保护。该数据集包含来自非洲肯尼亚的264种鸟类的16900个音频记录,训练数据经过预处理(如重采样和格式转换),测试数据则包含复杂声景录音,增加了识别难度。数据集主要用于开发机器学习模型进行鸟类声音分类,评估指标为padded cmAP,旨在提高模型在真实环境中的识别性能。
以上内容由遇见数据集搜集并总结生成



