鸟声噪声学习数据集
收藏魔搭社区2025-11-28 更新2024-05-15 收录
下载链接:
https://modelscope.cn/datasets/aoweichen/BirdDKY
下载链接
链接失效反馈官方服务:
资源简介:
## 数据集描述
该数据集由Dian团队国网组与国家电网中国电力科学研究院共同制作,用于鸟声与变电站站内变压器及电抗器所产生噪声的分离任务。
### 数据集简介
该数据集用于Mossformer模型在鸟声与变电站站内变压器及电抗器所产生噪声的分离任务上的训练及测试。
## 数据集的格式和结构
数据集格式为WAV音频文件。
数据集格式为:
```
audio
├── cv
│ ├── mix
│ ├── s1
│ └── s2
├── tr
│ ├── mix
│ ├── s1
│ └── s2
└── tt
├── mix
├── s1
└── s2
```
其中s1为鸟声,s2为站内噪声。
### 数据集加载方式
通过代码范例等方式,提供数据集通过MaaS/Dataset SDK进行加载和使用的详细说明。
### 数据分片
数据集已预分片为train、test、validation三个数据分片。
数据分片条数分别为:
```
训练集长度:4000
测试集长度:400
验证集长度:400
```
## 数据集生成的相关信息
### 原始数据
原始数据由国家电网中国电力科学研究院相关项目组采集,并进行了归一化及重采样处理。
## 其他相关信息
该数据集用于Dian团队与国家电网中国电力科学研究院合作项目“变电站厂界背景噪声剔除与智能分离算法研究”的算法研究工作。
## Dataset Description
This dataset was jointly created by the State Grid Group of the Dian Team and China Electric Power Research Institute of State Grid Corporation of China, for the separation task of bird sounds and noise generated by transformers and reactors inside substations.
### Dataset Introduction
This dataset is used for training and testing the Mossformer model on the separation task of bird sounds and noise generated by transformers and reactors inside substations.
## Dataset Format and Structure
The dataset is stored in WAV audio file format. The dataset structure is as follows:
audio
├── cv
│ ├── mix
│ ├── s1
│ └── s2
├── tr
│ ├── mix
│ ├── s1
│ └── s2
└── tt
├── mix
├── s1
└── s2
Where s1 represents bird sounds, and s2 represents the on-site noise inside the substation.
### Dataset Loading Method
Provide detailed instructions on loading and utilizing the dataset via the MaaS/Dataset SDK, including code examples and other relevant supporting materials.
### Data Sharding
The dataset has been pre-sharded into three splits: train, test, and validation. The sizes of each split are as follows:
Training set size: 4000
Test set size: 400
Validation set size: 400
## Dataset Generation Related Information
### Raw Data
The raw data was collected by the project team of China Electric Power Research Institute of State Grid Corporation of China, and underwent normalization and resampling processing.
## Other Relevant Information
This dataset is employed for the algorithm research work of the cooperative project between the Dian Team and China Electric Power Research Institute of State Grid Corporation of China, titled "Research on Background Noise Elimination and Intelligent Separation Algorithm for Substation Boundary Noise".
提供机构:
maas
创建时间:
2023-04-26
搜集汇总
数据集介绍

背景与挑战
背景概述
该数据集是一个专门用于鸟声与变电站内变压器及电抗器噪声分离任务的音频数据集,由Dian团队与国家电网中国电力科学研究院合作制作,包含WAV格式的音频文件,已划分为训练集(4000条)、测试集(400条)和验证集(400条),适用于Mossformer模型的训练和测试。
以上内容由遇见数据集搜集并总结生成



