five

nccratliri/vad-bengalese-finch

收藏
Hugging Face2023-10-03 更新2024-03-04 收录
下载链接:
https://hf-mirror.com/datasets/nccratliri/vad-bengalese-finch
下载链接
链接失效反馈
官方服务:
资源简介:
该数据集专门用于Bengalese finch(一种鸟类)的声音活动检测(VAD),是WhisperSeg项目的一部分。WhisperSeg利用预训练的Whisper Transformer进行人类和动物的声音活动检测,通过处理长音频的整个频谱图来生成声音活动的开始、结束和类型的文本表示。这种方法在处理长音频上下文时,通过更大的网络显著提高了检测的准确性,尤其是在标注数据较少的情况下。此外,该方法还展示了在新动物物种上的检测性能的正向迁移,使其在数据稀缺的多物种环境中具有可行性。
提供机构:
nccratliri
原始信息汇总

数据集概述

数据集名称

Bengalese finch 数据集

数据集用途

用于动物语音活动检测(语音分割)的 WhisperSeg 模型。

数据集下载

python from huggingface_hub import snapshot_download snapshot_download(nccratliri/vad-bengalese-finch, local_dir = "data/bengalese-finch", repo_type="dataset")

引用信息

当使用此数据集时,请引用以下文献:

@article {10.7554/eLife.68837, article_type = {journal}, title = {Fast and accurate annotation of acoustic signals with deep neural networks}, author = {Steinfath, Elsa and Palacios-Muñoz, Adrian and Rottschäfer, Julian R and Yuezak, Deniz and Clemens, Jan}, editor = {Calabrese, Ronald L and Egnor, SE Roian and Troyer, Todd}, volume = 10, year = 2021, month = {nov}, pub_date = {2021-11-01}, pages = {e68837}, citation = {eLife 2021;10:e68837}, doi = {10.7554/eLife.68837}, url = {https://doi.org/10.7554/eLife.68837}, abstract = {Acoustic signals serve communication within and across species throughout the animal kingdom. Studying the genetics, evolution, and neurobiology of acoustic communication requires annotating acoustic signals: segmenting and identifying individual acoustic elements like syllables or sound pulses. To be useful, annotations need to be accurate, robust to noise, and fast. We here introduce extit{DeepAudioSegmenter} ( extit{DAS)}, a method that annotates acoustic signals across species based on a deep-learning derived hierarchical presentation of sound. We demonstrate the accuracy, robustness, and speed of extit{DAS} using acoustic signals with diverse characteristics from insects, birds, and mammals. extit{DAS} comes with a graphical user interface for annotating song, training the network, and for generating and proofreading annotations. The method can be trained to annotate signals from new species with little manual annotation and can be combined with unsupervised methods to discover novel signal types. extit{DAS} annotates song with high throughput and low latency for experimental interventions in realtime. Overall, extit{DAS} is a universal, versatile, and accessible tool for annotating acoustic communication signals.}, keywords = {acoustic communication, annotation, song, deep learning, bird, fly}, journal = {eLife}, issn = {2050-084X}, publisher = {eLife Sciences Publications, Ltd}, }

@article {Gu2023.09.30.560270, author = {Nianlong Gu and Kanghwi Lee and Maris Basha and Sumit Kumar Ram and Guanghao You and Richard Hahnloser}, title = {Positive Transfer of the Whisper Speech Transformer to Human and Animal Voice Activity Detection}, elocation-id = {2023.09.30.560270}, year = {2023}, doi = {10.1101/2023.09.30.560270}, publisher = {Cold Spring Harbor Laboratory}, abstract = {This paper introduces WhisperSeg, utilizing the Whisper Transformer pre-trained for Automatic Speech Recognition (ASR) for human and animal Voice Activity Detection (VAD). Contrary to traditional methods that detect human voice or animal vocalizations from a short audio frame and rely on careful threshold selection, WhisperSeg processes entire spectrograms of long audio and generates plain text representations of onset, offset, and type of voice activity. Processing a longer audio context with a larger network greatly improves detection accuracy from few labeled examples. We further demonstrate a positive transfer of detection performance to new animal species, making our approach viable in the data-scarce multi-species setting.Competing Interest StatementThe authors have declared no competing interest.}, URL = {https://www.biorxiv.org/content/early/2023/10/02/2023.09.30.560270}, eprint = {https://www.biorxiv.org/content/early/2023/10/02/2023.09.30.560270.full.pdf}, journal = {bioRxiv} }

5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作