nccratliri/vad-mouse

Name: nccratliri/vad-mouse
Creator: nccratliri
Published: 2023-10-03 07:10:42
License: 暂无描述

Hugging Face2023-10-03 更新2024-03-04 收录

下载链接：

https://hf-mirror.com/datasets/nccratliri/vad-mouse

下载链接

链接失效反馈

官方服务：

资源简介：

该数据集是为WhisperSeg定制的，专门用于动物语音活动检测（特别是小鼠的声音分割）。WhisperSeg利用预训练的Whisper Transformer进行人类和动物的语音活动检测（VAD）。

This dataset is customized for WhisperSeg, which is specifically designed for animal voice activity detection, with a particular focus on sound segmentation of mice. WhisperSeg utilizes the pre-trained Whisper Transformer to perform voice activity detection (VAD) for both human and animal vocalizations.

提供机构：

nccratliri

原始信息汇总

数据集概述

数据集名称

Mouse dataset

数据集用途

用于动物语音活动检测（vocal segmentation），特别是针对WhisperSeg模型。

数据集下载

python from huggingface_hub import snapshot_download snapshot_download(nccratliri/vad-mouse, local_dir = "data/mouse", repo_type="dataset")

引用信息

当使用此数据集时，请引用以下文献：

@article {10.7554/eLife.68837, article_type = {journal}, title = {Fast and accurate annotation of acoustic signals with deep neural networks}, author = {Steinfath, Elsa and Palacios-Muñoz, Adrian and Rottschäfer, Julian R and Yuezak, Deniz and Clemens, Jan}, editor = {Calabrese, Ronald L and Egnor, SE Roian and Troyer, Todd}, volume = 10, year = 2021, month = {nov}, pub_date = {2021-11-01}, pages = {e68837}, citation = {eLife 2021;10:e68837}, doi = {10.7554/eLife.68837}, url = {https://doi.org/10.7554/eLife.68837}, abstract = {Acoustic signals serve communication within and across species throughout the animal kingdom. Studying the genetics, evolution, and neurobiology of acoustic communication requires annotating acoustic signals: segmenting and identifying individual acoustic elements like syllables or sound pulses. To be useful, annotations need to be accurate, robust to noise, and fast. We here introduce extit{DeepAudioSegmenter} ( extit{DAS)}, a method that annotates acoustic signals across species based on a deep-learning derived hierarchical presentation of sound. We demonstrate the accuracy, robustness, and speed of extit{DAS} using acoustic signals with diverse characteristics from insects, birds, and mammals. extit{DAS} comes with a graphical user interface for annotating song, training the network, and for generating and proofreading annotations. The method can be trained to annotate signals from new species with little manual annotation and can be combined with unsupervised methods to discover novel signal types. extit{DAS} annotates song with high throughput and low latency for experimental interventions in realtime. Overall, extit{DAS} is a universal, versatile, and accessible tool for annotating acoustic communication signals.}, keywords = {acoustic communication, annotation, song, deep learning, bird, fly}, journal = {eLife}, issn = {2050-084X}, publisher = {eLife Sciences Publications, Ltd}, }

@article {Gu2023.09.30.560270, author = {Nianlong Gu and Kanghwi Lee and Maris Basha and Sumit Kumar Ram and Guanghao You and Richard Hahnloser}, title = {Positive Transfer of the Whisper Speech Transformer to Human and Animal Voice Activity Detection}, elocation-id = {2023.09.30.560270}, year = {2023}, doi = {10.1101/2023.09.30.560270}, publisher = {Cold Spring Harbor Laboratory}, abstract = {This paper introduces WhisperSeg, utilizing the Whisper Transformer pre-trained for Automatic Speech Recognition (ASR) for human and animal Voice Activity Detection (VAD). Contrary to traditional methods that detect human voice or animal vocalizations from a short audio frame and rely on careful threshold selection, WhisperSeg processes entire spectrograms of long audio and generates plain text representations of onset, offset, and type of voice activity. Processing a longer audio context with a larger network greatly improves detection accuracy from few labeled examples. We further demonstrate a positive transfer of detection performance to new animal species, making our approach viable in the data-scarce multi-species setting.Competing Interest StatementThe authors have declared no competing interest.}, URL = {https://www.biorxiv.org/content/early/2023/10/02/2023.09.30.560270}, eprint = {https://www.biorxiv.org/content/early/2023/10/02/2023.09.30.560270.full.pdf}, journal = {bioRxiv} }

搜集汇总

数据集介绍

背景与挑战

背景概述

vad-mouse是一个用于动物声音活动检测的小型数据集，基于WhisperSeg方法，适用于跨物种的声音检测研究。数据集包含6条音频数据，分为训练集和测试集，总时长为162秒，采用Apache 2.0许可证。

以上内容由遇见数据集搜集并总结生成

5,000+

优质数据集

54 个

任务类型

进入经典数据集