Rana sierrae annotated aquatic soundscapes (2022)

Mendeley Data2024-04-13 更新2024-06-27 收录

下载链接：

https://datadryad.org/stash/dataset/doi:10.5061/dryad.9s4mw6mn3

下载链接

链接失效反馈

官方服务：

资源简介：

# Rana sierrae annotated soundscape recordings Audio files and Raven annotations for Rana sierrae (Sierra Nevada Yellow-legged frog) vocalization types, from aquatic soundscape recordings. Name: rana\_sierrae\_2022 Version: 1.0 This dataset is associated with the following manuscript, which provides details on the methodology of data collection an annotation: Lapp, S., Smith, T. C., Wilhelm, A, Knapp, R., Kitzes, J. In press. Aquatic soundscape recordings reveal diverse vocalizations and nocturnal activity of an endangered frog. The American Naturalist. ## Data and file structure This dataset contains audio files annotated in Raven Pro for Rana sierrae vocalization types. Frequency-time boxes were drawn around all sounds identifiable as vocalizations of Rana sierrae. Five vocalization types are annotated, and are described in the associated manuscript. In total, the dataset contains 1236 annotations of Rana sierrae vocalizations. Other sounds in the aquatic soundscape include stridulations most likely produced by members of the family Corixidae or other aquatic invertebrates. ### Audio: Audio files are provided in mp3 format in the `mp3` subdirectory. The dataset contains 672 10-second files. These are the first 10 seconds of each audio file recorded during the week of June 20-26 2022 on one device (Device 3 in the associated paper; North corner of lake; this device had the highest activity level). The underwater AudioMoth 1.2.0 recorder in underwater case recorded 1 minute starting every 15 minutes 24 hours per day, resulting in (24*4*7)=672 audio recordings. ### Annotations: Files were annotated by Sam Lapp using Raven Pro with closed-back headphones while viewing spectrogram. Only calls that could both be heard and seen on spectrogram were annotated. Multiple vocalizations were included in a single annotation box if they were separated by greater than 1 second of intervening time without vocalizations of the same type. Labels correspond to the associated manuscript: A primary vocalization B stuttered vocalization described in Vredenburg et al C chuck, double/triple chuck calls D short downward single note E frequency-modulated call X: could not determine if sound is R. sierrae or not; these are excluded from training and validation of the CNN ### Raven annotation files The subdirectory `raven_selection_tables` contains one Raven-formatted text file (tab separated values format) for each audio file. The file `audio_and_raven_files.csv` is a table that lists each audio file and the corresponding raven annotation file. The filename for each annotation file matches the corresponding audio file (for instance, sine2022a\_MSD-0558\_20220620\_000000\_0-10s.mp3 and sine2022a\_MSD-0558\_20220620\_000000\_0-10s.Table.1.selections.txt). ### One-hot labels This dataset also includes the file `(labels_2s.csv)`, which contains one-hot labels (0/1 per class per audio clip) for 2-second segments of audio. To generate these labels, we considered R. sierrae vocalizations to be present in a 2-second sample if any R. sierrae annotation overlapped with the sample by at least 0.2 seconds or if greater than 50% of an annotation box overlapped in time with the sample. A notebook in the associated [GitHub repository](https://github.com/kitzeslab/rana-sierrae-cnn) documents how the Raven annotations were converted to one-hot labels. ## Sharing/Access information This data is publicly available here. Associated scripts for data analysis are located in a [GitHub repository](https://github.com/kitzeslab/rana-sierrae-cnn). -Sam Lapp November 2023

创建时间：

2023-11-23

5,000+

优质数据集

54 个

任务类型

进入经典数据集