Worldwide Soundscapes project metadata and analysis scripts
收藏NIAID Data Ecosystem2026-05-02 收录
下载链接:
https://zenodo.org/record/6486835
下载链接
链接失效反馈官方服务:
资源简介:
The Worldwide Soundscapes project is a global, open inventory of spatio-temporally replicated passive acoustic monitoring meta-datasets (i.e. meta-data collections). This Zenodo entry comprises the data tables that constitute its (meta-)database, as well as their description. Additionally, R scripts are provided to replicate the analysis published in [placeholder].
The overview of all sampling sites and timelines can be found on the corresponding project on ecoSound-web, as well as a demonstration collection containing selected recordings. The recordings of this collection were annotated and analysed to explore macro-ecological trends.
The audio recording criteria justifying inclusion into the meta-database are:
Stationary (no transects, towed sensors or microphones mounted on cars)
Passive (unattended, no human disturbance by the recordist)
Ambient (no directional microphone or triggered recordings, non-experimental conditions)
Spatially and/or temporally replicated (i.e. multiple sites sampled at the same time and/or multiple days - covering the same daytime - sampled at the same site)
The individual columns of the provided data tables are described in the following. Data tables are linked through primary keys; joining them will result in a database. The data shared here only includes validated collections.
Changes from version 3.0.1
Added files needed to reproduce the metadata and the acoustic analyses found in the publication.
Dropped underused fields: spatial_selection, temporal_exclusion, freshwater_recordist_position from collections table; secondary realm, biome, and functional group from sites table.
Meta-database CSV files
collections
collection_id: unique integer, primary key
name: name of the dataset. if it is repeated, incremental integers should be used in the "subset" column to differentiate them.
ecoSound-web_link: link of validated meta-collection on ecoSound-web
primary_contributors: full names of people deemed corresponding contributors who are responsible for the dataset
secondary_contributors: full names of people who are not primary contributors but who have significantly contributed to the dataset, and who could be contacted for in-depth analyses
date_added: when the datased was added (YYYY-MM-DD)
URL_open_recordings: internet link for openly-available recordings from this collection
URL_project: internet link for further information about the corresponding project
DOI_publication: Digital Object Identifiers of corresponding publications
core_realm_IUCN: The main, core realm of the dataset according to IUCN Global Ecosystem Typology (v2.0): https://global-ecosystems.org/
medium: the physical medium the microphone is situated in
locality: optional free text about the locality
contributor_comments: free-text field for comments by the primary contributors
collections-sites
dataset_ID: primary key of collections table
site_ID: primary key of sites table
sites
site_ID: unique integer, primary key
site_name: internal name or code of sampling site as used in respective projects
latitude_numeric: site's numeric degrees of latitude
longitude_numeric: site's numeric degrees of longitude
blurred_coordinates: whether latitude and longitude coordinates are inaccurate, boolean. Coordinates may be blurred with random offsets, rounding, snapping, etc. Indicate the blurring method inside the comments field
topography_m: vertical position of the microphone relative to the sea level. for sites on land: elevation. For marine sites: depth (negative). in meters. Only indicate if the values were measured by the collaborator.
freshwater_depth_m: microphone depth, only used for sites inside freshwater bodies that also have an elevation value above the sea level
realm: Ecosystem type: main realm according to IUCN GET https://global-ecosystems.org/
biome: Ecosystem type: main biome according to IUCN GET https://global-ecosystems.org/
functional_group: Ecosystem type: main functional group according to IUCN GET https://global-ecosystems.org/
contributor_comments: free text field for contributor comments
GADM_0: Global ADMinistrative Database level 0 classification of terrestrial site or marine site that is within territorial waters. Source: https://gadm.org/download_world.html
IHO: International Hydrographic Organization classification of marine site. Source: https://marineregions.org/downloads.php
WDPA: World Database on Protected Areas classification of the site. Source: https://www.protectedplanet.net/en/thematic-areas/wdpa?tab=WDPA
deployments
dataset_ID: primary key of datasets table
deployment: identical subscript letters to denote rows that belong to the same deployment. For instance, you may use different operation times and schedules for different target taxa within one deployment.
subset_site_ID: If the deployment was not done in all the sites of the corresponding collection, site IDs where the deployment was conducted
start_date: date of deployment start
start_time_mixed: deployment start local time, either in HH:MM format or a choice of solar daytimes (sunrise, sunset). Corresponds to the recording start time for continuous recording deployments. If multiple start times were used, you should mention the latest start time (corresponds to the earliest daytime from which all recorders are active). If applicable, positive or negative offsets from solar times can be mentioned (For example: if data are collected one hour before sunrise, this will be "sunrise-60")
permanent: whether the deployment is permanent, boolean
end_date: date of deployment end (date when last scheduled operation starts)
end_time_mixed: deployment end local time, either in HH:MM format or a choice of solar daytimes (sunrise, sunset, noon, midnight). Corresponds to the recording end time for continuous recording deployments.
operation_mode: continuous: recording takes place from the deployment start date-time to deployment end date-time.periodical: recording takes place periodically (i.e., with duty cycle) from the deployment start date-time to deployment end date-time.scheduled: recording takes place during scheduled daily time intervals (optionally with duty cycle)
duty_cycle_minutes: duty cycle of the recording (i.e. the fraction of minutes when it is recording), written as "recording(minutes)/period(minutes)". empty if no duty cycle is used. For example: "1/6" if the recorder is active for 1 minute and standing by for 5 minutes
operation_start_time_mixed: only for scheduled recordings: start local time, either in HH:MM format or a choice of solar daytimes (sunrise, sunset, noon, midnight). If applicable, positive or negative offsets from solar times can be mentioned (For example: if data are collected one hour before sunrise, this will be "sunrise-60")
operation_duration_minutes: only for scheduled recordings: duration of operation in minutes, if constant
operation_end_time_mixed: only for scheduled recordings: end local time, either in HH:MM format or a choice of solar daytimes (sunrise, sunset, noon, midnight). Only required if durations are variable. Do not use when end times are ambiguous (for instance, if a recording could be 1 hour or 25 hours long because the end is on the next day). If applicable, positive or negative offsets from solar times can be mentioned (For example: if data are collected one hour before sunrise, this will be "sunrise-60")
high_pass_filter_Hz: frequency of the high-pass filter of the recorder if applied, in Hz. Otherwise, write "none". This may be called a "low-cut" filter too.
bit_depth: sampling bit depth of the recordings. Often constant for a particular recorder
channels: number of recorded audio channels
sampling_frequency_kHz: frequency at which the microphone signal was sampled by the recorder (sounds of half that frequency will be recorded)
recorder: recorder used for deployment
microphone: microphone used for deployment
target_taxa: main IUCN animal taxa that were studied with this deployment, using the exact IUCN Red list names (http://www.iucnredlist.org/), separated by commas. Only genera, families, orders, and classes are accepted. Empty if there was no taxonomic focus (i.e., general soundscapes were the study focus).
contributor_comments: free text field for contributor comments
exact_recordings: whether the deployment data here have been superseded by inserting more exact recording date-time ranges into the meta-collection on ecoSound-web
recordings (partial download from ecoSound-web)
recording_id: primary key of the recordings table
collection_id: ID of the collection the recording belongs to
name: name of the recording
site_id: site ID the recording belongs to:
recorder_id: ID of the recorder used for the recording (internal ecoSound-web code)
microphone_id: ID of the microphone used for the recording (internal ecoSound-web code)
recording_gain:recording gain applied for amplifying the audio signal, in decibels
duty_cycle_recording: fraction of the recording periode when the recorder is actively recording audio
duty_cycle_period: period of the duty cycle, i.e., time between the starts of two subsequent recordings
note: comments (contains the target taxon)
file_date: date of the recording start
file_time: local time of the recording start
sampling_rate: audio sampling rate in Hz
bitdepth: depth in bits for each audio sample
channel_num: number of channels
duration: duration of the recording in seconds. Note: duty-cycled recordings cover only a proportion of this duration
affiliations
affiliation_id: primary key of affiliations table
lab_research_group: Laboratory or research group name
department_school_institute: department, school, or institute name
university_institution: University or institution name
street_address: street address
region_state_province_city: region, state, province, or city name
postal_code: postal code
country: country name
primary_contributors
First_name: First, given name, anonymised when contributor is technically accepted but has not yet given publication authorisation
Last_name: Last, family name, anonymised when contributor is technically accepted but has not yet given publication authorisation
ORCiD
affiliation_IDs: primary keys of the affiliations' table corresponding affiliations, separated by comma
first_tier_position: Author position in first-tier
publication_agreement: Has contributor explicitly agreed to share her/his meta-data in the collaboration agreement?
co_author_first_synthesis: Has contributor confirmed co-authorship intention in the collaboration agreement?
The following columns describe the contributor's role in the project accordint to CRediT taxonomy.
Auxiliary files for reproducing analysis
R scripts
acoustic analysis.R: reproduces the result of the soundscape case studies
metadata analysis.R: reproduces the metadata analysis results in the publication
Data from the demonstration collection (download from ecoSound-web)
demo_recordings.csv: metadata of the recordings, see recordings table
demo_sites.csv: metadata of the sampling locations, see sites table
demo_tags.csv: data describing annotations made in demonstration recordings for the biophony, anthropophony, geophony, and unknown sound sources
spectrograms.zip: contains PNG format spectrograms used in generating Figure 5
Externally sourced data
GET_areas_2.1.1.csv: raw data obtained from Keith et al. 2023 (https://doi.org/10.5281/zenodo.10081251), then summarized in QGIS to obtain areas per functional group
Havlik_sites.csv: data obtained from Havlik et al. 2022 supplementary material (https://www.frontiersin.org/articles/10.3389/fmars.2022.919418), originally named "Data Sheet 1.CSV"
Sugai_sites_updated.csv: data obtained from Sugai et al. 2019 (https://doi.org/10.1093/biosci/biy147), personal communication with permission
taxonomy.csv: raw data obtained from IUCN Red List for all animal taxa (https://www.iucnredlist.org/)
topography_range_latitude.csv: raw topography from GEBCO sub-ice data (https://www.gebco.net/data_and_products/gridded_bathymetry_data/), summarised by bins of 10 latitudinal rows
创建时间:
2025-03-25



