five

Audio Commons Estimation Results Data for deliverables D4.4, D4.10 and D4.12

收藏
NIAID Data Ecosystem2026-03-11 收录
下载链接:
https://zenodo.org/record/2546642
下载链接
链接失效反馈
官方服务:
资源简介:
This dataset contains the results of running the automatic audio annotation algorithms for pitch, tempo and key used for the evaluation of algorithms developed during the AudioCommons H2020 EU project and which are part of the Audio Commons Audio Extractor tool. It also includes estimation results information for the single-eventness audio descriptor also developed for the same tool. These estimation results data has been used to generate the following documents: Deliverable D4.4: Evaluation report on the first prototype tool for the automatic semantic description of music samples Deliverable D4.10: Evaluation report on the second prototype tool for the automatic semantic description of music samples Deliverable D4.12: Release of tool for the automatic semantic description of music samples All these documents are available in the materials section of the AudioCommons website. All data in this repository is provided in the form of CSV files. Each CSV file corresponds to the analysis results of one musical task and one of the individual datasets used in the aforementioned deliverables. This repository does not include the audio files of each individual dataset, but includes references to the audio files. The following paragraphs describe the structure of the CSV files and give some notes about how to obtain the audio files in case these would be needed. Structure of the CSV files All the CSV files in this repository (with the sole exception of SINGLE EVENT - Estimation Results Truth.csv) are named according to the following convention: "DATASET_NAME - ESTIMATION_TASK Estimation Results.csv". Therefore, estimation results for pitch, tempo and tonality music tasks are separated in different files. All these files share the same structure for the first 2 CSV columns: Audio reference: reference to the corresponding audio file. This will either be a string withe the filename, or the Freesound ID (for one dataset based on Freesound content). See below for details about how to obtain those files.  Audio reference type: will be one of Filename or Freesound ID, and specifies how the previous column should be interpreted.  The rest of the columns include the estimation results for each one of the algorithms included in the evaluation of each music facet. For each algorithms two columns are reserved, the first one containing the actual estimation and the second one the confidence of this estimation (see CSV file previews below). The format of actual estimations depends on the musical task, check the description of the corresponding ground truth dataset for more information on that. The confidence value is a float number, typically in the range from 0.0 to 1.0. It can happen that one or both columns are empty for a given analysis algorithm and CSV row. This will be the case if the algorithm could not successfully produce an estimation for the audio file row corresponding to the CSV row. The remaining CSV file, SINGLE EVENT - Estimation Results.csv, has the following 4 columns: Freesound ID: sound ID used in Freesound to identify the audio clip. ACExtractorV2: single-eventness estimation of the algorithm included in the second version of the Audio Commons Audio Extractor tool (bool). ACExtractorV2-opt: single-eventness estimation of the algorithm included in the second version of the Audio Commons Audio Extractor tool with optimized parameters (bool). ACExtractorV3: single-eventness estimation of the algorithm included in the third version of the Audio Commons Audio Extractor tool (bool).   How to get the audio data In this section we provide some notes about how to obtain the audio files corresponding to the estimation results provided here. Note that due to licensing restrictions we are not allowed to re-distribute the audio data corresponding to most of these automatic annotations. Apple Loops (APPL): This dataset includes some of the music loops included in Apple's music software such as Logic or GarageBand. Access to these loops requires owning a license for the software. Detailed instructions about how to set up this dataset are provided here. Carlos Vaquero Instruments Dataset (CVAQ): This dataset includes single instrument recordings carried out by Carlos Vaqueroas part of this master thesis. Sounds are available as Freesound packs and can be downloaded at this page: https://freesound.org/people/Carlos_Vaquero/packs Freesound Loops 4k (FSL4): This dataset set includes a selection of music loops taken from Freesound. Detailed instructions about how to set up this dataset are provided here. Giant Steps Key Dataset (GSKY): This dataset includes a selection of previews from Beatport annotated by key. Audio and original annotations available here. Good-sounds Dataset (GSND): This dataset contains monophonic recordings of instrument samples. Full description, original annotations and audio are available here. University of IOWA Musical Instrument Samples (IOWA): This dataset  was created by the Electronic Music Studios of the University of IOWA and contains recordings of instrument samples. The dataset is available upon request by visiting this website. Mixcraft Loops (MIXL): This dataset includes some of the music loops included in Acoustica's Mixcraft music software. Access to these loops requires owning a license for the software. Detailed instructions about how to set up this dataset are provided here. NSynth Dataset Test and Validation sets (NSYT and NSYV): NSynth is a large-scale and high-quality dataset of annotated musical notes built with synthesized sounds by Google's Magenta team. Full dataset description including original annotations and audio files is available here. Philarmonia Orchestra Sound Samples Dataset (PHIL): This includes thousands of free, downloadable sound samples specially recorded by Philharmonia Orchestra players. Audio files are freely downloadable from the philarmonia orchestra website. Freesound Single Events Dataset (SINGLE EVENT): This includes a selection of Freesound audio clips representing audio signals containing either a single audio eventor multiple ones. Original audio files can be retrieved by downloading individual audio clips from Freesound using the ID identifier provided in the CSV file. A similar procedure to that described here could be followed.
创建时间:
2020-01-24
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作