Good-sounds dataset
收藏Mendeley Data2024-03-27 更新2024-06-28 收录
下载链接:
https://zenodo.org/record/4588740
下载链接
链接失效反馈官方服务:
资源简介:
General description: This dataset was created in the context of the Pablo project, partially funded by KORG Inc. It contains monophonic recordings of two kind of exercises: single notes and scales. The dataset was reported in the following article: Romaní Picas O., Parra Rodriguez H., Dabiri D., Tokuda H., Hariya W., Oishi K., & Serra X."A real-time system for measuring sound goodness in instrumental sounds", 138th Audio Engineering Society Convention (2015). The recordings were made in the Universitat Pompeu Fabra / Phonos recording studio by 15 different professional musicians, all of them holding a music degree and having some expertise in teaching. 12 different instruments were recorded using one or up to 4 different microphones (depending on the recording session). For all the instruments the whole set of playable semitones in the instrument is recorded several times with different tonal characteristics. Each note is recorded into a separate mono .flac audio file of 48kHz and 32 bits. The tonal characteristics are explained both in the the following section and the related publication. The audio files are organised in one directory for each recording session. In addition to the files, one SQLite database file is included. The structure of the database is related in the following section. Database description: The database is meant for organizing the sounds in a handy way. It is organised in four different tables: sounds, takes, packs and ratings. Sounds The table containing the sounds annotations. id instrument : flute, cello, clarinet, trumpet, violin, sax_alto, sax_tenor, sax_baritone, sax_soprano, oboe, piccolo, bass note octave dynamics : for some sounds, the musical notation of the loudness level (p, mf, f..) recorded_at : recording date and time location : recording place player : the musician who recorded it. For detailed information about the musicians please contact us. bow_velocity : for some string instruments the velocity of the bow (slow, medieum, fast) bridge_position : for some string instruments the position of the bow (tasto, middle, ponticello) string : for some string instruments the number of the string in which the sound it's played (1: lowest in pitch) csv_file : used for creation of the DB csv_id : used for creation of the DB pack_filename : used for creation of the DB pack_id : used for creation of the DB attack : for single notes, manual annotation of the onset in samples. decay : for single notes, manual annotation of the decay in samples. sustain : for single notes, manual annotation of the beginnig of the sustained part in samples. release : for single notes, manual annotation of the beginnig of the release part in samples. offset : for single notes, manual annotation of the offset in samples reference : 1 if sound was used to create the models in the good-sounds project, 0 if not. klass : user generated tags of the tonal qualities of the sound. They also contain information about the exercise, that could be single note or scale. "good-sound": good examples of single note "bad": bad example of one of the sound attributes defined in the project (please read the papers for a detailed explanation) "scale-good": good example of a one octave scale going up and down (15 notes). If the scale is minor a tagged "minor" is also available. "scale-bad": bad example scale of one of the sounds defined in the project. (15 notes up and down). comments : if any semitone : midi note pitch_reference : the reference pitch Takes A sound can have several takes as some of them were recorded using different microphones at the same time. Each take has an associated audio file. id microphone filename : the name of the associated audio file original_filename : freesound_id : for some sounds uploaded to freesound.org sound_id : the id of the sound in the DB goodsound_id : for some of the sounds available in good-sounds.org Packs A pack is a group of sounds from the same recording session. The audio files are organised in the *sound_files* directory in subfolders with the pack name to which they belong. id name description Ratings Some musicians rated some sounds in a 0-10 goodness scale for the user evaluatio of the first project prototype. Please read the paper for more detailed information. id mark: the rate or score. type: the klass of the sound. Related to the tags of the sound. created_at comments sound_id rater: the musician who rated the sound. License: This work is licensed under the Creative Commons Attribution-NonCommercial 4.0 International License. To view a copy of this license, visit http://creativecommons.org/licenses/by-nc/4.0/ or send a letter to Creative Commons, PO Box 1866, Mountain View, CA 94042, USA.
# 数据集概况
本数据集依托Pablo项目构建,该项目由KORG公司(KORG Inc.)部分资助。数据集包含两类单声道录音素材:单音与音阶练习录音。本数据集已在以下论文中发布:Romaní Picas O.、Parra Rodriguez H.、Dabiri D.、Tokuda H.、Hariya W.、Oishi K.与Serra X.合著的《面向器乐声音音质评价的实时系统》,发表于第138届音频工程师协会(Audio Engineering Society)大会(2015年)。
所有录音均在庞培法布拉大学(Universitat Pompeu Fabra)/Phonos录音棚完成,由15名专业音乐人录制,所有参与者均拥有音乐学位且具备一定教学经验。本次录制覆盖12种乐器,根据录制场次的不同,使用1至4支麦克风进行收音。针对每一种乐器,我们对其全部可演奏半音均进行了多次录制,且每次录制采用不同的音色特性。每一个音符均被单独存储为单声道FLAC(.flac)音频文件,采样率为48kHz,位深度为32比特。相关音色特性的说明将在后续章节与相关论文中详细阐述。
所有音频文件按录制场次划分至不同的目录中。除音频文件外,本数据集还包含一个SQLite数据库文件,数据库的结构将在后续章节中说明。
## 数据库说明
本数据库用于便捷管理录音素材,共包含四张数据表:声音表(sounds)、录制片段表(takes)、分组包表(packs)与评分表(ratings)。
### 声音表(sounds)
本表存储声音的标注信息,字段说明如下:
- `id`:记录编号
- `instrument`:乐器类型,包括长笛(flute)、大提琴(cello)、单簧管(clarinet)、小号(trumpet)、小提琴(violin)、中音萨克斯(sax_alto)、次中音萨克斯(sax_tenor)、上低音萨克斯(sax_baritone)、高音萨克斯(sax_soprano)、双簧管(oboe)、短笛(piccolo)、低音提琴(bass)
- `note`:音符
- `octave`:音高八度
- `dynamics`:力度标记,针对部分录音,采用音乐记谱法标注响度等级(如p、mf、f等)
- `recorded_at`:录制日期与时间
- `location`:录制地点
- `player`:录制音乐人,如需获取音乐人详细信息,请联系我们
- `bow_velocity`:弓速,针对部分弦乐器,标注弓的运动速度(慢、中等、快)
- `bridge_position`:琴桥位置,针对部分弦乐器,标注弓的触弦位置(靠近指板、琴身中部、靠近琴桥)
- `string`:弦号,针对部分弦乐器,标注发声所用的弦编号(1代表音高最低的弦)
- `csv_file`:用于数据库创建的CSV文件标识
- `csv_id`:用于数据库创建的CSV记录ID
- `pack_filename`:所属分组包文件名,用于数据库创建
- `pack_id`:所属分组包ID,用于数据库创建
- `attack`:起音点标注,针对单音录音,以采样点为单位手动标注起音位置
- `decay`:衰减段标注,针对单音录音,以采样点为单位手动标注衰减阶段起始位置
- `sustain`:持续段标注,针对单音录音,以采样点为单位手动标注持续阶段起始位置
- `release`:释音段标注,针对单音录音,以采样点为单位手动标注释音阶段起始位置
- `offset`:结束点标注,针对单音录音,以采样点为单位手动标注录音结束位置
- `reference`:参考标记,若该声音用于"优质声音"项目的模型构建则取值为1,否则为0
- `klass`:用户自定义音色标签,包含练习类型信息(可为单音或音阶练习),具体标签说明如下:
- "good-sound":优质单音示例
- "bad":本项目定义的某一声音属性不合格示例(详细说明请参阅相关论文)
- "scale-good":优质八度音阶上下行练习示例(共15个音符),若为小调音阶则额外标注"minor"标签
- "scale-bad":本项目定义的某一声音属性不合格音阶练习示例(共15个音符上下行)
- `comments`:附加注释(若有)
- `semitone`:半音信息
- `midi note`:MIDI音符编号
- `pitch_reference`:基准音高
### 录制片段表(takes)
单条声音可对应多个录制片段,部分录音会同时使用多支麦克风收音。每个录制片段均关联一个音频文件,字段说明如下:
- `id`:记录编号
- `microphone`:所用麦克风信息
- `filename`:关联音频文件的文件名
- `original_filename`:原始文件名
- `freesound_id`:部分上传至freesound.org的声音对应的ID
- `sound_id`:数据库中对应声音的编号
- `goodsound_id`:部分在good-sounds.org上线的声音对应的ID
### 分组包表(packs)
分组包指同一场录制的声音集合。所有音频文件均存储于`sound_files`目录下,以所属分组包的名称命名子文件夹。字段说明如下:
- `id`:记录编号
- `name`:分组包名称
- `description`:分组包描述
### 评分表(ratings)
部分音乐人针对本项目初代原型的用户评价需求,按照0-10的音质评分标准对部分声音进行了打分。详细评分规则请参阅相关论文。字段说明如下:
- `id`:记录编号
- `mark`:评分或得分
- `type`:声音类型,与声音表中的标签字段相关
- `created_at`:评分创建时间
- `comments`:评分附加注释
- `sound_id`:对应声音在数据库中的编号
- `rater`:打分的音乐人
## 授权许可
本作品采用知识共享署名-非商业性使用4.0国际许可协议(Creative Commons Attribution-NonCommercial 4.0 International License)进行授权。如需查看该许可协议副本,可访问http://creativecommons.org/licenses/by-nc/4.0/,或致函至Creative Commons,地址:美国加利福尼亚州芒廷维尤市PO Box 1866,邮编94042。
创建时间:
2023-06-28
搜集汇总
数据集介绍

背景与挑战
背景概述
Good-sounds数据集是一个专注于乐器声音质量评估的音频数据集,包含15位专业音乐家使用12种乐器录制的单音和音阶练习的单声道音频文件,格式为48kHz、32位.flac,并附带结构化SQLite数据库,详细标注了乐器、音符、动态和声音质量分类等信息。该数据集主要用于研究声音的“好ness”(goodness)特性,支持音频处理和音乐技术应用,采用知识共享署名-非商业性许可协议。
以上内容由遇见数据集搜集并总结生成



