Medley-solos-DB: a cross-collection dataset for musical instrument recognition
收藏Mendeley Data2024-05-17 更新2024-06-28 收录
下载链接:
https://zenodo.org/records/3464194
下载链接
链接失效反馈官方服务:
资源简介:
Medley-solos-DB ============= Version 1.2 March 2019. Created By -------------- Vincent Lostanlen (1), Carmine-Emanuele Cella (2), Rachel Bittner (3), Slim Essid (4). (1): New York University (2): UC Berkeley (3): Spotify, Inc. (4): Télécom ParisTech Description --------------- Medley-solos-DB is a cross-collection dataset for automatic musical instrument recognition in solo recordings. It consists of a training set of 3-second audio clips, which are extracted from the MedleyDB dataset of Bittner et al. (ISMIR 2014) as well as a test set set of 3-second clips, which are extracted from the solosDB dataset of Essid et al. (IEEE TASLP 2009). Each of these clips contains a single instrument among a taxonomy of eight: clarinet, distorted electric guitar, female singer, flute, piano, tenor saxophone, trumpet, and violin. The Medley-solos-DB dataset is the dataset that is used in the benchmarks of musical instrument recognition in the publications of Lostanlen and Cella (ISMIR 2016) and Andén et al. (IEEE TSP 2019). [1] V. Lostanlen, C.E. Cella. Deep convolutional networks on the pitch spiral for musical instrument recognition. Proceedings of the International Society for Music Information Retrieval Conference (ISMIR), 2016. [2] J. Andén, V. Lostanlen, and S. Mallat. Joint time-frequency scattering. IEEE Transactions in Signal Processing, vol. 67, no. 14, pp. 3704-3718, 2019. doi: 10.1109/TSP.2019.2918992 Data Files -------------- The Medley-solos-DB contains 21571 audio clips as WAV files, sampled at 44.1 kHz, with a single channel (mono), at a bit depth of 32. Every audio clip has a fixed duration of 2972 milliseconds, that is, 65536 discrete-time samples. Every audio file has a name of the form: Medley-solos-DB_SUBSET-INSTRUMENTID_UUID.wav For example: Medley-solos-DB_test-0_0a282672-c22c-59ff-faaa-ff9eb73fc8e6.wav corresponds to the snippet whose universally unique identifier (UUID) is 0a282672-c22c-59ff-faaa-ff9eb73fc8e6, contains clarinet sounds (clarinet has instrument id equal to 0), and belongs to the test set. Metadata Files ------------------- The Medley-solos-DB_metadata is a CSV file containing 21572 rows (one for each audio clip) and five columns: 1. subset: either "training", "validation", or "test" 2. instrument: tag in Medley-DB taxonomy, such as "clarinet", "distorted electric guitar", etc. 3. instrument id: integer from 0 to 7. There is a one-to-one between "instrument" (string format) and "instrument id" (integer). We provide both for convenience. 4. song id: integer from 0 to 226. The track and artist names are anonymized. 5. UUID4: universally unique identifier. Assigned and random, and different for every row. The list of instrument classes is: 0. clarinet 1. distorted electric guitar 2. female singer 3. flute 4. piano 5. tenor saxophone 6. trumpet 7. violin Please acknowledge Medley-solos-DB in academic research --------------------------------------------------------------------------------- When Medley-solos-DB is used for academic research, we would highly appreciate it if scientific publications of works partly based on this dataset cite the following publication: V. Lostanlen, C.E. Cella. Deep convolutional networks on the pitch spiral for musical instrument recognition. Proceedings of the International Society for Music Information Retrieval Conference (ISMIR), 2016. The creation of this dataset was supported by ERC InvariantClass grant 320959. Conditions of Use ------------------------ Dataset created by Vincent Lostanlen, Rachel Bittner, and Slim Essid, as a derivative work of Medley-DB and solos-Db. The Medley-solos-DB dataset is offered free of charge under the terms of the Creative Commons Attribution 4.0 International (CC BY 4.0) license: https://creativecommons.org/licenses/by/4.0/ The dataset and its contents are made available on an "as is" basis and without warranties of any kind, including without limitation satisfactory quality and conformity, merchantability, fitness for a particular purpose, accuracy or completeness, or absence of errors. Subject to any liability that may not be excluded or limited by law, the authors are not liable for, and expressly exclude all liability for, loss or damage however and whenever caused to anyone by any use of the Medley-solos-DB dataset or any part of it. Feedback ------------- Please help us improve Medley-solos-DB by sending your feedback to: vincent.lostanlen@nyu.edu In case of a problem, please include as many details as possible. Acknowledgement ------------------------- We thank all artists, recording engineers, curators, and annotators of both MedleyDB and solosDb.
Medley-solos-DB 数据集 ———— 版本1.2,2019年3月。
作者信息 ————
Vincent Lostanlen(1)、Carmine-Emanuele Cella(2)、Rachel Bittner(3)、Slim Essid(4)
(1) 纽约大学(New York University)
(2) 加州大学伯克利分校(UC Berkeley)
(3) Spotify公司
(4) 巴黎电信学院(Télécom ParisTech)
数据集描述 ————
Medley-solos-DB是一款用于独奏录音中乐器自动识别的跨集合数据集。其包含训练集与测试集两部分:训练集的3秒音频片段取自Bittner等人2014年国际音乐信息检索会议(ISMIR 2014)发布的MedleyDB数据集;测试集的3秒音频片段则取自Essid等人2009年IEEE音频、语音与语言处理汇刊(IEEE Transactions on Audio, Speech, and Language Processing, IEEE TASLP 2009)发布的solosDB数据集。所有音频片段均包含8类预设乐器中的一种:单簧管、失真电吉他、女声演唱、长笛、钢琴、次中音萨克斯、小号与小提琴。本数据集曾被用于Lostanlen与Cella(2016年ISMIR)以及Andén等人(2019年IEEE信号处理汇刊,IEEE Transactions on Signal Processing, IEEE TSP 2019)的乐器识别基准测试中。
[1] V. Lostanlen, C.E. Cella. 基于螺距螺旋线的深度卷积网络用于乐器识别. 国际音乐信息检索会议(ISMIR)论文集, 2016.
[2] J. Andén, V. Lostanlen, S. Mallat. 联合时频散射变换. IEEE信号处理汇刊, 第67卷第14期, 第3704-3718页, 2019. DOI: 10.1109/TSP.2019.2918992
数据文件 ————
Medley-solos-DB包含21571条WAV格式音频片段,采样率为44.1 kHz,单声道,位深度为32 bit。每条音频片段的固定时长为2972毫秒,即65536个离散采样点。所有音频文件的命名格式为:`Medley-solos-DB_SUBSET-INSTRUMENTID_UUID.wav`。
示例:`Medley-solos-DB_test-0_0a282672-c22c-59ff-faaa-ff9eb73fc8e6.wav`对应的片段全局唯一标识符(Universally Unique Identifier, UUID)为`0a282672-c22c-59ff-faaa-ff9eb73fc8e6`,对应乐器为单簧管(乐器ID为0),且属于测试集。
元数据文件 ————
`Medley-solos-DB_metadata`为逗号分隔值(Comma-Separated Values, CSV)文件,包含21572行数据(对应每条音频片段)与5列信息:
1. subset:取值为"training"(训练集)、"validation"(验证集)或"test"(测试集)
2. instrument:MedleyDB预设乐器标签,例如"clarinet"(单簧管)、"distorted electric guitar"(失真电吉他)等
3. instrument id:0至7的整数,乐器名称(字符串格式)与乐器ID(整数格式)一一对应,二者同时提供以方便使用
4. song id:0至226的整数,曲目与艺术家名称已做匿名化处理
5. UUID4:全局唯一标识符,随机生成,每行唯一。
乐器类别列表如下:
0. 单簧管
1. 失真电吉他
2. 女声演唱
3. 长笛
4. 钢琴
5. 次中音萨克斯
6. 小号
7. 小提琴
学术引用声明 ————
若将Medley-solos-DB用于学术研究,请引用以下文献:
V. Lostanlen, C.E. Cella. 基于螺距螺旋线的深度卷积网络用于乐器识别. 国际音乐信息检索会议(ISMIR)论文集, 2016.
本数据集的创建得到了欧洲研究理事会(European Research Council, ERC)InvariantClass项目(资助号320959)的支持。
使用条款 ————
本数据集由Vincent Lostanlen、Rachel Bittner与Slim Essid基于MedleyDB与solosDB衍生创作而成。Medley-solos-DB数据集按照知识共享署名4.0国际许可协议(Creative Commons Attribution 4.0 International, CC BY 4.0)免费提供,协议链接:https://creativecommons.org/licenses/by/4.0/
本数据集及相关内容按“现状”提供,不附带任何形式的保证,包括但不限于对适用性、适销性、特定用途适用性、准确性或完整性以及无错误的保证。在法律允许的最大范围内,作者不对因使用本数据集或其任何部分导致的任何损失或损害承担任何责任,并明确排除所有此类责任。
反馈渠道 ————
欢迎通过邮箱vincent.lostanlen@nyu.edu提交反馈以帮助我们改进Medley-solos-DB。若遇到问题,请尽可能提供详细信息。
致谢 ————
感谢MedleyDB与solosDB的所有艺术家、录音工程师、策展人与标注人员。
创建时间:
2023-06-28
搜集汇总
数据集介绍

背景与挑战
背景概述
Medley-solos-DB是一个用于自动音乐乐器识别的跨集合数据集,包含来自MedleyDB的训练集和solosDB的测试集,共21571个3秒音频片段,涵盖八种乐器类别。数据集提供标准化的WAV格式音频和详细元数据,适用于机器学习分类任务,并遵循CC BY 4.0许可,支持学术研究引用。
以上内容由遇见数据集搜集并总结生成



