five

AeroSonicDB (YPAD-0523): Labelled audio dataset for acoustic detection and classification of aircraft

收藏
Mendeley Data2024-06-29 更新2024-06-28 收录
下载链接:
https://zenodo.org8371595
下载链接
链接失效反馈
官方服务:
资源简介:
AeroSonicDB (YPAD-0523): Labelled audio dataset for acoustic detection and classification of aircraft Version 1.1.1 (September 2023) Publication When using this data in an academic work, please reference the dataset DOI and version. Please also reference the following paper which describes the methodology for collecting the dataset and presents baseline model results. Downward, B., & Nordby, J. (2023). The AeroSonicDB (YPAD-0523) Dataset for Acoustic Detection and Classification of Aircraft. ArXiv, abs/2311.06368. Description AeroSonicDB:YPAD-0523 is a specialised dataset of ADS-B labelled audio clips for research in the fields of environmental noise attribution and machine listening, particularly acoustic detection and classification of low-flying aircraft. Audio files in this dataset were recorded at locations in close proximity to a flight path approaching or departing Adelaide International Airport's (ICAO code: YPAD) primary runway, 05/23. Recordings are initially labelled from radio (ADS-B) messages received from the aircraft overhead, then human verified and annotated with the first and final moments which the target aircraft is audible. A total of 1,895 audio clips are distributed across two top-level classes, "Aircraft" (8.87 hours) and "Silence" (3.52 hours). The aircraft class is then further broken-down into four subclasses, which broadly describe the structure of the aircraft and propulsion mechanism. A variety of additional "airframe" features are provided to give researchers finer control of the dataset, and the opportunity to develop ontologies specific to their own use case. For convenience, the dataset has been split into training (10.04 hours) and testing (2.35 hours) subsets, with the training set further split into 5 distinct folds for cross-validation. These splits are performed to prevent data-leakage between folds and the test set, ensuring samples collected in the same recording session (distinct in time, location and microphone) are assigned to the same fold. Researchers may find applications for this dataset in a number of fields; particularly aircraft noise isolation and noise monitoring in an urban environment, development of passive acoustic systems to assist radar technology, and understanding the sources of aircraft noise to help manufacturers design less-noisy aircraft. Audio data ADS-B (Automatic Dependent Surveillance–Broadcast) messages transmitted directly from aircraft are used to automatically trigger, capture and label audio samples. A 60-second recording is triggered when an aircraft transmits a message indicating it is within a specified distance of the recording device (see "Location data" below for specifics). The resulting audio file is labelled with the unique ICAO identifier code for the aircraft, as well as its last reported altitude, date, time, location and microphone. The recording is then human verified and annotated with timestamps for the first and last moments the aircraft is audible. In total, AeroSonicDB contains 625 recordings of low-altitude aircraft - varying in length from 18 to 60 seconds, for a total of 8.87 hours of aircraft audio. A collection of urban background noise without aircraft (silence) is included with the dataset as a means of distinguishing location specific environmental noises from aircraft noises. 10-second background noise, or "silence" recordings are triggered only when there are no aircraft broadcasting they are within a specified distance of the recording device (see "Location data" below). These "silence" recordings are also human verified to ensure no aircraft noise is present. The dataset contains 1,270 clips of silence/urban background noise. Location data Recordings have been collected from three (3) locations. GPS coordinates for each location are provided in the "locations.json" file. In order to protect privacy, coordinates have been provided for a road or public space nearby the recording device instead of its exact location. Location: 0 Situated in a suburban environment approximately 15.5km north-east of the start/end of the runway. For Adelaide, typical south-westerly winds bring most arriving aircraft past this location on approach. Winds from the north or east will cause aircraft to take-off to the north-east, however not all departing aircraft will maintain a course to trigger a recording at this location. The "trigger distance" for this location is set for 3km to ensure small/slower aircraft and large/faster aircraft are captured within a sixty-second recording. "Silence" or ambient background noises at this location include; cars, motorbikes, light-trucks, garbage trucks, power-tools, lawn mowers, construction sounds, sirens, people talking, dogs barking and a wide range of Australian native birds (New Holland Honeyeaters, Wattlebirds, Australian Magpies, Australian Ravens, Spotted Doves, Rainbow Lorikeets and others). Location: 1 Situated approximately 500m south-east of the south-eastern end of the runway, this location is nearby recreational areas (golf course, skate park and parklands) with a busy road/highway inbetween the location and runway. This location features heavy winds and road traffic, as well as people talking, walking and riding, and also birds such as the Australian Magpie and Noisy Miner. The trigger distance for this location is set to 1km. Due to their low altitude aircraft are louder, but audible for a shorter time compared to "Location 0". Location: 2 As an alternative to "Location 1", this location is situated approximately 950m south-east of the end of the runway. This location has a wastewater facility to the north, a residential area to the south and a popular beach to the west. This location offers greater wind protection and further distance from airport and highway noises. Ambient background sounds feature close proximity cars and motorbikes, cyclists, people walking, nail guns and other construction sounds, as well as the local birds mentioned above. Aircraft metadata Supplementary "airframe" metadata for all aircraft has been gathered to help broaden the research possibilities from this dataset. Airframe information was collected and cross-checked from a number of open-source databases. The author has no reason to beleive any significant errors exist in the "aircraft_meta" files, however future versions of this dataset plan to obtain aircraft information directly from ICAO (International Civil Aviation Organization) to ensure a single, verifiable source of information. Class/subclass ontology (minutes of recordings) 0. no aircraft (202) 0: no aircraft (202) 1. aircraft (214) 1: piston-propeller aeroplane (12) 2: turbine-propeller aeroplane (37) 3: turbine-fan aeroplane (163) 4: rotorcraft (1.6) The subclasses are a combination of the "airframe" and "engtype" features. Piston and Turboshaft rotorcraft/helicopters have been combined into a single subclass due to the small number of samples. Data splits Audio recordings have been split into training (81%) and test (19%) sets. The training set has further been split into 5 folds, giving researchers a common split to perform 5-fold cross-validation to ensure reproducibility and comparable results. Data leakage into the test set has been avoided by ensuring recordings are disjointed from the training set by time and location - meaning samples in the test set for a particular location were recorded after any samples included in the training set for that particular location. Labelled data The entire dataset (training and test) is referenced and labelled in the "sample_meta.csv" file. Each row contains a reference to a unique recording, its meta information, annotations and airframe features. Alternatively, these labels can be derived directly from the filename of the sample (see below). The "aircraft_meta.csv" and "aircraft_meta.json" files can be used to reference aircraft specific features - such as; manufacturer, engine type, ICAO type designator etc. (see "Columns/Labels" below for all features). File naming convention Audio samples are in WAV format, with some metadata stored in the filename. Basic Convention "Aircraft ID + Date + Time + Location ID + Microphone ID" "XXXXXX_YYYY-MM-DD_hh-mm-ss_X_X" Sample with aircraft {hex_id} _ {date} _ {time} _ {location_id} _ {microphone_id} . {file_ext} 7C7CD0_2023-05-09_12-42-55_2_1.wav Sample without aircraft "Silence" files are denoted with six (6) leading zeros rather than an aircraft hex code. All relevant metadata for "silence" samples are contained in the audio filename, and again in the accompanying "sample_meta.csv" 000000 _ {date} _ {time} _ {location_id} _ {microphone_id} . {file_ext} 000000_2023-05-09_12-30-55_2_1.wav Columns/Labels (found in sample_meta.csv, aircraft_meta.csv/json files) train-test: Train-test split (train, test) fold: Digit from 1 to 5 splitting the training data 5 ways (else test) filename: The filename of the audio recording date: Date of the recording time: Time of the recording location: ID for the location of the recording mic: ID of the microphone used class: Top-level label for the recording (eg. 0 = No aircraft, 1 = Aircraft audible) subclass: Subclass label for the recording (eg. 0 = No aircraft, 3 = Turbine-fan aeroplane) altitude: Approximate altitude of the aircraft (in feet) at the start of the recording hex_id: Unique ICAO 24-bit address for the aircraft recorded session: Unique recording session by time, location and microphone. offset: Time stamp marking the start of the audio event. duration: Length of the recording (in seconds) file_length: Total length of the audio file in seconds. reg: Registration number of the aircraft airframe: Describes the mechanical structure of the aircraft (eg. Power Driven Aeroplane, Rotorcraft) engtype: Type of engine (eg. Piston, Turboprop, Turbofan, Turboshaft) engnum: Number of engines shortdesc: 3 character alpha-numeric code describing the airframe and engine configuration (eg. L1P, L4J, H2T) typedesig: ICAO type designator for make and model of aircraft (eg. PC12, C185, B738) manu: Aircraft manufaturer (eg. Boeing, Pilatus, Airbus) model: Aircraft model (eg. 737-800, A320-232, DHC-8-315) engmanu: Engine manufacturer (eg. Pratt & Whitney, CFM Interntional, Rolls Royce) engmodel: Engine model (eg. TRENT XWB, CFM56-7B24E, PT6E-67XP) engfamily: Family of the engine model (eg. TRENT, CFM56, PT6) fueltype: Fuel type used in the engine (eg. Gasoline, Kerosine) propmanu: Propeller manufacturer (eg. Hartzell Propellers, Hamilton Standard, "Aircraft Not Fitted With Propeller") propmodel: Propeller model (eg. HC-E5A-3A\/NC10245B, 14SF-15, "Not Applicable") mtow: Maximum take off weight (MTOW) in kilograms Environmental evaluation audio As a means for evaluating model performance on real-world data, a supplementary set of real-time environmental recordings have been included with AeroSonicDB(YPAD-0523). This additional dataset contains six, one-hour long recordings of continuous urban noise, and is accompanied by a CSV file (environment_class_mappings.csv) annotated with relevant class labels per 5-second interval. Due to the variable length of an aircraft audio event and the lack of distinct onset and outset moments, audio segments which transition between aircraft and silence periods are tagged with an "ignore" class. This is done to provide a clear boundary between silence and aircraft events, helping to avoid false misclassification at event boundaries and ensure meaningful evaluation results. Conditions of use Dataset created by Blake Downward. The AeroSonicDB (YPAD-0523) dataset is offered free of charge for non-commercial use under the Creative Commons Attribution-NonCommercial 4.0 International (CC BY-NC 4.0) license. [https://creativecommons.org/licenses/by-nc/4.0/](https://creativecommons.org/licenses/by-nc/4.0/) Acknowledgements Special thanks to Jon Nordby of Soundsensing AS - his contributions were pivotal in maximising the potential of this dataset for open-source release. Feedback Please send suggestions, feedback and comments to: Blake Downward: aerosonicdb@gmail.com Change log 1.1.1: Minor change to "sample_meta.csv" - replaced "6" with "test" in the "fold" column 1.1: Replaced truncated aircraft samples with the original full-length files and annotated the beginning and end of each audio event. Added 'ignore' statements to aircraft event boundaries in the environmental class mappings file. 1.0: Environmental audio and mappings added 0.3: locations.json file added, README updated 0.2: location information added to README

AeroSonicDB (YPAD-0523): 用于飞行器声学检测与分类的标注音频数据集 版本1.1.1(2023年9月) 出版引用:若在学术工作中使用本数据集,请引用该数据集的DOI与版本号。同时请引用下述描述本数据集采集方法并提供基准模型结果的论文:Downward, B., & Nordby, J. (2023). The AeroSonicDB (YPAD-0523) Dataset for Acoustic Detection and Classification of Aircraft. ArXiv, abs/2311.06368. 数据集概述 AeroSonicDB:YPAD-0523是一款专为环境噪声归因与机器听觉(尤其是低空飞行器声学检测与分类)研究打造的专用数据集,内含基于ADS-B(Automatic Dependent Surveillance–Broadcast)标注的音频片段。本数据集的音频文件采集于阿德莱德国际机场(ICAO代码:YPAD)05/23号主跑道进离场航线附近区域。 采集的音频初始通过头顶飞行器传输的ADS-B消息完成标注,随后经人工核验,并标注出目标飞行器可被闻听的起始与终止时刻。总计1895条音频片段,涵盖两个顶级类别:「飞行器」(8.87小时)与「静音」(3.52小时)。其中飞行器类别进一步细分为四个子类,大致按飞行器结构与推进机制划分。数据集还提供了丰富的额外「机身(airframe)」特征,便于研究者更精细地操控数据集,或针对自身研究场景构建专属本体。 为方便使用,本数据集已划分为训练集(10.04小时)与测试集(2.35小时),其中训练集进一步拆分为5个独立折用于交叉验证。拆分过程旨在避免折与测试集之间的数据泄露,确保同一采集会话(按时间、位置与麦克风区分)的样本被分配至同一折。 本数据集可应用于多个研究领域:尤其是城市环境中的飞行器噪声隔离与监测、辅助雷达技术的被动声学系统开发,以及解析飞行器噪声来源以助力制造商设计低噪声飞行器。 音频数据 直接从飞行器传输的ADS-B消息将自动触发、捕获并标注音频样本。当飞行器传输表明其处于采集设备指定距离范围内的消息时,系统将触发一段60秒的录音(具体触发距离详见下文「位置数据」章节)。生成的音频文件将标注该飞行器的唯一ICAO标识符、最新上报的高度、日期、时间、采集位置与麦克风信息。随后录音将经人工核验,并标注出飞行器可被闻听的起始与终止时间戳。 总计而言,AeroSonicDB包含625条低空飞行器录音,时长范围为18至60秒,总飞行器音频时长达8.87小时。 数据集还附带不含飞行器的城市背景噪声(静音)样本,用于区分特定位置的环境噪声与飞行器噪声。这类10秒的「静音」录音仅在无飞行器传输处于指定距离范围内的消息时触发(详见下文「位置数据」章节),且所有静音样本均经人工核验以确保不含飞行器噪声。数据集共包含1270条静音/城市背景噪声片段。 位置数据 录音共采集自3个位置,各位置的GPS坐标已在`locations.json`文件中提供。为保护隐私,文件中给出的坐标为采集设备附近的道路或公共区域坐标,而非设备精确位置。 位置0:位于郊区环境,距离跑道起点/终点东北方向约15.5公里。阿德莱德的典型西南风会使多数进近飞行器在进近时途经该区域;北风或东风则会使飞行器向东北方向起飞,但并非所有离场飞行器都会维持该航线以触发本位置的录音。该位置的触发距离设为3公里,以确保小型/慢速飞行器与大型/快速飞行器均可在60秒的录音窗口内被捕获。该位置的环境背景噪声包括:汽车、摩托车、轻型货车、垃圾车、电动工具、割草机、施工声响、警报器、人声、犬吠,以及多种澳大利亚本土鸟类(如新荷兰蜜鸟、垂耳吸蜜鸟、澳洲喜鹊、澳洲渡鸦、斑鸠、虹彩吸蜜鹦鹉等)。 位置1:位于跑道东南端东南方向约500米处,邻近休闲区域(高尔夫球场、滑板公园与公园绿地),该位置与跑道之间有繁忙的道路/高速公路。该区域存在较强风力与道路交通噪声,同时伴有人声、步行与骑行活动,以及澳洲喜鹊与黑头矿吸蜜鸟(Noisy Miner)。该位置的触发距离设为1公里。由于飞行器飞行高度更低,音量更大,但可被闻听的时长相较位置0更短。 位置2:作为位置1的替代采集点,位于跑道南端东南方向约950米处。该区域北侧为污水处理设施,南侧为居民区,西侧为热门海滩。该位置的防风效果更佳,且距离机场与高速公路噪声更远。环境背景噪声包括近距离的汽车与摩托车、骑行者、步行人群、钉枪及其他施工声响,以及前文提及的本土鸟类。 飞行器元数据 为拓展本数据集的研究潜力,已为所有飞行器采集了补充的「机身(airframe)」元数据。机身信息从多个开源数据库收集并交叉核验。作者认为「aircraft_meta」文件中不存在显著错误,但未来版本的数据集计划直接从ICAO(International Civil Aviation Organization)获取飞行器信息,以确保采用单一可验证的信息来源。 分类/子类本体(录音时长) 0. 无飞行器(202) 1. 飞行器(214) 1: 活塞螺旋桨飞机(12) 2: 涡轮螺旋桨飞机(37) 3: 涡轮风扇飞机(163) 4: 旋翼机(1.6) 注:子类由「机身(airframe)」与「发动机类型(engtype)」特征组合而成。由于样本量较小,活塞发动机与涡轴发动机旋翼机/直升机已合并为单个子类。 数据分割 音频录音已划分为训练集(81%)与测试集(19%)。训练集进一步拆分为5个折,为研究者提供统一的拆分方案以开展5折交叉验证,确保研究可复现且结果具有可比性。为避免测试集的数据泄露,已确保测试集的录音与训练集在时间与位置上互不重叠——即特定位置的测试集样本均采集于该位置训练集样本之后。 标注数据 整个数据集(训练集与测试集)的标注信息均收录于`sample_meta.csv`文件中。每一行对应一条唯一的录音,包含其元数据、标注信息与机身特征。研究者也可直接从样本文件名提取标注信息(详见下文)。`aircraft_meta.csv`与`aircraft_meta.json`文件可用于查询飞行器特定特征,例如制造商、发动机类型、ICAO类型代号等(所有特征详见下文「字段/标签」章节)。 文件命名规范 音频样本均采用WAV格式,部分元数据存储于文件名中。 基础命名格式:「航空器十六进制ID_日期_时间_位置ID_麦克风ID」 格式示例:`XXXXXX_YYYY-MM-DD_hh-mm-ss_X_X` 带飞行器的样本示例:`{hex_id}_{date}_{time}_{location_id}_{microphone_id}.{file_ext}` 例如:`7C7CD0_2023-05-09_12-42-55_2_1.wav` 不带飞行器的样本:「静音」文件以6个前导零代替航空器十六进制代码。所有「静音」样本的元数据均存储于音频文件名与配套的`sample_meta.csv`文件中。 格式示例:`000000_{date}_{time}_{location_id}_{microphone_id}.{file_ext}` 例如:`000000_2023-05-09_12-30-55_2_1.wav` 字段/标签(详见`sample_meta.csv`、`aircraft_meta.csv/json`文件) - train-test:训练/测试集划分(train, test) - fold:训练集拆分的折数(1至5,测试集标注为test) - filename:音频录音的文件名 - date:录音日期 - time:录音时间 - location:录音位置的ID - mic:所用麦克风的ID - class:录音的顶级标签(例如:0 = 无飞行器,1 = 可闻听飞行器) - subclass:录音的子类标签(例如:0 = 无飞行器,3 = 涡轮风扇飞机) - altitude:录音开始时飞行器的近似高度(单位:英尺) - hex_id:所采集飞行器的唯一ICAO 24位地址 - recorded session:按时间、位置与麦克风划分的唯一采集会话 - offset:音频事件的起始时间戳 - duration:录音时长(单位:秒) - file_length:音频文件的总时长(单位:秒) - reg:飞行器的注册号 - airframe:描述飞行器的机械结构(例如:动力驱动飞机、旋翼机) - engtype:发动机类型(例如:活塞式、涡轮螺旋桨式、涡轮风扇式、涡轮轴式) - engnum:发动机数量 - shortdesc:描述机身与发动机配置的3位字母数字代码(例如:L1P、L4J、H2T) - typedesig:飞行器型号的ICAO类型代号(例如:PC12、C185、B738) - manu:飞行器制造商(例如:波音、皮拉图斯、空客) - model:飞行器型号(例如:737-800、A320-232、DHC-8-315) - engmanu:发动机制造商(例如:普惠、CFM国际、罗尔斯·罗伊斯) - engmodel:发动机型号(例如:TRENT XWB、CFM56-7B24E、PT6E-67XP) - engfamily:发动机型号系列(例如:TRENT、CFM56、PT6) - fueltype:发动机所用燃料类型(例如:汽油、航空煤油) - propmanu:螺旋桨制造商(例如:Hartzell螺旋桨公司、汉密尔顿标准、「未配备螺旋桨的飞行器」) - propmodel:螺旋桨型号(例如:HC-E5A-3A/NC10245B、14SF-15、「不适用」) - mtow:最大起飞重量(MTOW,Maximum Take Off Weight),单位:千克 环境评估音频 为支持研究者在真实场景下评估模型性能,本数据集附带了一组补充的实时环境录音。该额外数据集包含6段时长1小时的连续城市噪声录音,配套的CSV文件`environment_class_mappings.csv`已按每5秒间隔标注了相应的类别标签。由于飞行器音频事件的时长不一且缺乏明确的起止时刻,飞行器与静音时段之间的过渡音频段将被标记为「ignore」类别,以此明确静音与飞行器事件的边界,避免在事件边界处产生误分类,确保评估结果的有效性。 使用条款 本数据集由Blake Downward创建。AeroSonicDB (YPAD-0523) 数据集可在知识共享署名-非商业性使用4.0国际许可(CC BY-NC 4.0,Creative Commons Attribution-NonCommercial 4.0 International)协议下免费用于非商业用途。https://creativecommons.org/licenses/by-nc/4.0/ 致谢 特别感谢Soundsensing AS的Jon Nordby,其贡献对本数据集的开源发布潜力最大化起到了关键作用。 反馈 如有建议、反馈与评论,请联系:Blake Downward: aerosonicdb@gmail.com 更新日志 1.1.1:对`sample_meta.csv`进行小幅修改——将「fold」列中的「6」替换为「test」 1.1:将截断的飞行器音频样本替换为原始完整文件,并标注每个音频事件的起止时刻。在环境类别映射文件中为飞行器事件边界添加「ignore」标记 1.0:添加环境音频与映射文件 0.3:添加`locations.json`文件,更新README文档 0.2:在README文档中添加位置信息
创建时间:
2023-09-28
二维码
社区交流群
二维码
科研交流群
商业服务