TUT Tietotalo Ambisonic Impulse Response
收藏Mendeley Data2024-03-27 更新2024-06-28 收录
下载链接:
https://zenodo.org/record/1443539
下载链接
链接失效反馈官方服务:
资源简介:
Tampere University of Technology (TUT) Tietotalo Ambisonic Impulse Response This dataset consists of impulse responses (IR) from a real environment using the Eigenmike spherical microphone array. The recordings were done in a fairly large spaced corridor inside the university (Tietotalo building) with classrooms around it. The IR acquisition was done using a maximum length sequence (MLS). The measurement was done by slowly moving a Genelec G Two loudspeaker continuously playing the MLS around the Eigenmike in a circular trajectory. The playback volume was set to be 30 dB greater than the ambient sound level. The IRs were collected at elevations −40 to 40 with 10-degree increments at 1 m from the Eigenmike and at elevations −20 to 20 with 10-degree increments at 2 m. The moving-source IRs were obtained by a freely available tool from CHiME challenge which estimates the time-varying responses in STFT domain by forming a least-squares regression between the known measurement signal and the far-field recording independently at each frequency. The IR for any azimuth within one trajectory can be analyzed by assuming block-wise stationarity of acoustic channel. The CHiME IR estimation tool was applied independently on all 32 channels of the Eigenmike. For the dataset creation, we analyzed the DOA of each time frame using MUSIC and extracted IRs for azimuthal angles at 10° resolution (36 IRs for each elevation). The IR file is in .mat format and can be read both in Matlab and Python. The details of the IR file are as following, Size: (2, 9, 1025, 36, 4, 32) = (distance_wrt_mic, elevation_wrt_mic, FFT, azimuth_wrt_mic, blocks, channels). where, distance_wrt_mic = two distances (1m and 2m) elevation_wrt_mic = 9 elevation angles (-40:10:40) at distance 1m, and 5 elevations angles (-20:10:20) at distance 2m. azimuth_wrt_mic = 36 azimuth angles (-180:10:180) for all distance-elevation combination The IRs were extracted assuming block-wise stationarity (four blocks) for each frequency bin (1025 bins). During synthesis, after convolving the IR with a sound event, the 32 channel audio will have to be transformed to Ambisonic format using the transformation matrix of Eigenmike. This dataset was collected as part of the 'Sound event localization and detection of overlapping sources using convolutional recurrent neural network' work, more details about this IR dataset can be found in this work. Data collector (s): Fagerlund, Eemi; Koskimies, Aino
坦佩雷理工大学(Tampere University of Technology, TUT)Tietotalo环绕声(Ambisonic)冲激响应数据集。本数据集包含基于真实环境采集的冲激响应(Impulse Response, IR),采集设备为Eigenmike球形麦克风阵列。录制工作在大学Tietotalo教学楼内一处面积较大的开放式走廊中,走廊周边环绕多间教室。本次冲激响应采集采用最长序列(Maximum Length Sequence, MLS)信号完成。测量过程为:将Genelec G Two扬声器以缓慢匀速的方式沿圆形轨迹环绕Eigenmike移动,同时持续播放MLS信号,播放音量设置为比环境声级高30 dB。数据采集时,在距Eigenmike 1米处,以10°为步长采集-40°至40°范围内的仰角冲激响应;在距其2米处,以10°为步长采集-20°至20°范围内的仰角冲激响应。移动声源的冲激响应可通过CHiME挑战赛提供的开源工具获取,该工具通过在每个频率维度独立构建已知测量信号与远场录音间的最小二乘回归,在短时傅里叶变换(Short-Time Fourier Transform, STFT)域内估计时变冲激响应。可通过假设声学信道分块平稳性,对单条轨迹内任意方位角的冲激响应进行分析,该CHiME冲激响应估计工具被独立应用于Eigenmike的全部32个通道。在数据集构建阶段,我们使用MUSIC算法分析每个时域帧的波达方向(Direction of Arrival, DOA),并以10°分辨率提取方位角对应的冲激响应,每个仰角对应36个冲激响应。本数据集的冲激响应文件采用.mat格式,可在Matlab与Python环境中读取。冲激响应文件的维度尺寸为(2, 9, 1025, 36, 4, 32),依次对应:(麦克风参考距离、麦克风参考仰角、FFT点数、麦克风参考方位角、分块数、通道数)。其中:麦克风参考距离共2个档位(1米与2米);麦克风参考仰角:1米距离下包含9个仰角(-40:10:40),2米距离下包含5个仰角(-20:10:20);麦克风参考方位角:所有距离-仰角组合下均包含36个方位角(-180:10:180)。本次冲激响应的提取基于每个频点(共1025个频点)的分块平稳性假设,共分为4个分块。在音频合成阶段,将冲激响应与声音事件进行卷积后,需通过Eigenmike的变换矩阵将32通道音频转换为环绕声格式。本数据集作为基于卷积循环神经网络(Convolutional Recurrent Neural Network, CRNN)的声音事件定位与重叠声源检测研究工作的一部分采集所得,更多关于该冲激响应数据集的细节可参阅该研究。数据采集者:Fagerlund, Eemi;Koskimies, Aino
创建时间:
2023-06-28



