mit-impulse-response-survey

Name: mit-impulse-response-survey
Creator: maas
Published: 2025-10-09 16:26:41
License: 暂无描述

魔搭社区2025-10-09 更新2025-03-22 收录

下载链接：

https://modelscope.cn/datasets/benjamin-paine/mit-impulse-response-survey

下载链接

链接失效反馈

官方服务：

资源简介：

# Author's Description > These are environmental Impulse Responses (IRs) measured in the real-world IR survey as described in [Traer and McDermott, PNAS, 2016](https://www.pnas.org/doi/full/10.1073/pnas.1612524113). > The survey locations were selected by tracking the motions of 7 volunteers over the course of 2 weeks of daily life. We sent the volunteers 24 text messages every day at randomized times and asked the volunteers to respond with their location at the time the text was sent. We then retraced their steps and measured the acoustic impulse responses of as many spaces as possible. We recorded 271 IRs from a total of 301 unique locations. This data set therefore reflects the diversity of acoustic distortion our volunteers encounter in the course of daily life. All recordings were made with a 1.5 meter spacing between speaker and microphone to simulate a typical conversation. > > [James Traer and Josh H. McDermott, mcdermottlab.mit.edu](https://mcdermottlab.mit.edu/Reverb/IR_Survey.html) # Repacking Notes The following changes were made to repack for 🤗 Datasets / 🥐 Croissant: - Mapped beggining part of filename to *id*. - Mapped second part of filename to *location*, and turned into a class label (enumeration.) - When present, mapped third (but not final) part of filename to *detail*. - Mapped final part of filename to *hits*. - Adjusted several filenames by correcting typos, homogenizing capitalization, and occasionally switching the order of *location* and *detail*. # License These files are licensed under an MIT Creative Commons license, [CC-BY 4.0](https://creativecommons.org/licenses/by/4.0/). Please cite the Traer and McDermott paper when used, as exampled below. # Citation ``` @article{ doi:10.1073/pnas.1612524113, author = {James Traer and Josh H. McDermott}, title = {Statistics of natural reverberation enable perceptual separation of sound and space}, journal = {Proceedings of the National Academy of Sciences}, volume = {113}, number = {48}, pages = {E7856-E7865}, year = {2016}, doi = {10.1073/pnas.1612524113}, URL = {https://www.pnas.org/doi/abs/10.1073/pnas.1612524113}, eprint = {https://www.pnas.org/doi/pdf/10.1073/pnas.1612524113}, abstract = {Sounds produced in the world reflect off surrounding surfaces on their way to our ears. Known as reverberation, these reflections distort sound but provide information about the world around us. We asked whether reverberation exhibits statistical regularities that listeners use to separate its effects from those of a sound’s source. We conducted a large-scale statistical analysis of real-world acoustics, revealing strong regularities of reverberation in natural scenes. We found that human listeners can estimate the contributions of the source and the environment from reverberant sound, but that they depend critically on whether environmental acoustics conform to the observed statistical regularities. The results suggest a separation process constrained by knowledge of environmental acoustics that is internalized over development or evolution. In everyday listening, sound reaches our ears directly from a source as well as indirectly via reflections known as reverberation. Reverberation profoundly distorts the sound from a source, yet humans can both identify sound sources and distinguish environments from the resulting sound, via mechanisms that remain unclear. The core computational challenge is that the acoustic signatures of the source and environment are combined in a single signal received by the ear. Here we ask whether our recognition of sound sources and spaces reflects an ability to separate their effects and whether any such separation is enabled by statistical regularities of real-world reverberation. To first determine whether such statistical regularities exist, we measured impulse responses (IRs) of 271 spaces sampled from the distribution encountered by humans during daily life. The sampled spaces were diverse, but their IRs were tightly constrained, exhibiting exponential decay at frequency-dependent rates: Mid frequencies reverberated longest whereas higher and lower frequencies decayed more rapidly, presumably due to absorptive properties of materials and air. To test whether humans leverage these regularities, we manipulated IR decay characteristics in simulated reverberant audio. Listeners could discriminate sound sources and environments from these signals, but their abilities degraded when reverberation characteristics deviated from those of real-world environments. Subjectively, atypical IRs were mistaken for sound sources. The results suggest the brain separates sound into contributions from the source and the environment, constrained by a prior on natural reverberation. This separation process may contribute to robust recognition while providing information about spaces around us.}} ```

# 作者原始描述 > 本数据集收录的是真实场景声学脉冲响应（Impulse Response, IR），源自[Traer与McDermott, 《美国国家科学院院刊》, 2016](https://www.pnas.org/doi/full/10.1073/pnas.1612524113)所述的真实世界IR调研。 > 调研点位通过追踪7名志愿者为期两周的日常生活轨迹选定。我们每日于随机时刻向志愿者发送24条短信，要求其回复短信发送时的所处位置。随后我们重访这些点位，尽可能多地测量各空间的声学脉冲响应。最终从301个独特点位中共录制得到271条IR数据。本数据集由此反映了志愿者日常生活中遭遇的各类声学畸变场景。所有录制均采用扬声器与麦克风间距1.5米的设置，以模拟典型的面对面交谈场景。 > > [James Traer与Josh H. McDermott，麻省理工学院McDermott实验室](https://mcdermottlab.mit.edu/Reverb/IR_Survey.html) # 重打包说明为适配🤗 数据集库（Datasets）与🥐 Croissant数据集格式，本次重打包进行了如下调整： - 将文件名的起始部分映射为`id`字段； - 将文件名的第二部分映射为`location`字段，并将其转换为类别标签（枚举形式）； - 若存在第三段（非最终段）文件名内容，则将其映射为`detail`字段； - 将文件名的最后一部分映射为`hits`字段； - 修正了部分文件名的拼写错误，统一大小写规范，并偶尔调整`location`与`detail`的字段顺序。 # 授权协议本数据集文件采用MIT知识共享署名4.0国际许可协议（CC-BY 4.0），详情可参见[CC-BY 4.0](https://creativecommons.org/licenses/by/4.0/)。若使用本数据集，请引用Traer与McDermott的相关论文，引用示例如下。 # 引用格式 @article{ doi:10.1073/pnas.1612524113, author = {James Traer and Josh H. McDermott}, title = {Statistics of natural reverberation enable perceptual separation of sound and space}, journal = {Proceedings of the National Academy of Sciences}, volume = {113}, number = {48}, pages = {E7856-E7865}, year = {2016}, doi: {10.1073/pnas.1612524113}, URL = {https://www.pnas.org/doi/abs/10.1073/pnas.1612524113}, eprint = {https://www.pnas.org/doi/pdf/10.1073/pnas.1612524113}, abstract = {自然界中产生的声音在传播至人耳的过程中会经周围表面反射，这类反射被称为混响（reverberation），其会使声音产生畸变，但同时也能提供周围环境的相关信息。本研究旨在探究混响是否存在统计规律，而听者正是利用这些规律将混响带来的声学畸变与声源本身的声学特征分离开来。我们对真实世界的声学特性开展了大规模统计分析，揭示了自然场景中混响的显著统计规律。研究发现，人类听者可从混响信号中估算声源与环境各自的声学贡献，但这一能力高度依赖于环境声学特性是否符合我们观测到的统计规律。研究结果表明，人类大脑存在一种受环境声学知识约束的分离过程，这类知识是在发育或进化过程中内化形成的。在日常聆听场景中，声音既会直接从声源传至人耳，也会经反射（即混响）间接传入。混响会显著畸变声源的声音信号，但人类仍可通过尚不明确的机制识别声源并区分所处环境。核心的计算挑战在于，声源与环境的声学特征会在人耳接收的单一信号中被结合在一起。本研究旨在探究人类对声源与空间的识别是否反映了分离二者声学影响的能力，以及这种分离是否由真实世界混响的统计规律所支撑。为首先确认这类统计规律是否存在，我们对人类日常生活中日常遇到的271个空间的脉冲响应（IR）进行了测量，这些空间的采样分布覆盖了日常场景的多样性。尽管采样空间种类多样，但其IR均呈现出较强的约束性，表现为随频率变化的指数衰减：中频混响持续时间最长，而高频与低频的衰减速度更快，这大概率与材料和空气的吸声特性有关。为验证人类是否利用了这类规律，我们在模拟混响音频中操纵了IR的衰减特性。实验结果显示，听者可从这些信号中区分声源与环境，但当混响特性偏离真实世界的规律时，听者的区分能力会显著下降。主观感受上，非典型的IR会被误判为声源本身的特征。研究结果表明，人类大脑会基于自然混响的先验知识，将声音信号分离为声源与环境的贡献。这种分离过程既有助于实现鲁棒的声源识别，同时也能提供周围空间的相关信息。} }

提供机构：

maas

创建时间：

2025-03-18

搜集汇总

数据集介绍