five

'Acoustic Signal Dataset of Homogeneous Balls under Steady Excitation in Sealed Containers

收藏
DataCite Commons2026-04-07 更新2025-05-18 收录
下载链接:
https://www.scidb.cn/detail?dataSetId=f7a238007ae94a528219688aa4ae622f
下载链接
链接失效反馈
官方服务:
资源简介:
This dataset aims to support research on non-contact detection of object quantities inside sealed containers. By analyzing the characteristics of collision acoustic signals, a machine learning model is established to achieve high-precision object counting. The following is a detailed description of the dataset: Data Collection and Structure Experimental setup: Containers and objects: Use containers made of different materials (wood, aluminum, acrylic) and small balls (glass, wood, quartz) to simulate various physical interaction conditions. Excitation device: By using a circular motion vibration table with a constant speed to excite the movement of the ball, data is collected after the system reaches steady state. Signal recording: A standard microphone (with a sampling rate of 44.1 kHz) records collision sounds for several seconds. During training, it is divided into 3-second samples, and each sample is stored as a 132300 dimensional time-domain signal. Data composition: Categories and Samples: There are a total of 36 main categories (with ball sizes ranging from 1 to 36), each containing ≥ 200 samples, resulting in a total of 7263 audio files. Control group expansion: includes multiple sets of experimental data (such as aluminum box+glass ball, acrylic box+quartz ball, etc.), with a total sample size of 22133. Data Features and Preprocessing Time domain and frequency domain analysis: The original time-domain signal is transformed into frequency-domain features through Fast Fourier Transform (FFT), enhancing inter class linear separability. Principal Component Analysis (PCA) is used for dimensionality reduction (to 2D/3D) to reveal the high-dimensional spherical distribution characteristics of data. Key physical characteristics: The total amount of sound energy within a single category (Eeergy=∑ xi ^ 2) remains constant and shows an increasing and then decreasing trend with the number of balls. The frequency domain features exhibit strong linear separability, supporting classification models based on logistic regression. Application and Value Target task: Supervised learning (classification), predict the number of small balls in a closed container, and achieve recognition accuracy of over 95% within an allowable error range (such as ± 1). Advantages: Cross material and size robustness: suitable for different containers, object materials, and stacking densities. Low hardware dependency: only standard microphone is required, no expensive optical/electromagnetic equipment is needed. Potential applications: Industrial quality inspection, logistics monitoring, intelligent sensing and other non-contact object counting scenarios. Data format and access File structure: Classified and stored according to experimental conditions (container+object material) and the number of balls, including raw audio (WAV), FFT spectrum, and PCA dimensionality reduction results. Metadata: Annotate vibration frequency, environmental noise, ball size, and material parameters. License Agreement: Following CC BY-NC 4.0, non-commercial research and secondary analysis are allowed.

本数据集旨在支持密封容器内物体数量非接触检测领域的研究工作。通过分析碰撞声学信号特征,构建机器学习模型以实现高精度物体计数。以下为该数据集的详细说明: 数据采集与结构 实验设置: 容器与物体:采用不同材质(木材、铝、亚克力)的容器,以及玻璃、木材、石英材质的小球,以模拟多样化的物理交互场景。 激励装置:使用匀速圆周运动振动台驱动小球运动,待系统达到稳态后采集数据。 信号录制:采用标准麦克风(采样率为44.1 kHz)录制数秒的碰撞音频。训练阶段将音频划分为3秒采样段,每段存储为132300维的时域信号。 数据构成: 类别与样本量:共包含36个主类别(小球尺寸覆盖1至36),每个类别含≥200条样本,总计7263个音频文件。 对照组拓展:包含多组实验数据(如铝盒+玻璃球、亚克力盒+石英球等),总样本量达22133条。 数据特征与预处理 时域与频域分析: 通过快速傅里叶变换(Fast Fourier Transform, FFT)将原始时域信号转换为频域特征,提升类别间的线性可分性。 采用主成分分析(Principal Component Analysis, PCA)进行降维(降至2维/3维),以揭示数据的高维球面分布特性。 关键物理特征: 单类别内的总声能(Energy=∑ xi²)保持恒定,且随小球数量呈现先增后减的变化趋势。 频域特征具备较强的线性可分性,可支撑基于逻辑回归的分类模型。 应用与价值 目标任务:基于监督学习(分类)任务,预测密闭容器内的小球数量,在允许误差范围(如±1)内可实现95%以上的识别准确率。 优势: 跨材质与尺寸鲁棒性:适配不同容器、物体材质及堆叠密度。 低硬件依赖度:仅需标准麦克风即可完成采集,无需昂贵的光学/电磁类设备。 潜在应用:工业质检、物流监控、智能感知等非接触式物体计数场景。 数据格式与访问 文件结构:按照实验条件(容器+物体材质)与小球数量分类存储,包含原始音频(WAV格式)、FFT频谱数据及PCA降维结果。 元数据:标注振动频率、环境噪声、小球尺寸与材质参数。 授权协议:遵循CC BY-NC 4.0协议,允许非商业性研究与二次分析。
提供机构:
Science Data Bank
创建时间:
2025-05-06
二维码
社区交流群
二维码
科研交流群
商业服务