Research Data for FiHi: Fusion of inertial and high-resolution acoustic data for human activity recognition
收藏NIAID Data Ecosystem2026-05-02 收录
下载链接:
https://zenodo.org/record/13838075
下载链接
链接失效反馈官方服务:
资源简介:
Description
This dataset contains information on 20 different activities collected from 15 participants (20-55 years old) using Wit-motion smart inertial sensors and Double Acoustics guitar pickups. Each participant performs these daily activities in an unrestricted environment, with each activity lasting at least 60 seconds and repeated twice.
DataSet Information
1.Original_data.zip
The data was annotated by manually reviewing the audio clips and assigning appropriate activity labels. Timestamping the inertial sensor data with the start time of the audio device recorded by the experimenter ensured correct segmentation and synchronization of the inertial and acoustic data. The total data length for
all participants is over 10 hours.
(1) IMU DATA
The inertial data (accelerometer and gyroscope) is sampled at 100 Hz and transmitted by Bluetooth to the host computer. These reviewed and annotated original samples from 15 participants are placed in separate csv files. The arrangement of information in each csv file is:
Col 1-3: 3D-acceleration data (g)
Col 4-6: 3D-gyroscope data (°/s)
(2) Audio DATA
The acoustic data are sampled at 192 kHz by a Steinberg sound card and transmitted by USB cable to the host computer. These original samples are placed in separate wav files, with each file name containing all the necessary information regarding the contents of the file.
For example:
100801_sitting
Participant ID (1-4 digits): 1008, Session ID (5-6 digits): 01, Activity ID: sitting.
2、Processed_data.zip
The last two columns of each file are as follows:
participant_id: such as 100101, 100102, 100201 ...... The last two digits are the Session ID, representing the two sessions from the same participant for the same activity.
activity_id: Refer to the ACTIVITY SET below
(1) IMU DATA
The original inertial data is individually aligned with the processed Audio data based on their start times, with any excess data rows at the end being trimmed. Then, these inertial data files are augmented with activity_id and participant_id for identification, and consolidated into a single csv file.
The continuous motion signal is segmented into sliding windows, each with a duration of 3 seconds and a step size of 3 seconds. Given the IMU’s sampling rate of 100 Hz, each window of inertial data consists of 300 time steps, with 6 channels of information (3 axes each for accelerometer and gyroscope). Consequently, a single inertial sample is represented by a 300 × 6 dimensional matrix.
(2) Audio DATA
The acoustic signals are processed using the Short Time Fourier Transform (STFT) with a window length of 1024 points and an overlap of 256 points, , which generates n linear spaced frequency bins between n kHz frequency range in the frequency domain. The output contains the estimate of the short-term, time-localized frequency patterns. We examine two levels of privacy protection: 8 ∼ 96 kHz for non-speech sound and 20 ∼ 96 kHz for inaudible sound, with 88 and 76 linear spaced frequency bins, respectively.
Under a sample rate of 192 kHz for the original acoustic signal, there are (192000 - 256)/(1024 - 256) ≈ 250 frequency features within a second, while each feature has 88 and 76 dimensions for non-speech (8∼96 kHz) and inaudible (20∼96 kHz) feature, respectively.
ACTIVITY SET
The activityIDs and corresponding activities are listed in the following:
0: use microwave
1: brush teeth
2: browse video
3: drink water
4: fry
5: lie down
6: flush
7: go downstairs
8: go upstairs
9: sit
10: stand
11: manipulate door
12: type
13: jump
14: run
15: walk
16: wash hands
17: write
18: operate light
19: eat
创建时间:
2024-09-27



