An audio-seismic dataset for human activity recognition
收藏IEEE2026-04-17 收录
下载链接:
https://ieee-dataport.org/documents/audio-seismic-dataset-human-activity-recognition
下载链接
链接失效反馈官方服务:
资源简介:
The test-bed for data collection comprised of one ⟨audio-seismic⟩ sensor pair. Samsung Galaxy M31s is used as an audio recorder. The audio recorder has a default sampling rate of 22050 Hz. The audio recorder not only senses the targeted audio activities but other environmental sounds as well. Other environmental sounds include the sounds of birds, vehicles, and construction sites. These other environmental sounds are referred to as background noise (i.e., no activity). Similarly, any off-the-self seismic sensor can be used to sense the earth’s vibration generated due to the movement of a target. Sparkfun SM-24 Geophone 1-axis (vertical) is used for seismic exploration. It converts earth vibrations to a voltage that any off-the-shelf microcontroller can read. The SM-24 Geophone 1-Axis has a sensitivity of 28.2 volts/ms and a 2.5 percent error margin. It also operates over a wide temperature range (-40 to +100 degrees). Similar to an audio recorder, a seismic sensor also senses activities not belonging to the targeted activities. We label the vibrations generated due to non-targeted activities as background noise (i.e., no activity). We set the sampling rate of the seismic sensor at 1000 Hz as used in literature. SM-24 Geophone is connected to a laptop via Analog Discovery 2. Analog Discovery 2 is a multifunction instrument that allows users to measure, visualize, generate, record, and control signals. Moreover, it also features a 24-bit analog to digital converter that provides high resolution to detect the slightest difference in measured values. Analog Discovery 2 can be powered by either USB or a 5-volt external power supply. The proposed dataset for human activity recognition consists of raw audio and seismic signals collected over a period of 10 non-consecutive days. This time duration was adequate for collecting the desired number of samples per activity class as well as incorporating weather variation (i.e., sunny, drizzle, and overcast) into the collected data. The data for human activity is collected under nine categories: run, jog, walk, jumping jacks, jump, cycling, riding a bike, and background noises (other sounds). We selected three distinct data collection locations to incorporate variations in signal. The data collection area consists of a hockey field and two distinct trails (with and without grass). Area-1 is a uniform surface with small grass on it, whereas Area-2 is an uneven surface with variable grass length. Area-3 is a dry and rough mud trail without any grass. The white arrow on the ground denotes the data generation path. The black circle denotes the location of the hardware setup. We also vary the timing of data collection to introduce variation in background noise. All the activities were carried out along the data collection path. The data collection path is approximately 10-21 meters in length and 2 meters in width. The sensors were placed on the ground near the midpoint of the path. The distance between activities and the sensor was not constant in every scenario. For example, while running on the data collection path, each step is not at an equal distance from the midpoint of the data collection path, i.e., sensor location. A human target goes to one of the extreme ends (chosen at random) of the data collection path and engages in a particular activity (such as running, jogging, walking, cycling, or riding a bike), finishing at the opposite end of the data collection path. Now, the same process is repeated from another end. The activities jump, jumping jacks, and hammer strike are point activities, and these activities are performed at different distances to introduce variations. On a particular day, the sensors could be located on either side of the path to introduce variations in data collection. The collected dataset has a total duration of 5 hours and 20 minutes. To the best of our knowledge, there exists no dataset which consists of both audio and seismic signals for the same set of activities in a synchronized manner.
提供机构:
Kumari, Pratibha; Choudhary, Priyankar



