Wallhack1.8k Dataset | Data Augmentation Techniques for Cross-Domain WiFi CSI-Based Human Activity Recognition
收藏NIAID Data Ecosystem2026-05-02 收录
下载链接:
https://zenodo.org/record/8188998
下载链接
链接失效反馈官方服务:
资源简介:
This repository contains the Wallhack1.8k dataset for WiFi-based long-range activity recognition in Line-of-Sight (LoS) and Non-Line-of-Sight (NLoS)/Through-Wall scenarios, as proposed in [1,2], as well as the CAD models (of 3D-printable parts) of the WiFi systems proposed in [2].
PyTroch Dataloader
A minimal PyTorch dataloader for the Wallhack1.8k dataset is provided at: https://github.com/StrohmayerJ/wallhack1.8k
Dataset Description
The Wallhack1.8k dataset comprises 1,806 CSI amplitude spectrograms (and raw WiFi packet time series) corresponding to three activity classes: "no presence," "walking," and "walking + arm-waving." WiFi packets were transmitted at a frequency of 100 Hz, and each spectrogram captures a temporal context of approximately 4 seconds (400 WiFi packets).
To assess cross-scenario and cross-system generalization, WiFi packet sequences were collected in LoS and through-wall (NLoS) scenarios, utilizing two different WiFi systems (BQ: biquad antenna and PIFA: printed inverted-F antenna). The dataset is structured accordingly:
LOS/BQ/ <- WiFi packets collected in the LoS scenario using the BQ system
LOS/PIFA/ <- WiFi packets collected in the LoS scenario using the PIFA system
NLOS/BQ/ <- WiFi packets collected in the NLoS scenario using the BQ system
NLOS/PIFA/ <- WiFi packets collected in the NLoS scenario using the PIFA system
These directories contain the raw WiFi packet time series (see Table 1). Each row represents a single WiFi packet with the complex CSI vector H being stored in the "data" field and the class label being stored in the "class" field. H is of the form [I, R, I, R, ..., I, R], where two consecutive entries represent imaginary and real parts of complex numbers (the Channel Frequency Responses of subcarriers). Taking the absolute value of H (e.g., via numpy.abs(H)) yields the subcarrier amplitudes A.
To extract the 52 L-LTF subcarriers used in [1], the following indices of A are to be selected:
# 52 L-LTF subcarriers
csi_valid_subcarrier_index = []
csi_valid_subcarrier_index += [i for i in range(6, 32)]
csi_valid_subcarrier_index += [i for i in range(33, 59)]
Additional 56 HT-LTF subcarriers can be selected via:
# 56 HT-LTF subcarriers
csi_valid_subcarrier_index += [i for i in range(66, 94)]
csi_valid_subcarrier_index += [i for i in range(95, 123)]
For more details on subcarrier selection, see ESP-IDF (Section Wi-Fi Channel State Information) and esp-csi.
Extracted amplitude spectrograms with the corresponding label files of the train/validation/test split: "trainLabels.csv," "validationLabels.csv," and "testLabels.csv," can be found in the spectrograms/ directory.
The columns in the label files correspond to the following: [Spectrogram index, Class label, Room label]
Spectrogram index: [0, ..., n]
Class label: [0,1,2], where 0 = "no presence", 1 = "walking", and 2 = "walking + arm-waving."
Room label: [0,1,2,3,4,5], where labels 1-5 correspond to the room number in the NLoS scenario (see Fig. 3 in [1]). The label 0 corresponds to no room and is used for the "no presence" class.
Dataset Overview:
Table 1: Raw WiFi packet sequences.
Scenario
System
"no presence" / label 0
"walking" / label 1
"walking + arm-waving" / label 2
Total
LoS
BQ
b1.csv
w1.csv, w2.csv, w3.csv, w4.csv and w5.csv
ww1.csv, ww2.csv, ww3.csv, ww4.csv and ww5.csv
LoS
PIFA
b1.csv
w1.csv, w2.csv, w3.csv, w4.csv and w5.csv
ww1.csv, ww2.csv, ww3.csv, ww4.csv and ww5.csv
NLoS
BQ
b1.csv
w1.csv, w2.csv, w3.csv, w4.csv and w5.csv
ww1.csv, ww2.csv, ww3.csv, ww4.csv and ww5.csv
NLoS
PIFA
b1.csv
w1.csv, w2.csv, w3.csv, w4.csv and w5.csv
ww1.csv, ww2.csv, ww3.csv, ww4.csv and ww5.csv
4
20
20
44
Table 2: Sample/Spectrogram distribution across activity classes in Wallhack1.8k.
Scenario
System
"no presence" / label 0
"walking" / label 1
"walking + arm-waving" / label 2
Total
LoS
BQ
149
154
155
LoS
PIFA
149
160
152
NLoS
BQ
148
150
152
NLoS
PIFA
143
147
147
589
611
606
1,806
Download and UseThis data may be used for non-commercial research purposes only. If you publish material based on this data, we request that you include a reference to one of our papers [1,2].
[1] Strohmayer, Julian, and Martin Kampel. (2024). “Data Augmentation Techniques for Cross-Domain WiFi CSI-Based Human Activity Recognition”, In IFIP International Conference on Artificial Intelligence Applications and Innovations (pp. 42-56). Cham: Springer Nature Switzerland, doi: https://doi.org/10.1007/978-3-031-63211-2_4.
[2] Strohmayer, Julian, and Martin Kampel., “Directional Antenna Systems for Long-Range Through-Wall Human Activity Recognition,” 2024 IEEE International Conference on Image Processing (ICIP), Abu Dhabi, United Arab Emirates, 2024, pp. 3594-3599, doi: https://doi.org/10.1109/ICIP51287.2024.10647666.
BibTeX citations:
@inproceedings{strohmayer2024data,
title={Data Augmentation Techniques for Cross-Domain WiFi CSI-Based Human Activity Recognition},
author={Strohmayer, Julian and Kampel, Martin},
booktitle={IFIP International Conference on Artificial Intelligence Applications and Innovations},
pages={42--56},
year={2024},
organization={Springer}}@INPROCEEDINGS{10647666, author={Strohmayer, Julian and Kampel, Martin}, booktitle={2024 IEEE International Conference on Image Processing (ICIP)}, title={Directional Antenna Systems for Long-Range Through-Wall Human Activity Recognition}, year={2024}, volume={}, number={}, pages={3594-3599}, keywords={Visualization;Accuracy;System performance;Directional antennas;Directive antennas;Reflector antennas;Sensors;Human Activity Recognition;WiFi;Channel State Information;Through-Wall Sensing;ESP32}, doi={10.1109/ICIP51287.2024.10647666}}
创建时间:
2025-04-04



