five

A dataset of labelled device Wi-Fi probe requests for MAC address de-randomization - 2024

收藏
NIAID Data Ecosystem2026-05-02 收录
下载链接:
https://data.mendeley.com/datasets/5tvnwhsj2p
下载链接
链接失效反馈
官方服务:
资源简介:
A Wi-Fi client device can perform an active scan to speed up the connection process by transmitting Probe Request messages periodically. These are management frames of the IEEE 802.11 standard. The process of capturing these messages is called sniffing and can be performed using a Wi-Fi interface set in monitor mode and tuned to the same channel (or an adjacent channel) where the transmission occurred. Since these messages are not encrypted, they can be used to implement device counting algorithms based on MAC address analysis. However, to prevent tracking of device owners, major operating system producers have developed MAC address randomization functionalities. Devices that periodically and randomly change their physical address pose a challenge to counting algorithms, which must then perform additional steps to cluster probe requests according to the source device through analysis of appropriate message features. Our dataset is divided into two parts: - Anechoic Chamber Data Collection: Data was collected from 22 devices simultaneously in a controlled environment (anechoic chamber) to ensure the absence of external interference. All devices kept the Wi-Fi interface active and the display switched off. Data was collected only on channel 6 for 30 minutes. This data is stored in the "Anechoic chamber" folder and the "Anechoic chamber - info.xlsx" file contains device information. - Individual Device Data Collection: Data was collected from 18 individual devices on three channels simultaneously and in six different modes, including settings based on display status, Wi-Fi connection, and power saving. Collecting data from individual devices allows for labelling them and associating them with their emitting source. The data was collected in "noisy" environments (a chamber without particular shielding but devoid of other probe request sources within a two-meter radius). Data is filtered to simulate the anechoic chamber environment. Capture files last 30 minutes and cover three non-overlapping channels (1, 6, and 11) simultaneously. This data is stored in the "Individual devices" folder and the "Individual devices - info.xlsx" file contains device information. We collected a total of 215 non-empty files, removing captures that were empty after filtering. The capture device used is a Raspberry Pi with three Wi-Fi dongle interfaces, each assigned to collect data from a specific channel. The main characteristic of this dataset is the subdivision by device, allowing for a more accurate analysis of individual device behaviour in different modes. Additionally, the labelled data can be used to train Machine Learning algorithms or to verify the correct functioning of algorithms aimed at counting devices through probe request analysis in the presence of random MAC addresses. Note: In version 2, we updated the folder of device L, as it was not correctly filtered.
创建时间:
2025-07-29
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作