five

CRAWDAD microsoft/osdi2006

收藏
DataCite Commons2022-11-10 更新2025-04-16 收录
下载链接:
https://ieee-dataport.org/open-access/crawdad-microsoftosdi2006
下载链接
链接失效反馈
官方服务:
资源简介:
Traces of network activity at OSDI 2006.The authors gathered a detailed trace of network activity at OSDI 2006 to enable analysis of the behavior of a wireless LAN that is (presumably) heavily used.date/time of measurement start: 2006-11-06date/time of measurement end: 2006-11-07collection environment: We gathered traces of wireless traffic at several monitoring nodes distributed across the conference floor and breakout areas. In addition, we gathered traces on the wired switch to which the wireless access points connect.network configuration: The collection map ([AP-map.png] linked in this page) shows the locations of the APs and sniffers and the channels which they were operating on. There were five APs (AP8, AP9, AP10, AP11, and AP13) and these where set to one of three channels (1, 6, and 11). We used nine sniffers (S1 - S9) to gather the wireless traffic, each equipped with either one or two 802.11 NICs (labeled A and B) that are set up for sniffing. Each sniffer and the channel(s) it was set to sniff on are shown in blue in the figure. For instance, S1:6;11 means S1 is sniffing channels 6 and 11 simultaneously. Note that we also had a wired sniffer (S10, not shown in the figure) to gather traffic between the wireless subnet and the wide-area network.data collection methodology: Each monitor captures all of the 802.11 frames it sees, including:1. Data frames2. Management frames (e.g., association, authentication)3. Control frames (e.g., RTS, CTS, ACK)For each wireless frame captured at a monitor, we record the following information:1. Per-frame PHY information, including:a. Channel frequencyb. RSSIc. Modulation rate2. Entire MAC header, with only the source and destination MAC addresses being anonymized as follows:a. In real-time, the first 3 bytes of the MAC address are copied over as is. The last 3 bytes are replaced with a one-way hash.b. Offline, we replace all the 3-byte MAC prefixes that occur fewer than 10 times with a common prefix. This ensures k-anonymity, for k=10.3. The entire IP and TCP/UDP header, with the source and destination IP addresses anonymized as follows:a. The IP address is replaced with a one-way hash.b. In addition, we record which of the following categories the IP address belongs to:i. Auto conf (169.254/16).ii. Locally allocated.iii. Other.4. The entire DHCP payload, with the following anonymization:a. All IP addresses (e.g., client IP address (ciaddr), your IP address (yiaddr)) are anonymized as in 3.b. All MAC addresses (e.g., client hardware address (chaddr)) are anonymized as in 2.c. All names (e.g., server name (sname)) are replaced with a one-way hash.d. All identifying options (e.g., client identifier) are replaced with a one-way hash.5. The DNS request/response payload, with the following anonymization/deletion:a. The domain name in the question section is replaced with a one-way hash.b. The resource records are deleted. sanitization: We have taken reasonable measures to secure the machines used for tracing: kept them up-to-date on patches, turned off unnecessary services, protected access with a strong password, etc. We throw away the secret key used for the keyed one-way hash once the trace collection is concluded to make difficult a dictionary attack on the one-way hash. Packet payload is recorded for DHCP and DNS requests and responses. However, information such as DNS names and IP addresses contained in the payload is anonymized before being stored. Given that the traces are being anonymized, we believe that it would be extremely difficult for anyone to identify users or learn which Internet services or hosts they have communicated with. That said, we are not in a position to prove that no such information can be gleaned from the anonymized traces. The traces is anonymized on-the-fly before they are stored on disk. However, certain information, such as the first 3 bytes of the MAC address, may turn out to violate the principle of k-anonymity (described below). If so, we further anonymize the trace offline before anyone else sees it; this kind of anonymization cannot be done online. Much of the anonymization is performed on-the-fly, so no one should have access to the non-anonymized data, given that we intend to keep the tracing system as secure as possible. However, some of the anonymization can only be done offline, so the data authors have access to the partially anonymized data during the time it takes to perform the offline anonymization (no more than a fewdays after the trace collection is concluded). It may be possible to identify users using a side-channel attack, for instance, by exploiting information such packet sizes and packet timing; we do not plan to protect the data against such attacks.  Also, we would like to permit the identification of the manufacturer of a wireless NIC (which could be useful when analyzing the traces), so the first 3 bytes of the MAC address are left unanonymized.  However, this could violate the principle of k-anonymity, i.e., that it should not be possible to identify any user as being a member of a group with fewer than k members. If a group size is smaller than 10, our offline anonymization replaces this MAC-address prefix with another value so as to create a group of at least 10 nodes (i.e., we set k to 10). So it would be possible to identify the 3-byte prefix of a node's MAC address provided that there are at least 10 nodes that share the same prefix. limitation: Despite the anonymization, it may be possible for some information to leak. For example, it may be possible to infer which website was visited based on the size of the response received. We are unable to obfuscate such information without damaging the data significantly. 
提供机构:
IEEE DataPort
创建时间:
2022-11-10
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作