CRAWDAD umd/sigcomm2008
收藏DataCite Commons2022-12-09 更新2025-04-16 收录
下载链接:
https://ieee-dataport.org/open-access/crawdad-umdsigcomm2008
下载链接
链接失效反馈官方服务:
资源简介:
We collected a trace of wireless network activity at SIGCOMM 2008. The subjects of the traced network chose to participate by joining the traced SSID. The release contains 3 types of anonymized traces: 802.11a, Ethernet and Syslog from the Access Point. We anonymized the trace data using a modified version (http://www.cs.umd.edu/projects/wifidelity/sigcomm08_traces/sigcomm08-tcpmkpub.tar.gz) of the tcpmkpub tool (http://www.icir.org/enterprise-tracing/tcpmkpub.html) The packet traces include anonymized DHCP and DNS headers.last modified : 2009-03-25release date : 2009-03-02date/time of measurement start : 2008-08-17date/time of measurement end : 2008-08-21collection environment : We collected a trace of wireless network activity at SIGCOMM 2008. The subjects of the traced network chose to participate by joining the traced SSID. Our goal is to gather a detailed trace of network activity at SIGCOMM 2008 to improve 802.11 tracing techniques as part of the Wifidelity project and enable analysis of the behavior of a wireless LAN that is (presumably) heavily used.network configuration : We used four BSSIDs on four channels with one NAT (Network Address Translation) router. To collect the traces, we deployed eight 802.11a monitors so 2 monitors are assigned to each channel. A Xirrus Wi-Fi Array (http://www.xirrus.com/products/arrays-80211abg.php) provided the traced 802.11a network (SSID:SIGCOMM-ONLY-Traced). The WiFi Array consisted of four BSSIDs that were broadcast on four 802.11a channels. After anonymization, the DHCP assigned IP addresses for clients are in the following subnets: 26.12.0.0/16 and 26.2.0.0/16.data collection methodology : We recorded network protocol information from all wired and wireless packets sent on the wireless network of SSID:SIGCOMM-ONLY-Traced. Each packet includes physical layer information (in the Prism header) such as the wireless signal strength as well as the 802.11, IP, TCP, UDP, and ICMP headers, depending on the packet type. We did not record packet payloads above the transport layer except for DHCP and DNS payloads. However, we anonymized or deleted potentially sensitive information such as MAC and IP addresses, and DHCP and DNS headers.sanitization : The user chose to participate in the trace by associating with the SIGCOMM-ONLY-Traced SSID. Otherwise, the users joined the "Untraced" SSID: SIGCOMM-ONLY-Untraced. The traces do not contain any data from the "Untraced" SSID. We anonymized the traces to protect the identity and activity of users who opted to be traced during SIGCOMM 2008. - Filtering 802.11a traces Each packet in the wireless traces meets one or both of the following criteria: 1. BSSID address matches the "traced" BSSID. 2. Packet is a probe request for the "SIGCOMM-ONLY-Traced" SSID. - Filtering Ethernet traces The AP was set up with a monitor VLAN for the "SIGCOMM-ONLY-Traced" network. - Filtering Syslog traces The syslog trace only contains information about users associated with the "traced" network. The method to filter out syslog messages about "Untraced" users is as follows: Include all syslog messages while a client is associated to the "traced" network. The syslog messages indicate when a client associates to, and disassociates from the "traced" network.Tracesetumd/sigcomm2008/pcapPCAP traceset of wireless network measurement in SIGCOMM 2008 conference.description: We collected pcap traces of wireless network activity at SIGCOMM 2008. The subjects of the traced network chose to participate by joining the traced SSID.measurement purpose: Network Diagnosismethodology: 1. 802.11a During most of the conference approximately two 802.11a monitors were placed at the four corners of the main conference hall. We did not record the exact location of each monitor. However, we tried to capture each channel with two monitors placed at opposite corners of the room. 2. Ethernet Packets sent from the NAT to the AP and from the AP to the NAT were captured using an Ethernet trace collector attached to the packet dump port on the WiFi Array.sanitization: The packets are anonymized using a modified version of the tcpmkpub tool. The tool is available from the download link of [sigcomm08-tcpmkpub.tar.gz]. Metadata about the trace anonymization is provided in the file tcpmkpub.log.export. In the description below, [new] indicates new functionality added to tcpmkpub, and [tcpmkpub] indicates the functionality of the original tcpmkpub tool, described in the following reference: R. Pang, M. Allman, V. Paxson, and J. Lee. The Devil and Packet Trace Anonymization SIGCOMM Computer Communication Review, 2006. [Crypto-PAn] indicates the functionality of the original tcpmkpub tool, described in the following reference: Xu, J. Fan, M. H. Ammar, and S. B. Moon. Prefix-preserving IP address anonymization: measurement-based security evaluation and a new cryptography-based scheme. In Proceedings of the IEEE International Conference on Network Protocols (ICNP), pages 280–289, Nov. 2002. 1. Checksums (IP/UDP/TCP) [tcpmkpub] The anonymization code recomputes checksums. The anonymization meta-data (tcpmkpub.log.export) holds information about packets in the traces with bad checksums. Bad checksums are indicated in the anonymized traces by a 1 in the checksum field, or 2 if the checksum was 1, A UDP checksum of 0 is not changed. 2. Link Layer A. Ethernet [tcpmkpub] MAC Addresses: - The 3 high and low-order bytes are hashed separately. - The high-order 3 bytes are hashed to retain vendor information. - Addresses containing all 1's or all 0's are not changed. - The Multicast bit is retained. B.VLAN [new] The vlan header did not need to be anonymized. C. 802.11 [new] - MAC addresses are anonymized using the same method as the Ethernet MAC addresses. - If the packet is fragmented (fragment bit == 1 or fragment # > 0), skip the rest of the packet. 3. Network Layer A. IP [tcpmkpub] - External addresses hashed using prefix preserving scheme [Crypto-PAn]. - Internal addresses hashed to unused prefix by the external addresses and the subnet and host portions of the address are transformed. - Multicast addresses are not anonymized. - The [tcpmkpub] paper recommends removing packets from network scanners. We did not determine this was a threat to our network as the identity tied to a local address was dynamic. B. ARP [tcpmkpub] - If the ARP packet contains a partial IP packet, use the IP anonymization above. - IP addresses anonymized using the IP anonymization procedure above. 4. Transport Layer A. TCP [tcpmkpub] - The TCP timestamp options are transformed into separate monotonically increasing counters with no relationship to time for each IP address in the anonymized trace. - If timestamp is 0 do not modify it. - Replace timestamp with a unique number incremented in the order of the trace. B. UDP [tcpmkpub] Recompute checksum according to checksum policy above. 5. Application Layer A. DNS [new] - Anonymize DNS labels individually by taking the Keyed-HMAC of the label. - Keep the low-order 8 bytes of the hash digest as the label. - Convert the digest to ASCII by converting to hex. - Store the new length of the DNS packet in the following fields: [IP/UDP/DNS,PCAP Captured, PCAP On Wire]. - Anonymize any type 'A' resource record data using the IP anonymization scheme above. DNS Packets may be cut off because of the snaplen at capture. B. DHCP [new] - Client IP address is anonymized. - Client hardware address is anonymized. - Your IP address (yiaddr) is anonymized. The rest of the DHCP packets were cut off by the snaplen at capture.umd/sigcomm2008/pcap Traces802.11a: PCAP traces of wireless network measurement collected from the wireless side in SIGCOMM 2008 conference.configuration: During most of the conference approximately two 802.11a monitors were placed at the four corners of the main conference hall. We did not record the exact location of each monitor. However, we tried to capture each channel with two monitors placed at opposite corners of the room. The network topology is configured as follows: Users: 26.12.*.* 26.2.*.* Network Management: 26.6.*.*format:sigcomm08_wl_(monitor #)_(first packet time)_(last packet time)_(bssid)_(channel).pcapEthernet: PCAP traces of wireless network measurement collected from the Ethernet side in the SIGCOMM 2008 conference.configuration: Packets sent from the NAT to the AP and from the AP to the NAT were captured using an Ethernet trace collector attached to the packet dump port on the WiFi Array. The network topology is configured as follows: Users: 26.12.*.* 26.2.*.* Network Management: 26.6.*.*format:sigcomm08_eth_(first packet time)_(last packet time).pcapanonymization_log: The anonymization log of wireless network traces in the SIGCOMM 2008 conference.configuration: tcpmkpub anonymization log for the traces 'umd/sigcomm2008/pcap/802.11a' and 'umd/sigcomm2008/pcap/Ethernet', and md5 checksums for the trace files.format:The anonymization log file name is 'tcpmkpub.log.export'.umd/sigcomm2008/syslogSyslog traceset of wireless network measurement in the SIGCOMM 2008 conference.description: We collected syslog traces of wireless network activity at SIGCOMM 2008. The subjects of the traced network chose to participate by joining the traced SSID.measurement purpose: Network Diagnosismethodology: A tracing box connected to the Array's management port collected syslog traces. Unfortunately, after the conference we noticed that these traces were corrupted. However, we were able to salvage one of the syslog traces because we collected it with the Ethernet tracing box.sanitization: macmkpub, a MAC address anonymizer based on the tcpmkpub anonymization code, anonymized the MAC addresses in the syslog traces. Metadata about the trace anonymization is provided in the file 'tcpmkpub.log.export'.umd/sigcomm2008/syslog TracesEthernet: Syslog traces of wireless network measurement in the SIGCOMM 2008 conference.configuration: We collected syslog traces with the Ethernet tracing box. The network topology is configured as follows: Users: 26.12.*.* 26.2.*.* Network Management: 26.6.*.*format:sigcomm08_syslog_(first log time)_(last log time)
我们在SIGCOMM 2008会议期间采集了无线网络活动追踪数据。本次追踪的网络参与者通过加入目标SSID(Service Set Identifier,服务集标识)自愿参与本数据采集工作。本数据集包含三类经过匿名化处理的追踪数据:来自无线接入点(Access Point, AP)的802.11a、以太网与系统日志(Syslog)追踪数据。我们使用修改版的tcpmkpub工具(原始工具链接:http://www.icir.org/enterprise-tracing/tcpmkpub.html,修改版下载地址:http://www.cs.umd.edu/projects/wifidelity/sigcomm08_traces/sigcomm08-tcpmkpub.tar.gz)对追踪数据进行匿名化处理。数据包追踪数据包含经过匿名化处理的DHCP(动态主机配置协议)与DNS(域名系统)头部信息。最后修改时间:2009-03-25;发布日期:2009-03-02;测量开始时间:2008-08-17;测量结束时间:2008-08-21;采集环境:我们在SIGCOMM 2008会议期间采集无线网络活动追踪数据,参与者通过加入目标SSID自愿参与。本项目的目标是采集SIGCOMM 2008会议期间的详细网络活动追踪数据,作为Wifidelity项目的一部分以优化802.11网络追踪技术,并支持对(大概率)高负载无线局域网的行为分析。网络配置:我们在四个信道上部署了四个BSSID(基本服务集标识),并搭配一台NAT(网络地址转换)路由器。为采集追踪数据,我们部署了八台802.11a监测设备,每个信道分配两台监测设备。我们使用Xirrus Wi-Fi Array(产品链接:http://www.xirrus.com/products/arrays-80211abg.php)搭建目标802.11a网络(SSID:SIGCOMM-ONLY-Traced),该Wi-Fi Array包含四个广播于802.11a四个信道的BSSID。匿名化处理后,客户端由DHCP分配的IP地址位于以下子网:26.12.0.0/16与26.2.0.0/16。数据采集方法:我们记录了SSID为SIGCOMM-ONLY-Traced的无线网络上所有有线与无线数据包的网络协议信息。根据数据包类型的不同,每个数据包包含物理层信息(位于Prism头部中),例如无线信号强度,以及802.11、IP、TCP、UDP与ICMP头部信息。除DHCP与DNS负载外,我们未记录传输层以上的数据包负载。同时,我们对潜在的敏感信息进行了匿名化或删除处理,例如MAC地址、IP地址、DHCP与DNS头部信息。
数据清理:用户通过关联至SIGCOMM-ONLY-Traced SSID自愿参与本追踪项目,未参与该SSID的用户将加入"SIGCOMM-ONLY-Untraced"(未追踪)SSID。本数据集不包含来自未追踪SSID的任何数据。我们对追踪数据进行匿名化处理,以保护在SIGCOMM 2008会议期间自愿参与追踪的用户的身份与活动信息。
- 802.11a追踪数据过滤规则:无线追踪数据中的每个数据包需满足以下一条或两条标准:1. BSSID地址与"目标追踪"BSSID匹配;2. 数据包为针对"SIGCOMM-ONLY-Traced" SSID的探测请求帧。
- 以太网追踪数据过滤规则:无线接入点为"SIGCOMM-ONLY-Traced"网络配置了监测VLAN(虚拟局域网)。
- 系统日志追踪数据过滤规则:系统日志追踪数据仅包含与"目标追踪"网络关联的用户信息。过滤掉与"未追踪"用户相关的系统日志消息的方法如下:仅保留客户端关联至"目标追踪"网络期间的所有系统日志消息。系统日志消息会记录客户端关联至与断开"目标追踪"网络的时间点。
### umd/sigcomm2008/pcap
SIGCOMM 2008会议无线网络测量的PCAP追踪数据集
描述:我们在SIGCOMM 2008会议期间采集了无线网络活动的PCAP追踪数据,参与者通过加入目标SSID自愿参与本数据采集工作。
测量目的:网络诊断
采集方法:
1. 802.11a:在会议的大部分时段,我们在主会议厅的四个角落部署了约两台802.11a监测设备。我们未记录每台监测设备的精确位置,但尝试通过部署于房间对角的两台监测设备覆盖每个信道。
2. 以太网:通过连接至Wi-Fi Array数据包转储端口的以太网追踪采集设备,捕获从NAT至AP以及从AP至NAT的数据包。
数据清理:使用修改版的tcpmkpub工具对数据包进行匿名化处理,该工具可通过[sigcomm08-tcpmkpub.tar.gz]的下载链接获取。追踪匿名化的元数据存储于tcpmkpub.log.export文件中。
下文的标注说明:[new]表示为tcpmkpub新增的功能,[tcpmkpub]表示原始tcpmkpub工具的功能,原始工具的相关描述可参考以下文献:R. Pang, M. Allman, V. Paxson, 和 J. Lee. The Devil and Packet Trace Anonymization. SIGCOMM Computer Communication Review, 2006. [Crypto-PAn]表示原始tcpmkpub工具使用的功能,相关描述可参考以下文献:Xu, J. Fan, M. H. Ammar, 和 S. B. Moon. Prefix-preserving IP address anonymization: measurement-based security evaluation and a new cryptography-based scheme. 收录于IEEE国际网络协议会议(ICNP)论文集,第280–289页,2002年11月。
1. 校验和(IP/UDP/TCP)[tcpmkpub]:匿名化代码会重新计算校验和。匿名化元数据文件(tcpmkpub.log.export)中存储了追踪数据中校验和错误的数据包信息。校验和错误在匿名化后的追踪数据中通过校验和字段的取值标记:校验和错误时字段值为1,若原始校验和为1则字段值为2;UDP校验和为0时不做修改。
2. 链路层
A. 以太网[tcpmkpub]:MAC地址:将高3字节与低3字节分别进行哈希处理;高3字节的哈希结果可保留厂商信息;全1或全0的地址不做修改;保留组播位。
B. VLAN[new]:VLAN头部无需进行匿名化处理。
C. 802.11[new]:MAC地址的匿名化方法与以太网MAC地址一致;若数据包为分片帧(分片位=1或分片编号>0),则跳过该数据包的后续处理。
3. 网络层
A. IP[tcpmkpub]:外部地址使用前缀保留方案[Crypto-PAn]进行哈希处理;内部地址通过外部地址映射至未使用的前缀,并转换地址的子网与主机部分;组播地址不进行匿名化处理;[tcpmkpub]原始文献建议移除网络扫描器产生的数据包,但由于本网络中本地地址的身份是动态的,我们未发现该操作对本网络存在威胁。
B. ARP[tcpmkpub]:若ARP数据包包含部分IP数据包,则使用上述IP匿名化方法进行处理;IP地址使用上述IP匿名化流程进行处理。
4. 传输层
A. TCP[tcpmkpub]:将TCP时间戳选项转换为独立的单调递增计数器,该计数器与匿名化追踪中每个IP地址的实际时间无关联;若时间戳为0则不做修改;将时间戳替换为按照追踪顺序递增的唯一编号。
B. UDP[tcpmkpub]:按照上述校验和策略重新计算校验和。
5. 应用层
A. DNS[new]:通过对DNS标签进行密钥化HMAC(Keyed-Hash Message Authentication Code,哈希消息认证码)运算,实现单个DNS标签的匿名化;取哈希摘要的低8字节作为新的标签;将摘要转换为十六进制ASCII字符串;更新DNS数据包的以下字段长度信息:[IP/UDP/DNS, PCAP Captured, PCAP On Wire];使用上述IP匿名化方案对类型为'A'的资源记录数据进行匿名化处理。由于捕获时的快照长度(snaplen)限制,DNS数据包可能会被截断。
B. DHCP[new]:客户端IP地址进行匿名化处理;客户端硬件地址进行匿名化处理;客户端分配的IP地址(yiaddr)进行匿名化处理。其余DHCP数据包因捕获时的快照长度限制被截断。
### umd/sigcomm2008/pcap 追踪数据集
802.11a:SIGCOMM 2008会议期间从无线网络侧采集的无线网络测量PCAP追踪数据。
配置:在会议的大部分时段,我们在主会议厅的四个角落部署了约两台802.11a监测设备。我们未记录每台监测设备的精确位置,但尝试通过部署于房间对角的两台监测设备覆盖每个信道。网络拓扑配置如下:用户网段:26.12.*.*、26.2.*.*;网络管理网段:26.6.*.*。
文件命名格式:sigcomm08_wl_(monitor #)_(first packet time)_(last packet time)_(bssid)_(channel).pcap
以太网:SIGCOMM 2008会议期间从以太网侧采集的无线网络测量PCAP追踪数据。
配置:通过连接至Wi-Fi Array数据包转储端口的以太网追踪采集设备,捕获从NAT至AP以及从AP至NAT的数据包。网络拓扑配置如下:用户网段:26.12.*.*、26.2.*.*;网络管理网段:26.6.*.*。
文件命名格式:sigcomm08_eth_(first packet time)_(last packet time).pcap
匿名化日志:SIGCOMM 2008会议期间无线网络追踪数据的匿名化日志。
配置:针对umd/sigcomm2008/pcap/802.11a与umd/sigcomm2008/pcap/Ethernet追踪数据的tcpmkpub匿名化日志,以及追踪文件的MD5校验和。
文件格式:匿名化日志文件名为'tcpmkpub.log.export'。
### umd/sigcomm2008/syslog
SIGCOMM 2008会议期间无线网络测量的系统日志追踪数据集
描述:我们在SIGCOMM 2008会议期间采集了无线网络活动的系统日志追踪数据,参与者通过加入目标SSID自愿参与本数据采集工作。
测量目的:网络诊断
采集方法:通过连接至Array管理端口的追踪采集设备收集系统日志追踪数据。遗憾的是,会议结束后我们发现这些追踪数据已损坏,但通过以太网追踪采集设备收集的一份系统日志追踪数据得以完整保留。
数据清理:使用基于tcpmkpub匿名化代码开发的macmkpub工具,对系统日志追踪数据中的MAC地址进行匿名化处理。追踪匿名化的元数据存储于'tcpmkpub.log.export'文件中。
### umd/sigcomm2008/syslog 追踪数据集
以太网:SIGCOMM 2008会议期间的无线网络测量系统日志追踪数据。
配置:我们通过以太网追踪采集设备收集系统日志追踪数据。网络拓扑配置如下:用户网段:26.12.*.*、26.2.*.*;网络管理网段:26.6.*.*。
文件命名格式:sigcomm08_syslog_(first log time)_(last log time)
提供机构:
IEEE DataPort
创建时间:
2022-12-09



