Host Network Traffic 2019
收藏NIAID Data Ecosystem2026-03-11 收录
下载链接:
https://zenodo.org/record/3799931
下载链接
链接失效反馈官方服务:
资源简介:
Dataset Summary
Timespan: 2019-01-01 : 2019-12-31
Granularity: 1-hour disjoint time windows
# of characteristics observed: 9
Hosts observed: 65536
Labels: included
Unzipped volume: approx. 10 GB
Dataset Origins
Dataset was collected over the whole year 2019. The observation points for the collection of IP flows were located at the borders of the university campus network. The campus university network has /16 CIDR IPv4 network range at disposal and contains various network segments from segments connecting dormitories, over server segments, to a segment containing working stations of university administrative workers. A host in our dataset is identified by its source IPv4 address.
Variables
The dataset contains the following variables:
Aggregations - created sums of the individual variables over a one-hour interval:
# of flows - number of flows for a given source IP
# of packets - number of packets for a given source IP
# of bytes - number of packets for a given source IP
flow duration - average flow duration in seconds
Distinct Counts - count of distinct values for each variable over a one-hour window
# of peers - number of distinct communication peers for a given source IP
# of ports - number of distinct destination ports for a given source IP
# of protocols - number of distinct communication protocols for a given source IP
# of AS numbers - number of distinct destination AS numbers for a given source IP
# of countries - number of distinct destination countries for a given source
Dataset Structure
Dataset Files - each variable is contained in one Comma-Separated File (.csv) file
Row index - timestamp of the observation window (8760 rows)
Columns index - anonymized IP addresses (65536 columns)
Label File - contains labels of the individual IP addresses from the Dataset Files
Row index - anonymized IP addresses (65536 rows)
Columns index - labels for the IP addresses
Subnet - ID of a subnet - hosts belonging to the same subnet have the same Id.
Subnet_range - CIDR range of a subnet
Unit - an ID of administrative unit owning the network range
Sub-unit - an ID of administrative sub-unit owning the network range
Subnet_label - subnet label
Servers - selected subnets containing mostly servers (133.250.178.0/24, 133.250.163.0/24)
Workstations - selected subnets containing mostly workstations (133.250.146.0/24, 133.250.157.128/25)
Further notes
N/A values
Variables - means that in a given observation window, the host did not communicate
Labels - no additional information on this IP is available
Dataset load
df = pd.read_csv(,header=[0], index_col=[0])
创建时间:
2020-05-13



