DoH -- Real-World
收藏NIAID Data Ecosystem2026-03-13 收录
下载链接:
https://zenodo.org/record/5956043
下载链接
链接失效反馈官方服务:
资源简介:
Please refer to the original data article for further data description: Jeřábek & Hynek et al., Collection of datasets with DNS over HTTPS traffic In: Data in Brief Journal ,DOI:10.1016/j.dib.2022.108310
The collection of datasets contains DoH and HTTPS traffic that was captured in a real large ISP network. The data are provided in the form of PCAP files. However, since we needed to anonymize the real captures, we also provided TLS enriched flow data that are generated with opensource ipfixprobe flow exporter. Other than TLS related information is not relevant since the dataset comprises only encrypted TLS traffic. The TLS enriched flow data are provided in the form of CSV files with the following columns:
Column Name
Column Description
DST_IP
Destination IP address
SRC_IP
Source IP address
BYTES
The number of transmitted bytes from Source to Destination
BYTES_REV
The number of transmitted bytes from Destination to Source
TIME_FIRST
Timestamp of the first packet in the flow in format YYYY-MM-DDTHH-MM-SS
TIME_LAST
Timestamp of the last packet in the flow in format YYYY-MM-DDTHH-MM-SS
PACKETS
The number of packets transmitted from Source to Destination
PACKETS_REV
The number of packets transmitted from Destination to Source
DST_PORT
Destination port
SRC_PORT
Source port
PROTOCOL
The number of transport protocol
TCP_FLAGS
Logic OR across all TCP flags in the packets transmitted from Source to Destination
TCP_FLAGS_REV
Logic OR across all TCP flags in the packets transmitted from Destination to Source
TLS_ALPN
The Value of Application Protocol Negotiation Extension sent from Server
TLS_JA3
The JA3 fingerprint
TLS_SNI
The value of Server Name Indication Extension sent by Client
The DoH resolvers in the dataset can be identified by IP addresses written in doh_resolver_ip.csv file.
The main part of the dataset is located in DoH-Real-World.tar.gz and has the following structure:
.
└── data | - Main directory with data
└── captured | - Directory with data captured on ISP backbone lines
├── pcap | - ISP backbone PCAPS
└── tls-flow-csv | - ISP backbone CSV flow data
Dataset collection statistics:
Name
Value
Total Data Size
179 GB
Total Time
~10 Days
Connections
~420 M
Number of unique Client IP addresses
116,263
Number of unique Server IP addresses
9343
Number of unique DoH Resolver's IP addresses
142
Please cite the original article:
@article{Jerabek2022,
title = {Collection of datasets with DNS over HTTPS traffic},
journal = {Data in Brief},
volume = {42},
pages = {108310},
year = {2022},
issn = {2352-3409},
doi = {https://doi.org/10.1016/j.dib.2022.108310},
url = {https://www.sciencedirect.com/science/article/pii/S2352340922005121},
author = {Kamil Jeřábek and Karel Hynek and Tomáš Čejka and Ondřej Ryšavý}
}
创建时间:
2022-06-03



