公开网站短期内高并发网页访问流量日志数据集
收藏国家基础学科公共科学数据中心2026-01-30 收录
下载链接:
https://nbsdc.cn/general/dataDetail?id=674b692b195d2661e1ba41fd&type=1
下载链接
链接失效反馈官方服务:
资源简介:
该数据集来源于互联网公开的原始性观测数据,包含1998年4月30日至1998年7月26日期间用户对1998年世界杯网站发出的所有1,352,804,107次访问请求数据。数据项主要有:1) 发出请求的客户端的唯一整数标识符;2)单位为秒的纪元时间戳;3)请求的URL及结果;4)处理请求的服务器。原始性观测数据全面地体现了流量缓慢上升、缓慢下降、急剧上升、急剧下降、持续波动多种流量负载情况,然而由于这些情况的出现过于离散,为了能在实验中体现算法对所有上述情况的有效支持,该数据集分别截取了出现上述几种负载情况的流量数据,经过适当缩放和拼接重新构成一个新的连续的流量数据,能一次性在实验中包含上述的典型情况和极端情况。最终合成后的数据文件是random-100max。
This dataset is derived from publicly available raw observational data sourced from the Internet. It contains all 1,352,804,107 access requests made by users to the 1998 FIFA World Cup website between April 30, 1998 and July 26, 1998. The core data fields include: 1) Unique integer identifier of the client that sent the request; 2) Epoch timestamp in seconds; 3) Requested URL and its corresponding processing result; 4) Server that processed the request. The raw observational data comprehensively covers various traffic load scenarios, namely slow increase, slow decrease, sharp increase, sharp decrease, and sustained fluctuations. However, these scenarios appear too discretely. To enable experiments to fully validate an algorithm's effective support for all these scenarios, this dataset selectively extracts traffic segments corresponding to the aforementioned load patterns, then properly scales and concatenates them to create a new continuous traffic dataset. This synthesized dataset can include both typical and extreme cases of the above scenarios in a single experimental run. The final synthesized data file is named random-100max.
提供机构:
中山大学
搜集汇总
数据集介绍

背景与挑战
背景概述
该数据集包含1998年世界杯网站从4月30日至7月26日期间的13.5亿次访问请求日志,涵盖客户端标识、时间戳、URL和服务器等关键数据项。原始数据经截取、缩放和拼接处理,形成连续流量数据集(random-100max),旨在综合反映多种典型和极端的流量负载情况,以支持算法实验。
以上内容由遇见数据集搜集并总结生成



