Detecting Degradation of Web Browsing Quality of Experience
收藏DataCite Commons2020-11-02 更新2024-07-28 收录
下载链接:
https://figshare.com/articles/dataset/Detecting_Degradation_of_Web_Browsing_Quality_of_Experience/13089854
下载链接
链接失效反馈官方服务:
资源简介:
This dataset represents 222k samples of web browsing session measurements collected during 2.5 months using the Web View platform (https://webview.orange.com) [1]. Web View allows different probes to automatically execute multiple web sessions in a real end-user environment. In our test campaign, we use 17 machines, spread in three different locations worldwide (Lannion, Paris and Mauritius islands), different ISPs and access technologies (ADSL, WiFi and fiber) for a total of 9 combinations, and up to 12 browser versions, which include various versions of Chrome and Firefox. Each machine can request a different browser viewport, can enable or disable the AdBlock plugin to emulate different user preferences and can request a specific network protocol (HTTP/1, HTTP/2 or QUIC).<br>We leverage this dataset to phrase the QoE degradation detection issue as a change point detection problem in [2]. Our results, beyond showing feasibility, warn about the exclusive use of QoE indicators that are very close to content, as changes in the content space can lead to false alarms that are not tied to network-related problems.<br>If you use these datasets in your research, you can reference the appropriate papers:<br>[1] A. Saverimoutou, B. Mathieu, and S. Vaton, “Web View: A measurement platform for depicting web browsing performance and delivery,” IEEE Communications Magazine, vol. 58, no. 3, pp. 33–39, 2020.[2] A. Huet, Z. Ben Houidi, B. Mathieu, D. Rossi “Detecting degradation of web browsing quality of experience,” 16th International Conference on Network and Service Management (CNSM), 2020.<br>Each row represents one experiment, and the columns are as follows:- wwwName: Target page- timestamp: Timestamp with format YYYY-MM-DD hh:mm:ss- browserUsed: Internet browser and version - requestedProtocol: Requested L7 protocol- adBlocker: Whether adBlocker is used or not- networkIface: Network interface- winSize: Window size- visiblePortion: Visible portion of the page that is above the fold in percents- h1Share: Share of the traffic coming from HTTP/1 in percents- h2Share: Share of the traffic coming from HTTP/2 in percents- hqShare: Share of the traffic coming from QUIC in percents- pushShare: Share of the traffic coming from HTTP/2 Server Push in percents- nbRes: Number of objects of the page- nbResNA: Number of objects coming from North America- nbResSA: Number of objects coming from South America- nbResEU: Number of objects coming from Europe- nbResAS: Number of objects coming from Asia- nbResAF: Number of objects coming from Africa- nbResOC: Number of objects coming from Oceania- nbResUKN: Number of objects coming from unknown provenance- nbHTTPS: Number of objects coming from an HTTPS connection- nbHTTP: Number of objects coming from an HTTP connection- nbDomNA: Number of different domain names coming from North America- nbDomSA: Number of different domain names coming from South America- nbDomEU: Number of different domain names coming from Europe- nbDomAS: Number of different domain names coming from Asia- nbDomAF: Number of different domain names coming from Africa- nbDomOC: Number of different domain names coming from Oceania- firstPaint: First paint time (ms)- tfvr: Time for Full Visual Rendering (ms)- dom: DOM time (ms)- plt: Page Load Time (ms)- machine: Machine name (containing location information)- categoryType: Category of the web page- pageSize: Total web page size (bytes)- receiveTime: Total receive time from HAR (ms)- transferRate: Transfer rate (bps)- id: Unique identification of the current experiment- config: Identification for the tuple (browserUsed, requestedProtocol, adBlocker, networkIface, winSize, machine, wwwName), i.e. the probe configuration with target wwwName<br>
本数据集包含22.2万个网页浏览会话测量样本,采集周期为2.5个月,依托Web View平台(https://webview.orange.com)完成[1]。Web View支持各类探针在真实终端用户环境中自动执行多组网页会话测试。
在本次测试项目中,我们部署了17台测试主机,分布于全球3个不同地点(拉尼永、巴黎与毛里求斯岛),使用不同的互联网服务提供商(ISP)与接入技术(ADSL、WiFi与光纤),共计9种组合;同时采用最多12种浏览器版本,涵盖Chrome与Firefox的多个迭代版本。每台测试主机可自定义浏览器视口(viewport),可启用或禁用AdBlock插件以模拟不同用户偏好,还可指定使用的网络协议(HTTP/1、HTTP/2或QUIC)。
本数据集被用于将网页浏览体验质量(QoE,Quality of Experience)退化检测问题建模为[2]中的变点检测任务。研究结果不仅验证了该方案的可行性,同时警示:若仅使用贴近内容层面的QoE指标,可能会因内容空间的变化触发误报,而此类误报与网络相关问题并无关联。
若您在研究中使用本数据集,请引用以下相关论文:
[1] A. Saverimoutou, B. Mathieu, 和 S. Vaton, "Web View: 用于描述网页浏览性能与传输效果的测量平台", 《IEEE通信杂志》, 第58卷第3期, 第33–39页, 2020.
[2] A. Huet, Z. Ben Houidi, B. Mathieu, D. Rossi, "检测网页浏览体验质量的退化", 第16届国际网络与服务管理会议(CNSM), 2020.
本数据集每一行代表一次实验,各列含义如下:
- wwwName:目标网页名称
- timestamp:时间戳,格式为YYYY-MM-DD hh:mm:ss
- browserUsed:所使用的互联网浏览器及其版本
- requestedProtocol:请求的应用层(L7)协议
- adBlocker:是否启用AdBlock插件
- networkIface:网络接口
- winSize:浏览器窗口尺寸
- visiblePortion:首屏可见页面占比(百分比)
- h1Share:来自HTTP/1协议的流量占比(百分比)
- h2Share:来自HTTP/2协议的流量占比(百分比)
- hqShare:来自QUIC协议的流量占比(百分比)
- pushShare:来自HTTP/2服务器推送(Server Push)的流量占比(百分比)
- nbRes:网页包含的资源总数
- nbResNA:来自北美地区的资源数量
- nbResSA:来自南美地区的资源数量
- nbResEU:来自欧洲地区的资源数量
- nbResAS:来自亚洲地区的资源数量
- nbResAF:来自非洲地区的资源数量
- nbResOC:来自大洋洲地区的资源数量
- nbResUKN:来源未知的资源数量
- nbHTTPS:采用HTTPS连接的资源数量
- nbHTTP:采用HTTP连接的资源数量
- nbDomNA:来自北美地区的不同域名数量
- nbDomSA:来自南美地区的不同域名数量
- nbDomEU:来自欧洲地区的不同域名数量
- nbDomAS:来自亚洲地区的不同域名数量
- nbDomAF:来自非洲地区的不同域名数量
- nbDomOC:来自大洋洲地区的不同域名数量
- firstPaint:首次绘制时间(毫秒,ms)
- tfvr:全视觉渲染完成时间(毫秒,ms)
- dom:文档对象模型(DOM,Document Object Model)构建耗时(毫秒,ms)
- plt:页面加载耗时(毫秒,ms)
- machine:测试主机名称(包含部署地点信息)
- categoryType:网页所属类别
- pageSize:网页总大小(字节)
- receiveTime:来自HTTP存档格式(HAR,HTTP Archive)的总接收耗时(毫秒,ms)
- transferRate:传输速率(比特每秒,bps)
- id:当前实验的唯一标识符
- config:由(browserUsed, requestedProtocol, adBlocker, networkIface, winSize, machine, wwwName)组成的标识符,即针对目标网页wwwName的探针配置参数
提供机构:
figshare
创建时间:
2020-11-02



