five

Detecting Degradation of Web Browsing Quality of Experience

收藏
DataCite Commons2025-06-01 更新2024-07-28 收录
下载链接:
https://figshare.com/articles/dataset/Detecting_Degradation_of_Web_Browsing_Quality_of_Experience/13089854/1
下载链接
链接失效反馈
官方服务:
资源简介:
This dataset represents 222k samples of web browsing session measurements collected during 2.5 months using the Web View platform (https://webview.orange.com) [1]. Web View allows different probes to automatically execute multiple web sessions in a real end-user environment. In our test campaign, we use 17 machines, spread in three different locations worldwide (Lannion, Paris and Mauritius islands), different ISPs and access technologies (ADSL, WiFi and fiber) for a total of 9 combinations, and up to 12 browser versions, which include various versions of Chrome and Firefox. Each machine can request a different browser viewport, can enable or disable the AdBlock plugin to emulate different user preferences and can request a specific network protocol (HTTP/1, HTTP/2 or QUIC).<br>We leverage this dataset to phrase the QoE degradation detection issue as a change point detection problem in [2]. Our results, beyond showing feasibility, warn about the exclusive use of QoE indicators that are very close to content, as changes in the content space can lead to false alarms that are not tied to network-related problems.<br>If you use these datasets in your research, you can reference the appropriate papers:<br>[1] A. Saverimoutou, B. Mathieu, and S. Vaton, “Web View: A measurement platform for depicting web browsing performance and delivery,” IEEE Communications Magazine, vol. 58, no. 3, pp. 33–39, 2020.[2] A. Huet, Z. Ben Houidi, B. Mathieu, D. Rossi “Detecting degradation of web browsing quality of experience,” 16th International Conference on Network and Service Management (CNSM), 2020.<br>Each row represents one experiment, and the columns are as follows:- wwwName: Target page- timestamp: Timestamp with format YYYY-MM-DD hh:mm:ss- browserUsed: Internet browser and version - requestedProtocol: Requested L7 protocol- adBlocker: Whether adBlocker is used or not- networkIface: Network interface- winSize: Window size- visiblePortion: Visible portion of the page that is above the fold in percents- h1Share: Share of the traffic coming from HTTP/1 in percents- h2Share: Share of the traffic coming from HTTP/2 in percents- hqShare: Share of the traffic coming from QUIC in percents- pushShare: Share of the traffic coming from HTTP/2 Server Push in percents- nbRes: Number of objects of the page- nbResNA: Number of objects coming from North America- nbResSA: Number of objects coming from South America- nbResEU: Number of objects coming from Europe- nbResAS: Number of objects coming from Asia- nbResAF: Number of objects coming from Africa- nbResOC: Number of objects coming from Oceania- nbResUKN: Number of objects coming from unknown provenance- nbHTTPS: Number of objects coming from an HTTPS connection- nbHTTP: Number of objects coming from an HTTP connection- nbDomNA: Number of different domain names coming from North America- nbDomSA: Number of different domain names coming from South America- nbDomEU: Number of different domain names coming from Europe- nbDomAS: Number of different domain names coming from Asia- nbDomAF: Number of different domain names coming from Africa- nbDomOC: Number of different domain names coming from Oceania- firstPaint: First paint time (ms)- tfvr: Time for Full Visual Rendering (ms)- dom: DOM time (ms)- plt: Page Load Time (ms)- machine: Machine name (containing location information)- categoryType: Category of the web page- pageSize: Total web page size (bytes)- receiveTime: Total receive time from HAR (ms)- transferRate: Transfer rate (bps)- id: Unique identification of the current experiment- config: Identification for the tuple (browserUsed, requestedProtocol, adBlocker, networkIface, winSize, machine, wwwName), i.e. the probe configuration with target wwwName<br>

本数据集包含22.2万条网页浏览会话测量样本,采集自为期2.5个月的实验,所用平台为Web View(https://webview.orange.com)[1]。 Web View平台支持多种探针在真实终端用户环境中自动执行多组网页会话。本次实验共部署17台测试设备,分布于全球3个不同地点(拉尼永、巴黎及毛里求斯岛),覆盖不同ISP(互联网服务提供商)与接入技术(ADSL、WiFi及光纤),共计9种组合;同时测试了多达12款浏览器版本,涵盖Chrome与Firefox的多个迭代版本。每台测试设备可自定义浏览器视口、启用或禁用AdBlock插件以模拟不同用户偏好,还可指定使用的网络协议(HTTP/1、HTTP/2或QUIC)。 本研究将该数据集用于将用户体验(Quality of Experience, QoE)下降检测问题建模为变点检测任务[2]。研究结果不仅验证了该方案的可行性,同时警示:仅使用贴近内容层面的QoE指标存在风险——内容空间的变化可能引发与网络问题无关的误报。 若您在研究中使用本数据集,请引用以下相关论文: [1] A. Saverimoutou、B. Mathieu及S. Vaton,"Web View:面向网页浏览性能与交付质量的测量平台",《IEEE通信杂志》,第58卷第3期,第33-39页,2020年。 [2] A. Huet、Z. Ben Houidi、B. Mathieu、D. Rossi,"网页浏览体验质量下降检测",第16届网络与服务管理国际会议(CNSM),2020年。 每一行代表一次实验,各字段含义如下: - wwwName:目标网页名称 - timestamp:时间戳,格式为YYYY-MM-DD hh:mm:ss - browserUsed:所用浏览器及版本 - requestedProtocol:请求的L7(七层)网络协议 - adBlocker:是否启用AdBlock插件 - networkIface:网络接口 - winSize:浏览器窗口尺寸 - visiblePortion:首屏可见区域占比(百分比) - h1Share:HTTP/1协议流量占比(百分比) - h2Share:HTTP/2协议流量占比(百分比) - hqShare:QUIC协议流量占比(百分比) - pushShare:HTTP/2 Server Push(服务器推送)流量占比(百分比) - nbRes:网页总资源对象数量 - nbResNA:来自北美地区的资源对象数量 - nbResSA:来自南美地区的资源对象数量 - nbResEU:来自欧洲地区的资源对象数量 - nbResAS:来自亚洲地区的资源对象数量 - nbResAF:来自非洲地区的资源对象数量 - nbResOC:来自大洋洲地区的资源对象数量 - nbResUKN:来源未知的资源对象数量 - nbHTTPS:使用HTTPS连接的资源对象数量 - nbHTTP:使用HTTP连接的资源对象数量 - nbDomNA:来自北美地区的独立域名数量 - nbDomSA:来自南美地区的独立域名数量 - nbDomEU:来自欧洲地区的独立域名数量 - nbDomAS:来自亚洲地区的独立域名数量 - nbDomAF:来自非洲地区的独立域名数量 - nbDomOC:来自大洋洲地区的独立域名数量 - firstPaint:首次渲染时间(毫秒) - tfvr:全视觉渲染完成时间(Time for Full Visual Rendering, tfvr)(毫秒) - dom:DOM加载时间(毫秒) - plt:页面总加载时间(Page Load Time, plt)(毫秒) - machine:测试设备名称(包含部署地点信息) - categoryType:网页所属类别 - pageSize:网页总大小(字节) - receiveTime:HTTP存档格式(HAR)总接收时长(毫秒) - transferRate:传输速率(比特每秒,bps) - id:本次实验的唯一标识 - config:(所用浏览器、请求协议、AdBlock启用状态、网络接口、窗口尺寸、测试设备、目标网页)组合的唯一标识,即包含目标网页的探针配置项
提供机构:
figshare
创建时间:
2020-11-02
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作