Borg Traces dataset
收藏DataCite Commons2024-07-16 更新2024-08-19 收录
下载链接:
https://figshare.com/articles/dataset/Borg_Traces_dataset/26308690
下载链接
链接失效反馈官方服务:
资源简介:
ClusterData 2019 traces<i>John Wilkes.</i>The <code>clusterdata-2019</code> trace dataset provides information about eight different Borg cells for the month of May 2019. It includes the following new information:CPU usage information histograms for each 5 minute period, not just a point sample;information about alloc sets (shared resource reservations used by jobs);job-parent information for master/worker relationships such as MapReduce jobs.The 2019 traces focus on resource requests and usage, and contain no information about end users, their data, or access patterns to storage systems and other services.Because of it's size (about 2.4TiB compressed), we are only making the trace data available via Google BigQuery so that sophisticated analyses can be performed without requiring local resources.<b>The </b><code><strong>clusterdata-2019</strong></code><b> traces are described in this document: </b><b>Google cluster-usage traces v3</b><b>.</b> You can find the download and access instructions there, as well as many more details about what is in the traces, and how to interpret them. For additional background information, please refer to the 2015 Borg paper, Large-scale cluster management at Google with Borg.
ClusterData 2019 追踪数据集<i>John Wilkes.</i> 本<code>clusterdata-2019</code>追踪数据集涵盖了2019年5月期间8个独立Borg计算单元的相关信息。其包含如下新增数据内容:每5分钟周期的CPU使用率直方图(而非仅单点采样数据)、任务分配集(alloc sets,即作业使用的共享资源预留机制)相关信息,以及诸如MapReduce作业这类主从(master/worker)关系的作业父级关联信息。本2019版追踪数据聚焦于资源请求与使用情况,未包含终端用户、其关联数据,或存储系统与其他服务的访问模式等信息。由于该数据集压缩后体积约为2.4TiB,我们仅通过Google BigQuery提供该追踪数据的访问服务,以便研究者无需本地资源即可开展复杂分析工作。<b>本</b><code><strong>clusterdata-2019</strong></code><b>追踪数据集的详细说明可参阅文档:</b><strong>Google集群使用追踪数据v3</strong><b>.</b>您可在该文档中获取数据下载与访问指南,同时还能获得关于追踪数据内容、解读方式的更多细节。如需了解更多背景信息,请参阅2015年发表的Borg相关论文《Large-scale cluster management at Google with Borg》。
提供机构:
figshare
创建时间:
2024-07-16
搜集汇总
数据集介绍

背景与挑战
背景概述
Borg Traces dataset包含2019年5月八个Borg单元的资源使用追踪数据,特别提供CPU使用直方图、资源预留和作业关系信息,适用于大规模集群管理分析。数据仅通过Google BigQuery访问,排除了用户隐私相关数据。
以上内容由遇见数据集搜集并总结生成



