Spark application traces of the run of 9 differents instance of BigDataBench applications

Mendeley Data2024-03-27 更新2024-06-27 收录

下载链接：

https://zenodo.org/record/2555074

下载链接

链接失效反馈

官方服务：

资源简介：

Spark application logs of the run of 9 different instances of BigDataBench applications. Each run was done with an input size in GB of 32, 64, 128 and with 4, 8, 16 executor respectively. There were 2 executors per node; so, it runs on 2, 4, and 8 nodes + a master node for Hadoop services. The input data were generated with the Data generator provided by BigDataBench using this procedure: https://gitlab.inria.fr/mmercier/bebida/blob/master/experiments/generate_dataset/journal.md List of the applications and their parameters: Grep: parameter: "word" WordCount no parameters Kmean: parameter: "4 3" input size in GB: "32, 64, 128" BigDataBench implementation can be downloaded here: http://prof.ict.ac.cn/download.html It was run on Debian 8, with Spark 2.1.0, on top of Hadoop 2.7.1 with Yarn and HDFS, using openjdk-7-jre-headless. All the details of the environment can be found here: https://gitlab.inria.fr/mmercier/bebida/blob/master/environments/bebida-slave.yaml The experiment in itself is described here: https://gitlab.inria.fr/mmercier/bebida/tree/master/experiments/run_big_data_workload All nodes Hardware description of the nodes: https://public-api.grid5000.fr/stable/sites/nancy/clusters/graphene/nodes.json?pretty=1 They can be visualized using the Spark History server.

创建时间：

2023-06-28

5,000+

优质数据集

54 个

任务类型

进入经典数据集