VEnron2
收藏DataCite Commons2020-09-02 更新2024-07-25 收录
下载链接:
https://figshare.com/articles/dataset/VEnron2/4798042
下载链接
链接失效反馈官方服务:
资源简介:
VEnron2 is an industrial-scale and public spreadsheet corpus with version information, including 1,609 evolution groups and 12,254 spreadsheets. (Multiple versions originated from the same spreadsheet are considered as an evolution group.)<br>VEnron2 is a big improvement to VEnron1.1. We extrace much more evolution groups and spreadsheets from the Enron email archive, by using SpreadCluster.VErnon2 is published associated with our MSR 2017 paper in May 2017.<br>Liang Xu, Wensheng Dou, Chushu Gao, Jie Wang, Jun Wei, Hua Zhong, Tao Huang. SpreadCluster: Recovering Versioned Spreadsheets through Similarity-Based Clustering. In <i>Proceedings of the 14th International Conference on Mining Software Repositories</i> (<b><i>MSR 2017</i></b>), May 2017.
VEnron2是一款具备版本信息的工业级公开电子表格语料库,包含1609个演化组与12254份电子表格。(源自同一份电子表格的多个版本被归为一个演化组。)<br>VEnron2是对VEnron1.1的重大改进。本数据集通过SpreadCluster算法从安然邮件档案库中提取了远超VEnron1.1的演化组与电子表格数量,并于2017年5月随我们的MSR 2017论文正式发布。<br>徐亮、窦文胜、高楚舒、王杰、魏俊、钟华、黄涛。《SpreadCluster:基于相似度聚类的版本化电子表格恢复方法》,载于《第十四届软件仓库挖掘国际会议(MSR 2017)论文集》,2017年5月。
提供机构:
figshare
创建时间:
2017-03-29



