Mascinissa/LOOPerSet
收藏Hugging Face2026-02-16 更新2025-10-25 收录
下载链接:
https://hf-mirror.com/datasets/Mascinissa/LOOPerSet
下载链接
链接失效反馈官方服务:
资源简介:
LOOPerSet是一个大规模的公共数据集,用于基于机器学习的编译器优化。它提供了标记的性能数据,用于训练和评估预测代码转换效果的模型。数据集包含超过2800万个标记数据点,来自大约22万个独特的、合成的循环嵌套。每个数据点包含一个程序、特定序列的循环变换(例如融合、平铺、倾斜、并行化)以及其结果的地面真实性能测量。变换序列使用多面体编译框架生成,以确保它们是合法的并且语义保持不变。LOOPerSet最初是为了训练[LOOPer自动调度程序](https://tbd)(PACT 25)的成本模型而创建的。有关生成过程的完整描述和多样性分析,请参阅我们的[arXiv上的配套论文](https://arxiv.org/abs/xxxx.xxxxx)。
LOOPerSet is a large-scale public dataset for machine learning-based compiler optimization. It provides labeled performance data for training and evaluating models that predict the effects of code transformations. The dataset contains over 28 million labeled data points derived from approximately 220,000 unique, synthetically generated loop nests. Each data point consists of a program, a specific sequence of applied loop transformations (e.g., fusion, tiling, skewing, parallelization), and its resulting ground-truth performance measurement. Transformation sequences were generated using a polyhedral compilation framework to ensure they were legal and semantics-preserving. LOOPerSet was originally created to train the cost model for the [LOOPer autoscheduler](https://tbd) (PACT 25). For a full description of the generation process and a diversity analysis, please see our [companion paper on arXiv](https://arxiv.org/abs/xxxx.xxxxx).
提供机构:
Mascinissa



