LLVM IR Dataset
收藏arXiv2025-09-30 收录
下载链接:
https://github.com/Panhaolin2001/Compiler-R1
下载链接
链接失效反馈官方服务:
资源简介:
该数据集是一个经过整合的LLVM IR数据集,用于训练和评估Compiler-R1模型,并经过筛选,仅包含少于1万条IR指令的程序。该数据集不仅用于在六个CompilerGym数据集上进行训练,还用于在包括blas、cbench、chstone、mibench、npb、opencv和tensorflow在内的七个测试套件上进行评估。所涉及程序规模均为少于1万条IR指令,任务重点是编译器自动调优。
This dataset is a curated LLVM IR dataset for training and evaluating the Compiler-R1 model. It has been filtered to only include programs with fewer than 10,000 IR instructions. This dataset is not only used for training on six CompilerGym datasets, but also for evaluation across seven test suites including BLAS, cbench, CHStone, MiBench, NPB, OpenCV, and TensorFlow. All programs contained herein have fewer than 10,000 IR instructions, and the core task focuses on compiler auto-tuning.



