SmartEval: A Benchmark for Evaluating LLM-Generated Smart Contracts from Natural Language Specifications
收藏DataCite Commons2026-05-06 更新2026-05-07 收录
下载链接:
https://zenodo.org/doi/10.5281/zenodo.20046036
下载链接
链接失效反馈官方服务:
资源简介:
SmartEval is a benchmark dataset for evaluating the quality of Solidity smart contracts generated by large language models (LLMs) from natural language specifications. It contains 9,000 LLM-generated contracts paired with expert-written ground-truth implementations drawn from the FSM-SCG dataset, along with five-dimensional quality evaluations, security audit reports, ABI artifacts, and compilation results for each contract.
The five-dimensional rubric covers functional completeness (25%), variable fidelity (15%), state machine correctness (15%), business logic fidelity (35%), and code quality (10%). All composite scores are deterministically recomputed from raw metric values to ensure reproducibility. The average composite score across generated contracts is 81.54 with an 86.54% compilation success rate.
The full dataset is hosted on Kaggle at: https://www.kaggle.com/datasets/neurips4242/smarteval-llm-generated-smart-contract-benchmark
This Zenodo record provides a persistent DOI and houses the dataset README and Croissant metadata file. The benchmark was validated via three independent empirical studies: a five-condition ablation study (N=300 per condition), human expert evaluation by three PhD researchers (automated scores within 0.34 points of expert judgment), and external security validation via the Slither static analyzer (79.4% category agreement).
提供机构:
Zenodo
创建时间:
2026-05-06



