five

SmartEval: A Benchmark for Evaluating LLM-Generated Smart Contracts from Natural Language Specifications

收藏
DataCite Commons2026-05-06 更新2026-05-07 收录
下载链接:
https://zenodo.org/doi/10.5281/zenodo.20046037
下载链接
链接失效反馈
官方服务:
资源简介:
SmartEval is a benchmark dataset for evaluating the quality of Solidity smart contracts generated by large language models (LLMs) from natural language specifications. It contains 9,000 LLM-generated contracts paired with expert-written ground-truth implementations drawn from the FSM-SCG dataset, along with five-dimensional quality evaluations, security audit reports, ABI artifacts, and compilation results for each contract. The five-dimensional rubric covers functional completeness (25%), variable fidelity (15%), state machine correctness (15%), business logic fidelity (35%), and code quality (10%). All composite scores are deterministically recomputed from raw metric values to ensure reproducibility. The average composite score across generated contracts is 81.54 with an 86.54% compilation success rate. The full dataset is hosted on Kaggle at: https://www.kaggle.com/datasets/neurips4242/smarteval-llm-generated-smart-contract-benchmark This Zenodo record provides a persistent DOI and houses the dataset README and Croissant metadata file. The benchmark was validated via three independent empirical studies: a five-condition ablation study (N=300 per condition), human expert evaluation by three PhD researchers (automated scores within 0.34 points of expert judgment), and external security validation via the Slither static analyzer (79.4% category agreement).
提供机构:
Zenodo
创建时间:
2026-05-06
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作