fm-universe/FM-bench
收藏Hugging Face2025-06-11 更新2025-07-05 收录
下载链接:
https://hf-mirror.com/datasets/fm-universe/FM-bench
下载链接
链接失效反馈官方服务:
资源简介:
FM-Bench数据集是一个用于评估自然语言要求到可验证形式证明的LLM模型的基准测试集,包含六个形式化验证相关任务:需求分析、证明/模型生成、证明片段生成、证明完成、证明填充和代码到证明。数据集使用五种形式化规范语言:ACSL、TLA、Cog、Dafny和Lean4。
The FM-Bench dataset is a benchmark for evaluating LLM models on natural language requirements to verifiable formal proofs, including six formal-verification-related tasks: Requirement Analysis, Proof/Model Generation, Proof segment generation, Proof Completion, Proof Infilling, and Code 2 Proof. The dataset uses five formal specification languages: ACSL, TLA, Cog, Dafny, and Lean4.
提供机构:
fm-universe



