coq-facts-props-proofs
收藏arXiv2024-04-02 更新2024-06-21 收录
下载链接:
https://huggingface.co/datasets/florath/coq-facts-props-proofs-gen0-v1
下载链接
链接失效反馈官方服务:
资源简介:
本数据集名为‘coq-facts-props-proofs’,由德国亚琛的弗洛纳特尔有限公司 & 公司创建,旨在提升大型语言模型(LLMs)在Coq代码理解和生成方面的能力。该数据集包含超过10,000个Coq源文件,涵盖广泛的命题、证明和定义,并附带元数据,如来源引用和许可信息。数据集的创建过程涉及从多个在线源精心收集Coq源文件,并使用定制的OCaml解析器进行预处理,以分离Coq句子、移除注释和指令。该数据集主要应用于促进LLMs在自动定理证明领域的发展,特别是在生成语法正确和语义有意义的Coq构造方面。
This dataset named "coq-facts-props-proofs" was created by Florenatal GmbH & Co. KG based in Aachen, Germany, with the goal of enhancing the capabilities of Large Language Models (LLMs) in understanding and generating Coq code. This dataset includes over 10,000 Coq source files, covering a broad spectrum of propositions, proofs and definitions, and is accompanied by metadata such as source citations and licensing information. The dataset creation process entails meticulously gathering Coq source files from multiple online sources, followed by preprocessing using a custom OCaml parser to separate Coq sentences, remove comments and directives. This dataset is primarily utilized to promote the development of LLMs in the field of automated theorem proving, particularly for generating syntactically correct and semantically meaningful Coq constructs.
提供机构:
弗洛纳特尔有限公司 & 公司, 亚琛, 德国
创建时间:
2024-03-19



