five

piercemaloney/coqgym_coq_projects_v2

收藏
Hugging Face2024-03-21 更新2024-06-11 收录
下载链接:
https://hf-mirror.com/datasets/piercemaloney/coqgym_coq_projects_v2
下载链接
链接失效反馈
官方服务:
资源简介:
--- dataset_info: features: - name: text dtype: string splits: - name: finmap num_bytes: 745110 num_examples: 3 - name: GeometricAlgebra num_bytes: 2180457 num_examples: 8 - name: bdds num_bytes: 11537326 num_examples: 15 - name: concat num_bytes: 1876052 num_examples: 90 - name: topology num_bytes: 1998914 num_examples: 32 - name: euler_formula num_bytes: 2157257 num_examples: 3 - name: ruler_compass_geometry num_bytes: 531800 num_examples: 39 - name: fcsl_pcm num_bytes: 1911756 num_examples: 12 - name: twoSquare num_bytes: 219009 num_examples: 2 - name: zfc num_bytes: 605621 num_examples: 11 - name: shuffle num_bytes: 87431 num_examples: 6 - name: metalib num_bytes: 190046 num_examples: 11 - name: hardware num_bytes: 185840 num_examples: 25 - name: three_gap num_bytes: 315458 num_examples: 7 - name: coq_ext_lib num_bytes: 311782 num_examples: 47 - name: cheerios num_bytes: 140210 num_examples: 6 - name: regexp num_bytes: 116907 num_examples: 7 - name: coq_library_undecidability num_bytes: 3089099 num_examples: 93 - name: automata num_bytes: 1075636 num_examples: 25 - name: coquelicot num_bytes: 6616788 num_examples: 23 - name: izf num_bytes: 146423 num_examples: 8 - name: lemma_overloading num_bytes: 1075228 num_examples: 27 - name: lin_alg num_bytes: 6760336 num_examples: 68 - name: railroad_crossing num_bytes: 202543 num_examples: 1 - name: idxassoc num_bytes: 139339 num_examples: 2 - name: hoare_tut num_bytes: 39069 num_examples: 3 - name: lesniewski_mereology num_bytes: 123636 num_examples: 2 - name: verdi num_bytes: 12479953 num_examples: 28 - name: additions num_bytes: 147048 num_examples: 20 - name: checker num_bytes: 6689 num_examples: 2 - name: VST num_bytes: 38787573 num_examples: 292 - name: domain_theory num_bytes: 67545 num_examples: 4 - name: propcalc num_bytes: 54422 num_examples: 5 - name: circuits num_bytes: 399265 num_examples: 19 - name: CompCert num_bytes: 14680669 num_examples: 142 - name: area_method num_bytes: 1578799 num_examples: 38 - name: bbv num_bytes: 705745 num_examples: 13 - name: ails num_bytes: 521926 num_examples: 11 - name: dep_map num_bytes: 24951 num_examples: 2 - name: ChargeCore num_bytes: 273913 num_examples: 20 - name: markov num_bytes: 102887 num_examples: 1 - name: rsa num_bytes: 180155 num_examples: 5 - name: verdi_raft num_bytes: 21320585 num_examples: 107 - name: goedel num_bytes: 9110441 num_examples: 44 - name: bigenough num_bytes: 5031 num_examples: 1 - name: generic_environments num_bytes: 142529 num_examples: 2 - name: disel num_bytes: 4787193 num_examples: 37 - name: ctltctl num_bytes: 61038 num_examples: 3 - name: coq_list_string num_bytes: 6409 num_examples: 3 - name: QuickChick num_bytes: 718709 num_examples: 17 - name: schroeder num_bytes: 24958 num_examples: 4 - name: lazy_pcf num_bytes: 497224 num_examples: 13 - name: weak_up_to num_bytes: 175365 num_examples: 10 - name: groups num_bytes: 11115 num_examples: 1 - name: pocklington num_bytes: 617483 num_examples: 13 - name: mini_compiler num_bytes: 14459 num_examples: 1 - name: StructTact num_bytes: 293887 num_examples: 17 - name: exceptions num_bytes: 8962 num_examples: 1 - name: coqrel num_bytes: 196290 num_examples: 12 - name: higman_s num_bytes: 198793 num_examples: 5 - name: bellantonicook num_bytes: 3852234 num_examples: 16 - name: rem num_bytes: 14504 num_examples: 1 - name: tree_automata num_bytes: 1972575 num_examples: 17 - name: coq_procrastination num_bytes: 26763 num_examples: 1 - name: higman_cf num_bytes: 49720 num_examples: 2 - name: GeoCoq num_bytes: 12090435 num_examples: 328 - name: coqoban num_bytes: 38285 num_examples: 1 - name: search_trees num_bytes: 59332 num_examples: 5 - name: system num_bytes: 2053 num_examples: 1 - name: ieee754 num_bytes: 35416 num_examples: 3 - name: jordan_curve_theorem num_bytes: 12530912 num_examples: 10 - name: huffman num_bytes: 1634130 num_examples: 25 - name: zf num_bytes: 933118 num_examples: 10 - name: hedges num_bytes: 360884 num_examples: 1 - name: zorns_lemma num_bytes: 608752 num_examples: 19 - name: tortoise_hare_algorithm num_bytes: 10084 num_examples: 1 - name: mod_red num_bytes: 636981 num_examples: 5 - name: UnifySL num_bytes: 1716822 num_examples: 128 - name: traversable_fincontainer num_bytes: 429019 num_examples: 1 - name: buchberger num_bytes: 2422607 num_examples: 29 - name: constructive_geometry num_bytes: 80179 num_examples: 7 - name: tarski_geometry num_bytes: 112419 num_examples: 8 - name: int_map num_bytes: 817835 num_examples: 13 - name: float num_bytes: 2994074 num_examples: 31 - name: InfSeqExt num_bytes: 106656 num_examples: 5 - name: zchinese num_bytes: 62626 num_examples: 6 - name: smc num_bytes: 6045540 num_examples: 15 - name: pts num_bytes: 72482 num_examples: 8 - name: param_pi num_bytes: 2596347 num_examples: 11 - name: axiomatic_abp num_bytes: 1204713 num_examples: 7 - name: lambda num_bytes: 181085 num_examples: 10 - name: maths num_bytes: 37685 num_examples: 3 - name: quicksort_complexity num_bytes: 489164 num_examples: 28 - name: fssec_model num_bytes: 1569135 num_examples: 25 - name: ipc num_bytes: 3108901 num_examples: 31 - name: chinese num_bytes: 208365 num_examples: 13 - name: cours_de_coq num_bytes: 71295 num_examples: 11 - name: graphs num_bytes: 644609 num_examples: 2 - name: dictionaries num_bytes: 67746 num_examples: 1 - name: dblib num_bytes: 195344 num_examples: 6 - name: cecoa num_bytes: 2449356 num_examples: 14 - name: corespec num_bytes: 1222339 num_examples: 28 - name: free_groups num_bytes: 63973 num_examples: 1 - name: ramsey num_bytes: 11734 num_examples: 1 - name: qarith num_bytes: 51068 num_examples: 2 - name: math_comp num_bytes: 44708646 num_examples: 76 - name: amm11262 num_bytes: 365079 num_examples: 5 - name: angles num_bytes: 322579 num_examples: 5 - name: orb_stab num_bytes: 204783 num_examples: 1 - name: qarith_stern_brocot num_bytes: 10873352 num_examples: 35 - name: Categories num_bytes: 4945 num_examples: 1 - name: group_theory num_bytes: 104940 num_examples: 10 - name: demos num_bytes: 62208 num_examples: 5 - name: distributed_reference_counting num_bytes: 8888400 num_examples: 74 - name: subst num_bytes: 362195 num_examples: 17 - name: miniml num_bytes: 114099 num_examples: 1 - name: algebra num_bytes: 3275753 num_examples: 65 - name: fermat4 num_bytes: 172156 num_examples: 5 - name: otway_rees num_bytes: 226052 num_examples: 19 - name: SCEV_coq num_bytes: 89902 num_examples: 1 - name: PolTac num_bytes: 157370 num_examples: 13 - name: fundamental_arithmetics num_bytes: 308733 num_examples: 8 download_size: 37361889 dataset_size: 286711572 configs: - config_name: default data_files: - split: finmap path: data/finmap-* - split: GeometricAlgebra path: data/GeometricAlgebra-* - split: bdds path: data/bdds-* - split: concat path: data/concat-* - split: topology path: data/topology-* - split: euler_formula path: data/euler_formula-* - split: ruler_compass_geometry path: data/ruler_compass_geometry-* - split: fcsl_pcm path: data/fcsl_pcm-* - split: twoSquare path: data/twoSquare-* - split: zfc path: data/zfc-* - split: shuffle path: data/shuffle-* - split: metalib path: data/metalib-* - split: hardware path: data/hardware-* - split: three_gap path: data/three_gap-* - split: coq_ext_lib path: data/coq_ext_lib-* - split: cheerios path: data/cheerios-* - split: regexp path: data/regexp-* - split: coq_library_undecidability path: data/coq_library_undecidability-* - split: automata path: data/automata-* - split: coquelicot path: data/coquelicot-* - split: izf path: data/izf-* - split: lemma_overloading path: data/lemma_overloading-* - split: lin_alg path: data/lin_alg-* - split: railroad_crossing path: data/railroad_crossing-* - split: idxassoc path: data/idxassoc-* - split: hoare_tut path: data/hoare_tut-* - split: lesniewski_mereology path: data/lesniewski_mereology-* - split: verdi path: data/verdi-* - split: additions path: data/additions-* - split: checker path: data/checker-* - split: VST path: data/VST-* - split: domain_theory path: data/domain_theory-* - split: propcalc path: data/propcalc-* - split: circuits path: data/circuits-* - split: CompCert path: data/CompCert-* - split: area_method path: data/area_method-* - split: bbv path: data/bbv-* - split: ails path: data/ails-* - split: dep_map path: data/dep_map-* - split: ChargeCore path: data/ChargeCore-* - split: markov path: data/markov-* - split: rsa path: data/rsa-* - split: verdi_raft path: data/verdi_raft-* - split: goedel path: data/goedel-* - split: bigenough path: data/bigenough-* - split: generic_environments path: data/generic_environments-* - split: disel path: data/disel-* - split: ctltctl path: data/ctltctl-* - split: coq_list_string path: data/coq_list_string-* - split: QuickChick path: data/QuickChick-* - split: schroeder path: data/schroeder-* - split: lazy_pcf path: data/lazy_pcf-* - split: weak_up_to path: data/weak_up_to-* - split: groups path: data/groups-* - split: pocklington path: data/pocklington-* - split: mini_compiler path: data/mini_compiler-* - split: StructTact path: data/StructTact-* - split: exceptions path: data/exceptions-* - split: coqrel path: data/coqrel-* - split: higman_s path: data/higman_s-* - split: bellantonicook path: data/bellantonicook-* - split: rem path: data/rem-* - split: tree_automata path: data/tree_automata-* - split: coq_procrastination path: data/coq_procrastination-* - split: higman_cf path: data/higman_cf-* - split: GeoCoq path: data/GeoCoq-* - split: coqoban path: data/coqoban-* - split: search_trees path: data/search_trees-* - split: system path: data/system-* - split: ieee754 path: data/ieee754-* - split: jordan_curve_theorem path: data/jordan_curve_theorem-* - split: huffman path: data/huffman-* - split: zf path: data/zf-* - split: hedges path: data/hedges-* - split: zorns_lemma path: data/zorns_lemma-* - split: tortoise_hare_algorithm path: data/tortoise_hare_algorithm-* - split: mod_red path: data/mod_red-* - split: UnifySL path: data/UnifySL-* - split: traversable_fincontainer path: data/traversable_fincontainer-* - split: buchberger path: data/buchberger-* - split: constructive_geometry path: data/constructive_geometry-* - split: tarski_geometry path: data/tarski_geometry-* - split: int_map path: data/int_map-* - split: float path: data/float-* - split: InfSeqExt path: data/InfSeqExt-* - split: zchinese path: data/zchinese-* - split: smc path: data/smc-* - split: pts path: data/pts-* - split: param_pi path: data/param_pi-* - split: axiomatic_abp path: data/axiomatic_abp-* - split: lambda path: data/lambda-* - split: maths path: data/maths-* - split: quicksort_complexity path: data/quicksort_complexity-* - split: fssec_model path: data/fssec_model-* - split: ipc path: data/ipc-* - split: chinese path: data/chinese-* - split: cours_de_coq path: data/cours_de_coq-* - split: graphs path: data/graphs-* - split: dictionaries path: data/dictionaries-* - split: dblib path: data/dblib-* - split: cecoa path: data/cecoa-* - split: corespec path: data/corespec-* - split: free_groups path: data/free_groups-* - split: ramsey path: data/ramsey-* - split: qarith path: data/qarith-* - split: math_comp path: data/math_comp-* - split: amm11262 path: data/amm11262-* - split: angles path: data/angles-* - split: orb_stab path: data/orb_stab-* - split: qarith_stern_brocot path: data/qarith_stern_brocot-* - split: Categories path: data/Categories-* - split: group_theory path: data/group_theory-* - split: demos path: data/demos-* - split: distributed_reference_counting path: data/distributed_reference_counting-* - split: subst path: data/subst-* - split: miniml path: data/miniml-* - split: algebra path: data/algebra-* - split: fermat4 path: data/fermat4-* - split: otway_rees path: data/otway_rees-* - split: SCEV_coq path: data/SCEV_coq-* - split: PolTac path: data/PolTac-* - split: fundamental_arithmetics path: data/fundamental_arithmetics-* ---
提供机构:
piercemaloney
原始信息汇总

数据集概述

数据集特征

  • 名称: text
  • 数据类型: string

数据集拆分

拆分名称 字节数 示例数
finmap 745110 3
GeometricAlgebra 2180457 8
bdds 11537326 15
concat 1876052 90
topology 1998914 32
euler_formula 2157257 3
ruler_compass_geometry 531800 39
fcsl_pcm 1911756 12
twoSquare 219009 2
zfc 605621 11
shuffle 87431 6
metalib 190046 11
hardware 185840 25
three_gap 315458 7
coq_ext_lib 311782 47
cheerios 140210 6
regexp 116907 7
coq_library_undecidability 3089099 93
automata 1075636 25
coquelicot 6616788 23
izf 146423 8
lemma_overloading 1075228 27
lin_alg 6760336 68
railroad_crossing 202543 1
idxassoc 139339 2
hoare_tut 39069 3
lesniewski_mereology 123636 2
verdi 12479953 28
additions 147048 20
checker 6689 2
VST 38787573 292
domain_theory 67545 4
propcalc 54422 5
circuits 399265 19
CompCert 14680669 142
area_method 1578799 38
bbv 705745 13
ails 521926 11
dep_map 24951 2
ChargeCore 273913 20
markov 102887 1
rsa 180155 5
verdi_raft 21320585 107
goedel 9110441 44
bigenough 5031 1
generic_environments 142529 2
disel 4787193 37
ctltctl 61038 3
coq_list_string 6409 3
QuickChick 718709 17
schroeder 24958 4
lazy_pcf 497224 13
weak_up_to 175365 10
groups 11115 1
pocklington 617483 13
mini_compiler 14459 1
StructTact 293887 17
exceptions 8962 1
coqrel 196290 12
higman_s 198793 5
bellantonicook 3852234 16
rem 14504 1
tree_automata 1972575 17
coq_procrastination 26763 1
higman_cf 49720 2
GeoCoq 12090435 328
coqoban 38285 1
search_trees 59332 5
system 2053 1
ieee754 35416 3
jordan_curve_theorem 12530912 10
huffman 1634130 25
zf 933118 10
hedges 360884 1
zorns_lemma 608752 19
tortoise_hare_algorithm 10084 1
mod_red 636981 5
UnifySL 1716822 128
traversable_fincontainer 429019 1
buchberger 2422607 29
constructive_geometry 80179 7
tarski_geometry 112419 8
int_map 817835 13
float 2994074 31
InfSeqExt 106656 5
zchinese 62626 6
smc 6045540 15
pts 72482 8
param_pi 2596347 11
axiomatic_abp 1204713 7
lambda 181085 10
maths 37685 3
quicksort_complexity 489164 28
fssec_model 1569135 25
ipc 3108901 31
chinese 208365 13
cours_de_coq 71295 11
graphs 644609 2
dictionaries 67746 1
dblib 195344 6
cecoa 2449356 14
corespec 1222339 28
free_groups 63973 1
ramsey 11734 1
qarith 51068 2
math_comp 44708646 76
amm11262 365079 5
angles 322579 5
orb_stab 204783 1
qarith_stern_brocot 10873352 35
Categories 4945 1
group_theory 104940 10
demos 62208 5
distributed_reference_counting 8888400 74
subst 362195 17
miniml 114099 1
algebra 3275753 65
fermat4 172156 5
otway_rees 226052 19
SCEV_coq 89902 1
PolTac 157370 13
fundamental_arithmetics 308733 8

数据集大小

  • 下载大小: 37361889字节
  • 数据集总大小: 286711572字节
搜集汇总
数据集介绍
main_image_url
构建方式
在形式化验证领域,Coq证明助手积累了丰富的开源项目资源。该数据集通过系统性地收集和整理Coq社区中的公开项目代码构建而成,涵盖了从基础数学理论到复杂算法验证的广泛主题。构建过程中,项目被依据其所属库或专题进行划分,每个子集对应一个独立的Coq开发项目,确保了数据来源的结构化与可追溯性。原始代码经过统一提取和格式化,保留了完整的证明脚本与定义,形成了以文本形式存储的标准化语料库。
使用方法
研究人员可通过HuggingFace数据集库直接加载该数据集,利用其标准接口访问各个项目子集。每个子集以纯文本格式提供,可直接用于训练或评估与Coq代码相关的机器学习模型,如证明自动补全、策略预测或代码摘要生成。在实际应用中,用户可以根据`configs`中定义的路径按分割名称加载特定领域的证明代码。该数据集适用于数据驱动的形式化方法研究,能够支持从代码语义理解到交互式证明自动化等一系列前沿探索。
背景与挑战
背景概述
在形式化验证领域,Coq交互式定理证明器作为关键工具,其应用广泛覆盖数学定理证明与软件正确性验证。数据集piercemaloney/coqgym_coq_projects_v2由研究人员Pierce Maloney等人构建,旨在为自动化定理证明研究提供丰富的Coq代码资源。该数据集汇集了多个Coq项目,涵盖几何、代数、逻辑及计算机科学等主题,通过结构化呈现证明脚本,支持机器学习模型学习定理证明策略,推动智能辅助证明系统的发展。
当前挑战
该数据集致力于解决自动化定理证明中的核心挑战,即如何让机器理解并生成复杂的数学证明。其构建过程面临多重困难:数据源自分散的Coq项目,需统一处理多样化的代码风格与依赖结构;证明步骤往往隐含深层逻辑,标注与解析要求极高的领域专业知识;此外,数据规模与质量平衡亦需谨慎考量,以确保模型训练的泛化能力与可靠性。
常用场景
经典使用场景
在形式化验证领域,Coq语言作为交互式定理证明器的核心工具,其项目数据集常被用于构建和评估自动化证明辅助系统。该数据集汇集了涵盖几何、代数、数论及分布式系统等多个数学与计算机科学分支的形式化证明代码,为研究人员提供了丰富的结构化语料库。经典使用场景包括训练机器学习模型以预测证明步骤或生成证明策略,从而加速定理证明过程,提升形式化验证的效率与可扩展性。
解决学术问题
该数据集有效应对了形式化方法中证明构造自动化程度不足的学术挑战。通过提供大规模、多样化的Coq证明代码,它支持研究如何利用数据驱动方法减少人工干预,解决证明搜索空间爆炸问题。其意义在于推动了定理证明与人工智能的交叉融合,为构建智能证明助手奠定了数据基础,显著促进了程序正确性验证与数学定理机械化证明的进展。
实际应用
在实际应用中,该数据集被广泛用于开发集成开发环境中的智能证明插件,辅助软件工程师编写高可靠性代码。例如,在操作系统内核、编译器及加密协议等安全关键系统的验证中,基于此数据集训练的模型能够建议证明策略或检测证明漏洞,增强代码的形式化保障。这些应用不仅降低了形式化验证的门槛,也为工业级软件的安全认证提供了技术支撑。
数据集最近研究
最新研究方向
在形式化验证与定理证明领域,Coq语言作为交互式证明助手的核心工具,其数据集piercemaloney/coqgym_coq_projects_v2汇聚了涵盖几何、代数、硬件验证及分布式系统等多元主题的Coq项目代码。当前研究前沿聚焦于利用此类结构化数据训练机器学习模型,以自动化生成证明策略或补全证明步骤,从而提升形式化验证的效率与可扩展性。随着人工智能在程序合成与符号推理中的深度融合,该数据集为探索神经定理证明、代码生成与形式化数学的交叉研究提供了丰富资源,推动了验证技术在安全关键系统与数学基础理论中的创新应用。
以上内容由遇见数据集搜集并总结生成
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作