five

"SAGE: Specification-Aware Grammar Extraction for Automated Test Case Generation with LLMs"

收藏
DataCite Commons2026-03-04 更新2026-05-03 收录
下载链接:
https://ieee-dataport.org/documents/sage-specification-aware-grammar-extraction-automated-test-case-generation-llms
下载链接
链接失效反馈
官方服务:
资源简介:
"SAGE (Specification-Aware Grammar Extraction) is a novel framework that fundamentally advances automated test case generation (ATCG) for competitive programming. While prior state-of-the-art methods, such as LogiCase, introduced Context-Free Grammars with Counters (CCFGs) to formalize input constraints, they relied heavily on supervised fine-tuning (SFT), which is often hampered by the scarcity of high-quality, ground-truth labeled grammars. SAGE overcomes these limitations through three primary technical contributions. First, we propose a verifiable reward-guided reinforcement learning strategy utilizing Group Relative Policy Optimization (GRPO). By introducing the metric of \"well-formedness,\" SAGE enables the induction of robust syntactic patterns from unlabeled specifications, effectively mitigating data scarcity. Second, unlike conventional methods that rely on fixed-distribution random sampling, SAGE implements an LLM-driven adversarial strategy to dynamically select test case complexity based on the target solution\u2019s logic. This enables the generation of strategically adversarial test cases tailored to expose latent vulnerabilities in specific code implementations. Experimental results demonstrate that SAGE significantly outperforms 18 baseline LLMs, achieving a 15.92%p improvement in grammar validity and a 10.37%p gain in test effectiveness over existing methods. These contributions provide a more reliable, interpretable, and scalable foundation for software testing in complex algorithmic domains."
提供机构:
IEEE DataPort
创建时间:
2026-03-04
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作