RTLCoder Dataset
收藏arXiv2025-09-30 收录
下载链接:
https://huggingface.co/collections/AI4EDA-CASE
下载链接
链接失效反馈官方服务:
资源简介:
该数据集源自RTLCoder数据集,并经过了语法有效性和提示匹配的筛选过程,其中包含了生成的推理步骤和测试平台输出。该数据集还包括至少100个测试用例的测试平台,以确保最佳覆盖范围,并且它与GRPO强化学习框架结合使用。在规模上,该数据集总共有1,892个样本,分为1,149个难度较高的样本和743个较简单的样本。其任务是进行RTL代码生成。
This dataset is derived from the RTLCoder dataset and has undergone a filtering process based on syntactic validity and prompt matching. It contains generated reasoning steps and test platform outputs. The dataset also includes a test platform with at least 100 test cases to ensure optimal coverage, and it is used in conjunction with the GRPO reinforcement learning framework. In terms of scale, the dataset has a total of 1,892 samples, which are divided into 1,149 high-difficulty samples and 743 relatively simple samples. Its task is RTL code generation.



