Lyra
收藏arXiv2025-09-30 收录
下载链接:
https://github.com/liangqingyuan/lyra
下载链接
链接失效反馈官方服务:
资源简介:
该数据集包含了2000个精心注释的Python数据库操作程序,这些程序内嵌了SQL语句,并且每个程序都配有一段中文注释和一段英文注释。该数据集真实反映了开发实践中的情况,代码片段主要集中在SQL操作上,并经过严格的注释过程以确保质量和相关性。规模上,该数据集包含了2000个示例,其任务是通过自然语言注释生成类似“Turducken”风格的代码。
This dataset comprises 2000 meticulously annotated Python database operation programs that embed SQL statements, with each program equipped with both a Chinese comment and an English comment. This dataset authentically mirrors real-world software development practices, where the code snippets primarily focus on SQL operations, and it has undergone a rigorous annotation process to ensure quality and relevance. Consisting of 2000 examples, this dataset is designed for the task of generating "Turducken"-style code using natural language annotations.
提供机构:
Research authors



