birdsql/DSCT
收藏Hugging Face2025-12-29 更新2026-01-03 收录
下载链接:
https://hf-mirror.com/datasets/birdsql/DSCT
下载链接
链接失效反馈官方服务:
资源简介:
数据科学代码翻译(DSCT)数据集是一个专注于数据科学代码翻译的基准数据集,旨在促进不同数据科学库之间的代码转换,同时保持功能等效性。该数据集包含基础级和项目级两个互补设置下的任务,覆盖数据科学工作流程的三个核心阶段:数据查询、数据操作和深度学习。通过双向翻译任务,该数据集旨在评估和改进大型语言模型在数据科学代码翻译方面的性能。完整的数据集和论文将在未来几周内发布。
Data Science Code Translation (DSCT) is a benchmark dataset focused on translating code between different data science libraries while maintaining functional equivalence. The dataset includes tasks under two complementary settings: grounding-level and project-level, covering three core stages of data science workflows: Data Querying, Data Manipulation, and Deep Learning. Through bidirectional translation tasks, the dataset aims to evaluate and improve the performance of large language models in data science code translation. The complete dataset and paper will be released in the next few weeks.
提供机构:
birdsql



