未明确提供
收藏arXiv2020-07-25 更新2024-08-06 收录
下载链接:
http://arxiv.org/abs/2003.10485v2
下载链接
链接失效反馈官方服务:
资源简介:
本研究涉及的数据集是为神经程序合成创建的合成数据集,用于训练和测试程序合成模型。数据集包含300,000个编程示例问题(PBE问题),这些问题是通过一种进化算法生成的,旨在提高模型在不同数据分布上的泛化能力。数据集的创建过程涉及使用进化算法来发现模型表现不佳的数据分布,并从中添加问题到训练集中。该数据集的应用领域是程序合成,特别是通过示例编程(PBE),旨在解决模型在非随机生成数据上的泛化问题。
The dataset employed in this research is a synthetic dataset developed for neural program synthesis, intended for training and testing program synthesis models. It consists of 300,000 programming-by-example (PBE) problems, which are generated using an evolutionary algorithm with the objective of improving the generalization performance of models across different data distributions. The dataset creation process utilizes evolutionary algorithms to identify data distributions where models underperform, and adds corresponding problems to the training dataset. The application domain of this dataset is program synthesis, particularly programming-by-example (PBE), and it is designed to address the generalization issues of models when processing non-randomly generated data.
提供机构:
Wayfair Research 东北大学计算机科学学院
创建时间:
2020-03-24



