Planetarium

arXiv2025-09-30 收录

下载链接：

https://huggingface.co/datasets/batsresearch/planetarium

下载链接

链接失效反馈

官方服务：

资源简介：

该数据集是一个旨在评估语言模型从自然语言描述的规划任务中生成PDDL代码能力的基准测试，包含了来自13个不同任务的132,037个文本到PDDL代码的配对。该数据集旨在系统地评估语言模型在将自然语言描述转化为语义正确的PDDL代码这一具有挑战性任务上的表现。其规模达到了132,037个配对，任务内容涉及将规划任务的天然语言描述翻译成结构化规划语言（PDDL）。

This dataset is a benchmark designed to evaluate the ability of language models to generate PDDL code from planning tasks described in natural language, and it contains 132,037 text-to-PDDL code pairs from 13 distinct tasks. This benchmark aims to systematically assess the performance of language models on the challenging task of translating natural language descriptions into semantically correct PDDL code. With a total of 132,037 pairs, this dataset covers tasks that involve translating natural language descriptions of planning tasks into the structured planning language PDDL.

5,000+

优质数据集

54 个

任务类型

进入经典数据集