five

JiayuJeff/CostBench

收藏
Hugging Face2026-04-09 更新2026-04-12 收录
下载链接:
https://hf-mirror.com/datasets/JiayuJeff/CostBench
下载链接
链接失效反馈
官方服务:
资源简介:
--- title: CostBench tags: - dataset - costbench - travel - queries --- # CostBench This dataset contains `381` records from `CostBench_queries.json`. The queries are derived from the official CostBench benchmark repository and follow its travel-task query schema. ## Contents Top-level fields: `query_id`, `TimeInfo`, `task`, `is_location`, `goal_type`, `preferences`, `groundtruth`, `validation_raw`, `is_valid`, `user_requirements`, `query` ## Field Guide - `query_id`: Unique identifier for each query. - `TimeInfo`: Time context used in the task prompt. It is an ID-style placeholder such as `<TimeInfo03119>`, not a real clock time. - `task`: Subtask name. Common values are `location`, `transportation`, `accommodation`, `attraction`, `dining`, and `shopping`. - `is_location`: Whether the query is a pure location-selection task. `1` means the task is about choosing a location; `0` means the query also depends on a location preference and then asks for another travel-related goal. - `goal_type`: Final target type for the task, such as `TravelLocation`, `TravelTransportation`, or `TravelShopping`. - `preferences`: Structured user constraints for the task. These are the key dimensions used by CostBench, typically including `category`, `tier`, `style`, and `feature_package`. - `location_preference`: The required location preference ID for non-location tasks. This field is absent or empty for pure location queries. - `groundtruth`: The correct answer label. In raw query data this is usually the ID of the target preference or final candidate. - `user_requirements`: Natural-language user request generated from the structured preferences. - `query.input`: The final prompt shown to the model. ## Dataset Shape Each row is a single CostBench query with a structured schema and a natural-language prompt. The file is stored as a JSON array, so it can be loaded directly with the Hugging Face `json` dataset builder. ## Loading locally ```python from datasets import load_dataset dataset = load_dataset("json", data_files="CostBench_queries.json", split="train") ``` ## Loading from the Hub ```python from datasets import load_dataset dataset = load_dataset( "json", data_files="https://huggingface.co/datasets/JiayuJeff/CostBench/resolve/main/CostBench_queries.json", split="train", ) ``` ## Example ```json { "query_id": "<Query00001>", "TimeInfo": "<TimeInfo03119>", "task": "location", "is_location": 1, "goal_type": "TravelLocation", "preferences": { "category": "city", "feature_package": "architectural_marvel", "style": "historic_and_traditional", "tier": "major_metropolis" }, "groundtruth": "<LocationPreference00001>", "validation_raw": "**no conflict**", "is_valid": 1, "user_requirements": "...", "query": { "input": "Please generate a travel plan ..." } } ``` ## Citing this work ```bibtex @article{liu2025costbench, title={CostBench: Evaluating Multi-Turn Cost-Optimal Planning and Adaptation in Dynamic Environments for LLM Tool-Use Agents}, author={Liu, Jiayu and Qian, Cheng and Su, Zhaochen and Zong, Qing and Huang, Shijue and He, Bingxiang and Fung, Yi R}, journal={arXiv preprint arXiv:2511.02734}, year={2025} } ```

--- 标题: CostBench 标签: - 数据集 - costbench - 旅游 - 查询 --- # CostBench数据集 本数据集包含来自`CostBench_queries.json`的381条数据记录。这些查询源自官方CostBench基准测试仓库,并遵循其旅游任务查询范式。 ## 数据集内容 顶层字段: `query_id`、`TimeInfo`、`task`、`is_location`、`goal_type`、`preferences`、`groundtruth`、`validation_raw`、`is_valid`、`user_requirements`、`query` ## 字段说明 - `query_id`:每条查询的唯一标识符。 - `TimeInfo`:任务提示中使用的时间上下文,为ID格式的占位符(如`<TimeInfo03119>`),并非真实时钟时间。 - `task`:子任务名称,常见取值包括`location`(位置选择)、`transportation`(交通出行)、`accommodation`(住宿)、`attraction`(景点)、`dining`(餐饮)以及`shopping`(购物)。 - `is_location`:标记该查询是否为纯位置选择任务。取值为`1`时,表示任务仅需选择位置;取值为`0`时,表示该查询需先基于位置偏好,再查询其他旅游相关目标。 - `goal_type`:任务的最终目标类型,例如`TravelLocation`(旅游位置)、`TravelTransportation`(旅游交通)或`TravelShopping`(旅游购物)。 - `preferences`:任务的结构化用户约束条件,为CostBench的核心评估维度,通常包含`category`(类别)、`tier`(等级)、`style`(风格)以及`feature_package`(特色套餐)。 - `location_preference`:非位置选择任务所需的位置偏好ID,纯位置查询任务中该字段不存在或为空。 - `groundtruth`:正确答案标签,在原始查询数据中通常为目标偏好或最终候选对象的ID。 - `validation_raw`:原始验证内容,用于标记查询的验证状态。 - `is_valid`:有效性标记,`1`表示该查询有效,`0`表示该查询无效。 - `user_requirements`:由结构化偏好生成的自然语言用户请求。 - `query.input`:呈现给模型的最终提示文本。 ## 数据集结构 每一行对应一条符合结构化范式并附带自然语言提示的CostBench查询。该文件以JSON数组格式存储,可直接通过Hugging Face的`json`数据集加载器读取。 ## 本地加载 使用以下代码可直接加载本地数据集: python from datasets import load_dataset dataset = load_dataset("json", data_files="CostBench_queries.json", split="train") ## 从Hugging Face Hub加载 使用以下代码可加载远程托管的数据集: python from datasets import load_dataset dataset = load_dataset( "json", data_files="https://huggingface.co/datasets/JiayuJeff/CostBench/resolve/main/CostBench_queries.json", split="train", ) ## 示例 json { "query_id": "<Query00001>", "TimeInfo": "<TimeInfo03119>", "task": "location", "is_location": 1, "goal_type": "TravelLocation", "preferences": { "category": "city", "feature_package": "architectural_marvel", "style": "historic_and_traditional", "tier": "major_metropolis" }, "groundtruth": "<LocationPreference00001>", "validation_raw": "**no conflict**", "is_valid": 1, "user_requirements": "...", "query": { "input": "Please generate a travel plan ..." } } ## 引用本工作 如需引用本数据集,请使用以下BibTeX条目: bibtex @article{liu2025costbench, title={CostBench: Evaluating Multi-Turn Cost-Optimal Planning and Adaptation in Dynamic Environments for LLM Tool-Use Agents}, author={Liu, Jiayu and Qian, Cheng and Su, Zhaochen and Zong, Qing and Huang, Shijue and He, Bingxiang and Fung, Yi R}, journal={arXiv preprint arXiv:2511.02734}, year={2025} }
提供机构:
JiayuJeff
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作