JiayuJeff/CostBench

Name: JiayuJeff/CostBench
Creator: JiayuJeff
Published: 2026-04-09 16:47:13
License: 暂无描述

Hugging Face2026-04-09 更新2026-04-12 收录

下载链接：

https://hf-mirror.com/datasets/JiayuJeff/CostBench

下载链接

链接失效反馈

官方服务：

资源简介：

--- title: CostBench tags: - dataset - costbench - travel - queries --- # CostBench This dataset contains `381` records from `CostBench_queries.json`. The queries are derived from the official CostBench benchmark repository and follow its travel-task query schema. ## Contents Top-level fields: `query_id`, `TimeInfo`, `task`, `is_location`, `goal_type`, `preferences`, `groundtruth`, `validation_raw`, `is_valid`, `user_requirements`, `query` ## Field Guide - `query_id`: Unique identifier for each query. - `TimeInfo`: Time context used in the task prompt. It is an ID-style placeholder such as `<TimeInfo03119>`, not a real clock time. - `task`: Subtask name. Common values are `location`, `transportation`, `accommodation`, `attraction`, `dining`, and `shopping`. - `is_location`: Whether the query is a pure location-selection task. `1` means the task is about choosing a location; `0` means the query also depends on a location preference and then asks for another travel-related goal. - `goal_type`: Final target type for the task, such as `TravelLocation`, `TravelTransportation`, or `TravelShopping`. - `preferences`: Structured user constraints for the task. These are the key dimensions used by CostBench, typically including `category`, `tier`, `style`, and `feature_package`. - `location_preference`: The required location preference ID for non-location tasks. This field is absent or empty for pure location queries. - `groundtruth`: The correct answer label. In raw query data this is usually the ID of the target preference or final candidate. - `user_requirements`: Natural-language user request generated from the structured preferences. - `query.input`: The final prompt shown to the model. ## Dataset Shape Each row is a single CostBench query with a structured schema and a natural-language prompt. The file is stored as a JSON array, so it can be loaded directly with the Hugging Face `json` dataset builder. ## Loading locally ```python from datasets import load_dataset dataset = load_dataset("json", data_files="CostBench_queries.json", split="train") ``` ## Loading from the Hub ```python from datasets import load_dataset dataset = load_dataset( "json", data_files="https://huggingface.co/datasets/JiayuJeff/CostBench/resolve/main/CostBench_queries.json", split="train", ) ``` ## Example ```json { "query_id": "<Query00001>", "TimeInfo": "<TimeInfo03119>", "task": "location", "is_location": 1, "goal_type": "TravelLocation", "preferences": { "category": "city", "feature_package": "architectural_marvel", "style": "historic_and_traditional", "tier": "major_metropolis" }, "groundtruth": "<LocationPreference00001>", "validation_raw": "**no conflict**", "is_valid": 1, "user_requirements": "...", "query": { "input": "Please generate a travel plan ..." } } ``` ## Citing this work ```bibtex @article{liu2025costbench, title={CostBench: Evaluating Multi-Turn Cost-Optimal Planning and Adaptation in Dynamic Environments for LLM Tool-Use Agents}, author={Liu, Jiayu and Qian, Cheng and Su, Zhaochen and Zong, Qing and Huang, Shijue and He, Bingxiang and Fung, Yi R}, journal={arXiv preprint arXiv:2511.02734}, year={2025} } ```

--- 标题: CostBench 标签: - 数据集 - costbench - 旅游 - 查询 --- # CostBench数据集本数据集包含来自`CostBench_queries.json`的381条数据记录。这些查询源自官方CostBench基准测试仓库，并遵循其旅游任务查询范式。 ## 数据集内容顶层字段： `query_id`、`TimeInfo`、`task`、`is_location`、`goal_type`、`preferences`、`groundtruth`、`validation_raw`、`is_valid`、`user_requirements`、`query` ## 字段说明 - `query_id`：每条查询的唯一标识符。 - `TimeInfo`：任务提示中使用的时间上下文，为ID格式的占位符（如`<TimeInfo03119>`），并非真实时钟时间。 - `task`：子任务名称，常见取值包括`location`（位置选择）、`transportation`（交通出行）、`accommodation`（住宿）、`attraction`（景点）、`dining`（餐饮）以及`shopping`（购物）。 - `is_location`：标记该查询是否为纯位置选择任务。取值为`1`时，表示任务仅需选择位置；取值为`0`时，表示该查询需先基于位置偏好，再查询其他旅游相关目标。 - `goal_type`：任务的最终目标类型，例如`TravelLocation`（旅游位置）、`TravelTransportation`（旅游交通）或`TravelShopping`（旅游购物）。 - `preferences`：任务的结构化用户约束条件，为CostBench的核心评估维度，通常包含`category`（类别）、`tier`（等级）、`style`（风格）以及`feature_package`（特色套餐）。 - `location_preference`：非位置选择任务所需的位置偏好ID，纯位置查询任务中该字段不存在或为空。 - `groundtruth`：正确答案标签，在原始查询数据中通常为目标偏好或最终候选对象的ID。 - `validation_raw`：原始验证内容，用于标记查询的验证状态。 - `is_valid`：有效性标记，`1`表示该查询有效，`0`表示该查询无效。 - `user_requirements`：由结构化偏好生成的自然语言用户请求。 - `query.input`：呈现给模型的最终提示文本。 ## 数据集结构每一行对应一条符合结构化范式并附带自然语言提示的CostBench查询。该文件以JSON数组格式存储，可直接通过Hugging Face的`json`数据集加载器读取。 ## 本地加载使用以下代码可直接加载本地数据集： python from datasets import load_dataset dataset = load_dataset("json", data_files="CostBench_queries.json", split="train") ## 从Hugging Face Hub加载使用以下代码可加载远程托管的数据集： python from datasets import load_dataset dataset = load_dataset( "json", data_files="https://huggingface.co/datasets/JiayuJeff/CostBench/resolve/main/CostBench_queries.json", split="train", ) ## 示例 json { "query_id": "<Query00001>", "TimeInfo": "<TimeInfo03119>", "task": "location", "is_location": 1, "goal_type": "TravelLocation", "preferences": { "category": "city", "feature_package": "architectural_marvel", "style": "historic_and_traditional", "tier": "major_metropolis" }, "groundtruth": "<LocationPreference00001>", "validation_raw": "**no conflict**", "is_valid": 1, "user_requirements": "...", "query": { "input": "Please generate a travel plan ..." } } ## 引用本工作如需引用本数据集，请使用以下BibTeX条目： bibtex @article{liu2025costbench, title={CostBench: Evaluating Multi-Turn Cost-Optimal Planning and Adaptation in Dynamic Environments for LLM Tool-Use Agents}, author={Liu, Jiayu and Qian, Cheng and Su, Zhaochen and Zong, Qing and Huang, Shijue and He, Bingxiang and Fung, Yi R}, journal={arXiv preprint arXiv:2511.02734}, year={2025} }

提供机构：

JiayuJeff

5,000+

优质数据集

54 个

任务类型

进入经典数据集