JiayuJeff/CostBench
收藏Hugging Face2026-04-09 更新2026-04-12 收录
下载链接:
https://hf-mirror.com/datasets/JiayuJeff/CostBench
下载链接
链接失效反馈官方服务:
资源简介:
---
title: CostBench
tags:
- dataset
- costbench
- travel
- queries
---
# CostBench
This dataset contains `381` records from `CostBench_queries.json`.
The queries are derived from the official CostBench benchmark repository and follow its travel-task query schema.
## Contents
Top-level fields:
`query_id`, `TimeInfo`, `task`, `is_location`, `goal_type`, `preferences`, `groundtruth`, `validation_raw`, `is_valid`, `user_requirements`, `query`
## Field Guide
- `query_id`: Unique identifier for each query.
- `TimeInfo`: Time context used in the task prompt. It is an ID-style placeholder such as `<TimeInfo03119>`, not a real clock time.
- `task`: Subtask name. Common values are `location`, `transportation`, `accommodation`, `attraction`, `dining`, and `shopping`.
- `is_location`: Whether the query is a pure location-selection task. `1` means the task is about choosing a location; `0` means the query also depends on a location preference and then asks for another travel-related goal.
- `goal_type`: Final target type for the task, such as `TravelLocation`, `TravelTransportation`, or `TravelShopping`.
- `preferences`: Structured user constraints for the task. These are the key dimensions used by CostBench, typically including `category`, `tier`, `style`, and `feature_package`.
- `location_preference`: The required location preference ID for non-location tasks. This field is absent or empty for pure location queries.
- `groundtruth`: The correct answer label. In raw query data this is usually the ID of the target preference or final candidate.
- `user_requirements`: Natural-language user request generated from the structured preferences.
- `query.input`: The final prompt shown to the model.
## Dataset Shape
Each row is a single CostBench query with a structured schema and a natural-language prompt. The file is stored as a JSON array, so it can be loaded directly with the Hugging Face `json` dataset builder.
## Loading locally
```python
from datasets import load_dataset
dataset = load_dataset("json", data_files="CostBench_queries.json", split="train")
```
## Loading from the Hub
```python
from datasets import load_dataset
dataset = load_dataset(
"json",
data_files="https://huggingface.co/datasets/JiayuJeff/CostBench/resolve/main/CostBench_queries.json",
split="train",
)
```
## Example
```json
{
"query_id": "<Query00001>",
"TimeInfo": "<TimeInfo03119>",
"task": "location",
"is_location": 1,
"goal_type": "TravelLocation",
"preferences": {
"category": "city",
"feature_package": "architectural_marvel",
"style": "historic_and_traditional",
"tier": "major_metropolis"
},
"groundtruth": "<LocationPreference00001>",
"validation_raw": "**no conflict**",
"is_valid": 1,
"user_requirements": "...",
"query": {
"input": "Please generate a travel plan ..."
}
}
```
## Citing this work
```bibtex
@article{liu2025costbench,
title={CostBench: Evaluating Multi-Turn Cost-Optimal Planning and Adaptation in Dynamic Environments for LLM Tool-Use Agents},
author={Liu, Jiayu and Qian, Cheng and Su, Zhaochen and Zong, Qing and Huang, Shijue and He, Bingxiang and Fung, Yi R},
journal={arXiv preprint arXiv:2511.02734},
year={2025}
}
```
---
标题: CostBench
标签:
- 数据集
- costbench
- 旅游
- 查询
---
# CostBench数据集
本数据集包含来自`CostBench_queries.json`的381条数据记录。这些查询源自官方CostBench基准测试仓库,并遵循其旅游任务查询范式。
## 数据集内容
顶层字段:
`query_id`、`TimeInfo`、`task`、`is_location`、`goal_type`、`preferences`、`groundtruth`、`validation_raw`、`is_valid`、`user_requirements`、`query`
## 字段说明
- `query_id`:每条查询的唯一标识符。
- `TimeInfo`:任务提示中使用的时间上下文,为ID格式的占位符(如`<TimeInfo03119>`),并非真实时钟时间。
- `task`:子任务名称,常见取值包括`location`(位置选择)、`transportation`(交通出行)、`accommodation`(住宿)、`attraction`(景点)、`dining`(餐饮)以及`shopping`(购物)。
- `is_location`:标记该查询是否为纯位置选择任务。取值为`1`时,表示任务仅需选择位置;取值为`0`时,表示该查询需先基于位置偏好,再查询其他旅游相关目标。
- `goal_type`:任务的最终目标类型,例如`TravelLocation`(旅游位置)、`TravelTransportation`(旅游交通)或`TravelShopping`(旅游购物)。
- `preferences`:任务的结构化用户约束条件,为CostBench的核心评估维度,通常包含`category`(类别)、`tier`(等级)、`style`(风格)以及`feature_package`(特色套餐)。
- `location_preference`:非位置选择任务所需的位置偏好ID,纯位置查询任务中该字段不存在或为空。
- `groundtruth`:正确答案标签,在原始查询数据中通常为目标偏好或最终候选对象的ID。
- `validation_raw`:原始验证内容,用于标记查询的验证状态。
- `is_valid`:有效性标记,`1`表示该查询有效,`0`表示该查询无效。
- `user_requirements`:由结构化偏好生成的自然语言用户请求。
- `query.input`:呈现给模型的最终提示文本。
## 数据集结构
每一行对应一条符合结构化范式并附带自然语言提示的CostBench查询。该文件以JSON数组格式存储,可直接通过Hugging Face的`json`数据集加载器读取。
## 本地加载
使用以下代码可直接加载本地数据集:
python
from datasets import load_dataset
dataset = load_dataset("json", data_files="CostBench_queries.json", split="train")
## 从Hugging Face Hub加载
使用以下代码可加载远程托管的数据集:
python
from datasets import load_dataset
dataset = load_dataset(
"json",
data_files="https://huggingface.co/datasets/JiayuJeff/CostBench/resolve/main/CostBench_queries.json",
split="train",
)
## 示例
json
{
"query_id": "<Query00001>",
"TimeInfo": "<TimeInfo03119>",
"task": "location",
"is_location": 1,
"goal_type": "TravelLocation",
"preferences": {
"category": "city",
"feature_package": "architectural_marvel",
"style": "historic_and_traditional",
"tier": "major_metropolis"
},
"groundtruth": "<LocationPreference00001>",
"validation_raw": "**no conflict**",
"is_valid": 1,
"user_requirements": "...",
"query": {
"input": "Please generate a travel plan ..."
}
}
## 引用本工作
如需引用本数据集,请使用以下BibTeX条目:
bibtex
@article{liu2025costbench,
title={CostBench: Evaluating Multi-Turn Cost-Optimal Planning and Adaptation in Dynamic Environments for LLM Tool-Use Agents},
author={Liu, Jiayu and Qian, Cheng and Su, Zhaochen and Zong, Qing and Huang, Shijue and He, Bingxiang and Fung, Yi R},
journal={arXiv preprint arXiv:2511.02734},
year={2025}
}
提供机构:
JiayuJeff



