five

GD-ML/MobilityBench

收藏
Hugging Face2026-03-05 更新2026-04-05 收录
下载链接:
https://hf-mirror.com/datasets/GD-ML/MobilityBench
下载链接
链接失效反馈
官方服务:
资源简介:
--- task_categories: - question-answering tags: - Agent - Benchmark - Route-Planning size_categories: - 50K<n<100K configs: - config_name: default data_files: - split: query path: datasets/all_data_benchmark_50000.csv --- > **Note:** This work is currently under review. The full dataset will be released progressively. # MobilityBench: A Benchmark for Evaluating Route-Planning Agents in Real-World Mobility Scenarios [Paper](https://huggingface.co/papers/2602.22638) | [GitHub](https://github.com/AMAP-ML/MobilityBench) **MobilityBench** is a scalable benchmark for evaluating route-planning agents in real-world mobility scenarios. It is built from large-scale, anonymized mobility queries from **Amap**, organized with a comprehensive task taxonomy, and provides **structured ground truth** (required tool calls + verifiable evidence). All tool calls are executed in a **deterministic replay sandbox** for reproducible, multi-dimensional evaluation. **Scale & Coverage:** 100,000 episodes across **22** countries and **350+** cities (including metropolitan areas), with a **long-tailed** geographic distribution. ### Scenario Distribution (11 intents) - **36.6%** Basic Information Retrieval - **9.6%** Route-Dependent Information Retrieval - **42.5%** Basic Route Planning - **11.3%** Preference-Constrained Route Planning #### Data Format | Field | Description | |-------|-------------| | `query` | User query text | | `context` | Context information (JSON, e.g., current location, city) | | `task_scenario` | Fine-grained task category | | `intent_family` | Coarse-grained intent category for evaluation aggregation | | `tool_list` | Expected tool calls (JSON array) | | `route_ans` | Ground truth route answer (JSON) | #### Sample Data (5 Examples) | Query | Task Scenario | Intent Family | |-------|---------------|---------------| | 去大石桥不走高速<br>Go to Dashiqiao without taking the highway. | Option-Constrained Route Planning | Preference-Constrained Route Planning | | 现在成都大道会堵车吗?看一下地图,会不会堵<br>Is Chengdu Avenue congested now? Looking at the map, is it likely to be congested? | Traffic Info Query | Basic Route Planning | | 我在哪<br>Where am I? | Geolocation Query | Basic Information Retrieval | | 知道离滇池会展中心有多远<br>How far it is from Dianchi Convention and Exhibition Center? | Route Property Query | Route-Dependent Information Retrieval | | 到寨河收费站入口不走高速<br>To reach the Zhaihe toll station entrance without taking the highway. | Option-Constrained Route Planning | Preference-Constrained Route Planning | ## Citation If you use this dataset in your research, please cite the following paper: ```bibtex @article{song2026mobilitybench, title={MobilityBench: A Benchmark for Evaluating Route-Planning Agents in Real-World Mobility Scenarios}, author={Song, Zhiheng and Zhang, Jingshuai and Qin, Chuan and Wang, Chao and Chen, Chao and Xu, Longfei and Liu, Kaikui and Chu, Xiangxiang and Zhu, Hengshu}, journal={arXiv preprint arXiv:2602.22638}, year={2026} } ```
提供机构:
GD-ML
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作