TransportationGames
收藏arXiv2024-01-09 更新2024-06-21 收录
下载链接:
http://transportation.games
下载链接
链接失效反馈官方服务:
资源简介:
TransportationGames是由北京交通大学设计的一个综合评估基准,专门用于评估大型语言模型(LLMs)和多模态大型语言模型(MLLMs)在交通领域的应用能力。该数据集包含10个任务,涵盖交通概念问答、交通法规问答、交通标志问答等多个方面,旨在通过记忆、理解和应用三个层次评估模型对交通知识的掌握程度。数据集通过互联网收集,处理后形成,适用于评估模型在实际交通场景中的应用能力,特别是在解决交通相关问题上的表现。
TransportationGames is a comprehensive evaluation benchmark designed by Beijing Jiaotong University, specifically dedicated to evaluating the application capabilities of Large Language Models (LLMs) and Multimodal Large Language Models (MLLMs) in the transportation domain. This dataset consists of 10 tasks covering multiple aspects including transportation concept question answering, traffic regulation question answering, and traffic sign question answering. It aims to evaluate models' mastery of transportation knowledge through three levels: memorization, comprehension, and application. Collected from the Internet and post-processed, the dataset is suitable for assessing models' application capabilities in real-world transportation scenarios, especially their performance in solving transportation-related problems.
提供机构:
北京交通大学
创建时间:
2024-01-09



