LwbXc/STBench
收藏Hugging Face2024-07-01 更新2024-06-12 收录
下载链接:
https://hf-mirror.com/datasets/LwbXc/STBench
下载链接
链接失效反馈官方服务:
资源简介:
STBench是一个用于评估大语言模型在时空分析中能力的基准测试。该基准测试包含13个不同的任务和超过60,000个问答对,涵盖了知识理解、时空推理、精确计算和下游应用四个维度。所有数据样本以文本补全的形式呈现,模型需要生成一个选项编号来完成文本。
STBench is a benchmark to evaluate the ability of large language models in spatio-temporal analysis. This benchmark consists of 13 distinct tasks and over 60,000 question-answer pairs, covering four dimensions: knowledge comprehension, spatio-temporal reasoning, accurate computation and downstream applications. All data samples in STbench are in the form of text completion, where the model is expected to generate an option number to complete the text.
提供机构:
LwbXc
原始信息汇总
数据集概述
许可证信息
- 许可证类型: MIT



