UGPhysics
收藏魔搭社区2026-04-29 更新2025-06-07 收录
下载链接:
https://modelscope.cn/datasets/xinxu02/UGPhysics
下载链接
链接失效反馈官方服务:
资源简介:
# UGPhysics: A Comprehensive Benchmark for Undergraduate Physics Reasoning with Large Language Models
**UGPhysics** is a large-scale and comprehensive benchmark tailored for evaluating the physics problem-solving abilities of LLMs across multiple **U**nder**G**raduate-level **Physics** (UGPhysics) disciplines, comprising 5,520 distinct problems
in three main domains, 13 core subjects, and 59 key topic.
# An Example to load the data
```python
from datasets import load_dataset
dataset=load_dataset("UGPhysics/ugphysics", "AtomicPhysics", split="en")
print(dataset[0])
```
More details on loading and using the data are on our [GitHub page](https://github.com/YangLabHKUST/UGPhysics.git).
If you do find our code helpful or use our benchmark dataset, please cite our paper.
```
@article{xu2025ugphysics,
title={UGPhysics: A Comprehensive Benchmark for Undergraduate Physics Reasoning with Large Language Models},
author={Xu, Xin and Xu, Qiyun and Xiao, Tong and Chen, Tianhao and Yan, Yuchen and Zhang, Jiaxin and Diao, Shizhe and Yang, Can and Wang, Yang},
journal={arXiv preprint arXiv:2502.00334},
year={2025}
}
```
# UGPhysics:面向大语言模型的本科物理推理综合基准数据集
**UGPhysics**是一款大规模且全面的基准测试集,专为评估大语言模型(Large Language Model,简称LLM)在多个**本科(U)**层次**物理(Physics)**(UGPhysics)学科领域中的物理问题求解能力而构建,涵盖3个主要领域、13个核心科目以及59个关键主题下的5520道独立题目。
# 数据加载示例
python
from datasets import load_dataset
dataset=load_dataset("UGPhysics/ugphysics", "AtomicPhysics", split="en")
print(dataset[0])
更多关于数据集加载与使用的细节,请参阅我们的[GitHub页面](https://github.com/YangLabHKUST/UGPhysics.git)。
若您认为我们的代码具有参考价值,或使用了本基准数据集,请引用我们的论文。
@article{xu2025ugphysics,
title={UGPhysics: A Comprehensive Benchmark for Undergraduate Physics Reasoning with Large Language Models},
author={Xu, Xin and Xu, Qiyun and Xiao, Tong and Chen, Tianhao and Yan, Yuchen and Zhang, Jiaxin and Diao, Shizhe and Yang, Can and Wang, Yang},
journal={arXiv preprint arXiv:2502.00334},
year={2025}
}
提供机构:
maas
创建时间:
2025-06-03



