agieval-gaokao-chinese

Name: agieval-gaokao-chinese
Creator: maas
Published: 2025-11-27 16:14:53
License: 暂无描述

魔搭社区2025-11-27 更新2024-05-15 收录

下载链接：

https://modelscope.cn/datasets/AI-ModelScope/agieval-gaokao-chinese

下载链接

链接失效反馈

官方服务：

资源简介：

# Dataset Card for "agieval-gaokao-chinese" Dataset taken from https://github.com/microsoft/AGIEval and processed as in that repo, following dmayhem93/agieval-* datasets on the HF hub. This dataset contains the contents of the Gaokao Chinese subtask of AGIEval, as accessed in https://github.com/ruixiangcui/AGIEval/commit/5c77d073fda993f1652eaae3cf5d04cc5fd21d40 . Citation: ``` @misc{zhong2023agieval, title={AGIEval: A Human-Centric Benchmark for Evaluating Foundation Models}, author={Wanjun Zhong and Ruixiang Cui and Yiduo Guo and Yaobo Liang and Shuai Lu and Yanlin Wang and Amin Saied and Weizhu Chen and Nan Duan}, year={2023}, eprint={2304.06364}, archivePrefix={arXiv}, primaryClass={cs.CL} } ``` Please make sure to cite all the individual datasets in your paper when you use them. We provide the relevant citation information below: ``` @inproceedings{ling-etal-2017-program, title = "Program Induction by Rationale Generation: Learning to Solve and Explain Algebraic Word Problems", author = "Ling, Wang and Yogatama, Dani and Dyer, Chris and Blunsom, Phil", booktitle = "Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)", month = jul, year = "2017", address = "Vancouver, Canada", publisher = "Association for Computational Linguistics", url = "https://aclanthology.org/P17-1015", doi = "10.18653/v1/P17-1015", pages = "158--167", abstract = "Solving algebraic word problems requires executing a series of arithmetic operations{---}a program{---}to obtain a final answer. However, since programs can be arbitrarily complicated, inducing them directly from question-answer pairs is a formidable challenge. To make this task more feasible, we solve these problems by generating answer rationales, sequences of natural language and human-readable mathematical expressions that derive the final answer through a series of small steps. Although rationales do not explicitly specify programs, they provide a scaffolding for their structure via intermediate milestones. To evaluate our approach, we have created a new 100,000-sample dataset of questions, answers and rationales. Experimental results show that indirect supervision of program learning via answer rationales is a promising strategy for inducing arithmetic programs.", } @inproceedings{hendrycksmath2021, title={Measuring Mathematical Problem Solving With the MATH Dataset}, author={Dan Hendrycks and Collin Burns and Saurav Kadavath and Akul Arora and Steven Basart and Eric Tang and Dawn Song and Jacob Steinhardt}, journal={NeurIPS}, year={2021} } @inproceedings{Liu2020LogiQAAC, title={LogiQA: A Challenge Dataset for Machine Reading Comprehension with Logical Reasoning}, author={Jian Liu and Leyang Cui and Hanmeng Liu and Dandan Huang and Yile Wang and Yue Zhang}, booktitle={International Joint Conference on Artificial Intelligence}, year={2020} } @inproceedings{zhong2019jec, title={JEC-QA: A Legal-Domain Question Answering Dataset}, author={Zhong, Haoxi and Xiao, Chaojun and Tu, Cunchao and Zhang, Tianyang and Liu, Zhiyuan and Sun, Maosong}, booktitle={Proceedings of AAAI}, year={2020}, } @article{Wang2021FromLT, title={From LSAT: The Progress and Challenges of Complex Reasoning}, author={Siyuan Wang and Zhongkun Liu and Wanjun Zhong and Ming Zhou and Zhongyu Wei and Zhumin Chen and Nan Duan}, journal={IEEE/ACM Transactions on Audio, Speech, and Language Processing}, year={2021}, volume={30}, pages={2201-2216} } ```

# "agieval-gaokao-chinese"数据集卡片本数据集源自微软（Microsoft）开源仓库https://github.com/microsoft/AGIEval，处理流程与该仓库保持一致，且遵循Hugging Face Hub上dmayhem93/agieval-*系列数据集的规范。本数据集包含AGIEval基准测试的高考语文子任务全部内容，数据获取自提交记录https://github.com/ruixiangcui/AGIEval/commit/5c77d073fda993f1652eaae3cf5d04cc5fd21d40。 ### 引用信息 @misc{zhong2023agieval, title={AGIEval：面向基础模型评估的以人为中心的基准测试集}, author={钟万军 and 崔瑞祥 and 郭一多 and 梁耀波 and 陆帅 and 王衍林 and 阿明·赛义德 and 陈伟柱 and 段楠}, year={2023}, eprint={2304.06364}, archivePrefix={arXiv}, primaryClass={cs.CL} } 使用本数据集时，请在您的论文中引用所有相关的独立数据集。下文提供了相应的引用信息： @inproceedings{ling-etal-2017-program, title = {基于理由生成的程序归纳：学习求解与解释代数应用题}, author = {王凌 and 达尼·约加塔马 and 克里斯·戴尔 and 菲尔·布兰森姆}, booktitle = {第55届国际计算语言学协会年会论文集（第1卷：长论文）}, month = jul, year = {2017}, address = {加拿大温哥华}, publisher = {国际计算语言学协会}, url = {https://aclanthology.org/P17-1015}, doi = {10.18653/v1/P17-1015}, pages = {158--167}, abstract = {求解代数应用题需要执行一系列算术运算（即程序）以得到最终答案。然而，由于程序可能任意复杂，直接从问答对中归纳程序是一项极具挑战的任务。为简化该任务，我们通过生成答案理由来求解此类问题——答案理由是一系列自然语言与人类可读的数学表达式，通过一系列小步骤推导出最终答案。尽管理由并未显式地指定程序，但它们通过中间里程碑为程序结构提供了支撑框架。为评估我们的方法，我们构建了一个包含10万个样本的全新数据集，涵盖问题、答案与理由。实验结果表明，通过答案理由对程序学习进行间接监督，是诱导算术程序的一种极具前景的策略。}, } @inproceedings{hendrycksmath2021, title={基于MATH数据集的数学问题求解能力测评}, author={丹·亨德里克斯 and 科林·伯恩斯 and 萨罗夫·卡达瓦思 and 阿库尔·阿罗拉 and 史蒂文·巴萨特 and 埃里克·唐 and 宋晓东 and 雅各布·斯坦哈特}, journal={神经信息处理系统大会（NeurIPS）}, year={2021} } @inproceedings{Liu2020LogiQAAC, title={LogiQA：面向逻辑推理型机器阅读理解的挑战数据集}, author={刘健 and 崔乐洋 and 刘汉萌 and 黄丹丹 and 王依乐 and 张岳}, booktitle={国际人工智能联合大会}, year={2020} } @inproceedings{zhong2019jec, title={JEC-QA：一个法律领域问答数据集}, author={钟浩西 and 肖超军 and 屠存超 and 张天阳 and 刘知远 and 孙茂松}, booktitle={AAAI大会论文集}, year={2020}, } @article{Wang2021FromLT, title={源自LSAT：复杂推理的研究进展与挑战}, author={王思源 and 刘仲坤 and 钟万军 and 周明 and 韦忠玉 and 陈祖敏 and 段楠}, journal={IEEE/ACM音频、语音与语言处理汇刊}, year={2021}, volume={30}, pages={2201-2216} }

提供机构：

maas

创建时间：

2024-05-08

搜集汇总

数据集介绍