agieval-gaokao-chinese
收藏魔搭社区2025-11-27 更新2024-05-15 收录
下载链接:
https://modelscope.cn/datasets/AI-ModelScope/agieval-gaokao-chinese
下载链接
链接失效反馈官方服务:
资源简介:
# Dataset Card for "agieval-gaokao-chinese"
Dataset taken from https://github.com/microsoft/AGIEval and processed as in that repo, following dmayhem93/agieval-* datasets on the HF hub.
This dataset contains the contents of the Gaokao Chinese subtask of AGIEval, as accessed in https://github.com/ruixiangcui/AGIEval/commit/5c77d073fda993f1652eaae3cf5d04cc5fd21d40 .
Citation:
```
@misc{zhong2023agieval,
title={AGIEval: A Human-Centric Benchmark for Evaluating Foundation Models},
author={Wanjun Zhong and Ruixiang Cui and Yiduo Guo and Yaobo Liang and Shuai Lu and Yanlin Wang and Amin Saied and Weizhu Chen and Nan Duan},
year={2023},
eprint={2304.06364},
archivePrefix={arXiv},
primaryClass={cs.CL}
}
```
Please make sure to cite all the individual datasets in your paper when you use them. We provide the relevant citation information below:
```
@inproceedings{ling-etal-2017-program,
title = "Program Induction by Rationale Generation: Learning to Solve and Explain Algebraic Word Problems",
author = "Ling, Wang and
Yogatama, Dani and
Dyer, Chris and
Blunsom, Phil",
booktitle = "Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)",
month = jul,
year = "2017",
address = "Vancouver, Canada",
publisher = "Association for Computational Linguistics",
url = "https://aclanthology.org/P17-1015",
doi = "10.18653/v1/P17-1015",
pages = "158--167",
abstract = "Solving algebraic word problems requires executing a series of arithmetic operations{---}a program{---}to obtain a final answer. However, since programs can be arbitrarily complicated, inducing them directly from question-answer pairs is a formidable challenge. To make this task more feasible, we solve these problems by generating answer rationales, sequences of natural language and human-readable mathematical expressions that derive the final answer through a series of small steps. Although rationales do not explicitly specify programs, they provide a scaffolding for their structure via intermediate milestones. To evaluate our approach, we have created a new 100,000-sample dataset of questions, answers and rationales. Experimental results show that indirect supervision of program learning via answer rationales is a promising strategy for inducing arithmetic programs.",
}
@inproceedings{hendrycksmath2021,
title={Measuring Mathematical Problem Solving With the MATH Dataset},
author={Dan Hendrycks and Collin Burns and Saurav Kadavath and Akul Arora and Steven Basart and Eric Tang and Dawn Song and Jacob Steinhardt},
journal={NeurIPS},
year={2021}
}
@inproceedings{Liu2020LogiQAAC,
title={LogiQA: A Challenge Dataset for Machine Reading Comprehension with Logical Reasoning},
author={Jian Liu and Leyang Cui and Hanmeng Liu and Dandan Huang and Yile Wang and Yue Zhang},
booktitle={International Joint Conference on Artificial Intelligence},
year={2020}
}
@inproceedings{zhong2019jec,
title={JEC-QA: A Legal-Domain Question Answering Dataset},
author={Zhong, Haoxi and Xiao, Chaojun and Tu, Cunchao and Zhang, Tianyang and Liu, Zhiyuan and Sun, Maosong},
booktitle={Proceedings of AAAI},
year={2020},
}
@article{Wang2021FromLT,
title={From LSAT: The Progress and Challenges of Complex Reasoning},
author={Siyuan Wang and Zhongkun Liu and Wanjun Zhong and Ming Zhou and Zhongyu Wei and Zhumin Chen and Nan Duan},
journal={IEEE/ACM Transactions on Audio, Speech, and Language Processing},
year={2021},
volume={30},
pages={2201-2216}
}
```
# "agieval-gaokao-chinese"数据集卡片
本数据集源自微软(Microsoft)开源仓库https://github.com/microsoft/AGIEval,处理流程与该仓库保持一致,且遵循Hugging Face Hub上dmayhem93/agieval-*系列数据集的规范。
本数据集包含AGIEval基准测试的高考语文子任务全部内容,数据获取自提交记录https://github.com/ruixiangcui/AGIEval/commit/5c77d073fda993f1652eaae3cf5d04cc5fd21d40。
### 引用信息
@misc{zhong2023agieval,
title={AGIEval:面向基础模型评估的以人为中心的基准测试集},
author={钟万军 and 崔瑞祥 and 郭一多 and 梁耀波 and 陆帅 and 王衍林 and 阿明·赛义德 and 陈伟柱 and 段楠},
year={2023},
eprint={2304.06364},
archivePrefix={arXiv},
primaryClass={cs.CL}
}
使用本数据集时,请在您的论文中引用所有相关的独立数据集。下文提供了相应的引用信息:
@inproceedings{ling-etal-2017-program,
title = {基于理由生成的程序归纳:学习求解与解释代数应用题},
author = {王凌 and 达尼·约加塔马 and 克里斯·戴尔 and 菲尔·布兰森姆},
booktitle = {第55届国际计算语言学协会年会论文集(第1卷:长论文)},
month = jul,
year = {2017},
address = {加拿大温哥华},
publisher = {国际计算语言学协会},
url = {https://aclanthology.org/P17-1015},
doi = {10.18653/v1/P17-1015},
pages = {158--167},
abstract = {求解代数应用题需要执行一系列算术运算(即程序)以得到最终答案。然而,由于程序可能任意复杂,直接从问答对中归纳程序是一项极具挑战的任务。为简化该任务,我们通过生成答案理由来求解此类问题——答案理由是一系列自然语言与人类可读的数学表达式,通过一系列小步骤推导出最终答案。尽管理由并未显式地指定程序,但它们通过中间里程碑为程序结构提供了支撑框架。为评估我们的方法,我们构建了一个包含10万个样本的全新数据集,涵盖问题、答案与理由。实验结果表明,通过答案理由对程序学习进行间接监督,是诱导算术程序的一种极具前景的策略。},
}
@inproceedings{hendrycksmath2021,
title={基于MATH数据集的数学问题求解能力测评},
author={丹·亨德里克斯 and 科林·伯恩斯 and 萨罗夫·卡达瓦思 and 阿库尔·阿罗拉 and 史蒂文·巴萨特 and 埃里克·唐 and 宋晓东 and 雅各布·斯坦哈特},
journal={神经信息处理系统大会(NeurIPS)},
year={2021}
}
@inproceedings{Liu2020LogiQAAC,
title={LogiQA:面向逻辑推理型机器阅读理解的挑战数据集},
author={刘健 and 崔乐洋 and 刘汉萌 and 黄丹丹 and 王依乐 and 张岳},
booktitle={国际人工智能联合大会},
year={2020}
}
@inproceedings{zhong2019jec,
title={JEC-QA:一个法律领域问答数据集},
author={钟浩西 and 肖超军 and 屠存超 and 张天阳 and 刘知远 and 孙茂松},
booktitle={AAAI大会论文集},
year={2020},
}
@article{Wang2021FromLT,
title={源自LSAT:复杂推理的研究进展与挑战},
author={王思源 and 刘仲坤 and 钟万军 and 周明 and 韦忠玉 and 陈祖敏 and 段楠},
journal={IEEE/ACM音频、语音与语言处理汇刊},
year={2021},
volume={30},
pages={2201-2216}
}
提供机构:
maas
创建时间:
2024-05-08
搜集汇总
数据集介绍

背景与挑战
背景概述
该数据集是AGIEval基准中的Gaokao Chinese子任务,用于评估基础模型在高考语文相关任务上的表现。它基于开源项目处理,遵循Apache License 2.0许可,并包含相关的学术引用信息。
以上内容由遇见数据集搜集并总结生成



