hails/agieval-gaokao-english

Name: hails/agieval-gaokao-english
Creator: hails
Published: 2024-01-26 18:38:04
License: 暂无描述

Hugging Face2024-01-26 更新2024-03-04 收录

下载链接：

https://hf-mirror.com/datasets/hails/agieval-gaokao-english

下载链接

链接失效反馈

官方服务：

资源简介：

--- dataset_info: features: - name: query dtype: string - name: choices sequence: string - name: gold sequence: int64 splits: - name: test num_bytes: 688986 num_examples: 306 download_size: 200861 dataset_size: 688986 configs: - config_name: default data_files: - split: test path: data/test-* --- # Dataset Card for "agieval-gaokao-english" Dataset taken from https://github.com/microsoft/AGIEval and processed as in that repo, following dmayhem93/agieval-* datasets on the HF hub. This dataset contains the contents of the Gaokao-English subtask of AGIEval, as accessed in https://github.com/ruixiangcui/AGIEval/commit/5c77d073fda993f1652eaae3cf5d04cc5fd21d40 . Citation: ``` @misc{zhong2023agieval, title={AGIEval: A Human-Centric Benchmark for Evaluating Foundation Models}, author={Wanjun Zhong and Ruixiang Cui and Yiduo Guo and Yaobo Liang and Shuai Lu and Yanlin Wang and Amin Saied and Weizhu Chen and Nan Duan}, year={2023}, eprint={2304.06364}, archivePrefix={arXiv}, primaryClass={cs.CL} } ``` Please make sure to cite all the individual datasets in your paper when you use them. We provide the relevant citation information below: ``` @inproceedings{ling-etal-2017-program, title = "Program Induction by Rationale Generation: Learning to Solve and Explain Algebraic Word Problems", author = "Ling, Wang and Yogatama, Dani and Dyer, Chris and Blunsom, Phil", booktitle = "Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)", month = jul, year = "2017", address = "Vancouver, Canada", publisher = "Association for Computational Linguistics", url = "https://aclanthology.org/P17-1015", doi = "10.18653/v1/P17-1015", pages = "158--167", abstract = "Solving algebraic word problems requires executing a series of arithmetic operations{---}a program{---}to obtain a final answer. However, since programs can be arbitrarily complicated, inducing them directly from question-answer pairs is a formidable challenge. To make this task more feasible, we solve these problems by generating answer rationales, sequences of natural language and human-readable mathematical expressions that derive the final answer through a series of small steps. Although rationales do not explicitly specify programs, they provide a scaffolding for their structure via intermediate milestones. To evaluate our approach, we have created a new 100,000-sample dataset of questions, answers and rationales. Experimental results show that indirect supervision of program learning via answer rationales is a promising strategy for inducing arithmetic programs.", } @inproceedings{hendrycksmath2021, title={Measuring Mathematical Problem Solving With the MATH Dataset}, author={Dan Hendrycks and Collin Burns and Saurav Kadavath and Akul Arora and Steven Basart and Eric Tang and Dawn Song and Jacob Steinhardt}, journal={NeurIPS}, year={2021} } @inproceedings{Liu2020LogiQAAC, title={LogiQA: A Challenge Dataset for Machine Reading Comprehension with Logical Reasoning}, author={Jian Liu and Leyang Cui and Hanmeng Liu and Dandan Huang and Yile Wang and Yue Zhang}, booktitle={International Joint Conference on Artificial Intelligence}, year={2020} } @inproceedings{zhong2019jec, title={JEC-QA: A Legal-Domain Question Answering Dataset}, author={Zhong, Haoxi and Xiao, Chaojun and Tu, Cunchao and Zhang, Tianyang and Liu, Zhiyuan and Sun, Maosong}, booktitle={Proceedings of AAAI}, year={2020}, } @article{Wang2021FromLT, title={From LSAT: The Progress and Challenges of Complex Reasoning}, author={Siyuan Wang and Zhongkun Liu and Wanjun Zhong and Ming Zhou and Zhongyu Wei and Zhumin Chen and Nan Duan}, journal={IEEE/ACM Transactions on Audio, Speech, and Language Processing}, year={2021}, volume={30}, pages={2201-2216} } ```

提供机构：

hails

原始信息汇总

数据集概述

数据集信息

特征:
- query: 类型为字符串。
- choices: 类型为字符串序列。
- gold: 类型为整数序列。
分割:
- test: 包含306个样本，总字节数为688986。
大小:
- 下载大小: 200861字节。
- 数据集大小: 688986字节。

配置

默认配置:
- 数据文件:
  - test分割的数据文件路径为data/test-*。

5,000+

优质数据集

54 个

任务类型

进入经典数据集