five

MathChat: Benchmarking Mathematical Reasoning and Instruction Following in Multi-Turn Interactions

收藏
NIAID Data Ecosystem2026-05-02 收录
下载链接:
https://zenodo.org/record/13294002
下载链接
链接失效反馈
官方服务:
资源简介:
1. follow_up.jsonl This file contains entries that facilitate follow-up questioning. Each line consists of three keys: question: Sourced from the GSM8k testing set. answer: Corresponding answer from the GSM8k testing set. followup: Includes two rounds of follow-up questions and reference answers, formatted as a conversation between a user (A:) and an assistant (B:). 2. error_correction.jsonl This file is designed for error correction tasks. Each line consists of three keys: question: Sourced from the GSM8k testing set. answer: Corresponding answer from the GSM8k testing set. error_correction: Contains a conversation between a user (A:) and an assistant (B:), which includes the original question, an incorrect answer, and the process of correcting the error. 3. error_analysis.jsonl This file also focuses on error correction but employs a different prompt strategy. Each line consists of three keys: question: Sourced from the GSM8k testing set. answer: Corresponding answer from the GSM8k testing set. error_analysis: Includes a conversation between a user (A:) and an assistant (B:), where the model is prompted to independently determine the correctness of the answer without being explicitly told. 4. p2p_generation.jsonl This file contains entries for problem generation tasks. Each line consists of three keys: question: Sourced from the GSM8k testing set. answer: Corresponding answer from the GSM8k testing set. new_problem: A new problem generated by GPT-4 to serve as a reference answer.
创建时间:
2024-08-11
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作