TAT-QA-Arithmetic-CoT
收藏魔搭社区2025-12-26 更新2025-12-06 收录
下载链接:
https://modelscope.cn/datasets/cerebras/TAT-QA-Arithmetic-CoT
下载链接
链接失效反馈官方服务:
资源简介:
# Dataset Information
A Chain of Thought (CoT) version of the TAT-QA arithmetic dataset (hosted at https://huggingface.co/datasets/nvidia/ChatQA-Training-Data). The dataset was synthetically generated by prompting Llama3 70B Instruct. The dataset was created as part of our work on Cerebras DocChat - a document-based conversational Q&A model. We observed that initial iterations of our model frequently made errors on arithmetic tasks (such as ConvFinQA) because it was trained on datasets such as TAT-QA where the model must create a final equation in a single shot. We found that the addition of this dataset led to a substantial boost in accuracy (+10 on ConvFinQA).
# Acknowledgement
This dataset was is a variation of the TAT-QA dataset, and was synthetically generated using Llama 3 70B Instruct.
```
@inproceedings{zhu2021tat,
title={TAT-QA: A Question Answering Benchmark on a Hybrid of Tabular and Textual Content in Finance},
author={Zhu, Fengbin and Lei, Wenqiang and Huang, Youcheng and Wang, Chao and Zhang, Shuo and Lv, Jiancheng and Feng, Fuli and Chua, Tat-Seng},
booktitle={Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics},
year={2021}
}
@article{llama3modelcard,
title={Llama 3 Model Card},
author={AI@Meta},
year={2024},
url = {https://github.com/meta-llama/llama3/blob/main/MODEL_CARD.md}
}
```
# 数据集信息
本数据集为TAT-QA算术数据集的思维链(Chain of Thought, CoT)版本,托管于https://huggingface.co/datasets/nvidia/ChatQA-Training-Data。本数据集通过对Llama3 70B Instruct进行提示式生成构建而成,是我们针对Cerebras DocChat——一款基于文档的对话式问答模型——开展研究的产物。我们发现,模型的初始迭代版本在算术类任务(如ConvFinQA)上频繁出错,原因在于其训练所用的TAT-QA等数据集要求模型单次生成最终的计算公式。实验结果表明,新增本数据集后,模型的准确率得到了显著提升(在ConvFinQA任务上提升了10个百分点)。
# 致谢
本数据集是TAT-QA数据集的变体,通过Llama 3 70B Instruct合成生成。
@inproceedings{zhu2021tat,
title={TAT-QA:面向金融领域混合表格与文本内容的问答基准数据集},
author={朱凤彬、雷文强、黄友诚、王超、张硕、吕建成、冯福利、蔡达生(Chua, Tat-Seng)},
booktitle={第59届国际计算语言学协会年会论文集},
year={2021}
}
@article{llama3modelcard,
title={Llama 3 模型卡片},
author={AI@Meta},
year={2024},
url = {https://github.com/meta-llama/llama3/blob/main/MODEL_CARD.md}
}
提供机构:
maas
创建时间:
2025-10-23



