five

TAT-QA-Arithmetic-CoT

收藏
魔搭社区2025-12-26 更新2025-12-06 收录
下载链接:
https://modelscope.cn/datasets/cerebras/TAT-QA-Arithmetic-CoT
下载链接
链接失效反馈
官方服务:
资源简介:
# Dataset Information A Chain of Thought (CoT) version of the TAT-QA arithmetic dataset (hosted at https://huggingface.co/datasets/nvidia/ChatQA-Training-Data). The dataset was synthetically generated by prompting Llama3 70B Instruct. The dataset was created as part of our work on Cerebras DocChat - a document-based conversational Q&A model. We observed that initial iterations of our model frequently made errors on arithmetic tasks (such as ConvFinQA) because it was trained on datasets such as TAT-QA where the model must create a final equation in a single shot. We found that the addition of this dataset led to a substantial boost in accuracy (+10 on ConvFinQA). # Acknowledgement This dataset was is a variation of the TAT-QA dataset, and was synthetically generated using Llama 3 70B Instruct. ``` @inproceedings{zhu2021tat, title={TAT-QA: A Question Answering Benchmark on a Hybrid of Tabular and Textual Content in Finance}, author={Zhu, Fengbin and Lei, Wenqiang and Huang, Youcheng and Wang, Chao and Zhang, Shuo and Lv, Jiancheng and Feng, Fuli and Chua, Tat-Seng}, booktitle={Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics}, year={2021} } @article{llama3modelcard, title={Llama 3 Model Card}, author={AI@Meta}, year={2024}, url = {https://github.com/meta-llama/llama3/blob/main/MODEL_CARD.md} } ```

# 数据集信息 本数据集为TAT-QA算术数据集的思维链(Chain of Thought, CoT)版本,托管于https://huggingface.co/datasets/nvidia/ChatQA-Training-Data。本数据集通过对Llama3 70B Instruct进行提示式生成构建而成,是我们针对Cerebras DocChat——一款基于文档的对话式问答模型——开展研究的产物。我们发现,模型的初始迭代版本在算术类任务(如ConvFinQA)上频繁出错,原因在于其训练所用的TAT-QA等数据集要求模型单次生成最终的计算公式。实验结果表明,新增本数据集后,模型的准确率得到了显著提升(在ConvFinQA任务上提升了10个百分点)。 # 致谢 本数据集是TAT-QA数据集的变体,通过Llama 3 70B Instruct合成生成。 @inproceedings{zhu2021tat, title={TAT-QA:面向金融领域混合表格与文本内容的问答基准数据集}, author={朱凤彬、雷文强、黄友诚、王超、张硕、吕建成、冯福利、蔡达生(Chua, Tat-Seng)}, booktitle={第59届国际计算语言学协会年会论文集}, year={2021} } @article{llama3modelcard, title={Llama 3 模型卡片}, author={AI@Meta}, year={2024}, url = {https://github.com/meta-llama/llama3/blob/main/MODEL_CARD.md} }
提供机构:
maas
创建时间:
2025-10-23
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作