vidore/tatdqa_test
收藏数据集概述
数据集信息
- 特征:
query: 字符串类型image_filename: 字符串类型image: 图像类型answer: 字符串类型answer_type: 字符串类型page: 字符串类型model: 字符串类型prompt: 字符串类型source: 字符串类型
- 分割:
test: 包含1663个样本,大小为774039186.125字节
- 下载大小: 136066416字节
- 数据集大小: 774039186.125字节
- 配置:
default: 数据文件路径为data/test-*
- 许可证: CC BY 4.0
- 任务类别:
- 视觉问答
- 问答
- 语言: 英语
- 标签:
- 文档检索
- 视觉问答
- 问答
- 规模类别: 1K<n<10K
数据集描述
- 来源: 来自TAT-DQA数据集的测试集,该数据集是从公开的现实世界财务报告中构建的。
- 特点: 专注于需要数值推理的丰富表格和文本内容。
- 标注: 问题和答案由金融领域的专家手动标注。
数据集结构
- 示例:
questionId: 字符串类型query: 字符串类型question_types: 空类型image: 图像类型docId: 整数类型image_filename: 字符串类型page: 字符串类型answer: 空类型data_split: 字符串类型source: 字符串类型
引用信息
-
引用格式: latex @inproceedings{zhu-etal-2021-tat, title = "{TAT}-{QA}: A Question Answering Benchmark on a Hybrid of Tabular and Textual Content in Finance", author = "Zhu, Fengbin and Lei, Wenqiang and Huang, Youcheng and Wang, Chao and Zhang, Shuo and Lv, Jiancheng and Feng, Fuli and Chua, Tat-Seng", booktitle = "Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers)", month = aug, year = "2021", address = "Online", publisher = "Association for Computational Linguistics", url = "https://aclanthology.org/2021.acl-long.254", doi = "10.18653/v1/2021.acl-long.254", pages = "3277--3287" }
@inproceedings{zhu2022towards, title={Towards complex document understanding by discrete reasoning}, author={Zhu, Fengbin and Lei, Wenqiang and Feng, Fuli and Wang, Chao and Zhang, Haozhou and Chua, Tat-Seng}, booktitle={Proceedings of the 30th ACM International Conference on Multimedia}, pages={4857--4866}, year={2022} }




