next-tat/tat-llm-instructions
收藏TAT-LLM-Instructions 数据集概述
数据集信息
- 特征:
resp: 字符串类型id: 字符串类型user_prompt: 字符串类型
- 分割:
train: 165619445 字节, 32555 个样本validation: 21180081 字节, 4136 个样本
- 下载大小: 37315773 字节
- 数据集大小: 186799526 字节
配置
- 默认配置:
train:data/train-*validation:data/validation-*
任务类别
- 文本生成
- 问答
- 表格问答
标签
- 金融
数据集描述
TAT-LLM-Instructions 数据集是一个精心收集的金融数据集,结构类似于指令。它从三个公开可用的表格和文本问答数据集(FinQA, TAT-QA, TAT-DQA)中聚合信息。通过使用专门的模板,TAT-LLM-Instructions 将原始数据集转换为优化的大型语言模型(LLMs)和外部执行器的提示,旨在显著提高它们的性能。
模板
FinQA 指令模板
Below is an instruction that describes a question answering task in the finance domain, paired with an input table and its relevant text that provide further context. The given question is relevant to the table and text. Generate an appropriate answer to the given question.
Instruction:
Given a table and a list of texts in the following, what is the answer to the question? Please complete the task in three steps:
- In the first step, extract the relevant numerical values from the provided table or texts. Store these in the variable ‘{evidence}‘. If there are multiple values, separate them using the ’#’ symbol.
- In the second step, generate an equation using the extracted numerical values. Store this equation in the variable ‘{equation}‘.
- In the third step, calculate the answer based on the equation and store it in the variable ‘{answer}‘. Please organize the results in the following table: | step | output | | 1 | {evidence} | | 2 | {equation} | | 3 | {answer} | Finally, present the calculated answer in the format: "The answer is: {answer}"
Table {table}
Text {text}
Question {question}
Response
|step | output| |1 | {gold_evidence} | |2 | {gold_equation} | |3 | {gold_answer} | The answer is: {gold_answer}
TAT-QA 指令模板
Below is an instruction that describes a question answering task in the finance domain, paired with an input table and its relevant text that provide further context. The given question is relevant to the table and text. Generate an appropriate answer to the given question.
Instruction
Given a table and a list of texts in the following, answer the question posed using the following five-step process:
- Step 1: Predict the type of question being asked. Store this prediction in the variable ‘{question_type}‘. The value of ‘{question_type}‘ can be one of the following:‘Single span‘, ‘Multiple spans‘, ‘Count‘, or ‘Arithmetic‘.
- Step 2: Extract the relevant strings or numerical values from the provided table or texts. Store these pieces of evidence in the variable ‘{evidence}‘. If there are multiple pieces of evidence, separate them using the ’#’ symbol.
- Step 3: if the ‘{question_type}‘ is ‘Arithmetic‘, formulate an equation using values stored in ‘{evidence}‘. Store this equation in the variable ‘{equation}‘. For all other question types, set the value of {equation} to ’N.A.’.
- Step 4: Predict or calculate the answer based on the question type, evidence and equation. Store it in the variable ‘{answer}‘. If there are multiple values, separate them using the ’#’ symbol.
- Step 5: If the value of the ‘{answer}‘ is numerical, predict its scale and store it in a variable named ‘{scale}‘. The value of ‘{scale}‘ can be one of the following: ‘none‘, ‘percent‘, ‘thousand‘, ‘million‘, or ‘billion‘. For non-numerical values, set the value of ‘{scale}‘ to ’none’. Please organize the results in the following table: | step | output | | 1 | {question_type} | | 2 | {evidence} | | 3 | {equation} | | 4 | {answer} | | 5 | {scale} | Finally, present the final answer in the format: "The answer is: {answer} #### and its corresponding scale is: {scale}"
Table {table}
Text {text}
Question {question}
Response
| step | output | | 1 | {gold_question_type} | | 2 | {gold_evidence} | | 3 | {gold_equation} | | 4 | {gold_answer} | | 5 | {gold_scale} | The answer is: {gold_answer} #### and its corresponding scale is: {gold_scale}
TAT-DQA 指令模板
Below is an instruction that describes a question answering task in the finance domain, paired with an input document that has one or multiple pages that provide further context. The given question is relevant to the document. Generate an appropriate answer to the given question.
Instruction
Given a document that has one or multiple pages in the following, answer the question posed using the following five-step process:
- Step 1: Predict the type of question being asked. Store this prediction in the variable ‘{question_type}‘. The value of ‘{question_type}‘ can be one of the following:‘Single span‘, ‘Multiple spans‘, ‘Count‘, or ‘Arithmetic‘.
- Step 2: Extract the relevant strings or numerical values from the provided document. Store these pieces of evidence in the variable ‘{evidence}‘. If there are multiple pieces of evidence, separate them using the ’#’ symbol.
- Step 3: if the ‘{question_type}‘ is ‘Arithmetic‘, formulate an equation using values stored in ‘{evidence}‘. Store this equation in the variable ‘{equation}‘. For all other question types, set the value of {equation} to ’N.A.’.
- Step 4: Predict or calculate the answer based on the question type, evidence and equation. Store it in the variable ‘{answer}‘. If there are multiple values, separate them using the ’#’ symbol.
- Step 5: If the value of the ‘{answer}‘ is numerical, predict its scale and store it in a variable named ‘{scale}‘. The value of ‘{scale}‘ can be one of the following: ‘none‘, ‘percent‘, ‘thousand‘, ‘million‘, or ‘billion‘. For non-numerical values, set the value of ‘{scale}‘ to ’none’. Please organize the results in the following table: | step | output | | 1 | {question_type} | | 2 | {evidence} | | 3 | {equation} | | 4 | {answer} | | 5 | {scale} | Finally, present the final answer in the format: "The answer is: {answer} #### and its corresponding scale is: {scale}"
Text {pages}
Question {question}
Response
| step | output | | 1 | {gold_question_type} | | 2 | {gold_evidence} | | 3 | {gold_equation} | | 4 | {gold_answer} | | 5 | {gold_scale} | The answer is: {gold_answer} #### and its corresponding scale is: {gold_scale}



