TableInstruct
收藏魔搭社区2025-12-26 更新2025-07-05 收录
下载链接:
https://modelscope.cn/datasets/osunlp/TableInstruct
下载链接
链接失效反馈官方服务:
资源简介:
---
# TableLlama: Towards Open Large Generalist Models for Tables
Project Page: [https://osu-nlp-group.github.io/TableLlama/](https://osu-nlp-group.github.io/TableLlama/)
Paper: [https://arxiv.org/abs/2311.09206](https://arxiv.org/abs/2311.09206)
Model: [https://huggingface.co/osunlp/TableLlama/](https://huggingface.co/osunlp/TableLlama/)
Code: [https://osu-nlp-group.github.io/TableLlama/](https://osu-nlp-group.github.io/TableLlama/)
## Introduction
We introduce TableLlama, an open-source large generalist model specifically tailored for various table-based tasks. The TableLlama model is trained on TableInstruct Dataset, a meticulously curated instruction tuning dataset for tables. TableLlama is tuned on 2.6 million table-based task data, and can handle up to 8K context!
## Model
🤗 [TableLlama-7B](https://huggingface.co/osunlp/TableLlama/)
## Data
The models are trained on the 🤗 [TableInstruct Dataset](https://huggingface.co/datasets/osunlp/TableInstruct), which includes a comprehensive table-based instruction tuning dataset that covers a variety of real-world tables and realistic tasks. We include 14 datasets of 11 tasks in total. Check out the dataset card for more details.
## Training Procedure
The models are fine-tuned with the TableInstruct dataset using LongLoRA (7B), fully fine-tuning version as the base model, which replaces the vanilla attention mechanism of the original Llama-2 (7B) with shift short attention. The training takes 9 days on a 48*A100 cluster. Check out our paper for more details.
## Evaluation
The models are evaluated on 8 in-domain datasets of 8 tasks and 6 out-of-domain datasets of 4 tasks.
## Usage
You can use the models through Huggingface's Transformers library.
Check our Github repo for more advanced use: [https://osu-nlp-group.github.io/TableLlama/](https://osu-nlp-group.github.io/TableLlama/)
## Prompt Format
```
Below is an instruction that describes a task, paired with an input that provides further context. Write a response that
appropriately completes the request.
### Instruction:
{instruction}
### Input:
{input}
### Question:
{question}
### Response:
```
## Citation
If you use the models, data, or code from this project, please cite the original paper:
```
@misc{zhang2023tablellama,
title={TableLlama: Towards Open Large Generalist Models for Tables},
author={Tianshu Zhang and Xiang Yue and Yifei Li and Huan Sun},
year={2023},
eprint={2311.09206},
archivePrefix={arXiv},
primaryClass={cs.CL}
}
```
# TableLlama:面向表格任务的开源通用大语言模型(Large Language Model)
## 项目主页
[https://osu-nlp-group.github.io/TableLlama/](https://osu-nlp-group.github.io/TableLlama/)
## 论文
[https://arxiv.org/abs/2311.09206](https://arxiv.org/abs/2311.09206)
## 模型
[https://huggingface.co/osunlp/TableLlama/](https://huggingface.co/osunlp/TableLlama/)
## 代码
[https://osu-nlp-group.github.io/TableLlama/](https://osu-nlp-group.github.io/TableLlama/)
## 简介
我们推出了TableLlama,一款专为各类表格任务定制的开源通用大语言模型。TableLlama基于TableInstruct数据集(TableInstruct Dataset)进行微调,该数据集是一套经过精心整理的表格类指令微调数据集。TableLlama在260万条表格任务数据上完成微调,支持最长8K的上下文处理!
## 模型
🤗 [TableLlama-7B](https://huggingface.co/osunlp/TableLlama/)
## 数据集
本模型基于🤗 [TableInstruct数据集(TableInstruct Dataset)](https://huggingface.co/datasets/osunlp/TableInstruct) 训练,该数据集是一套覆盖广泛的表格类指令微调数据集,涵盖多种真实场景表格与贴合实际的任务。本数据集共包含11类任务对应的14个子数据集,更多细节可查阅数据集卡片。
## 训练流程
本模型以完整微调版本的LongLoRA(7B)作为基础模型,将原始Llama-2(7B)的标准注意力机制替换为移位短注意力(shift short attention),并基于TableInstruct数据集进行微调。训练过程在搭载48块A100的GPU集群上耗时9天,更多细节可参阅原论文。
## 评估
本模型在8类任务对应的8个域内数据集,以及4类任务对应的6个域外数据集上完成了评估。
## 使用方法
你可以通过Huggingface的Transformers库调用本模型。更多高级使用方式可查阅我们的GitHub仓库:[https://osu-nlp-group.github.io/TableLlama/](https://osu-nlp-group.github.io/TableLlama/)
## 提示词格式
以下是一则描述任务的指令,搭配提供额外上下文的输入,请生成一段恰当的响应以完成该请求。
### 指令:
{instruction}
### 输入:
{input}
### 问题:
{question}
### 响应:
## 引用
若你使用本项目的模型、数据集或代码,请引用原论文:
@misc{zhang2023tablellama,
title={TableLlama: Towards Open Large Generalist Models for Tables},
author={Tianshu Zhang and Xiang Yue and Yifei Li and Huan Sun},
year={2023},
eprint={2311.09206},
archivePrefix={arXiv},
primaryClass={cs.CL}
}
提供机构:
maas
创建时间:
2025-07-04



