LongAlign-10k
收藏魔搭社区2025-12-05 更新2024-08-31 收录
下载链接:
https://modelscope.cn/datasets/ZhipuAI/LongAlign-10k
下载链接
链接失效反馈官方服务:
资源简介:
# LongAlign-10k
<p align="center">
🤗 <a href="https://huggingface.co/datasets/THUDM/LongAlign-10k" target="_blank">[LongAlign Dataset] </a> • 💻 <a href="https://github.com/THUDM/LongAlign" target="_blank">[Github Repo]</a> • 📃 <a href="https://arxiv.org/abs/2401.18058" target="_blank">[LongAlign Paper]</a>
</p>
**LongAlign** is the first full recipe for LLM alignment on long context. We propose the **LongAlign-10k** dataset, containing 10,000 long instruction data of 8k-64k in length. We investigate on trianing strategies, namely **packing (with loss weighting) and sorted batching**, which are all implemented in our code. For real-world long context evaluation, we introduce **LongBench-Chat** that evaluate the instruction-following capability on queries of 10k-100k length.
## All Models
We open-sourced the following list of models:
|Model|Huggingface Repo|Description|
|---|---|---|
|**LongAlign-6B-64k-base**| [🤗 Huggingface Repo](https://huggingface.co/THUDM/LongAlign-6B-64k-base) | **ChatGLM3-6B** with an extended 64k context window |
|**LongAlign-6B-64k**| [🤗 Huggingface Repo](https://huggingface.co/THUDM/LongAlign-6B-64k) | Chat model by LongAlign training on LongAlign-6B-64k-base|
|**LongAlign-7B-64k-base**| [🤗 Huggingface Repo](https://huggingface.co/THUDM/LongAlign-7B-64k-base) | **Llama-2-7B** with an extended 64k context window |
|**LongAlign-7B-64k**| [🤗 Huggingface Repo](https://huggingface.co/THUDM/LongAlign-7B-64k) | Chat model by LongAlign training on LongAlign-7B-64k-base|
|**LongAlign-13B-64k-base**| [🤗 Huggingface Repo](https://huggingface.co/THUDM/LongAlign-13B-64k-base) | **Llama-2-13B** with an extended 64k context window |
|**LongAlign-13B-64k**| [🤗 Huggingface Repo](https://huggingface.co/THUDM/LongAlign-13B-64k) | Chat model by LongAlign training on LongAlign-13B-64k-base|
|**ChatGLM3-6B-128k**| [🤗 Huggingface Repo](https://huggingface.co/THUDM/chatglm3-6b-128k) | **ChatGLM3-6B** with a 128k context window|
# LongAlign-10k
<p align="center">
🤗 <a href="https://huggingface.co/datasets/THUDM/LongAlign-10k" target="_blank">[LongAlign 数据集]</a> • 💻 <a href="https://github.com/THUDM/LongAlign" target="_blank">[Github 代码仓库]</a> • 📃 <a href="https://arxiv.org/abs/2401.18058" target="_blank">[LongAlign 研究论文]</a>
</p>
**LongAlign** 是首个针对长上下文场景下大语言模型(Large Language Model)对齐任务的完整解决方案。我们提出了**LongAlign-10k** 数据集,包含10000条长度为8k至64k的长指令数据。我们研究了两种训练策略,即**带损失加权的打包训练(packing with loss weighting)**和**排序批处理(sorted batching)**,相关实现均已包含在我们的代码中。为了开展真实场景下的长上下文性能评估,我们推出了**LongBench-Chat** 评估基准,用于测试模型在10k至100k长度的查询指令上的指令遵循能力。
## 全部开源模型
|模型名称|Huggingface 仓库地址|模型描述|
|---|---|---|
|**LongAlign-6B-64k-base**| [🤗 Huggingface 仓库](https://huggingface.co/THUDM/LongAlign-6B-64k-base) | 基于**ChatGLM3-6B**,将上下文窗口扩展至64k |
|**LongAlign-6B-64k**| [🤗 Huggingface 仓库](https://huggingface.co/THUDM/LongAlign-6B-64k) | 以LongAlign-6B-64k-base为基座,通过LongAlign训练流程微调得到的对话模型 |
|**LongAlign-7B-64k-base**| [🤗 Huggingface 仓库](https://huggingface.co/THUDM/LongAlign-7B-64k-base) | 基于**Llama-2-7B**,将上下文窗口扩展至64k |
|**LongAlign-7B-64k**| [🤗 Huggingface 仓库](https://huggingface.co/THUDM/LongAlign-7B-64k) | 以LongAlign-7B-64k-base为基座,通过LongAlign训练流程微调得到的对话模型 |
|**LongAlign-13B-64k-base**| [🤗 Huggingface 仓库](https://huggingface.co/THUDM/LongAlign-13B-64k-base) | 基于**Llama-2-13B**,将上下文窗口扩展至64k |
|**LongAlign-13B-64k**| [🤗 Huggingface 仓库](https://huggingface.co/THUDM/LongAlign-13B-64k) | 以LongAlign-13B-64k-base为基座,通过LongAlign训练流程微调得到的对话模型 |
|**ChatGLM3-6B-128k**| [🤗 Huggingface 仓库](https://huggingface.co/THUDM/chatglm3-6b-128k) | 基于**ChatGLM3-6B**,配备128k上下文窗口 |
提供机构:
maas
创建时间:
2024-08-19



