LongAlign-10k

Name: LongAlign-10k
Creator: maas
Published: 2025-12-05 16:16:50
License: 暂无描述

魔搭社区2025-12-05 更新2024-08-31 收录

下载链接：

https://modelscope.cn/datasets/ZhipuAI/LongAlign-10k

下载链接

链接失效反馈

官方服务：

资源简介：

# LongAlign-10k <p align="center"> 🤗 <a href="https://huggingface.co/datasets/THUDM/LongAlign-10k" target="_blank">[LongAlign Dataset] </a> • 💻 <a href="https://github.com/THUDM/LongAlign" target="_blank">[Github Repo]</a> • 📃 <a href="https://arxiv.org/abs/2401.18058" target="_blank">[LongAlign Paper]</a> </p> **LongAlign** is the first full recipe for LLM alignment on long context. We propose the **LongAlign-10k** dataset, containing 10,000 long instruction data of 8k-64k in length. We investigate on trianing strategies, namely **packing (with loss weighting) and sorted batching**, which are all implemented in our code. For real-world long context evaluation, we introduce **LongBench-Chat** that evaluate the instruction-following capability on queries of 10k-100k length. ## All Models We open-sourced the following list of models: |Model|Huggingface Repo|Description| |---|---|---| |**LongAlign-6B-64k-base**| [🤗 Huggingface Repo](https://huggingface.co/THUDM/LongAlign-6B-64k-base) | **ChatGLM3-6B** with an extended 64k context window | |**LongAlign-6B-64k**| [🤗 Huggingface Repo](https://huggingface.co/THUDM/LongAlign-6B-64k) | Chat model by LongAlign training on LongAlign-6B-64k-base| |**LongAlign-7B-64k-base**| [🤗 Huggingface Repo](https://huggingface.co/THUDM/LongAlign-7B-64k-base) | **Llama-2-7B** with an extended 64k context window | |**LongAlign-7B-64k**| [🤗 Huggingface Repo](https://huggingface.co/THUDM/LongAlign-7B-64k) | Chat model by LongAlign training on LongAlign-7B-64k-base| |**LongAlign-13B-64k-base**| [🤗 Huggingface Repo](https://huggingface.co/THUDM/LongAlign-13B-64k-base) | **Llama-2-13B** with an extended 64k context window | |**LongAlign-13B-64k**| [🤗 Huggingface Repo](https://huggingface.co/THUDM/LongAlign-13B-64k) | Chat model by LongAlign training on LongAlign-13B-64k-base| |**ChatGLM3-6B-128k**| [🤗 Huggingface Repo](https://huggingface.co/THUDM/chatglm3-6b-128k) | **ChatGLM3-6B** with a 128k context window|

# LongAlign-10k <p align="center"> 🤗 <a href="https://huggingface.co/datasets/THUDM/LongAlign-10k" target="_blank">[LongAlign 数据集]</a> • 💻 <a href="https://github.com/THUDM/LongAlign" target="_blank">[Github 代码仓库]</a> • 📃 <a href="https://arxiv.org/abs/2401.18058" target="_blank">[LongAlign 研究论文]</a> </p> **LongAlign** 是首个针对长上下文场景下大语言模型（Large Language Model）对齐任务的完整解决方案。我们提出了**LongAlign-10k** 数据集，包含10000条长度为8k至64k的长指令数据。我们研究了两种训练策略，即**带损失加权的打包训练（packing with loss weighting）**和**排序批处理（sorted batching）**，相关实现均已包含在我们的代码中。为了开展真实场景下的长上下文性能评估，我们推出了**LongBench-Chat** 评估基准，用于测试模型在10k至100k长度的查询指令上的指令遵循能力。 ## 全部开源模型 |模型名称|Huggingface 仓库地址|模型描述| |---|---|---| |**LongAlign-6B-64k-base**| [🤗 Huggingface 仓库](https://huggingface.co/THUDM/LongAlign-6B-64k-base) | 基于**ChatGLM3-6B**，将上下文窗口扩展至64k | |**LongAlign-6B-64k**| [🤗 Huggingface 仓库](https://huggingface.co/THUDM/LongAlign-6B-64k) | 以LongAlign-6B-64k-base为基座，通过LongAlign训练流程微调得到的对话模型 | |**LongAlign-7B-64k-base**| [🤗 Huggingface 仓库](https://huggingface.co/THUDM/LongAlign-7B-64k-base) | 基于**Llama-2-7B**，将上下文窗口扩展至64k | |**LongAlign-7B-64k**| [🤗 Huggingface 仓库](https://huggingface.co/THUDM/LongAlign-7B-64k) | 以LongAlign-7B-64k-base为基座，通过LongAlign训练流程微调得到的对话模型 | |**LongAlign-13B-64k-base**| [🤗 Huggingface 仓库](https://huggingface.co/THUDM/LongAlign-13B-64k-base) | 基于**Llama-2-13B**，将上下文窗口扩展至64k | |**LongAlign-13B-64k**| [🤗 Huggingface 仓库](https://huggingface.co/THUDM/LongAlign-13B-64k) | 以LongAlign-13B-64k-base为基座，通过LongAlign训练流程微调得到的对话模型 | |**ChatGLM3-6B-128k**| [🤗 Huggingface 仓库](https://huggingface.co/THUDM/chatglm3-6b-128k) | 基于**ChatGLM3-6B**，配备128k上下文窗口 |

提供机构：

maas

创建时间：

2024-08-19

5,000+

优质数据集

54 个

任务类型

进入经典数据集