LIMI

Name: LIMI
Creator: maas
Published: 2025-12-05 16:51:01
License: 暂无描述

魔搭社区2025-12-05 更新2025-09-27 收录

下载链接：

https://modelscope.cn/datasets/GAIR/LIMI

下载链接

链接失效反馈

官方服务：

资源简介：

<div style="display: flex; justify-content: center; align-items: center; gap: 20px;"> <img src="assets/sii.jpg" alt="SII" width="100px"> <img src="assets/asi.png" alt="ASI" width="100px"> </div> <div align="center> <a href="https://github.com/GAIR-NLP/LIMI" target="_blank" style="margin: 2px;"> <img alt="Chat" src="assets/teaser.jpg" style="display: inline-block; vertical-align: middle;"/> </a> </div> # LIMI: Less is More for Agency [![arXiv](https://img.shields.io/badge/arXiv-2509.17567-b31b1b.svg)](https://arxiv.org/pdf/2509.17567) [![GitHub](https://img.shields.io/badge/GitHub-Repository-green)](https://github.com/GAIR-NLP/LIMI) [![Hugging Face](https://img.shields.io/badge/%F0%9F%A4%97%20Hugging%20Face-Model-blue)](https://huggingface.co/GAIR/LIMI) --- To learn more about LIMI, feel free to explore our documentation and resources. Our release consists of the following sections: This dataset release includes the following sections: - **Data Fields**: Message schema, tool definitions, and tool-call format. - **Splits**: Available partitions and counts. - **Examples**: Representative JSON samples. # News - **2025.10.08**: 📝 Released training scripts for Qwen3 dense models (4B/8B/32B) - check out our [training scripts](https://github.com/GAIR-NLP/LIMI/tree/main/scripts/train) to reproduce the results! - **2025.10.08**: 📊 Our LIMI dataset significantly enhances dense models on **AgencyBench**: Qwen3-4B (4.6% → 8.6%), Qwen3-8B (7.3% → 10.6%), Qwen3-32B (8.4% → 20.5%). - **2025.10.08**: 🎯 Strong generalization on **out-of-domain benchmarks** while maintaining performance: Qwen3-4B (28.3% → 28.9%), Qwen3-8B (31.2% → 32.0%), Qwen3-32B (35.2% → 37.1%). - **2025.09.23**: 🚀 LIMI paper is now available on arXiv! Check out our [paper](https://arxiv.org/pdf/2509.17567) for detailed methodology and experimental results. - **2025.09.23**: 🤗 Released LIMI models on Hugging Face! Both [LIMI](https://huggingface.co/GAIR/LIMI) (355B) and [LIMI-Air](https://huggingface.co/GAIR/LIMI-Air) (106B) are now available. - **2025.09.23**: 📊 Released the LIMI training dataset with 78 carefully curated samples on [Hugging Face](https://huggingface.co/datasets/GAIR/LIMI). ## Dataset Summary Curated agentic training data with OpenAI‑style multi‑turn dialogs and tool calls. Focuses on functional completeness, correction over rounds, and spec adherence. For more details, please check the [GAIR-NLP/LIMI](https://github.com/GAIR-NLP/LIMI). ## Data Fields - `messages` (list): chat messages with roles `system` | `user` | `assistant` | `tool` - `tools` (optional list): OpenAI function‑call schemas - `assistant.tool_calls` (optional, list): entries like `{ "type": "function", "function": { "name": str, "arguments": object } }` ## Splits - `train`: 78 samples (current release) ## Examples ```json { "messages": [ {"role": "system", "content": "You are a helpful assistant tasked with discovering mathematical function structures for scientific systems."}, {"role": "user", "content": "Modify the equation.py function, considering the physical meaning and relationships of the inputs."} ], "tools": [ {"type": "function", "function": {"name": "run_tests", "parameters": {"type": "object", "properties": {"path": {"type": "string"}}}}} ] } ``` ## License - Provided for research; verify final license policy before redistribution ## Citation ```bibtex @misc{xiao2025limiagency, title={LIMI: Less is More for Agency}, author={Yang Xiao and Mohan Jiang and Jie Sun and Keyu Li and Jifan Lin and Yumin Zhuang and Ji Zeng and Shijie Xia and Qishuo Hua and Xuefeng Li and Xiaojie Cai and Tongyu Wang and Yue Zhang and Liming Liu and Xia Wu and Jinlong Hou and Yuan Cheng and Wenjie Li and Xiang Wang and Dequan Wang and Pengfei Liu}, year={2025}, eprint={2509.17567}, archivePrefix={arXiv}, primaryClass={cs.AI}, url={https://arxiv.org/abs/2509.17567}, } ```

<div style="display: flex; justify-content: center; align-items: center; gap: 20px;"> <img src="assets/sii.jpg" alt="SII" width="100px"> <img src="assets/asi.png" alt="ASI" width="100px"> </div> <div align="center"> <a href="https://github.com/GAIR-NLP/LIMI" target="_blank" style="margin: 2px;"> <img alt="对话预览" src="assets/teaser.jpg" style="display: inline-block; vertical-align: middle;"/> </a> </div> # LIMI：少即是多，赋能智能体 [![arXiv](https://img.shields.io/badge/arXiv-2509.17567-b31b1b.svg)](https://arxiv.org/pdf/2509.17567) [![GitHub](https://img.shields.io/badge/GitHub-Repository-green)](https://github.com/GAIR-NLP/LIMI) [![Hugging Face](https://img.shields.io/badge/%F0%9F%A4%97%20Hugging%20Face-Model-blue)](https://huggingface.co/GAIR/LIMI) --- 若欲深入了解LIMI，欢迎查阅我们的文档与相关资源。本次发布的内容包含以下板块：本次数据集发布包含以下板块： - **数据字段（Data Fields）**: 消息模式、工具定义与工具调用格式 - **数据集划分（Splits）**: 可用分区与样本计数 - **示例（Examples）**: 代表性JSON样本 # 最新动态 - **2025.10.08**：📝 发布了Qwen3稠密模型（4B/8B/32B）的训练脚本——请访问我们的[训练脚本仓库](https://github.com/GAIR-NLP/LIMI/tree/main/scripts/train)复现实验结果！ - **2025.10.08**：📊 我们的LIMI数据集在**智能体基准测试集（AgencyBench）**上显著提升了稠密模型的性能：Qwen3-4B (4.6% → 8.6%), Qwen3-8B (7.3% → 10.6%), Qwen3-32B (8.4% → 20.5%). - **2025.10.08**：🎯 在**域外基准测试集**上展现出强劲的泛化能力，同时维持原有性能：Qwen3-4B (28.3% → 28.9%), Qwen3-8B (31.2% → 32.0%), Qwen3-32B (35.2% → 37.1%). - **2025.09.23**：🚀 LIMI论文现已在arXiv上线！请查阅我们的[论文](https://arxiv.org/pdf/2509.17567)以获取详细的研究方法与实验结果。 - **2025.09.23**：🤗 已在Hugging Face平台发布LIMI模型！[LIMI](https://huggingface.co/GAIR/LIMI) (355B) 和 [LIMI-Air](https://huggingface.co/GAIR/LIMI-Air) (106B) 现已均可获取。 - **2025.09.23**：📊 已在[Hugging Face](https://huggingface.co/datasets/GAIR/LIMI)发布LIMI训练数据集，该数据集包含78条精心筛选的样本。 ## 数据集概述（Dataset Summary）本数据集为经过精心筛选的智能体训练数据，采用OpenAI风格的多轮对话与工具调用格式，重点关注功能完整性、逐轮纠错与规范依从性。欲了解更多细节，请查阅[GAIR-NLP/LIMI](https://github.com/GAIR-NLP/LIMI)仓库。 ## 数据字段（Data Fields） - `messages` (list): 包含角色为`system`（系统）、`user`（用户）、`assistant`（助手）或`tool`（工具）的聊天消息 - `tools` (optional list): OpenAI风格的函数调用模式 - `assistant.tool_calls` (optional, list): 格式如`{ "type": "function", "function": { "name": str, "arguments": object } }`的条目 ## 数据集划分（Splits） - `训练集（train）`: 78条样本（本次发布版本） ## 示例（Examples） json { "messages": [ {"role": "system", "content": "You are a helpful assistant tasked with discovering mathematical function structures for scientific systems."}, {"role": "user", "content": "Modify the equation.py function, considering the physical meaning and relationships of the inputs."} ], "tools": [ {"type": "function", "function": {"name": "run_tests", "parameters": {"type": "object", "properties": {"path": {"type": "string"}}}}} ] } ## 许可证（License） - 仅用于科研用途；再分发前请核实最终许可证政策 ## 引用格式（Citation） bibtex @misc{xiao2025limiagency, title={LIMI: Less is More for Agency}, author={Yang Xiao and Mohan Jiang and Jie Sun and Keyu Li and Jifan Lin and Yumin Zhuang and Ji Zeng and Shijie Xia and Qishuo Hua and Xuefeng Li and Xiaojie Cai and Tongyu Wang and Yue Zhang and Liming Liu and Xia Wu and Jinlong Hou and Yuan Cheng and Wenjie Li and Xiang Wang and Dequan Wang and Pengfei Liu}, year={2025}, eprint={2509.17567}, archivePrefix={arXiv}, primaryClass={cs.AI}, url={https://arxiv.org/abs/2509.17567}, }

提供机构：

maas

创建时间：

2025-09-23

5,000+

优质数据集

54 个

任务类型

进入经典数据集