kimi-k2.6-reap-observations-v1
收藏魔搭社区2026-05-01 更新2026-05-03 收录
下载链接:
https://modelscope.cn/datasets/0xSero/kimi-k2.6-reap-observations-v1
下载链接
链接失效反馈官方服务:
资源简介:
# Kimi-K2.6 REAP Observation Data (v1)
Per-layer expert routing + activation statistics captured from **moonshotai/Kimi-K2.6**
under the REAP layerwise observer (PR #17, CerebrasResearch/reap).
## What this is
This dataset contains the **observer output** of a full REAP calibration pass on
Kimi-K2.6. It is *not* a pruned model. Each record describes per-token routing
decisions, expert activation norms, and the REAP saliency ingredients for every
MoE layer of the base model.
Downstream consumers can feed these observations back into `reap.prune` (or any
other expert-saliency-based compressor) to produce pruned checkpoints at arbitrary
compression ratios without re-running the (expensive) forward-pass calibration.
## Source model
- **Base**: `moonshotai/Kimi-K2.6` (Kimi-K2.6 = DeepseekV3 arch, ~1.026 T params)
- **Quantization**: INT4, group-size 32, symmetric, compressed-tensors
`pack-quantized` format. Dense MLPs and attention kept in BF16 per the model's
`quantization_config.ignore` list.
## Calibration
- **Composite dataset**:
- [`0xSero/reap-calibration-data-v1`](https://huggingface.co/datasets/0xSero/reap-calibration-data-v1)
— 23,088 benchmark-free samples across 10 domains.
- [`0xSero/structured-outputs-calibration-v1`](https://huggingface.co/datasets/0xSero/structured-outputs-calibration-v1)
— 430 JSON/Mermaid samples for structured-output coverage.
- **REAP params** (per paper recommendation for ≥110 B models): `max_tokens=16384`,
`batch_size=8`, `observation_sequence_chunk_size=1`,
`renormalize_router_weights=true`, `observer=layerwise`.
## Repository layout
```
runs/kimi-k26-pr17-obs-v1/
layerwise_intermediate/
group_000/
block_000_metrics.pt
block_001_metrics.pt
...
group_001/
...
complete_state.pt # merged observer state after all blocks × groups
status.json # current progress / last_block / last_group / eta
mix-summary.json # calibration mix manifest
README.md # this file
```
Every `block_NNN_metrics.pt` is uploaded as soon as REAP's layerwise observer
finishes writing it, so partial runs are already usable. The final merged
`complete_state.pt` is pushed when the full sweep finishes.
## Usage
```python
from huggingface_hub import snapshot_download
import torch
path = snapshot_download(
repo_id="0xSero/kimi-k2.6-reap-observations-v1",
repo_type="dataset",
allow_patterns=["runs/kimi-k26-pr17-obs-v1/complete_state.pt"],
)
observer_data = torch.load(f"{path}/runs/kimi-k26-pr17-obs-v1/complete_state.pt",
weights_only=False)
# observer_data[layer_idx] = {
# "expert_frequency": Tensor[num_experts],
# "routed_characteristic_activation": Tensor[num_experts, hidden_dim],
# "ttm_similarity_matrix": ...,
# "reap": Tensor[num_experts], # precomputed REAP saliency
# ...
# }
```
Feed back into the REAP pruner:
```bash
python -m reap.layerwise_prune \
--model-name moonshotai/Kimi-K2.6 \
--compression-ratio 0.25 \
--prune-method reap \
--cached-observer-data runs/kimi-k26-pr17-obs-v1/complete_state.pt
```
## Citation
If you use this dataset, please cite both REAP and this release:
```bibtex
@inproceedings{
lasby2026reap,
title={{REAP} the Experts: Why Pruning Prevails for One-Shot MoE compression},
author={Mike Lasby and Ivan Lazarevich and Nish Sinnadurai and Sean Lie and Yani Ioannou and Vithursan Thangarasa},
booktitle={ICLR},
year={2026}
}
```
## License
Apache-2.0 (matching upstream REAP). Base model license follows
moonshotai/Kimi-K2.6's terms.
# Kimi-K2.6 REAP观测数据集(v1版)
本数据集采集自**moonshotai/Kimi-K2.6**模型,基于REAP分层观测器(PR #17,CerebrasResearch/reap仓库)获取每层专家路由与激活统计信息。
## 数据集概况
本数据集包含对Kimi-K2.6执行完整REAP校准流程所得到的**观测器输出结果**。该数据集并非剪枝后模型,每条记录均描述了基座模型每一层混合专家模型(Mixture of Experts, MoE)的逐Token路由决策、专家激活范数,以及REAP显著性计算所需的各项要素。
下游使用者可将这些观测结果输入至`reap.prune`工具(或其他基于专家显著性的剪枝器),即可在无需重新运行代价高昂的前向传播校准流程的前提下,生成任意压缩倍率的剪枝模型检查点。
## 基座模型与量化配置
- **基座模型**:`moonshotai/Kimi-K2.6`(Kimi-K2.6采用DeepseekV3架构,参数量约1.026万亿)
- **量化方式**:采用INT4量化,组大小为32,对称量化,使用`compressed-tensors`库的`pack-quantized`格式存储。根据模型的`quantization_config.ignore`列表,稠密多层感知机(Multi-Layer Perceptron, MLP)与注意力模块保持BF16精度。
## 校准配置
- **校准复合数据集**:
- [`0xSero/reap-calibration-data-v1`](https://huggingface.co/datasets/0xSero/reap-calibration-data-v1):涵盖10个领域的23088条非基准测试样本。
- [`0xSero/structured-outputs-calibration-v1`](https://huggingface.co/datasets/0xSero/structured-outputs-calibration-v1):包含430条JSON/Mermaid格式样本,用于覆盖结构化输出场景。
- **REAP参数配置**:遵循论文针对参数量≥1100亿模型的推荐设置:`max_tokens=16384`、`batch_size=8`、`observation_sequence_chunk_size=1`、`renormalize_router_weights=true`、`observer=layerwise`。
## 仓库目录结构
runs/kimi-k26-pr17-obs-v1/
layerwise_intermediate/
group_000/
block_000_metrics.pt
block_001_metrics.pt
...
group_001/
...
complete_state.pt # 所有模块×分组完成后的合并观测器状态文件
status.json # 当前进度、最后处理模块、最后处理分组、预计完成时间
mix-summary.json # 校准数据集组合清单
README.md # 本说明文档
每个`block_NNN_metrics.pt`文件会在REAP分层观测器完成写入后立即上传,因此即使是未完成的部分运行结果也可直接使用。完整扫描流程结束后,会推送最终合并后的`complete_state.pt`文件。
## 使用方法
python
from huggingface_hub import snapshot_download
import torch
path = snapshot_download(
repo_id="0xSero/kimi-k2.6-reap-observations-v1",
repo_type="dataset",
allow_patterns=["runs/kimi-k26-pr17-obs-v1/complete_state.pt"],
)
observer_data = torch.load(f"{path}/runs/kimi-k26-pr17-obs-v1/complete_state.pt",
weights_only=False)
# observer_data[layer_idx] = {
# "expert_frequency": Tensor[num_experts], # 专家激活频率
# "routed_characteristic_activation": Tensor[num_experts, hidden_dim], # 路由激活特征
# "ttm_similarity_matrix": ..., # TTM相似度矩阵
# "reap": Tensor[num_experts], # 预计算的REAP显著性得分
# ...
# }
将观测数据输入REAP剪枝器的示例命令:
bash
python -m reap.layerwise_prune
--model-name moonshotai/Kimi-K2.6
--compression-ratio 0.25
--prune-method reap
--cached-observer-data runs/kimi-k26-pr17-obs-v1/complete_state.pt
## 引用方式
若使用本数据集,请同时引用REAP相关论文与本数据集发布版本:
bibtex
@inproceedings{
lasby2026reap,
title={{REAP} the Experts: Why Pruning Prevails for One-Shot MoE compression},
author={Mike Lasby and Ivan Lazarevich and Nish Sinnadurai and Sean Lie and Yani Ioannou and Vithursan Thangarasa},
booktitle={ICLR},
year={2026}
}
## 开源协议
本数据集采用Apache-2.0开源协议(与上游REAP项目保持一致)。基座模型的授权条款遵循`moonshotai/Kimi-K2.6`的相关规定。
提供机构:
maas
创建时间:
2026-04-22



