MichaelR207/rephraser_kimi_v1_0403
收藏Hugging Face2026-04-04 更新2026-04-12 收录
下载链接:
https://hf-mirror.com/datasets/MichaelR207/rephraser_kimi_v1_0403
下载链接
链接失效反馈官方服务:
资源简介:
---
dataset_info:
features:
- name: messages
list:
- name: role
dtype: string
- name: content
dtype: string
- name: warc_file
dtype: string
- name: doc_id
dtype: string
- name: spec_id
dtype: string
- name: spec
dtype: string
- name: model
dtype: string
splits:
- name: train
num_examples: 77691
- name: validation
num_examples: 100
license: apache-2.0
---
# rephraser_kimi_v1_0403
Web extraction distillation dataset reprocessed with **moonshotai/Kimi-K2.5**.
Covers the first **10%** of each extraction spec (77,691 train + 100 val rows).
Source: [MichaelR207/rephraser_late_check_0225](https://huggingface.co/datasets/MichaelR207/rephraser_late_check_0225)
## Schema
Each row contains:
- messages: list of 3 dicts (system, user, assistant) -- compatible with chat fine-tuning
- warc_file, doc_id, spec_id, spec, model
提供机构:
MichaelR207



