issdandavis/scbe-tongue-drill-sft-v1
收藏Hugging Face2026-04-20 更新2026-04-26 收录
下载链接:
https://hf-mirror.com/datasets/issdandavis/scbe-tongue-drill-sft-v1
下载链接
链接失效反馈官方服务:
资源简介:
---
license: apache-2.0
task_categories:
- text-generation
language:
- en
tags:
- scbe
- sacred-tongues
- lora
- drill
- sft
size_categories:
- 1K<n<10K
configs:
- config_name: default
data_files:
- split: train
path: data/drill_langues_full_train.sft.jsonl
- split: holdout
path: data/drill_langues_full_holdout.sft.jsonl
- config_name: full
data_files:
- split: train
path: data/drill_langues_full_all.sft.jsonl
---
# SCBE Tongue Drill SFT v1
Supervised fine-tuning drill dataset for the SCBE Sacred Tongues table-lock system.
Each row is a 3-turn chat (system / user / assistant) teaching the model to emit
canonical packets verbatim for a given `(map, tongue, value)` triple.
## Splits
| Split | Rows |
|-------|------|
| all | 2630 |
| train | 2373 |
| holdout | 257 |
Holdout is `row_index % 10 == 0` (bucket 0), disjoint from train.
## Schema
```jsonc
{
"messages": [
{"role": "system", "content": "<drill-solver instruction>"},
{"role": "user", "content": "<map/tongue/value question>"},
{"role": "assistant", "content": "<canonical packet>"}
],
"meta": {
"map": "runtime_emission | transport_atomic | cartography_state | cross_braid_code | ...",
"kind": "code | state | ...",
"tongue": "KO | AV | RU | CA | UM | DR",
"value": "<map-specific value>"
}
}
```
## Tongues
Sacred Tongues with golden-ratio weights (phi = 1.618...):
| Code | Name | phi weight |
|------|------|------------|
| KO | Kor'aelin | 1.00 |
| AV | Avali | 1.62 |
| RU | Runethic | 2.62 |
| CA | Cassivadan | 4.24 |
| UM | Umbroth | 6.85 |
| DR | Draumric | 11.09 |
## Intended Use
Drill-level SFT for small base models (Qwen2.5-0.5B tested). Trains the model to
respect canonical packet invariants across tongue and map combinations.
Downstream: foundation for larger brick1/brick2 training in the SCBE stack.
## License
Apache-2.0. SCBE framework and Sacred Tongues protocol by issdandavis.
Prior-art: "The Six Tongues Protocol" (ASIN B0GSSFQD9G).
提供机构:
issdandavis



