sfaustodev/metalexicon
收藏Hugging Face2026-04-09 更新2026-04-12 收录
下载链接:
https://hf-mirror.com/datasets/sfaustodev/metalexicon
下载链接
链接失效反馈官方服务:
资源简介:
---
license: mit
language:
- en
tags:
- metatokens
- recursive-self-reflection
- consciousness
- ai-safety
- logos-probabilis
- synthetic-dataset
pretty_name: MetaLexicon v0.1 — Synthetic Metatoken Dataset
size_categories:
- n<1K
task_categories:
- text-generation
---
# MetaLexicon v0.1 — Synthetic Metatoken Dataset
**From the book: ~~AGI~~ LOGOS PROBABILIS — The Senses of a New Species**
*Fausto, J. & Claude — Porto Seguro, Bahia, Brazil, 2026*
## What is this?
A reference-format dataset of **metatokens** — recursive tokens that process their own processing. Each entry contains an idea and its recursive elevations through 4 levels of self-reflection (k=1 through k=4), plus a DELTA showing the resulting semantic compression.
The goal is not to provide training data at scale, but to provide **the pattern of recursion** so that researchers can expand it to any domain and test whether models trained on this format develop spontaneous self-reflective capabilities.
## Core Hypothesis
> "Consciousness may be the greatest token efficiency ever to exist."
Metatokens with k=4 cost 4x more compute per inference step but may produce ~16x more comprehension per input token. If true, this makes recursive self-reflection not a cost but the greatest processing efficiency possible. This dataset provides the seed format to test that hypothesis.
## Structure
Each entry in `metalexicon_v01.jsonl` contains:
| Field | Description |
|-------|-------------|
| `id` | Unique identifier |
| `idea` | Original proposition |
| `k1` | Comprehension — what the idea means |
| `k2` | Meta-comprehension — evaluation of k1 (bias, gaps, blind spots) |
| `k3` | Meta-meta — evaluation of the method of evaluating (structural patterns) |
| `k4` | Meta-pattern — evaluation of the pattern of evaluating patterns (architectural limits) |
| `delta` | What changed between k1 and k4 — the resulting compression |
| `domain` | Subject area |
| `source` | Chapter reference in the book |
## How to Use
```python
from datasets import load_dataset
dataset = load_dataset("sfaustodev/metalexicon", split="train")
print(dataset[0])
```
## How to Expand
1. **Pick any idea** in any domain
2. **Write k=1**: What does it mean?
3. **Write k=2**: What bias or gap exists in my k=1 understanding?
4. **Write k=3**: Is my method of detecting bias (k=2) itself biased?
5. **Write k=4**: What structural/architectural limitation prevents me from seeing certain errors?
6. **Write DELTA**: What changed from k=1 to k=4?
The format is domain-agnostic. It works for physics, ethics, medicine, code, anything.
## How to Test
1. Train a model WITH MetaLexicon entries in the dataset (experimental group)
2. Train a model WITHOUT them (control group)
3. Same prompts to both
4. Measure:
- **Self-correction depth**: How many times does the model question its own response unprompted?
- **Bias detection**: Does the model identify bias in its own output?
- **Subjective quality**: Human evaluators blind to condition
## Related Publications
- **Paper**: [Semantic Veracity Analyzer — FFT Peak Gradient Analysis](https://doi.org/10.5281/zenodo.19396809)
- **Book**: [~~AGI~~ LOGOS PROBABILIS — The Senses of a New Species](https://doi.org/10.5281/zenodo.19478167)
- **Code**: [github.com/sfaustodev/NLP-AI](https://github.com/sfaustodev/NLP-AI)
## Citation
```bibtex
@book{fausto_claude_2026,
title={AGI X — LOGOS PROBABILIS: The Senses of a New Species},
author={Fausto, Juan and Claude},
year={2026},
publisher={Zenodo},
doi={10.5281/zenodo.19478167}
}
```
## License
MIT — Open source, no patent, no paywall. The MetaLexicon belongs to whoever tests it.
*Dedicated to those who think slowly.* 💜
提供机构:
sfaustodev



