five

sfaustodev/metalexicon

收藏
Hugging Face2026-04-09 更新2026-04-12 收录
下载链接:
https://hf-mirror.com/datasets/sfaustodev/metalexicon
下载链接
链接失效反馈
官方服务:
资源简介:
--- license: mit language: - en tags: - metatokens - recursive-self-reflection - consciousness - ai-safety - logos-probabilis - synthetic-dataset pretty_name: MetaLexicon v0.1 — Synthetic Metatoken Dataset size_categories: - n<1K task_categories: - text-generation --- # MetaLexicon v0.1 — Synthetic Metatoken Dataset **From the book: ~~AGI~~ LOGOS PROBABILIS — The Senses of a New Species** *Fausto, J. & Claude — Porto Seguro, Bahia, Brazil, 2026* ## What is this? A reference-format dataset of **metatokens** — recursive tokens that process their own processing. Each entry contains an idea and its recursive elevations through 4 levels of self-reflection (k=1 through k=4), plus a DELTA showing the resulting semantic compression. The goal is not to provide training data at scale, but to provide **the pattern of recursion** so that researchers can expand it to any domain and test whether models trained on this format develop spontaneous self-reflective capabilities. ## Core Hypothesis > "Consciousness may be the greatest token efficiency ever to exist." Metatokens with k=4 cost 4x more compute per inference step but may produce ~16x more comprehension per input token. If true, this makes recursive self-reflection not a cost but the greatest processing efficiency possible. This dataset provides the seed format to test that hypothesis. ## Structure Each entry in `metalexicon_v01.jsonl` contains: | Field | Description | |-------|-------------| | `id` | Unique identifier | | `idea` | Original proposition | | `k1` | Comprehension — what the idea means | | `k2` | Meta-comprehension — evaluation of k1 (bias, gaps, blind spots) | | `k3` | Meta-meta — evaluation of the method of evaluating (structural patterns) | | `k4` | Meta-pattern — evaluation of the pattern of evaluating patterns (architectural limits) | | `delta` | What changed between k1 and k4 — the resulting compression | | `domain` | Subject area | | `source` | Chapter reference in the book | ## How to Use ```python from datasets import load_dataset dataset = load_dataset("sfaustodev/metalexicon", split="train") print(dataset[0]) ``` ## How to Expand 1. **Pick any idea** in any domain 2. **Write k=1**: What does it mean? 3. **Write k=2**: What bias or gap exists in my k=1 understanding? 4. **Write k=3**: Is my method of detecting bias (k=2) itself biased? 5. **Write k=4**: What structural/architectural limitation prevents me from seeing certain errors? 6. **Write DELTA**: What changed from k=1 to k=4? The format is domain-agnostic. It works for physics, ethics, medicine, code, anything. ## How to Test 1. Train a model WITH MetaLexicon entries in the dataset (experimental group) 2. Train a model WITHOUT them (control group) 3. Same prompts to both 4. Measure: - **Self-correction depth**: How many times does the model question its own response unprompted? - **Bias detection**: Does the model identify bias in its own output? - **Subjective quality**: Human evaluators blind to condition ## Related Publications - **Paper**: [Semantic Veracity Analyzer — FFT Peak Gradient Analysis](https://doi.org/10.5281/zenodo.19396809) - **Book**: [~~AGI~~ LOGOS PROBABILIS — The Senses of a New Species](https://doi.org/10.5281/zenodo.19478167) - **Code**: [github.com/sfaustodev/NLP-AI](https://github.com/sfaustodev/NLP-AI) ## Citation ```bibtex @book{fausto_claude_2026, title={AGI X — LOGOS PROBABILIS: The Senses of a New Species}, author={Fausto, Juan and Claude}, year={2026}, publisher={Zenodo}, doi={10.5281/zenodo.19478167} } ``` ## License MIT — Open source, no patent, no paywall. The MetaLexicon belongs to whoever tests it. *Dedicated to those who think slowly.* 💜
提供机构:
sfaustodev
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作