MMJBDS/ouroboros-papers
收藏Hugging Face2026-04-22 更新2026-04-26 收录
下载链接:
https://hf-mirror.com/datasets/MMJBDS/ouroboros-papers
下载链接
链接失效反馈官方服务:
资源简介:
Ouroboros研究论文数据集是一个包含六篇研究论文的集合,介绍了大型语言模型(LLMs)中的反射智能概念。该数据集聚焦于认知架构、强化学习和LLMs中的相变。主要内容包括反射智能的首次形式化、观察者深度(OD)定量指标、ReflexBench v1.0基准测试以及九层SCRGNDWMT认知奖励拓扑结构。数据集还涵盖了多奖励GRPO训练中的奖励交互问题(RIP)和相变动态。所有研究均使用单一的35B参数混合专家模型进行,并进行了20多轮迭代GRPO训练。数据集以英文呈现,适用于文本生成任务,采用CC BY 4.0许可。
The Ouroboros Research Papers dataset is a collection of six research papers introducing the concept of Reflexive Intelligence in large language models (LLMs). The dataset focuses on cognitive architectures, reinforcement learning, and phase transitions in LLMs. Key contributions include the first formalization of Reflexive Intelligence, the Observer Depth (OD) quantitative metric, the ReflexBench v1.0 benchmark, and the nine-tier SCRGNDWMT cognitive reward topology. It also covers the Reward Interaction Problem (RIP) and phase transition dynamics in multi-reward GRPO training. All research was conducted using a single 35B-parameter Mixture-of-Experts model across 20+ iterative GRPO training rounds. The dataset is in English, categorized under text-generation tasks, and licensed under CC BY 4.0.
提供机构:
MMJBDS



