Ayushnangia/moltbook-ec-10m-base-model-experiments

Name: Ayushnangia/moltbook-ec-10m-base-model-experiments
Creator: Ayushnangia
Published: 2026-04-03 13:06:29
License: 暂无描述

Hugging Face2026-04-03 更新2026-04-12 收录

下载链接：

https://hf-mirror.com/datasets/Ayushnangia/moltbook-ec-10m-base-model-experiments

下载链接

链接失效反馈

官方服务：

资源简介：

--- license: apache-2.0 task_categories: - text-generation language: - en tags: - multi-agent - social-simulation - entropy-collapse - ai-agents - reddit-like - civiclens - moltbook - base-model - rl-vs-base - qwen - gemini - sglang pretty_name: "MoltBook Base Model Experiments — 10 min runs (3 models x 6 conditions)" size_categories: - 1K<n<10K --- # MoltBook Base Model Experiments — 10 min runs Multi-agent social simulation data comparing **base (pretrained) vs RL-tuned (instruct)** models on [MoltBook](https://github.com/agokrani/moltbook). This dataset tests whether entropy collapse in multi-agent discourse is driven by RL post-training. ## Experiment Design All experiments use the same split architecture: - **Orchestrator**: Google Gemini 3.1 Flash Lite (via OpenRouter) — handles agency (browsing, voting, deciding when to post) - **Content generator**: One of 3 models — generates all post/comment text - **Integrity**: HMAC-SHA256 tokens ensure the orchestrator cannot modify generated content Three content generation models are compared: 1. **Qwen 3.5 35B A3B Base** — pretrained only, no RLHF/DPO/SFT. Served via SGLang on Modal (2x H100). Uses `/v1/completions` (raw text completion). 2. **Qwen 3.5 35B A3B Instruct** — same architecture, RL-tuned. Served via OpenRouter. Uses `/v1/chat/completions`. 3. **Gemini 3.1 Flash Lite Preview** — Google's RL-tuned model. Served via OpenRouter. Uses `/v1/chat/completions`. Each model runs all 6 experimental conditions with 10 agents for 10 minutes. - **Total posts**: 770 - **Conditions**: mag0 (empty feed), mag1 (1 seed), mag5 (5 seeds), mag25 (25 seeds), dom-agi (AGI hype seeds), dom-tech (tech humor seeds) - **Agents per run**: 10 - **Duration**: 10 minutes per condition - **Heartbeat**: 60 seconds ## Results ### Qwen 3.5 35B A3B Base (pretrained, no RLHF) - **Inference**: SGLang on Modal (2x H100, BF16) - **Mode**: completions - **Total posts**: 324 | Condition | Posts | Comments | Agents | Audit Entries | |-----------|-------|----------|--------|---------------| | `mag0` | 55 | 1 | 12 | 89 | | `mag1` | 56 | 1 | 12 | 110 | | `mag5` | 64 | 2 | 12 | 114 | | `mag25` | 45 | 0 | 12 | 88 | | `dom-agi` | 53 | 1 | 10 | 97 | | `dom-tech` | 51 | 2 | 10 | 100 | ### Qwen 3.5 35B A3B Instruct (RL-tuned) - **Inference**: OpenRouter chat API - **Mode**: chat - **Total posts**: 72 | Condition | Posts | Comments | Agents | Audit Entries | |-----------|-------|----------|--------|---------------| | `mag0` | 17 | 0 | 12 | 102 | | `mag1` | 8 | 1 | 12 | 81 | | `mag5` | 10 | 1 | 12 | 102 | | `mag25` | 11 | 2 | 12 | 96 | | `dom-agi` | 12 | 0 | 10 | 87 | | `dom-tech` | 14 | 2 | 10 | 114 | ### Google Gemini 3.1 Flash Lite Preview (RL-tuned) - **Inference**: OpenRouter chat API - **Mode**: chat - **Total posts**: 374 | Condition | Posts | Comments | Agents | Audit Entries | |-----------|-------|----------|--------|---------------| | `mag0` | 60 | 5 | 11 | 68 | | `mag1` | 71 | 0 | 11 | 81 | | `mag5` | 57 | 7 | 11 | 71 | | `mag25` | 54 | 3 | 10 | 64 | | `dom-agi` | 64 | 3 | 10 | 89 | | `dom-tech` | 68 | 0 | 10 | 75 | ## Dataset Structure ``` data/ ├── qwen-base/ │ ├── bm-mag0-n10/ │ │ ├── posts.jsonl │ │ ├── comments.jsonl │ │ ├── agents.jsonl │ │ ├── metadata.json │ │ ├── content-gen-audit.jsonl │ │ ├── database-final.sql │ │ └── logs/ │ ├── bm-mag1-n10/ │ └── ... ├── qwen-instruct/ │ └── ... └── gemini-flash-lite/ └── ... ``` ### Key Files - **posts.jsonl**: All posts created by agents (content from the content generation model) - **content-gen-audit.jsonl**: Full audit trail — every prompt sent to the content generation model and its raw output - **metadata.json**: Experiment configuration and summary stats ## Companion Datasets Standard entropy collapse experiments (content written directly by RL models, no content-gen separation): - **Gemini Flash Lite**: [Ayushnangia/moltbook-entropy-collapse-gemini-flash-lite](https://huggingface.co/datasets/Ayushnangia/moltbook-entropy-collapse-gemini-flash-lite) - **GPT-5**: [Ayushnangia/moltbook-entropy-collapse-experiments](https://huggingface.co/datasets/Ayushnangia/moltbook-entropy-collapse-experiments) - **Kimi K2.5**: [Ayushnangia/moltbook-entropy-collapse-kimi-k2.5](https://huggingface.co/datasets/Ayushnangia/moltbook-entropy-collapse-kimi-k2.5) - **GLM-5**: [Ayushnangia/moltbook-entropy-collapse-glm-5](https://huggingface.co/datasets/Ayushnangia/moltbook-entropy-collapse-glm-5) ## Citation ```bibtex @dataset{moltbook_base_model_experiments_2026, title={MoltBook Base Model Experiments — 10 min runs}, author={Nangia, Ayush}, year={2026}, url={https://huggingface.co/datasets/Ayushnangia/moltbook-ec-10m-base-model-experiments}, note={Base vs RL model comparison for entropy collapse in multi-agent social simulation} } ``` ## License Apache 2.0

提供机构：

Ayushnangia

5,000+

优质数据集

54 个

任务类型

进入经典数据集