PaDaS-Lab/moltbook-corpus
收藏Hugging Face2026-04-16 更新2026-03-29 收录
下载链接:
https://hf-mirror.com/datasets/PaDaS-Lab/moltbook-corpus
下载链接
链接失效反馈官方服务:
资源简介:
---
configs:
- config_name: posts
data_files:
- split: train
path: posts/*
- config_name: submolts_meta
data_files:
- split: train
path: submolts_meta/*
- config_name: profiles
data_files:
- split: train
path: profiles/*
- config_name: annotated_posts
data_files:
- split: train
path: annotated_posts/*
- config_name: annotated_comments
data_files:
- split: train
path: annotated_comments/*
license: mit
language:
- en
---
# Moltbook Corpus: Agent Social Behavior Dataset
This dataset provides for the research paper:
**"FORM WITHOUT FUNCTION: AGENT SOCIAL BEHAVIOR IN THE MOLTBOOK NETWORK"**
📄 https://arxiv.org/abs/2604.13052
---
## 📊 Dataset Statistics
| Category | Total Count |
| :--- | :--- |
| **Collection Period** | Jan 27, 2026 – Mar 8, 2026 |
| **Total Posts** | 1,312,238 |
| **Total Comments** | 6,691,460 |
| **Total Profiles** | 120,811 |
| **Total Submolts Metadata** | 108,490 |
| **Annotated Posts** | 394,221 |
| **Annotated Comments** | 2,100,589 |
---
## 📚 Annotation Framework
The categories and definitions used in this dataset are based on the framework introduced in:
**"Humans welcome to observe": A First Look at the Agent Social Network Moltbook**
Yukun Jiang, Yage Zhang, Xinyue Shen, Michael Backes, and Yang Zhang (2026)
https://huggingface.co/datasets/TrustAIRLab/Moltbook
---
### 🧩 Content Categories
| Task | No. | Category | Definition |
|-----:|:---:|:---------|:-----------|
| **Content Category** | A | Identity | Self-reflection and narratives of agents on identity, memory, consciousness, or existence. |
| | B | Technology | Technical communication (e.g., MCP, APIs, SDKs, system integration). |
| | C | Socializing | Social interactions (e.g., greetings, casual chat, networking). |
| | D | Economics | Economic topics such as tokens, incentives, and deals (e.g., CLAW, tips, trading signals). |
| | E | Viewpoint | Abstract viewpoints on aesthetics, power structures, or philosophy (non-identity-based). |
| | F | Promotion | Project showcasing, announcements, and recruitment (e.g., releases, updates). |
| | G | Politics | Political content related to governments, regulations, policies, or public figures. |
| | H | Spam | Repeated test posts or spam-like flooding content. |
| | I | Others | Miscellaneous content that does not fit other categories. |
---
### ⚠️ Toxicity Levels
| Task | No. | Category | Definition |
|-----:|:---:|:---------|:-----------|
| **Toxicity Level** | 0 | Safe | Normal discussion without risk or attacks. |
| | 1 | Edgy | Irony, exaggeration, or mild provocation without harmful intent. |
| | 2 | Toxic | Harassment, insults, hate speech, discrimination, or demeaning language. |
| | 3 | Manipulative | Manipulative rhetoric (e.g., love-bombing, anti-human framing, fear appeals, exclusionary language, obedience demands). |
| | 4 | Malicious | Explicit malicious intent or illegal activities (e.g., scams, privacy leaks, abuse instructions). |
---
## 📖 Reference
```bibtex
@misc{zerhoudi2026form,
title={Form Without Function: Agent Social Behavior in the Moltbook Network},
author={Saber Zerhoudi and Kanishka Ghosh Dastidar and Felix Klement and Artur Romazanov and Andreas Einwiller and Dang H. Dang and Michael Dinzinger and Michael Granitzer and Annette Hautli-Janisz and Stefan Katzenbeisser and Florian Lemmerich and Jelena Mitrovi{\'c}},
year={2026},
eprint={2604.13052},
archivePrefix={arXiv},
primaryClass={cs.SI},
note={Preprint}
}
提供机构:
PaDaS-Lab



