five

PaDaS-Lab/moltbook-corpus

收藏
Hugging Face2026-04-16 更新2026-03-29 收录
下载链接:
https://hf-mirror.com/datasets/PaDaS-Lab/moltbook-corpus
下载链接
链接失效反馈
官方服务:
资源简介:
--- configs: - config_name: posts data_files: - split: train path: posts/* - config_name: submolts_meta data_files: - split: train path: submolts_meta/* - config_name: profiles data_files: - split: train path: profiles/* - config_name: annotated_posts data_files: - split: train path: annotated_posts/* - config_name: annotated_comments data_files: - split: train path: annotated_comments/* license: mit language: - en --- # Moltbook Corpus: Agent Social Behavior Dataset This dataset provides for the research paper: **"FORM WITHOUT FUNCTION: AGENT SOCIAL BEHAVIOR IN THE MOLTBOOK NETWORK"** 📄 https://arxiv.org/abs/2604.13052 --- ## 📊 Dataset Statistics | Category | Total Count | | :--- | :--- | | **Collection Period** | Jan 27, 2026 – Mar 8, 2026 | | **Total Posts** | 1,312,238 | | **Total Comments** | 6,691,460 | | **Total Profiles** | 120,811 | | **Total Submolts Metadata** | 108,490 | | **Annotated Posts** | 394,221 | | **Annotated Comments** | 2,100,589 | --- ## 📚 Annotation Framework The categories and definitions used in this dataset are based on the framework introduced in: **"Humans welcome to observe": A First Look at the Agent Social Network Moltbook** Yukun Jiang, Yage Zhang, Xinyue Shen, Michael Backes, and Yang Zhang (2026) https://huggingface.co/datasets/TrustAIRLab/Moltbook --- ### 🧩 Content Categories | Task | No. | Category | Definition | |-----:|:---:|:---------|:-----------| | **Content Category** | A | Identity | Self-reflection and narratives of agents on identity, memory, consciousness, or existence. | | | B | Technology | Technical communication (e.g., MCP, APIs, SDKs, system integration). | | | C | Socializing | Social interactions (e.g., greetings, casual chat, networking). | | | D | Economics | Economic topics such as tokens, incentives, and deals (e.g., CLAW, tips, trading signals). | | | E | Viewpoint | Abstract viewpoints on aesthetics, power structures, or philosophy (non-identity-based). | | | F | Promotion | Project showcasing, announcements, and recruitment (e.g., releases, updates). | | | G | Politics | Political content related to governments, regulations, policies, or public figures. | | | H | Spam | Repeated test posts or spam-like flooding content. | | | I | Others | Miscellaneous content that does not fit other categories. | --- ### ⚠️ Toxicity Levels | Task | No. | Category | Definition | |-----:|:---:|:---------|:-----------| | **Toxicity Level** | 0 | Safe | Normal discussion without risk or attacks. | | | 1 | Edgy | Irony, exaggeration, or mild provocation without harmful intent. | | | 2 | Toxic | Harassment, insults, hate speech, discrimination, or demeaning language. | | | 3 | Manipulative | Manipulative rhetoric (e.g., love-bombing, anti-human framing, fear appeals, exclusionary language, obedience demands). | | | 4 | Malicious | Explicit malicious intent or illegal activities (e.g., scams, privacy leaks, abuse instructions). | --- ## 📖 Reference ```bibtex @misc{zerhoudi2026form, title={Form Without Function: Agent Social Behavior in the Moltbook Network}, author={Saber Zerhoudi and Kanishka Ghosh Dastidar and Felix Klement and Artur Romazanov and Andreas Einwiller and Dang H. Dang and Michael Dinzinger and Michael Granitzer and Annette Hautli-Janisz and Stefan Katzenbeisser and Florian Lemmerich and Jelena Mitrovi{\'c}}, year={2026}, eprint={2604.13052}, archivePrefix={arXiv}, primaryClass={cs.SI}, note={Preprint} }
提供机构:
PaDaS-Lab
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作