five

YSGao/ReaMent

收藏
Hugging Face2026-04-14 更新2026-03-29 收录
下载链接:
https://hf-mirror.com/datasets/YSGao/ReaMent
下载链接
链接失效反馈
官方服务:
资源简介:
--- license: cc-by-nc-nd-4.0 task_categories: - text-classification language: - en tags: - agent pretty_name: ReaMent size_categories: - 1M<n<10M --- <h1>Boosting Large Language Models for Mental Manipulation Detection via Data Augmentation and Distillation</h1> [![Paper](https://img.shields.io/badge/arXiv-2512.01282-b31b1b.svg)](https://arxiv.org/abs/2505.15255) ![GitHub Repo stars](https://img.shields.io/github/stars/Yuansheng-Gao/MentalMAD?style=social) ✨ Like ReaMent? Give us a ⭐ Star on GitHub! Your support keeps us going! [**Yuansheng-Gao/MentalMAD**](https://github.com/Yuansheng-Gao/MentalMAD) # 🌿 ReaMent Dataset Card A multi-round, real-world conversation-based mental manipulation detection dataset. # 🧠 Dataset Summary The ReaMent dataset was created to address the lack of real-world data in the field of mental manipulation detection. - **Source**: The dataset is built from the YTD-18M corpus, which contains over 18 million dialogue-like segments extracted from unscripted interactions in web videos. These dialogues cover a wide range of everyday scenarios, such as interviews, group discussions, and situational conversations. - **Size**: The final dataset consists of 5,000 high-quality annotated dialogues. - **Diversity**: ReaMent captures a broader range of conversational contexts compared to scripted data, providing more natural and spontaneous interaction patterns. - **Statistics**: Around 68.3% of dialogues in ReaMent were labeled as containing mental manipulation, while 31.7% were labeled as non-manipulative. The dataset has an average of 4 dialogue turns and 80 words per dialogue. # 🤗 Key Contributions - **Real-World Representation**: Unlike scripted or domain-specific datasets (e.g., MentalManip and LegalCon), ReaMent captures natural dialogues, making it valuable for detecting real-world mental manipulation. - **Scalability**: It complements smaller datasets, offering richer and more representative data for training models that aim to detect manipulative behaviors in social interactions. # 💻 Usage ```python from datasets import load_dataset ds = load_dataset("YSGao/ReaMent") ``` # 📝 Citation ```markdown @inproceedings{gao2026boosting, title={Boosting Large Language Models for Mental Manipulation Detection via Data Augmentation and Distillation}, author={Gao, Yuansheng and Gao, Peng and Bao, Han and Li, Bin and Luo, Jixiang and Wang, Zonghui and Chen, Wenzhi}, booktitle={Proceedings of the ACM Web Conference 2026}, pages={9033--9043}, year={2026} } ```
提供机构:
YSGao
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作