hobby data set
收藏arXiv2025-09-30 收录
下载链接:
https://github.com/CodingPerson/PEARL
下载链接
链接失效反馈官方服务:
资源简介:
该数据集是从公开可获得的Reddit投稿和评论中提取出来的,包含了标注了个人属性的用户的发言。所有含有明确个人属性断言的发言,即用于标注的发言,都已从数据集中移除。该数据集大约涵盖了6000名用户,涉及149个爱好的属性值。这项任务旨在从对话中进行个人属性预测。
This dataset is extracted from publicly available Reddit submissions and comments, and contains posts from users whose personal attributes have been annotated. All posts that contain explicit personal attribute assertions — the very posts used for annotation — have been removed from the dataset. This dataset encompasses approximately 6,000 users and includes attribute values for 149 hobbies. The core task supported by this dataset is personal attribute prediction from conversational data.
提供机构:
Authors of the study



