five

The Blog Authorship Corpus

收藏
SSH Open MarketPlace2021-07-22 更新2024-08-03 收录
下载链接:
https://marketplace.sshopencloud.eu/dataset/1UXld8
下载链接
链接失效反馈
官方服务:
资源简介:
The Blog Authorship Corpus consists of the collected posts of 19,320 bloggers gathered from blogger.com in August 2004. The corpus incorporates a total of 681,288 posts and over 140 million words - or approximately 35 posts and 7250 words per person. Each blog is presented as a separate file, the name of which indicates a blogger id# and the blogger’s self-provided gender, age, industry and astrological sign.Cite as: J. Schler, M. Koppel, S. Argamon and J. Pennebaker (2006). Effects of Age and Gender on Blogging in _Proceedings of 2006 AAAI Spring Symposium on Computational Approaches for Analyzing Weblogs_.
创建时间:
2021-07-22
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作