five

BlogCatalog dataset

收藏
DataCite Commons2020-08-25 更新2024-07-28 收录
下载链接:
https://figshare.com/articles/BlogCatalog_dataset/11923611
下载链接
链接失效反馈
官方服务:
资源简介:
<b>Abstract</b>: BlogCatalog is the social blog directory which manages the bloggers and their blogs.<br><b>Number of Nodes:</b>10,312<b>Number of Edges:</b>333,983<b>Missing Values?</b>no<br><b>Source:</b>Nitin Agarwal+, Xufei Wang*, Huan Liu*<br><br>+ Department of Information Science, University of Arkansas at Little Rock. E-mail:nxagarwal@ualr.edu<br><br>* School of Computing, Informatics and Decision Systems Engineering, Arizona State University. E-mail: huan.liu@asu.edu, xufei.wang@asu.edu<br><b>Data Set Information:</b>2 files are included:<br><br>1. nodes.csv<br>-- it's the file of all the users. This file works as a dictionary of all the users in this data set. It's useful for fast reference. It contains all the node ids used in the dataset.<br><br>2. edges.csv<br>-- this is the friendship network among the bloggers. The blogger's friends are represented using edges. Here is an example.<br><br>1,2<br><br>This means blogger with id "1" is friend with blogger id "2".<br><b>Attribute Information:</b>This is the data set crawled on July, 2009 from BlogCatalog ( http://www.blogcatalog.com ). BlogCatalog is a social blog directory website. This contains the friendship network crawled. For easier understanding, all the contents are organized in CSV file format.<br><br>-. Basic statistics<br><br>Number of bloggers : 88,784<br><br>Number of friendship pairs: 4,186,390<br><b>Relevant Papers:</b><b><br></b>Nitin Agarwal and Huan Liu. ”Modeling and Data Mining in Blogosphere”, Synthesis Lectures on Data Mining and Knowledge Discovery #1, Morgan &amp; Claypool Publishers, Robert Grossman (Editor), August 2009. ISBN: 9781598299083 (paperback) ISBN: 9781598299090 (ebook) Nitin Agarwal, Magdiel Galan, Huan Liu, and Shankar Subramanya. WisColl: Collective Wisdom based Blog Clustering. Journal of Information Science, 180(1): 39-61, January, 2010. Nitin Agarwal, Huan Liu, Sudheendra Murthy, Arunabha Sen, and Xufei Wang. A Social Identity Approach to Identify Familiar Strangers in a Social Network. In Proceedings of the Third International AAAI Conference on Weblogs and Social Media (ICWSM09), pp. 2 - 9, May 17-20, 2009. San Jose, California. Nitin Agarwal, Huan Liu, Sudheendra Murthy, Arunabha Sen, and Xufei Wang. "A Social Identity Approach to Identify Familiar Strangers in a Social Network", 3rd International AAAI Conference on Weblogs and Social Media (ICWSM09), pp. 2 - 9, May 17-20, 2009. San Jose, California.<br>

<b>摘要</b>: BlogCatalog是用于管理博主及其博客的社交博客目录平台。<br><b>节点数量:</b>10,312<br><b>边数量:</b>333,983<br><b>是否存在缺失值:</b>否<br><b>来源:</b>Nitin Agarwal+,Xufei Wang*,Huan Liu*<br><br>+ 阿肯色大学小石城分校信息科学系。电子邮箱:nxagarwal@ualr.edu<br><br>* 亚利桑那州立大学计算、信息与决策系统工程学院。电子邮箱:huan.liu@asu.edu,xufei.wang@asu.edu<br><b>数据集信息:</b>本数据集包含2个文件:<br><br>1. nodes.csv<br>-- 该文件存储所有用户信息,可作为本数据集内全部用户的字典表,用于快速检索,包含数据集内使用的所有节点ID。<br><br>2. edges.csv<br>-- 该文件存储博主间的好友关系网络,以边代表博主的好友关系。示例如下:<br><br>1,2<br><br>上述示例表示ID为"1"的博主与ID为"2"的博主互为好友。<br><b>属性信息:</b>本数据集于2009年7月从BlogCatalog网站(http://www.blogcatalog.com)爬取获取。BlogCatalog是一个社交博客目录平台,本数据集包含其爬取得到的好友关系网络。为便于使用,所有数据均以CSV文件格式组织。<br><br>-. 基础统计信息<br><br>博主总数:88,784<br><br>好友关系对总数:4,186,390<br><b>相关论文:</b><br>Nitin Agarwal与Huan Liu. "Blogosphere中的建模与数据挖掘",《数据挖掘与知识发现综合讲义》第1卷,摩根&克莱普尔出版社,Robert Grossman(主编),2009年8月。ISBN:9781598299083(平装版),ISBN:9781598299090(电子版)<br>Nitin Agarwal, Magdiel Galan, Huan Liu, 及Shankar Subramanya. WisColl:基于集体智慧的博客聚类. 《信息科学期刊》,180(1): 39-61,2010年1月。<br>Nitin Agarwal, Huan Liu, Sudheendra Murthy, Arunabha Sen, 及Xufei Wang. 社交网络中识别熟悉陌生人的社会认同方法. 第三届国际博客与社交媒体会议(ICWSM09)论文集,第2-9页,2009年5月17-20日,加利福尼亚州圣何塞。<br>Nitin Agarwal, Huan Liu, Sudheendra Murthy, Arunabha Sen, 及Xufei Wang. "A Social Identity Approach to Identify Familiar Strangers in a Social Network", 3rd International AAAI Conference on Weblogs and Social Media (ICWSM09), pp. 2 - 9, May 17-20, 2009. San Jose, California.
提供机构:
figshare
创建时间:
2020-04-20
搜集汇总
数据集介绍
main_image_url
以上内容由遇见数据集搜集并总结生成
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作