BlogCatalog dataset
收藏DataCite Commons2025-06-01 更新2024-07-28 收录
下载链接:
https://figshare.com/articles/BlogCatalog_dataset/11923611/3
下载链接
链接失效反馈官方服务:
资源简介:
<b>Abstract</b>: BlogCatalog is the social blog directory which manages the bloggers and their blogs.<br><b>Number of Nodes:</b>10,312<b>Number of Edges:</b>333,983<b>Missing Values?</b>no<br><b>Source:</b>Nitin Agarwal+, Xufei Wang*, Huan Liu*<br><br>+ Department of Information Science, University of Arkansas at Little Rock. E-mail:nxagarwal@ualr.edu<br><br>* School of Computing, Informatics and Decision Systems Engineering, Arizona State University. E-mail: huan.liu@asu.edu, xufei.wang@asu.edu<br><b>Data Set Information:</b>2 files are included:<br><br>1. nodes.csv<br>-- it's the file of all the users. This file works as a dictionary of all the users in this data set. It's useful for fast reference. It contains all the node ids used in the dataset.<br><br>2. edges.csv<br>-- this is the friendship network among the bloggers. The blogger's friends are represented using edges. Here is an example.<br><br>1,2<br><br>This means blogger with id "1" is friend with blogger id "2".<br><b>Attribute Information:</b>This is the data set crawled on July, 2009 from BlogCatalog ( http://www.blogcatalog.com ). BlogCatalog is a social blog directory website. This contains the friendship network crawled. For easier understanding, all the contents are organized in CSV file format.<br><br>-. Basic statistics<br><br>Number of bloggers : 88,784<br><br>Number of friendship pairs: 4,186,390<br><b>Relevant Papers:</b><b><br></b>Nitin Agarwal and Huan Liu. ”Modeling and Data Mining in Blogosphere”, Synthesis Lectures on Data Mining and Knowledge Discovery #1, Morgan & Claypool Publishers, Robert Grossman (Editor), August 2009. ISBN: 9781598299083 (paperback) ISBN: 9781598299090 (ebook) Nitin Agarwal, Magdiel Galan, Huan Liu, and Shankar Subramanya. WisColl: Collective Wisdom based Blog Clustering. Journal of Information Science, 180(1): 39-61, January, 2010. Nitin Agarwal, Huan Liu, Sudheendra Murthy, Arunabha Sen, and Xufei Wang. A Social Identity Approach to Identify Familiar Strangers in a Social Network. In Proceedings of the Third International AAAI Conference on Weblogs and Social Media (ICWSM09), pp. 2 - 9, May 17-20, 2009. San Jose, California. Nitin Agarwal, Huan Liu, Sudheendra Murthy, Arunabha Sen, and Xufei Wang. "A Social Identity Approach to Identify Familiar Strangers in a Social Network", 3rd International AAAI Conference on Weblogs and Social Media (ICWSM09), pp. 2 - 9, May 17-20, 2009. San Jose, California.<br>
<b>摘要</b>:BlogCatalog是管理博主及其博客的社交博客目录平台。<br><b>节点数量:</b>10,312<b>边数量:</b>333,983<b>缺失值情况:</b>无<br><b>来源:</b>Nitin Agarwal+、Xufei Wang*、Huan Liu*<br><br>+ 阿肯色大学小石城分校信息科学系,电子邮箱:nxagarwal@ualr.edu<br><br>* 亚利桑那州立大学计算、信息与决策系统工程学院,电子邮箱:huan.liu@asu.edu、xufei.wang@asu.edu<br><b>数据集信息:</b>本数据集包含2个文件:<br><br>1. nodes.csv<br>——该文件收录了全部用户信息,可作为本数据集所有用户的字典索引,用于快速查阅,包含数据集中使用的所有节点ID。<br><br>2. edges.csv<br>——该文件记录了博主间的好友关系网络,边用于表示博主的好友关系。示例如下:<br>1,2<br>其含义为ID为“1”的博主与ID为“2”的博主互为好友。<br><b>属性信息:</b>本数据集于2009年7月从BlogCatalog(http://www.blogcatalog.com)爬取获取。BlogCatalog是一个社交博客目录网站,本数据集包含其爬取得到的好友关系网络。为便于使用,所有数据均以CSV(逗号分隔值)格式组织存储。<br><br>——基础统计信息<br><br>博主总数:88,784位<br><br>好友关系对总数:4,186,390对<br><b>相关文献:</b><br>Nitin Agarwal与Huan Liu. 《博客圈建模与数据挖掘》,《数据挖掘与知识发现综合讲座》第1卷,Morgan & Claypool出版社,Robert Grossman(编辑),2009年8月。ISBN:9781598299083(平装版)、ISBN:9781598299090(电子版)<br><br>Nitin Agarwal、Magdiel Galan、Huan Liu及Shankar Subramanya. WisColl: 基于集体智慧的博客聚类. 《信息科学期刊》,180(1): 39-61,2010年1月。<br><br>Nitin Agarwal、Huan Liu、Sudheendra Murthy、Arunabha Sen与Xufei Wang. 社交网络中识别熟悉陌生人的社会认同方法. 第三届国际博客与社交媒体会议(ICWSM09)论文集,第2-9页,2009年5月17-20日,加利福尼亚州圣何塞市。<br><br>Nitin Agarwal、Huan Liu、Sudheendra Murthy、Arunabha Sen与Xufei Wang. 《社交网络中识别熟悉陌生人的社会认同方法》,第三届国际博客与社交媒体会议(ICWSM09),第2-9页,2009年5月17-20日,加利福尼亚州圣何塞市。
提供机构:
figshare
创建时间:
2020-04-20
搜集汇总
数据集介绍

背景与挑战
背景概述
BlogCatalog dataset是一个社交博客目录网络数据集,包含10,312个用户节点和333,983条友谊关系边,数据以CSV格式组织,适用于社交网络分析和数据挖掘研究。
以上内容由遇见数据集搜集并总结生成



