five

Facebook评论数据集|社交网络分析数据集|机器学习预测建模数据集

收藏
帕依提提2024-03-04 收录
社交网络分析
机器学习预测建模
下载链接:
https://www.payititi.com/opendatasets/show-26065.html
下载链接
链接失效反馈
资源简介:
Data Set Information: The Dataset is uploaded in ZIP format. The dataset contains 5 variants of the dataset, for the details about the variants and detailed analysis read and cite the research paper @INPROCEEDINGS{Sing1503:Comment, AUTHOR='Kamaljot Singh and Ranjeet Kaur Sandhu and Dinesh Kumar', TITLE='Comment Volume Prediction Using Neural Networks and Decision Trees', BOOKTITLE='IEEE UKSim-AMSS 17th International Conference on Computer Modelling and Simulation, UKSim2015 (UKSim2015)', ADDRESS='Cambridge, United Kingdom', DAYS=25, MonTH=mar, YEAR=2015, KEYWORDS='Neural Networks; RBF Network; Prediction; Facebook; Comments; Data Mining; REP Tree; M5P Trees.', ABSTRACT='The leading treads towards social networking services had drawn massive public attention from last one and half decade. The amount of data that is uploaded to these social networking services is increasing day by day. So, there is massive requirement to study the highly dynamic behavior of users towards these services. This is a preliminary work to model the user patterns and to study the effectiveness of machine learning predictive modeling approaches on leading social networking service Facebook. We modeled the user comment patters, over the posts on Facebook Pages and predicted that how many comments a post is expected to receive in next H hrs. In order to automate the process, we developed a software prototype consisting of the crawler, Information extractor, information processor and knowledge discovery module. We used Neural Networks and Decision Trees, predictive modeling techniques on different dataset variants and evaluated them under Hits(at)10 (custom measure), Area Under Curve, evaluation Time and Mean Absolute error evaluation metrics. We concluded that the Decision trees performed better than the Neural Networks under light of all evaluation metrics.' } The research paper is also available at conference website: uksim.info/uksim2015/[Web link] another extended paper is that is to be published soon is : @ARTICLE{Sing1601:Facebook, AUTHOR='Kamaljot Singh', TITLE='Facebook Comment Volume Prediction', JOURNAL='International Journal of Simulation- Systems, Science and Technology- IJSSST V16', ADDRESS='Cambridge, United Kingdom', DAYS=30, MonTH=jan, YEAR=2016, KEYWORDS='Neural Networks; RBF Network; Prediction; Facebook; Comments; Data Mining; REP Tree; M5P Trees.', ABSTRACT='The amount of data that is uploaded to social networking services is increasing day by day. So, their is massive requirement to study the highly dynamic behavior of users towards these services. This work is to model the user patterns and to study the effectiveness of machine learning predictive modeling approaches on leading social networking service Facebook. We modeled the user comment patters, over the posts on Facebook Pages and predicted that how many comments a post is expected to receive in next H hrs. To automate the process, we developed a software prototype consisting of the crawler, Information extractor, information processor and knowledge discovery module. We used Neural Networks and Decision Trees, predictive modeling techniques on different data-set variants and evaluated them under Hits(at)10, Area Under Curve, evaluation Time and M.A.E metrics. We concluded that the Decision trees performed better than the Neural Networks under light of all metrics.' } this above paper will be freely available after publication at www.ijssst.info Attribute Information: 1 Page Popularity/likes Decimal Encoding Page feature Defines the popularity or support for the source of the document. 2 Page Checkinsa€?s Decimal Encoding Page feature Describes how many individuals so far visited this place. This feature is only associated with the places eg:some institution, place, theater etc. 3 Page talking about Decimal Encoding Page feature Defines the daily interest of individuals towards source of the document/ Post. The people who actually come back to the page, after liking the page. This include activities such as comments, likes to a post, shares, etc by visitors to the page. 4 Page Category Value Encoding Page feature Defines the category of the source of the document eg: place, institution, brand etc. 5 - 29 Derived Decimal Encoding Derived feature These features are aggregated by page, by calculating min, max, average, median and standard deviation of essential features. 30 CC1 Decimal Encoding Essential feature The total number of comments before selected base date/time. 31 CC2 Decimal Encoding Essential feature The number of comments in last 24 hours, relative to base date/time. 32 CC3 Decimal Encoding Essential feature The number of comments in last 48 to last 24 hours relative to base date/time. 33 CC4 Decimal Encoding Essential feature The number of comments in the first 24 hours after the publication of post but before base date/time. 34 CC5 Decimal Encoding Essential feature The difference between CC2 and CC3. 35 base time Decimal(0-71) Encoding Other feature Selected time in order to simulate the scenario. 36 Post length Decimal Encoding Other feature Character count in the post. 37 Post Share Count ??????Decimal Encoding Other feature This features counts the no of shares of the post, that how many peoples had shared this post on to their timeline. 38 Post Promotion Status ??????Binary Encoding Other feature To reach more people with posts in News Feed, individual promote their post and this features tells that whether the post is promoted(1) or not(0). 39 H Local ???Decimal(0-23) Encoding Other feature This describes the H hrs, for which we have the target variable/ comments received. 40-46 Post published weekday Binary Encoding Weekdays feature This represents the day(Sunday...Saturday) on which the post was published. 47-53 base DateTime weekday Binary Encoding Weekdays feature This represents the day(Sunday...Saturday) on selected base Date/Time. 54 Target Variable Decimal Target The no of comments in next H hrs(H is given in Feature no 39). Relevant Papers: Provide references to papers that have cited this data set in the past (if any).The Dataset is uploaded in ZIP format. The dataset contains 5 variants of the dataset, for the details about the variants and detailed analysis read and cite the research paper @INPROCEEDINGS{Sing1503:Comment, AUTHOR='Kamaljot Singh and Ranjeet Kaur Sandhu and Dinesh Kumar', TITLE='Comment Volume Prediction Using Neural Networks and Decision Trees', BOOKTITLE='IEEE UKSim-AMSS 17th International Conference on Computer Modelling and Simulation, UKSim2015 (UKSim2015)', ADDRESS='Cambridge, United Kingdom', DAYS=25, MonTH=mar, YEAR=2015, KEYWORDS='Neural Networks; RBF Network; Prediction; Facebook; Comments; Data Mining; REP Tree; M5P Trees.', ABSTRACT='The leading treads towards social networking services had drawn massive public attention from last one and half decade. The amount of data that is uploaded to these social networking services is increasing day by day. So, there is massive requirement to study the highly dynamic behavior of users towards these services. This is a preliminary work to model Kamaljot Singh, Assistant Professor, Lovely Professional University, Jalandhar. Kamaljotsingh2009 '@' gmail.com
提供机构:
帕依提提
用户留言
有没有相关的论文或文献参考?
这个数据集是基于什么背景创建的?
数据集的作者是谁?
能帮我联系到这个数据集的作者吗?
这个数据集如何下载?
点击留言
数据主题
具身智能
数据集  4098个
机构  8个
大模型
数据集  439个
机构  10个
无人机
数据集  37个
机构  6个
指令微调
数据集  36个
机构  6个
蛋白质结构
数据集  50个
机构  8个
空间智能
数据集  21个
机构  5个
5,000+
优质数据集
54 个
任务类型
进入经典数据集
热门数据集

学生课堂行为数据集 (SCB-dataset3)

学生课堂行为数据集(SCB-dataset3)由成都东软学院创建,包含5686张图像和45578个标签,重点关注六种行为:举手、阅读、写作、使用手机、低头和趴桌。数据集覆盖从幼儿园到大学的不同场景,通过YOLOv5、YOLOv7和YOLOv8算法评估,平均精度达到80.3%。该数据集旨在为学生行为检测研究提供坚实基础,解决教育领域中学生行为数据集的缺乏问题。

arXiv 收录

HazyDet

HazyDet是由解放军工程大学等机构创建的一个大规模数据集,专门用于雾霾场景下的无人机视角物体检测。该数据集包含383,000个真实世界实例,收集自自然雾霾环境和正常场景中人工添加的雾霾效果,以模拟恶劣天气条件。数据集的创建过程结合了深度估计和大气散射模型,确保了数据的真实性和多样性。HazyDet主要应用于无人机在恶劣天气条件下的物体检测,旨在提高无人机在复杂环境中的感知能力。

arXiv 收录

中国农村金融统计数据

该数据集包含了中国农村金融的统计信息,涵盖了农村金融机构的数量、贷款余额、存款余额、金融服务覆盖率等关键指标。数据按年度和地区分类,提供了详细的农村金融发展状况。

www.pbc.gov.cn 收录

TCIA

TCIA(The Cancer Imaging Archive)是一个公开的癌症影像数据集,包含多种癌症类型的医学影像数据,如CT、MRI、PET等。这些数据通常与临床和病理信息相结合,用于癌症研究和临床试验。

www.cancerimagingarchive.net 收录

WeChat Social Network Dataset

该数据集包含了微信社交网络的用户关系数据,包括用户之间的关注关系、互动行为等。数据集旨在帮助研究社交网络的结构和动态变化。

www.aminer.cn 收录