TweetPap
收藏arXiv2021-06-14 更新2024-06-21 收录
下载链接:
https://github.com/lingo-iitgn/TweetPap
下载链接
链接失效反馈官方服务:
资源简介:
TweetPap是由印度理工学院甘地分校计算机科学与工程系创建的大型数据集,旨在研究科学论文在社交媒体上的讨论。该数据集包含2010至2019年间367,124条与arXiv论文相关的推文,通过arXiv标识符进行映射。创建过程中,研究者利用关键词搜索和链接扩展技术收集数据,并结合Semantic Scholar Corpus提取年度引用信息。TweetPap不仅记录了推文和引用数量,还提供了年度引用、转发、点赞和链接等详细信息,用于量化社交媒体活动对学术影响力的影响,适用于分析社交媒体与学术文献之间的关系。
TweetPap is a large-scale dataset created by the Department of Computer Science and Engineering, Indian Institute of Technology Gandhinagar, aiming to study the discussions of scientific papers on social media. This dataset contains 367,124 tweets related to arXiv papers spanning from 2010 to 2019, mapped via arXiv identifiers. During the data collection process, researchers gathered the dataset using keyword search and link expansion techniques, and extracted annual citation information by leveraging the Semantic Scholar Corpus. TweetPap not only records the number of tweets and citations, but also provides detailed information including annual citations, retweets, likes and associated links. It is designed to quantify the impact of social media activities on academic influence, and is applicable for analyzing the relationship between social media and academic literature.
提供机构:
印度理工学院甘地分校计算机科学与工程系
创建时间:
2021-06-14



