five

Insight4news Irish news related hashtagged tweet collection 15.07.2015-24.05.2017

收藏
Figshare2019-09-13 更新2026-04-08 收录
下载链接:
https://figshare.com/articles/Insight4news_Irish_news_related_tweet_collection_15_07_2015-24_05_2017/7932422/4
下载链接
链接失效反馈
官方服务:
资源简介:
<pre>The 1.3GB .tar.gz file contains a 3.6GB (uncompressed) .txt file with <b>198'725'860</b> rows, each row of which is a tweet ID.</pre><pre>These tweets have been collected in <i>15.07.2015-24.05.2017</i> period with the <i>Hashtagger</i> platform (presented in https://doi.org/10.1145/2872427.2882982 by Shi et al.), which considered these tweets relevant to the monitored stream of news from Irish sources (The Irish Times, Irish Examiner, etc.). </pre><pre><br></pre><pre>All 198'725'860 tweets are in English (with 'en' in the 'lang' field of the json objects, privided by GNIP) and contain at least one hashtag. <br></pre><pre><br></pre><pre>Hydrate the tweet ids with Twarc (https://github.com/edsu/twarc) and write to a file. You will need to provide Twarc with a set of Twitter API keys.</pre><pre><i> twarc.py --hydrate tweet_ids.txt &gt; tweets.json</i><br></pre><pre></pre><pre>It is probably not a good idea to hydrate all the tweets in one go, and may be better to split the file into chunks and hydrate the tweets chunk-by-chunk. </pre><pre><br></pre><pre><br></pre><pre>When using the dataset, please cite the following paper, for which this dataset was generated for. <b><br></b></pre><pre><b>SocialTree: Socially Augmented Structured Summaries of News Stories</b></pre><pre><i>Gevorg Poghosyan</i>, <i>Georgiana Ifrim</i> </pre><pre>Proceedings of 30th ACM Conference on Hypertext &amp; Social Media (HT ’19), 2019</pre><pre>https://doi.org/10.1145/3342220.3343668 </pre>
创建时间:
2019-04-04
二维码
社区交流群
二维码
科研交流群
商业服务