2016 United States Presidential Election Tweet Ids
收藏DataCite Commons2025-05-12 更新2025-05-17 收录
下载链接:
https://dataverse.harvard.edu/citation?persistentId=doi:10.7910/DVN/PDI7IN
下载链接
链接失效反馈官方服务:
资源简介:
<p>This dataset contains the tweet ids of approximately 280 million tweets related to the 2016 United States presidential election. They were collected between July 13, 2016 and November 10, 2016 from the Twitter API using <a href="http://gwu-libraries.github.io/sfm-ui/">Social Feed Manager</a>.</p>
<p>These tweet ids are broken up into 12 collections. Each collection was collected either from the <a href="https://dev.twitter.com/rest/reference/get/statuses/user_timeline">GET statuses/user_timeline method</a> of the Twitter REST API or the <a href="https://dev.twitter.com/streaming/reference/post/statuses/filter">POST statuses/filter method</a> of the Twitter Stream API. The collections are:
<ul>
<li>Candidates and key election hashtags (Twitter filter): election-filter[1-6].txt</li>
<li>Democratic candidates (Twitter user timeline): democratic-candidate-timelines.txt</li>
<li>Democratic Convention (Twitter filter): democratic-convention-filter.txt</li>
<li>Democratic Party (Twitter user timeline): democratic-party-timelines.txt</li>
<li>Election Day (Twitter filter): election-day.txt</li>
<li>First presidential debate (Twitter filter): first-debate.txt</li>
<li>GOP Convention (Twitter filter): republican-convention-filter.txt</li>
<li>Republican candidates (Twitter user timeline): republican-candidate-timelines.txt</li>
<li>Republican Party (Twitter user timeline): republican-party-timelines.txt</li>
<li>Second presidential debate (Twitter filter): second-debate.txt</li>
<li>Third presidential debate (Twitter filter): third-debate.txt</li>
<li>Vice Presidential debate (Twitter filter): vp-debate.txt</li>
</ul>
There is also a README.txt file for each collection containing additional documentation on how it was collected.</p>
<p>The <a href="https://dev.twitter.com/rest/reference/get/statuses/lookup">GET statuses/lookup method</a> supports retrieving the complete tweet for a tweet id (known as hydrating). Tools such as <a href="https://github.com/DocNow/twarc">Twarc</a> or <a href="https://github.com/DocNow/hydrator">Hydrator</a> can be used to hydrate tweets. When hydrating be aware that:
<ul>
<li>Twitter limits hydration to 900 requests of 100 tweet ids per 15 minute window per set of user credentials. This works out to 8,640,000 tweets per day, so hydrating this entire dataset will take 32 days.</li>
<li>The Twitter API will not return tweets that have been deleted or belong to accounts that have been suspended, deleted, or made private. You should expect a large number of these tweets to be unavailable.</il>
</ul></p>
<p>There may be duplicate tweets across collections. Also, according to the Twitter documentation, duplicate tweets are possible for tweets collected from the Twitter filter stream.</p>
<p>For tweets collected from the Twitter filter stream, this is not a complete set of tweets that match the filter. Gaps may exist because:
<ul>
<li>Twitter limits the number of tweets returned by the filter at any point in time.</li>
<li>Social Feed Manager stops and starts the Twitter filter stream every 30 minutes.</li>
<li>In Social Feed Manager, collecting is turned off while a user is making changes to the collection criteria.</li>
<li>There were some operational issues, e.g., network interruptions, during the collection period.</li>
</ul></p>
<p>Since some of the terms used to collect from the Twitter filter stream were broad (e.g., “election”), it may contain tweets from elections other than the U.S. presidential election, including state elections, local elections, or elections in other countries.</p>
<p>Per Twitter’s <a href="https://dev.twitter.com/overview/terms/policy.html#id8">Developer Policy</a>, tweet ids may be publicly shared; tweets may not.</p>
<p>Questions about this dataset can be sent to sfm@gwu.edu. George Washington University researchers should contact us for access to the tweets.</p>
<p>This work is supported by grant #NARDI-14-50017-14 from the National Historical Publications and Records Commission.</p>
提供机构:
Harvard Dataverse
创建时间:
2016-11-29



