five

Digital Humanities Summer Institute 2014: A #dhsi2014 Archive

收藏
figshare.com2023-06-01 更新2025-03-22 收录
下载链接:
https://figshare.com/articles/dataset/Digital_Humanities_Summer_Institute_2014_A_dhsi2014_Archive/1050020/3
下载链接
链接失效反馈
官方服务:
资源简介:
The Digital Humanities Summer Institute (DHSI) took place 2-6 June 2014. DHSI takes place on the University of Victoria campus, and is offered by UVic's Electronic Textual Cultures Lab.http://www.dhsi.org/index.php This .XLS file contains tweets tagged with #dhsi2014 (case not sensitive). If you use or refer to this data in any way please cite and link back using the following citation information: Priego, Ernesto (2014): Digital Humanities Summer Institute 2014: A #dhsi2014 Archive. figshare. http://dx.doi.org/10.6084/m9.figshare.1050020 This file was created and shared by Ernesto Priego (Centre for Information Science, City University London) with a Creative Commons- Attribution license (CC-BY) for academic research and educational use. The complete archive contains 10, 686 tweets, the first one dated 26/05/2014 12:32:00 and the last one dated 07/06/2014 00:46:08 (Vancouver Pacific Time). The tweets contained in this file were collected using Martin Hawksey’s TAGS 5.1. Due to the volume of tweets nine Google Spreadsheets were created during the period of the event, which were subsequently refined to four. The data was subsequently manually refined into various sheets, which have been included here. Sheet 0. This 'Cite Me' sheet, including procedence of this file, citation information, information about its contents, the methods employed and some context. Sheet 1. All includes all 10,686 tweets archived between 6/05/2014 12:32:00 and 07/06/2014 00:46:08 (Vancouver Pacific Time). (Note this sheet includes some tweets with line breaks so number of rows is higher than number of actual tweets). This should be cleaned. Sheet 2. All DHSI dates includes archived 10,056 tweets posted throughout the duration of the event, between 02/06/2014 09:16:00 and 07/06/2014 00:46:08. (Event ended on 06/06/2014 but included a few tweets published after midnight that night). Sheet 3. Covers the period between the 26/05/2014 and 31/05/2014, with 290 archived tweets. This is prior to the actual event. Sheet 4. Covers 01/05/2014, with 335 archived tweets. This is also prior to the actual event. Sheet 5. Covers 02/06/2014, with 2,829 archived tweets. This was the first day of the event. Sheet 6. Covers 03/06/2014, with 1,726 archived tweets. Sheet 7. Covers 04/06/2014, with 1,882 archived tweets. Sheet 8. Covers 5/06/2014, with 1,970 archived tweets. Sheet 9. Covers 6/06/2014, with 1,649 archived tweets. This was the last day of the event. Sheet 10. Covers the early hours of 07/06/2014, with 5 archived tweets (until 07/06/2014 00:46:08 only). Sheet 11. Includes an archive of tweets tagged with #dhsi14 (the ‘official’ hashtag was #dhsi2014). It contains 58 tweets archived between 30/05/2014 18:05:28 and 06/06/2014 14:23:17. To avoid spam only users with at least 2 followers were included in the archive. Retweets have been included. Column D refers to the date and time in which tweet was archived (in GMT); Column E refers to the date of publication (in the event's local time; Vancouver Pacific Time). Please note that both research and experience show that the Twitter search API isn't 100% reliable. Large tweet volumes affect the search collection process. The API might "over-represent the more central users", not offering "an accurate picture of peripheral activity" (González-Bailón, Sandra, et al. 2012). Therefore, it cannot be guaranteed this file contains each and every tweet tagged with #dhsi2014 during the indicated period. Please take into account that other hashtags for specific sessions might have been used, and that participants might have tweeted from/about the conference (remotely or locally) without the hashtag collected here, so this archive does not represent an authoritative, complete view of all the Twitter activity associated to the event. Some deduplication and refining has been performed to avoid spam tweets and duplication. Some work was done to ensure the chronology was complete; I have highlighted an apparent gap in the tweets between 05/06/2014 23:41 and 06/06/2014 10:19; this could be due to a later start on the morning of the last day of activities, though one could have expected other tweets form other time zones coming it at that time, so it’s possible I have missed them. Some characters in some of the tweets' text might not have been decoded correctly. Please note the data in this file is likely to require further refining and even deduplication. The data is shared as is. If you use or refer to this data in any way please cite and link back using the citation information above.

数字人文夏季研究院(DHSI)于2014年6月2日至6月6日举行。该研究院在维多利亚大学校园内举办,由维多利亚大学电子文本文化实验室承办。http://www.dhsi.org/index.php 本.XLS文件包含带有#dhsi2014标签(不区分大小写)的推文。 若您以任何方式使用或引用此数据,请引用并链接以下引用信息: Priego, Ernesto (2014):数字人文夏季研究院2014年:#dhsi2014档案。figshare。 http://dx.doi.org/10.6084/m9.figshare.1050020 此文件由Ernesto Priego(伦敦城市大学信息科学中心)创建并共享,采用Creative Commons-署名许可(CC-BY)供学术研究和教育使用。 完整的档案包含10,686条推文,最早的一条发布于2014年5月26日12:32:00,最后一条发布于2014年6月7日00:46:08(温哥华太平洋时间)。 此文件中的推文使用Martin Hawksey的TAGS 5.1收集,由于推文数量庞大,在活动期间创建了九个Google电子表格,随后精简为四个。数据随后被手动精炼成多个工作表,现已包含在内。 工作表0:'引用我'工作表,包括此文件的来源、引用信息、内容信息、采用的方法以及一些背景信息。 工作表1:包含6/05/2014 12:32:00至07/06/2014 00:46:08(温哥华太平洋时间)之间存档的所有10,686条推文。请注意,此工作表包含一些带有换行符的推文,因此行数多于实际推文数量。此工作表应进行清理。 工作表2:包含整个活动期间发布的10,056条推文,时间范围为02/06/2014 09:16:00至07/06/2014 00:46:08。活动于06/06/2014结束,但包括了当晚午夜后发布的几条推文。 工作表3:涵盖2014年5月26日至5月31日的时间段,包含290条存档推文。这是活动之前的时期。 工作表4:涵盖2014年5月1日,包含335条存档推文。这也是活动之前的时期。 工作表5:涵盖2014年6月2日,包含2,829条存档推文。这是活动的第一天。 工作表6:涵盖2014年6月3日,包含1,726条存档推文。 工作表7:涵盖2014年6月4日,包含1,882条存档推文。 工作表8:涵盖2014年6月5日,包含1,970条存档推文。 工作表9:涵盖2014年6月6日,包含1,649条存档推文。这是活动的最后一天。 工作表10:涵盖2014年6月7日凌晨,包含5条存档推文(直到07/06/2014 00:46:08为止)。 工作表11:包含带有#dhsi14标签的推文存档(官方标签为#dhsi2014)。它包含58条于30/05/2014 18:05:28至06/06/2014 14:23:17之间存档的推文。 为了防止垃圾邮件,仅包含至少有2个关注者的用户。包括转推。列D指的是推文存档的日期和时间(GMT时间);列E指的是推文发布的日期(活动当地时间;温哥华太平洋时间)。 请注意,Twitter搜索API的可靠度并非100%,大量推文会影响搜索收集过程。API可能会“过度代表中心用户”,并不能提供“关于外围活动的准确图景”(González-Bailón, Sandra, et al. 2012)。因此,无法保证此文件包含在指定期间带有#dhsi2014标签的每一条推文。 请考虑,可能还有用于特定会议的其他标签,以及参与者可能从远程或本地发布关于会议的推文(未使用此处收集的标签),因此此存档并不代表与事件相关的所有Twitter活动的权威、完整视图。 已进行了一些去重和精炼工作,以避免垃圾邮件和重复。进行了一些工作以确保时间顺序完整;我在推文之间05/06/2014 23:41和06/06/2014 10:19之间的明显差距上做了标记;这可能是由于活动最后一天早上开始较晚,尽管可以预计在那个时间会有来自其他时区的其他推文,所以我可能遗漏了一些。某些推文文本中的某些字符可能未正确解码。 请注意,此文件中的数据可能需要进一步精炼甚至去重。数据按原样共享。若您以任何方式使用或引用此数据,请引用并链接上述引用信息。
提供机构:
figshare.com
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作