five

Twitter Big Data as A Resource For Exoskeleton Research: A Large-Scale Dataset of about 140,000 Tweets and 100 Research Questions

收藏
NIAID Data Ecosystem2026-03-13 收录
下载链接:
https://doi.org/10.7910/DVN/VPPTRF
下载链接
链接失效反馈
官方服务:
资源简介:
Please cite the following paper when using this dataset: N. Thakur, “Twitter Big Data as a Resource for Exoskeleton Research: A Large-Scale Dataset of about 140,000 Tweets and 100 Research Questions,” Preprints, 2022, DOI: 10.20944/preprints202206.0383.v1 Abstract The exoskeleton technology has been rapidly advancing in the recent past due to its multitude of applications and use cases in assisted living, military, healthcare, firefighting, and industries. With the projected increase in the diverse uses of exoskeletons in the next few years in these application domains and beyond, it is crucial to study, interpret, and analyze user perspectives, public opinion, reviews, and feedback related to exoskeletons, for which a dataset is necessary. The Internet of Everything era of today's living, characterized by people spending more time on the Internet than ever before, holds the potential for developing such a dataset by mining relevant web behavior data from social media communications, which have increased exponentially in the last few years. Twitter, one such social media platform, is highly popular amongst all age groups, who communicate on diverse topics including but not limited to news, current events, politics, emerging technologies, family, relationships, and career opportunities, via tweets, while sharing their views, opinions, perspectives, and feedback towards the same. Therefore, this work presents a dataset of about 140,000 Tweets related to exoskeletons. that were mined for a period of 5-years from May 21, 2017, to May 21, 2022. The tweets contain diverse forms of communications and conversations which communicate user interests, user perspectives, public opinion, reviews, feedback, suggestions, etc., related to exoskeletons. Instructions: This dataset contains about 140,000 Tweets related to exoskeletons. that were mined for a period of 5-years from May 21, 2017, to May 21, 2022. The tweets contain diverse forms of communications and conversations which communicate user interests, user perspectives, public opinion, reviews, feedback, suggestions, etc., related to exoskeletons. The dataset contains only tweet identifiers (Tweet IDs) due to the terms and conditions of Twitter to re-distribute Twitter data only for research purposes. They need to be hydrated to be used. The process of retrieving a tweet's complete information (such as the text of the tweet, username, user ID, date and time, etc.) using its ID is known as the hydration of a tweet ID. The Hydrator application (link to download the application: https://github.com/DocNow/hydrator/releases and link to a step-by-step tutorial: https://towardsdatascience.com/learn-how-to-easily-hydrate-tweets-a0f393ed340e#:~:text=Hydrating%20Tweets) or any similar application may be used for hydrating this dataset. Data Description This dataset consists of 7 .txt files. The following shows the number of Tweet IDs and the date range (of the associated tweets) in each of these files. Filename: Exoskeleton_TweetIDs_Set1.txt (Number of Tweet IDs – 22945, Date Range of Tweets - July 20, 2021 – May 21, 2022) Filename: Exoskeleton_TweetIDs_Set2.txt (Number of Tweet IDs – 19416, Date Range of Tweets - Dec 1, 2020 – July 19, 2021) Filename: Exoskeleton_TweetIDs_Set3.txt (Number of Tweet IDs – 16673, Date Range of Tweets - April 29, 2020 - Nov 30, 2020) Filename: Exoskeleton_TweetIDs_Set4.txt (Number of Tweet IDs – 16208, Date Range of Tweets - Oct 5, 2019 - Apr 28, 2020) Filename: Exoskeleton_TweetIDs_Set5.txt (Number of Tweet IDs – 17983, Date Range of Tweets - Feb 13, 2019 - Oct 4, 2019) Filename: Exoskeleton_TweetIDs_Set6.txt (Number of Tweet IDs – 34009, Date Range of Tweets - Nov 9, 2017 - Feb 12, 2019) Filename: Exoskeleton_TweetIDs_Set7.txt (Number of Tweet IDs – 11351, Date Range of Tweets - May 21, 2017 - Nov 8, 2017) Here, the last date for May is May 21 as it was the most recent date at the time of data collection. The dataset would be updated soon to incorporate more recent tweets.

使用本数据集时请引用以下论文:N. 塔库尔,《推特(Twitter)大数据作为外骨骼(exoskeleton)研究的资源:包含约14万条推文(Tweet)与100个研究问题的大规模数据集》,Preprints,2022,DOI: 10.20944/preprints202206.0383.v1 摘要:外骨骼技术近年来发展迅猛,因其在辅助生活、军事、医疗保健、消防及工业等领域拥有众多应用场景。随着未来数年上述应用领域及更多领域中外骨骼的多样化应用预计将持续增长,研究、解读并分析与外骨骼相关的用户视角、公众舆论、评论与反馈至关重要,而数据集正是开展此类研究的必要基础。当今万物互联时代,人们在互联网上花费的时间较以往任何时候都更多,通过挖掘社交媒体沟通中的相关网络行为数据,便可有潜力构建此类数据集——近年来社交媒体的使用量呈指数级增长。推特作为此类社交媒体平台之一,在各年龄段人群中广受欢迎,用户通过推文围绕包括但不限于新闻、时事、政治、新兴技术、家庭、人际关系与职业机遇在内的多样话题进行交流,分享自身观点、舆论、视角及相关反馈。因此,本研究发布了一个包含约14万条与外骨骼相关推文的数据集,该数据集的采集周期为5年,即2017年5月21日至2022年5月21日。这些推文涵盖了多种形式的交流与对话,传递了用户对外骨骼的兴趣、视角、公众舆论、评论、反馈与建议等相关内容。 使用说明:由于推特的条款规定,仅可出于研究目的重新分发推特数据,因此本数据集仅包含推文标识符(Tweet ID),需通过「推文ID补全(hydration)」流程方可使用。通过推文ID检索其完整信息(如推文文本、用户名、用户ID、日期时间等)的过程被称为推文ID补全。可使用Hydrator应用(应用下载链接:https://github.com/DocNow/hydrator/releases;分步教程链接:https://towardsdatascience.com/learn-how-to-easily-hydrate-tweets-a0f393ed340e#:~:text=Hydrating%20Tweets)或其他同类应用完成本数据集的补全操作。 数据说明:本数据集包含7个纯文本(.txt)文件。以下列出每个文件中包含的推文ID数量及对应推文的日期范围: 1. Exoskeleton_TweetIDs_Set1.txt:推文ID数量22945,推文日期范围:2021年7月20日 – 2022年5月21日 2. Exoskeleton_TweetIDs_Set2.txt:推文ID数量19416,推文日期范围:2020年12月1日 – 2021年7月19日 3. Exoskeleton_TweetIDs_Set3.txt:推文ID数量16673,推文日期范围:2020年4月29日 – 2020年11月30日 4. Exoskeleton_TweetIDs_Set4.txt:推文ID数量16208,推文日期范围:2019年10月5日 – 2020年4月28日 5. Exoskeleton_TweetIDs_Set5.txt:推文ID数量17983,推文日期范围:2019年2月13日 – 2019年10月4日 6. Exoskeleton_TweetIDs_Set6.txt:推文ID数量34009,推文日期范围:2017年11月9日 – 2019年2月12日 7. Exoskeleton_TweetIDs_Set7.txt:推文ID数量11351,推文日期范围:2017年5月21日 – 2017年11月8日 注:本次数据采集的截止日期为5月21日,因此末尾的日期标注为5月21日。本数据集将很快更新以纳入更多近期推文。
创建时间:
2022-07-09
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作