Dataset of Mastodon toots using the hashtag #Fairdata
收藏NIAID Data Ecosystem2026-05-01 收录
下载链接:
https://zenodo.org/record/8252442
下载链接
链接失效反馈官方服务:
资源简介:
This dataset provides mastodon toots that are using the hashtag FAIR-Data. This dataset is supposed to provide the foundation of further network analysis around the topic FAIR-Data.
Data Collection
Data was harvested using the Mastodon API. For each Mastodon server listed in the dataset, the API was employed to retrieve posts tagged with "fairdata". Up to 5,000 posts were collected per request, utilizing the API's pagination feature to obtain all available posts for that hashtag. Each Mastodon server was queried separately, and the results were stored in distinct CSV files.
Potential Duplicates
Given that Mastodon operates as a federated network, a post made on one instance can be replicated across different instances. This implies that the same post might appear in the data from multiple Mastodon servers, likely accounting for the duplicates observed in the dataset.
Analysis of Federation Dynamics: Duplicates can reveal which content is shared between servers and which servers are most active in the federation. In this context, duplicates could provide valuable insights for the network analysis.
The columns are:
id - A unique identifier for each post.
created_at - Timestamp indicating when the post was created.
content - The content or message of the post.
account - Detailed information about the account that created the post (ID, username, URL).
replies_count - Number of replies to the post.
reblogs_count - Number of reblogs or shares of the post.
favourites_count - Number of favorites or likes the post has received.
language - Language in which the post is written.
mentions - Any mentions in the post (for example, other users).
tags - Tags associated with the post.
emojis - Emojis used in the post.
category - Category of the post.
创建时间:
2023-08-16



