Sectarian Hate Speech Detection
收藏NIAID Data Ecosystem2026-05-02 收录
下载链接:
https://zenodo.org/record/14325726
下载链接
链接失效反馈官方服务:
资源简介:
These datasets are collected through Keyword-specific and Account-specific queries on Google Search as well as other dedicated platforms (Twitter/X). The keywords are related to several major crises that happened in Lebanon, which could help reveal the spread of Hate and Sectarian speech in Lebanon.
Folder Navigation:
Beirut-Port-Timeline: (Crisis name)
EYYYY-MM-DD: (folder resembling the name of a timepoint of interest, around which data was scraped (+1 week or +2 weeks)). For example, a folder named E2020-12-21 means the data was scraped starting 2020-12-21 and 1/2 weeks ahead.
Twitter-advanced.xlsx: Data scraped through Twitter/X advanced search bar. The dataset contains the following fields:
link: the link of the tweet.
text: the actual tweet text.
time: the time the tweet was posted.
mentions: the set of accounts mentioned in the tweet, if any.
keywords: the query that was used to retrieve this tweet.
account: the Twitter/X account of the tweet author.
generated_by: This contains either: `KD` or `KAD`. `KD` resembles `Keyword-Date` indicating that the query used to retrieve the tweet contains only date and keyword attributes. `KAD` resembles `Keyword-Account-Date` indicating that the query used to retrieve the tweet contains date, account, and keyword attributes.
Twitter-Google.xlsx: Data scraped through Google search. The dataset contains the following fields:
title: The title (heading) of the query search result on Google.
link: the link of the tweet.
keywords: the query that was used to retrieve this tweet.
account: the Twitter/X account of the tweet author.
generated_by: This contains either: `KD` or `KAD`. `KD` resembles `Keyword-Date` indicating that the query used to retrieve the tweet contains only date and keyword attributes. `KAD` resembles `Keyword-Account-Date` indicating that the query used to retrieve the tweet contains date, account, and keyword attributes.
Israel-Lebanon-War (Crisis name)
EYYYY-MM-DD: (folder resembling the name of a timepoint of interest, around which data was scraped (+1 week or +2 weeks)). For example, a folder named E2024-09-20 means the data was scraped starting 2024-09-20 and 1/2 weeks ahead.
Twitter-advanced.xlsx: Data scraped through Twitter/X advanced search bar. The dataset contains the following fields:
link: the link of the tweet.
text: the actual tweet text.
time: the time the tweet was posted.
text_mentions: the set of accounts mentioned in the tweet, if any.
keywords: the query that was used to retrieve this tweet.
account: the Twitter/X account of the tweet author.
generated_by: This contains either: `KD` or `KAD`. `KD` resembles `Keyword-Date` indicating that the query used to retrieve the tweet contains only date and keyword attributes. `KAD` resembles `Keyword-Account-Date` indicating that the query used to retrieve the tweet contains date, account, and keyword attributes.
reply: a reply to the actual tweet, if any.
reply_mention: the set of accounts mentioned in the reply, if any.
Twitter-Google.xlsx: Data scraped through Google search. The dataset contains the following fields:
link: the link of the tweet.
text: the actual tweet text.
keywords: the query that was used to retrieve this tweet.
account: the Twitter/X account of the tweet author.
generated_by: This contains either: `KD` or `KAD`. `KD` resembles `Keyword-Date` indicating that the query used to retrieve the tweet contains only date and keyword attributes. `KAD` resembles `Keyword-Account-Date` indicating that the query used to retrieve the tweet contains date, account, and keyword attributes.
reply: a reply to the actual tweet, if any.
reply_mention: the set of accounts mentioned in the reply, if any.
Instagram.xlsx: Data scraped through Google search. The dataset contains the following fields:
title: The title (heading) of the query search result on Google.
link: the link to the Instagram post.
text: the description inserted by the author associated with the Instagram post, if any.
keywords: the query that was used to retrieve this tweet.
account: the Instagram account of the post author. (TO BE REMOVED)
channel_name: the Instagram account of the post author.
description: the description inserted by the author associated with the Instagram post, if any. (TO BE REMOVED)
post_time: the time the Instagram post was uploaded.
generated_by: This contains either: `KD` or `KAD`. `KD` resembles `Keyword-Date` indicating that the query used to retrieve the Instagram post contains only date and keyword attributes. `KAD` resembles `Keyword-Account-Date` indicating that the query used to retrieve the Instagram post contains date, account, and keyword attributes.
reply: a reply to the actual Instagram post, if any.
reply_time: the time the reply to the post was uploaded.
创建时间:
2024-12-09



