Messages from Telegram on Palaistine
收藏NIAID Data Ecosystem2026-05-02 收录
下载链接:
https://zenodo.org/record/14710656
下载链接
链接失效反馈官方服务:
资源简介:
The dataset represents a collection of messages fetched from specified Telegram channels ( "PalestineSolidarityBelgium", "Eyeonpalestine2", "haqqintel", "samidounnetwork", "resistancechain", "PalestinianResistance", "PalestineHealth", "PalestineUpdates", "GazaNow", "Palestine2024", "FreePalestine2023", "StopGazaGenocide", "AlQassamBrigades9", "palestineresistance", "pal_Online9", "gazaalanpa", ) and saved in individual JSON files. Each file is named based on the channel name and the date (e.g., PalestineSolidarityBelgium_2025-01-21.json). Here's a detailed description of the dataset:
Dataset Structure:
File Name: Each JSON file is named using the format {channel_name}_{date}.json. The channel_name represents the Telegram channel from which messages were fetched, and the date is the date when the messages were collected.
Data Fields:
Each message is stored as a dictionary with the following fields:
id: The unique identifier for the message in the Telegram channel.
message: The actual content of the message posted in the channel. This field could contain text, links, or any other content shared within the channel.
timestamp: The date and time when the message was posted, in ISO 8601 format. This is important for tracking when each message was shared in the channel.
Additional fields (optional): Depending on future expansions or modifications, other message properties (e.g., sender, reply-to message, etc.) could be included.
Dataset Characteristics:
Data Integrity: The script ensures that duplicate messages, identified by their unique id, are not saved again in the same file. If a message with the same id already exists in the corresponding file, it will be skipped.
Temporal Coverage: The dataset contains messages posted over time, and each file corresponds to messages from a specific date. This allows for temporal analysis and tracking the evolution of content over time in each channel.
Data Volume: The dataset is incrementally built over time as messages are fetched periodically, with each run of the script collecting up to MESSAGES_PER_REQUEST messages per channel. The dataset grows as new messages are added without overwriting existing data, preserving historical content.
Example:
For example, the file PalestineSolidarityBelgium_2025-01-21.json might contain an array of message objects, each structured like this:
[ { "id": 12345, "message": "This is a sample message about the ongoing situation in Palestine...", "timestamp": "2025-01-21T14:30:00" }, { "id": 12346, "message": "Another message regarding the political landscape in the region...", "timestamp": "2025-01-21T14:45:00" }]The total mesages in each file are:
Total number of messages in 'AlQassamBrigades9_2025-01-20.json': 2326Total number of messages in 'Aqsatvsat_2025-01-20.json': 58Total number of messages in 'Eyeonpalestine2_2025-01-20.json': 7075Total number of messages in 'FreePalestine2023_2025-01-20.json': 1Total number of messages in 'GazaNow_2025-01-20.json': 70Total number of messages in 'PalestineSolidarityBelgium_2025-01-20.json': 147Total number of messages in 'PalestineUpdates_2025-01-20.json': 465Total number of messages in 'PalestinianResistance_2025-01-20.json': 19Total number of messages in 'StopGazaGenocide_2025-01-20.json': 119Total number of messages in 'TIMESOFGAZA_2025-01-20.json': 1745Total number of messages in 'The_Jerusalem_Post_2025-01-20.json': 7907Total number of messages in 'bigolivr_2025-01-20.json': 42Total number of messages in 'gazaalanpa_2025-01-20.json': 6855Total number of messages in 'gazaenglishupdates_2025-01-20.json': 34688Total number of messages in 'haqqintel_2025-01-20.json': 4351Total number of messages in 'pal_Online9_2025-01-20.json': 49178Total number of messages in 'palestineonline_2025-01-20.json': 1778Total number of messages in 'palestineresistance_2025-01-20.json': 2455Total number of messages in 'resistancechain_2025-01-20.json': 5775
Total number of messages across all files: 125054
创建时间:
2025-01-21



