five

Messages from Telegram on Palaistine

收藏
NIAID Data Ecosystem2026-05-02 收录
下载链接:
https://zenodo.org/record/14710656
下载链接
链接失效反馈
官方服务:
资源简介:
The dataset     represents a collection of messages fetched from specified Telegram channels ( "PalestineSolidarityBelgium",    "Eyeonpalestine2",    "haqqintel",    "samidounnetwork",    "resistancechain",    "PalestinianResistance",    "PalestineHealth",    "PalestineUpdates",    "GazaNow",    "Palestine2024",    "FreePalestine2023",    "StopGazaGenocide",    "AlQassamBrigades9",    "palestineresistance",    "pal_Online9",    "gazaalanpa",    ) and saved in individual JSON files. Each file is named based on the channel name and the date (e.g., PalestineSolidarityBelgium_2025-01-21.json). Here's a detailed description of the dataset: Dataset Structure: File Name: Each JSON file is named using the format {channel_name}_{date}.json. The channel_name represents the Telegram channel from which messages were fetched, and the date is the date when the messages were collected. Data Fields: Each message is stored as a dictionary with the following fields: id: The unique identifier for the message in the Telegram channel. message: The actual content of the message posted in the channel. This field could contain text, links, or any other content shared within the channel. timestamp: The date and time when the message was posted, in ISO 8601 format. This is important for tracking when each message was shared in the channel. Additional fields (optional): Depending on future expansions or modifications, other message properties (e.g., sender, reply-to message, etc.) could be included. Dataset Characteristics: Data Integrity: The script ensures that duplicate messages, identified by their unique id, are not saved again in the same file. If a message with the same id already exists in the corresponding file, it will be skipped. Temporal Coverage: The dataset contains messages posted over time, and each file corresponds to messages from a specific date. This allows for temporal analysis and tracking the evolution of content over time in each channel. Data Volume: The dataset is incrementally built over time as messages are fetched periodically, with each run of the script collecting up to MESSAGES_PER_REQUEST messages per channel. The dataset grows as new messages are added without overwriting existing data, preserving historical content. Example: For example, the file PalestineSolidarityBelgium_2025-01-21.json might contain an array of message objects, each structured like this:   [  {    "id": 12345,    "message": "This is a sample message about the ongoing situation in Palestine...",    "timestamp": "2025-01-21T14:30:00"  },  {    "id": 12346,    "message": "Another message regarding the political landscape in the region...",    "timestamp": "2025-01-21T14:45:00"  }]The total mesages in each file are:  Total number of messages in 'AlQassamBrigades9_2025-01-20.json': 2326Total number of messages in 'Aqsatvsat_2025-01-20.json': 58Total number of messages in 'Eyeonpalestine2_2025-01-20.json': 7075Total number of messages in 'FreePalestine2023_2025-01-20.json': 1Total number of messages in 'GazaNow_2025-01-20.json': 70Total number of messages in 'PalestineSolidarityBelgium_2025-01-20.json': 147Total number of messages in 'PalestineUpdates_2025-01-20.json': 465Total number of messages in 'PalestinianResistance_2025-01-20.json': 19Total number of messages in 'StopGazaGenocide_2025-01-20.json': 119Total number of messages in 'TIMESOFGAZA_2025-01-20.json': 1745Total number of messages in 'The_Jerusalem_Post_2025-01-20.json': 7907Total number of messages in 'bigolivr_2025-01-20.json': 42Total number of messages in 'gazaalanpa_2025-01-20.json': 6855Total number of messages in 'gazaenglishupdates_2025-01-20.json': 34688Total number of messages in 'haqqintel_2025-01-20.json': 4351Total number of messages in 'pal_Online9_2025-01-20.json': 49178Total number of messages in 'palestineonline_2025-01-20.json': 1778Total number of messages in 'palestineresistance_2025-01-20.json': 2455Total number of messages in 'resistancechain_2025-01-20.json': 5775 Total number of messages across all files: 125054
创建时间:
2025-01-21
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作