Dataset corresponding to the paper "A privacy-preserving approach to identify riot-related footage on social media".

Name: Dataset corresponding to the paper "A privacy-preserving approach to identify riot-related footage on social media".
Creator: El Khatibi, Naufal; van Galen, Maurits; van Horik, Bryan
Published: 2025-12-29 00:00:00
License: 暂无描述

4TU.ResearchData2025-12-29 更新2026-04-23 收录

下载链接：

https://data.4tu.nl/datasets/1d26c310-5a5b-48e7-b72d-5540bd6d0b6e/1

下载链接

链接失效反馈

官方服务：

资源简介：

107,674 geolocated visual posts from a social media were collected during and after the 'Nahel Merzouk' riots in the summer 2023 in 7 French cities. These posts were fed to an image-to-text model (BLIP2-OPT-2.7B) to produce textual description of the visual content. This dataset contains those textual descriptions, along with the metadata (date, time, and location). A subset of the posts were also annotated as riot-related or not riot-related to train a BERT model. This subset is also provided in this database (see paper for more details). Tables: 1. videos: Contains metadata about each video including location and timestamp information.2. captions: Contains all captions extracted from videos, with frame-level information.3. annotated_captions: Contains a subset of captions that have been manually annotated for riot-related content.4. annotated_videos: Contains manually annotated video-level labels for riot detection.5. split_annotated_videos: Defines the train/test split for annotated videos used in model training and evaluation.

本数据集收录了2023年夏季法国7座城市在“纳赫尔·梅尔祖克（Nahel Merzouk）”骚乱期间及骚乱后，从社交媒体平台采集的107674条带地理位置标注的视觉帖文。上述帖文被输入至图像转文本模型BLIP2-OPT-2.7B，以生成对应视觉内容的文本描述。本数据集包含生成的文本描述，以及帖文的元数据（日期、时间与地理位置信息）。此外，部分帖文被人工标注为“与骚乱相关”或“非骚乱相关”，用于训练BERT（Bidirectional Encoder Representations from Transformers）模型，该标注子集同样收录于本数据库（详细信息请参阅相关论文）。 数据集包含以下数据表： 1. videos表：存储每条视频的元数据，涵盖地理位置与时间戳信息。2. captions表：存储从视频中提取的全部字幕，包含帧级细节信息。3. annotated_captions表：存储经人工标注的骚乱相关内容字幕子集。4. annotated_videos表：存储用于骚乱检测的人工标注视频级标签。5. split_annotated_videos表：定义了用于模型训练与评估的标注视频集的训练/测试划分方式。

提供机构：

El Khatibi, Naufal; van Galen, Maurits; van Horik, Bryan

创建时间：

2025-12-29