TamperedNews & News400 Datasets (IJMIR'21 Update) BU
收藏DataCite Commons2022-05-17 更新2025-04-15 收录
下载链接:
https://data.uni-hannover.de/dataset/3724ff7e-ca73-490e-8916-1130be809373
下载链接
链接失效反馈官方服务:
资源简介:
# Multimodal Analytics for Real-world News using Measures of Cross-modal Entity Consistency This repository contains the *TamperedNews* and *News400* datasets introduced in the paper: > Eric Müller-Budack, Jonas Theiner, Sebastian Diering, Maximilian Idahl, Sherzod Hakimov und Ralph Ewerth. „Multimodal news analytics using measures of cross-modal entity and context consistency“. In: _International Journal of Multimedia Information Retrieval_ 10.2 (2021), Springer, S. 111–125. DOI: https://doi.org/10.1007/s13735-021-00207-4 ## Content For both datasets *TamperedNews* and *News400*, we provide the: - ```*dataset*.tar.gz``` containing the ```*dataset*.jsonl``` with - Web links to the news texts - Web links to the news image - Outputs of the named entity recognition and disambiguation (NERD) approach - Untampered and tampered entities - ```*dataset*_vise_features.tar.gz```with visual features for events extracted from our event classification approach VisE presented at WACV'21 ([paper](https://openaccess.thecvf.com/content/WACV2021/html/Muller-Budack_Ontology-Driven_Event_Type_Classification_in_Images_WACV_2021_paper.html), [GitHub](https://github.com/TIBHannover/VisE)) Please note that the remaining visual features (```*dataset*_features.tar.gz```) and word embeddings (```*dataset*_wordembeddings.tar.gz```) have been already provided in the first version of both datasets ([News400](https://data.uni-hannover.de/dataset/news400), [TamperedNews](https://data.uni-hannover.de/dataset/tamperednews)). For all entities detected in both datasets, we provide: - ```entities.tar.gz``` containing an ```*entity_type*.jsonl``` for all entity types (events, locations, and persons) with: - Wikidata ID - Wikidata label - Meta information used for tampering - Web links to all reference images crawled from Google, Bing, and Wikidata - ```entities_features.tar.gz``` containing the visual features of the reference images for all entities ## Source Code The source code to reproduce our results as well as download scripts to crawl news texts and images can be found on our GitHub page: https://github.com/TIBHannover/cross-modal_entity_consistency
提供机构:
LUIS
创建时间:
2022-05-16



