MM_Claims Dataset

Name: MM_Claims Dataset
Creator: LUIS
Published: 2023-07-10 09:52:21
License: 暂无描述

DataCite Commons2023-07-10 更新2024-07-13 收录

下载链接：

https://data.uni-hannover.de/dataset/99d876e0-3ab3-4a93-8f8d-101abea40034

下载链接

链接失效反馈

官方服务：

资源简介：

This dataset is introduced by the paper "MM-Claims: A Dataset for Multimodal Claim Detection in Social Media" If you use this dataset in your work, please cite: @inproceedings{cheema-etal-2022-mm, title = "{MM}-Claims: A Dataset for Multimodal Claim Detection in Social Media", author = {Cheema, Gullal Singh and Hakimov, Sherzod and Sittar, Abdul and M{\"u}ller-Budack, Eric and Otto, Christian and Ewerth, Ralph}, booktitle = "Findings of the Association for Computational Linguistics: NAACL 2022", month = jul, year = "2022", address = "Seattle, United States", publisher = "Association for Computational Linguistics", url = "https://aclanthology.org/2022.findings-naacl.72", pages = "962--979" } Information about columns in the files: 1. claim_binary: {0: 'Not a claim', 1: 'claim'} 2. claim_three: {0: 'Not a claim', '1': 'claim but not check-worthy', 2: 'check-worthy claim'} 3. claim_vis: {0: 'Not a claim', '1': 'visually-irrelevant claim', 2: 'visually-relevant claim'} Official code repository: https://github.com/TIBHannover/MM_Claims **All files were updated on 5th May 2023, with some images removed because of obscene images that were not automatically detected in the first phase.** **If you are interested in the binary task on check-worthiness estimation in multimodal claims, you can find the refined dataset with new test data released as part of the CLEF Checkthat! 2023 challenge: https://gitlab.com/checkthat_lab/clef2023-checkthat-lab/-/tree/main**

本数据集由论文《MM-Claims：面向社交媒体多模态主张（Multimodal Claim）检测的数据集》提出。若您在研究工作中使用该数据集，请引用如下文献： @inproceedings{cheema-etal-2022-mm, title = "MM-Claims：面向社交媒体多模态主张检测的数据集", author = {Cheema, Gullal Singh and Hakimov, Sherzod and Sittar, Abdul and Müller-Budack, Eric and Otto, Christian and Ewerth, Ralph}, booktitle = "《计算语言学协会（Association for Computational Linguistics，ACL）研究发现：2022年北美计算语言学协会年会（NAACL 2022）》", month = jul, year = "2022", address = "美国西雅图", publisher = "计算语言学协会", url = "https://aclanthology.org/2022.findings-naacl.72", pages = "962--979" } 文件字段说明如下： 1. claim_binary：标签映射为{0: "非主张", 1: "主张"} 2. claim_three：标签映射为{0: "非主张", 1: "非可核查主张", 2: "可核查主张"} 3. claim_vis：标签映射为{0: "非主张", 1: "视觉无关主张", 2: "视觉相关主张"} 官方代码仓库：https://github.com/TIBHannover/MM_Claims **所有文件已于2023年5月5日完成更新，移除了首轮预处理阶段未被自动检测到的低俗色情图片。** **若您关注多模态主张可核查性估计的二分类任务，可使用CLEF Checkthat! 2023挑战赛发布的含全新测试数据的精制数据集，链接为：https://gitlab.com/checkthat_lab/clef2023-checkthat-lab/-/tree/main**

提供机构：

LUIS

创建时间：

2022-07-13

搜集汇总

背景与挑战

背景概述

MM_Claims Dataset是一个专注于社交媒体多模态声明检测的数据集，提供二元、三元和视觉相关性三种分类标注。数据集经过更新，移除了不适当内容，并与CLEF Checkthat! 2023挑战相关。

以上内容由遇见数据集搜集并总结生成

5,000+

优质数据集

54 个

任务类型

进入经典数据集