five

phosseini/multimodal_satire

收藏
Hugging Face2023-10-19 更新2024-03-04 收录
下载链接:
https://hf-mirror.com/datasets/phosseini/multimodal_satire
下载链接
链接失效反馈
官方服务:
资源简介:
--- dataset_info: features: - name: url dtype: string - name: headline dtype: string - name: image_link dtype: string - name: is_satire dtype: int32 splits: - name: train num_bytes: 2841764 num_examples: 10000 download_size: 1268537 dataset_size: 2841764 task_categories: - image-classification language: - en size_categories: - 1K<n<10K --- # Dataset card for "multimodal_satire" This is the dataset for the paper [A Multi-Modal Method for Satire Detection using Textual and Visual Cues](https://aclanthology.org/2020.nlp4if-1.4/). To obtain the full-text body of the articles, you need to scrape websites using the provided links in the dataset. * GitHub repository: [https://github.com/lilyli2004/satire](https://github.com/lilyli2004/satire) ## Reference If you use this dataset, please cite the following paper: ``` @inproceedings{li-etal-2020-multi-modal, title = "A Multi-Modal Method for Satire Detection using Textual and Visual Cues", author = "Li, Lily and Levi, Or and Hosseini, Pedram and Broniatowski, David", booktitle = "Proceedings of the 3rd NLP4IF Workshop on NLP for Internet Freedom: Censorship, Disinformation, and Propaganda", month = dec, year = "2020", address = "Barcelona, Spain (Online)", publisher = "International Committee on Computational Linguistics (ICCL)", url = "https://aclanthology.org/2020.nlp4if-1.4", pages = "33--38", abstract = "Satire is a form of humorous critique, but it is sometimes misinterpreted by readers as legitimate news, which can lead to harmful consequences. We observe that the images used in satirical news articles often contain absurd or ridiculous content and that image manipulation is used to create fictional scenarios. While previous work have studied text-based methods, in this work we propose a multi-modal approach based on state-of-the-art visiolinguistic model ViLBERT. To this end, we create a new dataset consisting of images and headlines of regular and satirical news for the task of satire detection. We fine-tune ViLBERT on the dataset and train a convolutional neural network that uses an image forensics technique. Evaluation on the dataset shows that our proposed multi-modal approach outperforms image-only, text-only, and simple fusion baselines.", } ```
提供机构:
phosseini
原始信息汇总

数据集概述

数据集信息

  • 特征列表

    • url:字符串类型
    • headline:字符串类型
    • image_link:字符串类型
    • is_satire:32位整数类型
  • 数据分割

    • train:包含10000个样本,总大小为2841764字节
  • 下载大小:1268537字节

  • 数据集大小:2841764字节

任务类别

  • 图像分类

语言

  • 英语

数据集规模

  • 1K<n<10K
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作