Face Mask Perception Scale Attitudes in U.S. Covid-19 News
收藏Mendeley Data2024-03-27 更新2024-06-28 收录
下载链接:
https://dataverse.harvard.edu/citation?persistentId=doi:10.7910/DVN/X1KKAB
下载链接
链接失效反馈官方服务:
资源简介:
This data set aggregates mainstream U.S. media news articles and opinion show transcripts concerning Covid-19 mask-wearing between April 6 and June 8, 2020. Additionally, for several paragraphs of the news articles, it includes crowd-sourced annotation of the statements according to 14 mask-wearing attitude questions taken from Howard 2020's Face Mask Perception Scale (FMPS). Each annotated paragraph thus contains 14 labels (e.g., for "Does the text presented convey the idea that it is difficult to breathe while wearing a face mask?" options include 0 (it is difficult), 1 (it is not difficult), and 2 (does not mention)) with a confidence score ranging from 0-6 for each label. In total, this data set contains 2,361 news articles from eight sources (Daily Kos, Vox, New York Times, Fox, Breitbart, Tucker Carlson, Laura Ingraham, Sean Hannity), including article title, publication date, source, and raw text. Another file of the 8,473 paragraphs contained in all the articles is included with unique paragraph IDs. A separate file of crowd-sourced annotations is also included where labels are given for certain paragraph IDs, and contains 7,559 total annotations across 297 paragraphs and 202 articles. Instructions for how to load the data, as well as filter the annotations for high-quality versions (where there is high confidence or inner-annotator agreement), can be found at https://github.com/ricknabb/media-ideology-coding.
本数据集聚合了2020年4月6日至6月8日期间,美国主流媒体发布的有关新冠疫情口罩佩戴政策的新闻报道与脱口秀访谈转录稿。此外,针对其中部分新闻段落,数据集附带基于Howard 2020年提出的面部口罩感知量表(Face Mask Perception Scale, FMPS)中的14个口罩佩戴态度问题所开展的众包标注。每一条被标注的段落均包含14个分类标签:例如针对“所呈现文本是否传递了‘佩戴口罩时呼吸困难’这一观点?”这一问题,可选答案包括0(存在呼吸困难)、1(无呼吸困难)以及2(未提及相关内容),且每个标签附带0至6分的置信度评分。
本数据集总计收录来自8个信源(Daily Kos、Vox、《纽约时报》(New York Times)、Fox、Breitbart、Tucker Carlson、Laura Ingraham、Sean Hannity)的2361篇新闻报道,涵盖文章标题、发布日期、信源信息与原始文本。同时附带一份包含所有文章中8473个段落的文件,每个段落均配有唯一标识符。另有一份独立的众包标注文件,针对特定段落ID提供标注信息,总计覆盖297个段落、202篇文章,共7559条标注内容。关于数据集加载方法,以及如何筛选高质量标注(即置信度较高或标注者内部一致性较强的标注)的详细说明,可参阅https://github.com/ricknabb/media-ideology-coding。
创建时间:
2024-03-06



