DIVERSE
收藏arXiv2024-03-06 更新2024-06-21 收录
下载链接:
https://doi.org/10.5281/zenodo.10493803
下载链接
链接失效反馈官方服务:
资源简介:
DIVERSE数据集是由陆军网络研究所和卡内基梅隆大学合作创建,包含超过173,000条YouTube视频评论,这些评论针对美国军方的视频内容进行了立场标注。数据集通过人工引导、机器辅助的标注方法,利用弱信号如仇恨言论、讽刺、特定关键词和文本情感等进行标注。该数据集旨在帮助理解社交媒体上对军事招募信息的接收情况,以及如何影响公众对加入军队的看法。此外,数据集的应用还扩展到评估社交媒体内容对目标受众的影响,对任何营销活动都至关重要。
The DIVERSE Dataset was developed in collaboration between the Army Cyber Institute and Carnegie Mellon University. It contains over 173,000 YouTube video comments, which are annotated with stances towards U.S. military-related video content. The dataset adopts a human-guided, machine-aided annotation framework, leveraging weak signals such as hate speech, sarcasm, specific keywords and textual sentiment for labeling. Its core purpose is to facilitate understanding of public reception of military recruitment information on social media, as well as how such content shapes public attitudes toward enlisting in the military. Additionally, the applications of this dataset extend to evaluating the impact of social media content on target audiences, which is critical for any marketing campaign.
提供机构:
陆军网络研究所
创建时间:
2024-03-06
搜集汇总
数据集介绍

背景与挑战
背景概述
DIVERSE 是一个专注于立场分类的新型基准数据集,旨在通过分析YouTube视频评论来解读互联网用户对美国军队的看法。该数据集包含立场、情感、仇恨言论和讽刺等多维度标注,适用于大语言模型研究,发布于2024年,采用开放许可,便于学术使用。
以上内容由遇见数据集搜集并总结生成



