Labeling Social Media Posts: Does Showing Coders Multimodal Content Produce Better Human Annotation, and a Better Machine Classifier?

NIAID Data Ecosystem2026-05-02 收录

下载链接：

https://doi.org/10.7910/DVN/E2BV85

下载链接

链接失效反馈

官方服务：

资源简介：

The increasing multimodality (e.g., images, videos, links) of social media data presents opportunities and challenges. But text-as-data methods continue to dominate as modes of classification, as multimodal social media data are costly to collect and label. Researchers who face a budget constraint may need to make informed decisions regarding whether to collect and label only the textual content of social media data, or their full multimodal content. In this article, we develop five measures and an experimental framework to assist with these decisions. We propose five performance metrics to measure the costs and benefits of multimodal labeling: average time per post, average time per valid response, valid response rate, inter-coder agreement, and classifier's predictive power. To estimate these measures, we introduce an experimental framework to evaluate coders' performance under text-only and multimodal labeling conditions. We illustrate the method with a tweet labeling experiment.

创建时间：

2025-04-28

5,000+

优质数据集

54 个

任务类型

进入经典数据集