Replication Data for: ChatGPT outperforms crowd-workers for text-annotation tasks

NIAID Data Ecosystem2026-05-01 收录

下载链接：

https://doi.org/10.7910/DVN/PQYF6M

下载链接

链接失效反馈

官方服务：

资源简介：

Many NLP applications require manual text annotations for a variety of tasks, notably to train classifiers or evaluate the performance of unsupervised models. Depending on the size and degree of complexity, the tasks may be conducted by crowd-workers on platforms such as MTurk as well as trained annotators, such as research assistants. Using four samples of tweets and news articles (n = 6,183), we show that ChatGPT outperforms crowd-workers for several annotation tasks, including relevance, stance, topics, and frame detection. Across the four datasets, the zero-shot accuracy of ChatGPT exceeds that of crowd-workers by about 25 percentage points on average, while ChatGPT's intercoder agreement exceeds that of both crowd-workers and trained annotators for all tasks. Moreover, the per-annotation cost of ChatGPT is less than $0.003---about thirty times cheaper than MTurk. These results demonstrate the potential of large language models to drastically increase the efficiency of text classification.

创建时间：

2023-07-19

5,000+

优质数据集

54 个

任务类型

进入经典数据集