Replication Data for: ChatGPT outperforms crowd-workers for text-annotation tasks

DataONE2023-07-19 更新2024-06-08 收录

下载链接：

https://search.dataone.org/view/sha256:60be7614b50cf47808403c6e1e1818489f16a909456ec65d15b3aea9e08af2fa

下载链接

链接失效反馈

官方服务：

资源简介：

Many NLP applications require manual text annotations for a variety of tasks, notably to train classifiers or evaluate the performance of unsupervised models. Depending on the size and degree of complexity, the tasks may be conducted by crowd-workers on platforms such as MTurk as well as trained annotators, such as research assistants. Using four samples of tweets and news articles (n = 6,183), we show that ChatGPT outperforms crowd-workers for several annotation tasks, including relevance, stance, topics, and frame detection. Across the four datasets, the zero-shot accuracy of ChatGPT exceeds that of crowd-workers by about 25 percentage points on average, while ChatGPT's intercoder agreement exceeds that of both crowd-workers and trained annotators for all tasks. Moreover, the per-annotation cost of ChatGPT is less than $0.003---about thirty times cheaper than MTurk. These results demonstrate the potential of large language models to drastically increase the efficiency of text classification.

创建时间：

2023-11-08

5,000+

优质数据集

54 个

任务类型

进入经典数据集