Replication Data for: ChatGPT outperforms crowd-workers for text-annotation tasks
收藏DataONE2023-07-19 更新2024-06-08 收录
下载链接:
https://search.dataone.org/view/sha256:60be7614b50cf47808403c6e1e1818489f16a909456ec65d15b3aea9e08af2fa
下载链接
链接失效反馈官方服务:
资源简介:
Many NLP applications require manual text annotations for a variety of tasks, notably to train classifiers or evaluate the performance of unsupervised models. Depending on the size and degree of complexity, the tasks may be conducted by crowd-workers on platforms such as MTurk as well as trained annotators, such as research assistants. Using four samples of tweets and news articles (n = 6,183), we show that ChatGPT outperforms crowd-workers for several annotation tasks, including relevance, stance, topics, and frame detection. Across the four datasets, the zero-shot accuracy of ChatGPT exceeds that of crowd-workers by about 25 percentage points on average, while ChatGPT's intercoder agreement exceeds that of both crowd-workers and trained annotators for all tasks. Moreover, the per-annotation cost of ChatGPT is less than $0.003---about thirty times cheaper than MTurk. These results demonstrate the potential of large language models to drastically increase the efficiency of text classification.
创建时间:
2023-11-08



