Humans vs ChatGPT texts on TOEFL questions and HC3 dataset
收藏NIAID Data Ecosystem2026-05-02 收录
下载链接:
https://zenodo.org/record/11199029
下载链接
链接失效反馈官方服务:
资源简介:
Description
Human-ChatGPT (gpt4-o) comparison corpus. It extends the HC3 dataset and ChatGPT Generated Text Detection corpus. The original datasets include the questions and human & ChatGPT3.5 answers. These datasets extend the originals with answers from gpt4-o. Each line of each file is the gpt4-o answer to each of the questions.
HC3 dataset:
finance
medicine
computing
open questions
ChatGPT Generated Text Detection corpus:
toefl
Program.py: python script to lemmatize, POS, and clean the human/chatgpt texts
Paper
Paper: Playing with Words: Comparing the Vocabulary and Lexical Richness of ChatGPT and Humans
Cite:
@misc{reviriego2023playing, title={Playing with Words: Comparing the Vocabulary and Lexical Richness of ChatGPT and Humans}, author={Pedro Reviriego and Javier Conde and Elena Merino-Gómez and Gonzalo Martínez and José Alberto Hernández}, year={2023}, eprint={2308.07462}, archivePrefix={arXiv}, primaryClass={cs.CL}}
创建时间:
2024-05-15



