GPT Capabilites for Extracting Tasks From Textual Process Descriptions
收藏NIAID Data Ecosystem2026-05-01 收录
下载链接:
https://zenodo.org/record/7783506
下载链接
链接失效反馈官方服务:
资源简介:
This dataset provides multiple tables which evaluate the capabilities of GPT 1, GPT 2, GPT 3, GPT 3.5, and GPT 4 regarding the extraction of tasks from https://doi.org/10.5281/zenodo.7783492 and PET datasets in model_completeness.zip, model_correctness.xslx and control_evaluation.xlsx.
The performance of the LLMs is measured by calculating a range of similarity metrics:
Extracted number of tasks from text vs. extracted number of tasks from Model
Semantic Text Similarity: Contextual and Non-Contextual between extracted sets of tasks
Semantic Text Similarity: Contextual and Non-Contextual between extracted individual tasks
Similarities and Prevalence for length restricted extracted labels
Similarities and Prevalence for augmented texts (each text has been paraphrased by 9 different paraphrasing methods)
Recall and Precision for the above mentioned datasets
Jaccard Index for the above mentioned datasets
This dataset also includes all data collected from a survey (survey_data.ods), which lets users (currently n=40) evaluate how good LLM created models are: https://forms.office.com/e/Y55jyNuPi2?origin=lprLink
创建时间:
2023-11-01



