GPT Capabilites for Extracting Tasks From Textual Process Descriptions

NIAID Data Ecosystem2026-05-01 收录

下载链接：

https://zenodo.org/record/7783506

下载链接

链接失效反馈

官方服务：

资源简介：

This dataset provides multiple tables which evaluate the capabilities of GPT 1, GPT 2, GPT 3, GPT 3.5, and GPT 4 regarding the extraction of tasks from https://doi.org/10.5281/zenodo.7783492 and PET datasets in model_completeness.zip, model_correctness.xslx and control_evaluation.xlsx. The performance of the LLMs is measured by calculating a range of similarity metrics: Extracted number of tasks from text vs. extracted number of tasks from Model Semantic Text Similarity: Contextual and Non-Contextual between extracted sets of tasks Semantic Text Similarity: Contextual and Non-Contextual between extracted individual tasks Similarities and Prevalence for length restricted extracted labels Similarities and Prevalence for augmented texts (each text has been paraphrased by 9 different paraphrasing methods) Recall and Precision for the above mentioned datasets Jaccard Index for the above mentioned datasets This dataset also includes all data collected from a survey (survey_data.ods), which lets users (currently n=40) evaluate how good LLM created models are: https://forms.office.com/e/Y55jyNuPi2?origin=lprLink

创建时间：

2023-11-01

5,000+

优质数据集

54 个

任务类型

进入经典数据集