five

Performance of GPT-4o mini and GPT-4o for medical text mining tasks at different temperature settings

收藏
DataCite Commons2025-05-01 更新2025-05-10 收录
下载链接:
https://datadryad.org/dataset/doi:10.5061/dryad.crjdfn3dt
下载链接
链接失效反馈
官方服务:
资源简介:
The application of natural language processing (NLP) for extracting data from biomedical research has gained momentum with the advent of large language models (LLMs). However, the effect of different LLM parameters, such as temperature settings, on biomedical text mining remains underexplored and a consensus on what settings can be considered “safe” is missing. This study evaluates the impact of temperature settings on LLM performance for a named entity recognition and a classification task in clinical trial publications. Two datasets that had been annotated as part of previous projects by the author group were used to create tasks for the evaluation of two LLMs, namely Generative Pretrained Transformer 4 Omni (GPT-4o, OpenAI, San Francisco, United States) and GPT-4o mini at nine different temperature settings. The LLMs were first asked to extract the number of people who underwent randomization from the abstract of a publication reporting on a randomized clinical trial (RCT). The second task was to classify an abstract regarding whether or not it was reported on an RCT and/or an oncology topic. The answers of the LLM as well as the ground truth are provided in the dataset.
提供机构:
Dryad
创建时间:
2025-01-13
二维码
社区交流群
二维码
科研交流群
商业服务