A comparative evaluation of ChatGPT 3.5 and ChatGPT 4 in responses to selected genetics questions - Full study data
收藏DataONE2024-06-04 更新2025-08-02 收录
下载链接:
https://search.dataone.org/view/sha256:67390dd1e60f78edcbba7e5397a378a7101e96a2626ee2c43a7035d213331237
下载链接
链接失效反馈官方服务:
资源简介:
Objective:
Our objective is to evaluate the efficacy of ChatGPT 4 in accurately and effectively delivering genetic information, building on previous findings with ChatGPT 3.5. We focus on assessing the utility, limitations, and ethical implications of using ChatGPT in medical settings.
Materials and Methods:
A structured questionnaire, including the Brief User Survey (BUS-15) and custom questions, was developed to assess ChatGPT 4's clinical value. An expert panel of genetic counselors and clinical geneticists independently evaluated ChatGPT 4's responses to these questions. We also involved comparative analysis with ChatGPT 3.5, utilizing descriptive statistics and using R for data analysis.
Results:
ChatGPT 4 demonstrated improvements over 3.5 in context recognition, relevance, and informativeness. However, performance variability and concerns about the naturalness of the output were noted. No significant difference in accuracy was found between ChatGPT 3.5 and 4.0. Notably, the effic..., Study Design
This study was conducted to evaluate the performance of ChatGPT 4 (March 23rd, 2023)
 Model) in the context of genetic counseling and education. The evaluation involved a structured questionnaire, which included questions selected from the Brief User Survey (BUS-15) and additional custom questions designed to assess the clinical value of ChatGPT 4's responses.
Questionnaire Development
The questionnaire was built on Qualtrics, which comprised twelve questions: seven selected from the BUS-15 preceded by two additional questions that we designed.
The initial questions focused on quality and answer relevancy:
1.    The overall quality of the Chatbotâs response is: (5-point Likert: Very poor to Very Good)
2.    The Chatbot delivered an answer that provided the relevant information you would include if asked the question. (5-point Likert: Strongly disagree to Strongly agree)
The BUS-15 questions (7-point Likert: Strongly disagree to Strongly agree) focused on:
1.    Recogniti..., , # A comparative evaluation of ChatGPT 3.5 and ChatGPT 4 in responses to selected genetics questions - Full study data
[https://doi.org/10.5061/dryad.s4mw6m9cv](https://doi.org/10.5061/dryad.s4mw6m9cv)
This data was captured when evaluating the ability of ChatGPT to address questions patients may ask it about three genetic conditions (BRCA1, HFE, and MLH1). This data is associated with the JAMIA article of the similar name with the DOIÂ 10.1093/jamia/ocae128
## Description of the data and file structure
1. **Key**: This tab contains the data structure, explaining the survey questions, and potential responses available.
2. **Prompt Responses**: This tab contains the prompts used for ChatGPT, and the response provided from each model (3.5 and 4)
3. **GPT 4 Results**: This tab provides the responses collected from the medical experts (genetic counselors and clinical geneticist) from the Qualtrics survey.
4. **Accuracy (Qx_1)**: This tab contains the subset of results from both the Ch...
创建时间:
2025-08-01



