ChatGPT Performs Worse on USMLE-Style Ethics Questions Compared to Medical Knowledge Questions.
The main objective of this study is to evaluate the ability of the Large Language Model Chat Generative Pre-Trained Transformer (ChatGPT) to accurately answer the United States Medical Licensing Examination (USMLE) board-style medical ethics questions compared to medical knowledge-based questions. This study has the additional objectives of comparing the overall accuracy of GPT-3.5 to GPT-4 and assessing the variability of responses given by each version.
Author(s): Danehy, Tessa, Hecht, Jessica, Kentis, Sabrina, Schechter, Clyde B, Jariwala, Sunit P
DOI: 10.1055/a-2405-0138