1School of Computer Science, University of Leeds
2Leeds Institute of Medical Education, School of Medicine, University of Leeds
Abstract
Background: Simulated medical scenarios are useful for evaluating and developing clinical competencies but scheduling them is expensive and time-consuming. Large language models (LLMs) show promise in role-playing tasks. We investigated the fidelity with which ChatGPT can mimic participants in clinical settings.
Objective: To determine the realism with which ChatGPT can portray patient, doctor and examiner roles, and the utility of these agents in clinical education.
Method: We selected four paediatric scenarios from mock OSCEs and set up separate patient, doctor and examiner ChatGPT agents for each. The patient and doctor agents conversed with each other in written format. The examiner agent marked the doctor agent based on this conversation. Patients and clinicians familiar with the OSCE assessed the dialogues.
Results: The patient agent was judged to be true to character most of the time and good at expressing emotion. The doctor agent was reported to be an effective communicator but occasionally used jargon. Both agents had a tendency to produce repetitive responses which undermined realism. The examiner agent had high correlation with human clinicians. There was moderate support for using the simulated interactions for educational purposes.
Conclusion: Although the realism of the agents can be improved, ChatGPT can generate plausible proxies of participants in medical scenarios, and could be useful for complementing standardised patient-based training.