ISSN 1016-5169 | E-ISSN 1308-4488
pdf
Comparative Evaluation of Chatbot Responses on Coronary Artery Disease [Turk Kardiyol Dern Ars]
Turk Kardiyol Dern Ars. Ahead of Print: TKDA-78131 | DOI: 10.5543/tkda.2024.78131

Comparative Evaluation of Chatbot Responses on Coronary Artery Disease

Levent Pay1, Ahmet Çağdaş Yumurtaş2, Tuğba Çetin3, Tufan Çınar4, Mert İlker Hayıroğlu3
1Department of Cardiology, Istanbul Haseki Training and Research Hospital, Istanbul, Türkiye
2Department of Cardiology, Kars Harakani State Hospital, Kars, Türkiye
3Department of Cardiology, Dr Siyami Ersek Thoracic and Cardiovascular Surgery Training Hospital, İstanbul, Türkiye
4Department of Medicine, University of Maryland Medical Center Midtown Campus, Maryland, USA


BACKGROUND
Coronary artery disease (CAD) is the leading cause of morbidity and mortality globally. The growing interest in natural language processing chatbots (NLPC) has ensured their inevitable widespread adoption in the healthcare field. The purpose of this study was to check the accuracy and reproducibility of the answers given by NLPCs such as ChatGPT, Gemini and Bing to frequently asked questions about CAD.

METHODS
Fifty frequently asked questions about CAD were asked 2 times, 1 week apart, on ChatGPT, Gemini and Bing. Two cardiologists independently scored the answers into 4 groups: comprehensive/correct (1), incomplete/partially correct (2), a mix of accurate and inaccurate/misleading (3), and completely inaccurate/irrelevant (4). The accuracy and reproducibility of each NLPCs answers were evaluated.

RESULTS
ChatGPT's scoring was 14% incomplete/partially correct and 86% comprehensive/correct. On the other hand, Gemini provided 68% comprehensive/correct answers, 30% incomplete/partly correct responses, and 2% mix of accurate and inaccurate/misleading. Finally, Bing delivered 60% comprehensive/correct responses, 26% incomplete/partially correct responses, and 8% responses that were a mix of accurate and inaccurate/misleading information. Reproducibility values were 88% for ChatGPT, 84% for Gemini, and 70% for Bing.

CONCLUSIONS
ChatGPT has significant potential to enhance patient education about coronary artery disease by providing more sensitive and accurate answers compared to Bing and Gemini.

Keywords: Artificial intelligence, bing chat, chatgpt, coronary artery disease, digital health, gemini, natural language processing chatbots

Corresponding Author: Levent Pay
Manuscript Language: English
×
APA
NLM
AMA
MLA
Chicago
Copied!
CITE


Journal Metrics

Journal Citation Indicator: 0.18
CiteScore: 1.1
Source Normalized Impact
per Paper:
0.22
SCImago Journal Rank: 0.348

Quick Search

Copyright © 2024 Archives of the Turkish Society of Cardiology



Kare Publishing is a subsidiary of Kare Media.