Publication:
How accurate are ChatGPT-4 responses in chronic urticaria? A critical analysis with information quality metrics

dc.contributor.authorChérrez, Ivan
dc.contributor.authorFaytong-Haro, Marco
dc.contributor.authorÁlvarez-Muñoz, Patricio Rigoberto
dc.contributor.authorLarco-Sousa, José Ignacio
dc.contributor.authorde Arruda-Chaves, Erika
dc.contributor.authorRojo, Isabel
dc.contributor.authorMoncayo, Carol Vivian
dc.contributor.authorRamón, Germán Darío
dc.contributor.authorRodas-Valero, Gabriela
dc.contributor.authorKocaturk Goncu, Emek Ozgur
dc.contributor.institutionChérrez, Ivan, Universidad Espíritu Santo, Samborondon, Ecuador, Respiralab Research Group, Guayaquil, Ecuador, Institute of Allergology, Charité – Universitätsmedizin Berlin, Berlin, Germany, Allergology and Immunology, Fraunhofer Institute for Translational Medicine and Pharmacology ITMP, Frankfurt am Main, Germany
dc.contributor.institutionFaytong-Haro, Marco, Respiralab Research Group, Guayaquil, Ecuador, Universidad Estatal de Milagro, Milagro, Ecuador, Laboratorio para Investigación para el Desarrollo del Ecuador, Guayaquil, Ecuador
dc.contributor.institutionÁlvarez-Muñoz, Patricio Rigoberto, Universidad Estatal de Milagro, Milagro, Ecuador
dc.contributor.institutionLarco-Sousa, José Ignacio, Allergy Department, Clínica San Felipe, Lima, Peru
dc.contributor.institutionde Arruda-Chaves, Erika, Clínica Anglo Americana, Lima, Peru
dc.contributor.institutionRojo, Isabel, Allergy Service, Juarez Hospital, Mexico, Mexico
dc.contributor.institutionMoncayo, Carol Vivian, Allergy Service, Hospital Juárez de México, Mexico, Mexico
dc.contributor.institutionRamón, Germán Darío, Instituto de Alergia e Inmunología del Sur, Bahia Blanca, Argentina
dc.contributor.institutionRodas-Valero, Gabriela, Universidad Espíritu Santo, Samborondon, Ecuador, Respiralab Research Group, Guayaquil, Ecuador
dc.contributor.institutionKocaturk Goncu, Emek Ozgur, Institute of Allergology, Charité – Universitätsmedizin Berlin, Berlin, Germany, Allergology and Immunology, Fraunhofer Institute for Translational Medicine and Pharmacology ITMP, Frankfurt am Main, Germany, Department of Dermatology, Bahçeşehir Üniversitesi, Istanbul, Turkey
dc.date.accessioned2025-10-05T14:28:46Z
dc.date.issued2025
dc.description.abstractBackground: The increasing use of artificial intelligence (AI) in healthcare, especially in delivering medical information, prompts concerns over the reliability and accuracy of AI-generated responses. This study evaluates the quality, reliability, and readability of ChatGPT-4 responses for chronic urticaria (CU) care, considering the potential implications of inaccurate medical information. Objective: The goal of the study was to assess the quality, reliability, and readability of ChatGPT-4 responses to inquiries on CU management in accordance with international guidelines, utilizing validated metrics to evaluate the effectiveness of ChatGPT-4 as a resource for medical information acquisition. Methods: Twenty-four questions were derived from the EAACI/GA2LEN/EuroGuiDerm/APAAACI recommendations and utilized as prompts for ChatGPT-4 to obtain responses in individual chats for each question. The inquiries were categorized into 3 groups: A.) Classification and Diagnosis, B.) Assessment and Monitoring, and C.) Treatment and Management Recommendations. The responses were separately evaluated by allergy specialists utilizing the DISCERN instrument for quality assessment, Journal of the American Medical Association (JAMA) benchmark criteria for reliability evaluation, and Flesch scores for readability analysis. The scores were further examined by median calculations and Intraclass Correlation Coefficient assessments. Results: Categories A and C exhibited insufficient reliability according to JAMA, with median scores of 1 and 0, respectively. Category B exhibited a low reliability score (median 2, interquartile range 2). The information quality from category C questions was satisfactory (median 51.5, IQR 12.5). All 3 groups exhibited confusing readability levels according to the Flesch assessment. Limitations: The study's limitations encompass the emphasis on CU, possible bias in question selection, the use of particular instruments such as DISCERN, JAMA, and Flesch, as well as reliance on expert opinion for assessment. Conclusion: ChatGPT-4 demonstrates potential for producing medical content, nonetheless, its reliability is shaky underscoring the necessity for caution and confirmation when employing AI-generated medical information, especially in the management of CU. © 2025 Elsevier B.V., All rights reserved.
dc.identifier.doi10.1016/j.waojou.2025.101071
dc.identifier.issn19394551
dc.identifier.issue7
dc.identifier.scopus2-s2.0-105008012709
dc.identifier.urihttps://doi.org/10.1016/j.waojou.2025.101071
dc.identifier.urihttps://hdl.handle.net/20.500.14719/6259
dc.identifier.volume18
dc.language.isoen
dc.publisherElsevier Inc.
dc.relation.oastatusAll Open Access
dc.relation.oastatusGold Open Access
dc.relation.oastatusGreen Accepted Open Access
dc.relation.oastatusGreen Open Access
dc.relation.sourceWorld Allergy Organization Journal
dc.subject.authorkeywordsArtificial Intelligence
dc.subject.authorkeywordsChronic Urticaria
dc.subject.authorkeywordsGenerative Artificial Intelligence
dc.subject.authorkeywordsArticle
dc.subject.authorkeywordsChatgpt
dc.subject.authorkeywordsChatgpt 4
dc.subject.authorkeywordsChronic Spontaneous Urticaria
dc.subject.authorkeywordsChronic Urticaria
dc.subject.authorkeywordsControlled Study
dc.subject.authorkeywordsCorrelation Coefficient
dc.subject.authorkeywordsDiscern Score
dc.subject.authorkeywordsDisease Control
dc.subject.authorkeywordsFlesch Kincaid Grade Level
dc.subject.authorkeywordsFlesch Kincaid Reading Ease Score
dc.subject.authorkeywordsHuman
dc.subject.authorkeywordsJama Benchmark
dc.subject.authorkeywordsQuality Of Life
dc.subject.authorkeywordsScoring System
dc.subject.indexkeywordsArticle
dc.subject.indexkeywordsChatGPT
dc.subject.indexkeywordsChatGPT 4
dc.subject.indexkeywordschronic spontaneous urticaria
dc.subject.indexkeywordschronic urticaria
dc.subject.indexkeywordscontrolled study
dc.subject.indexkeywordscorrelation coefficient
dc.subject.indexkeywordsDISCERN score
dc.subject.indexkeywordsdisease control
dc.subject.indexkeywordsFlesch Kincaid grade level
dc.subject.indexkeywordsFlesch Kincaid reading ease score
dc.subject.indexkeywordshuman
dc.subject.indexkeywordsJAMA benchmark
dc.subject.indexkeywordsquality of life
dc.subject.indexkeywordsscoring system
dc.titleHow accurate are ChatGPT-4 responses in chronic urticaria? A critical analysis with information quality metrics
dc.typeArticle
dcterms.referencesChoudhury, Avishek, Investigating the Impact of User Trust on the Adoption and Use of ChatGPT: Survey Analysis, Journal of Medical Internet Research, 25, (2023), undefined, Guardian, (2023), Generative Pre Trained Transformer A Comprehensive Review on Enabling Technologies Potential Applications Emerging Challenges and Future Directions, (2023), undefined, J Allergy Clin Immunol, (2024), Zuberbier, Thorsten, The international EAACI/GA²LEN/EuroGuiDerm/APAAACI guideline for the definition, classification, diagnosis, and management of urticaria, Allergy: European Journal of Allergy and Clinical Immunology, 77, 3, pp. 734-766, (2022), Charnock, Deborah, DISCERN: An instrument for judging the quality of written consumer health information on treatment choices, Journal of Epidemiology and Community Health, 53, 2, pp. 105-111, (1999), undefined, Silberg, William M., Assesing, cotrolling and Assuring the Quality of medical information on the internet: Caveant lector et viewor - Let the reader and viewer beware, JAMA, 277, 15, pp. 1244-1245, (1997)
dspace.entity.typePublication
local.indexed.atScopus
person.identifier.scopus-author-id26636865300
person.identifier.scopus-author-id57844685500
person.identifier.scopus-author-id56922180700
person.identifier.scopus-author-id15725387700
person.identifier.scopus-author-id7801394949
person.identifier.scopus-author-id59943884900
person.identifier.scopus-author-id59943323800
person.identifier.scopus-author-id55617014100
person.identifier.scopus-author-id59152645100
person.identifier.scopus-author-id18437291300

Files