Publication: How accurate are ChatGPT-4 responses in chronic urticaria? A critical analysis with information quality metrics
| dc.contributor.author | Chérrez, Ivan | |
| dc.contributor.author | Faytong-Haro, Marco | |
| dc.contributor.author | Álvarez-Muñoz, Patricio Rigoberto | |
| dc.contributor.author | Larco-Sousa, José Ignacio | |
| dc.contributor.author | de Arruda-Chaves, Erika | |
| dc.contributor.author | Rojo, Isabel | |
| dc.contributor.author | Moncayo, Carol Vivian | |
| dc.contributor.author | Ramón, Germán Darío | |
| dc.contributor.author | Rodas-Valero, Gabriela | |
| dc.contributor.author | Kocaturk Goncu, Emek Ozgur | |
| dc.contributor.institution | Chérrez, Ivan, Universidad Espíritu Santo, Samborondon, Ecuador, Respiralab Research Group, Guayaquil, Ecuador, Institute of Allergology, Charité – Universitätsmedizin Berlin, Berlin, Germany, Allergology and Immunology, Fraunhofer Institute for Translational Medicine and Pharmacology ITMP, Frankfurt am Main, Germany | |
| dc.contributor.institution | Faytong-Haro, Marco, Respiralab Research Group, Guayaquil, Ecuador, Universidad Estatal de Milagro, Milagro, Ecuador, Laboratorio para Investigación para el Desarrollo del Ecuador, Guayaquil, Ecuador | |
| dc.contributor.institution | Álvarez-Muñoz, Patricio Rigoberto, Universidad Estatal de Milagro, Milagro, Ecuador | |
| dc.contributor.institution | Larco-Sousa, José Ignacio, Allergy Department, Clínica San Felipe, Lima, Peru | |
| dc.contributor.institution | de Arruda-Chaves, Erika, Clínica Anglo Americana, Lima, Peru | |
| dc.contributor.institution | Rojo, Isabel, Allergy Service, Juarez Hospital, Mexico, Mexico | |
| dc.contributor.institution | Moncayo, Carol Vivian, Allergy Service, Hospital Juárez de México, Mexico, Mexico | |
| dc.contributor.institution | Ramón, Germán Darío, Instituto de Alergia e Inmunología del Sur, Bahia Blanca, Argentina | |
| dc.contributor.institution | Rodas-Valero, Gabriela, Universidad Espíritu Santo, Samborondon, Ecuador, Respiralab Research Group, Guayaquil, Ecuador | |
| dc.contributor.institution | Kocaturk Goncu, Emek Ozgur, Institute of Allergology, Charité – Universitätsmedizin Berlin, Berlin, Germany, Allergology and Immunology, Fraunhofer Institute for Translational Medicine and Pharmacology ITMP, Frankfurt am Main, Germany, Department of Dermatology, Bahçeşehir Üniversitesi, Istanbul, Turkey | |
| dc.date.accessioned | 2025-10-05T14:28:46Z | |
| dc.date.issued | 2025 | |
| dc.description.abstract | Background: The increasing use of artificial intelligence (AI) in healthcare, especially in delivering medical information, prompts concerns over the reliability and accuracy of AI-generated responses. This study evaluates the quality, reliability, and readability of ChatGPT-4 responses for chronic urticaria (CU) care, considering the potential implications of inaccurate medical information. Objective: The goal of the study was to assess the quality, reliability, and readability of ChatGPT-4 responses to inquiries on CU management in accordance with international guidelines, utilizing validated metrics to evaluate the effectiveness of ChatGPT-4 as a resource for medical information acquisition. Methods: Twenty-four questions were derived from the EAACI/GA2LEN/EuroGuiDerm/APAAACI recommendations and utilized as prompts for ChatGPT-4 to obtain responses in individual chats for each question. The inquiries were categorized into 3 groups: A.) Classification and Diagnosis, B.) Assessment and Monitoring, and C.) Treatment and Management Recommendations. The responses were separately evaluated by allergy specialists utilizing the DISCERN instrument for quality assessment, Journal of the American Medical Association (JAMA) benchmark criteria for reliability evaluation, and Flesch scores for readability analysis. The scores were further examined by median calculations and Intraclass Correlation Coefficient assessments. Results: Categories A and C exhibited insufficient reliability according to JAMA, with median scores of 1 and 0, respectively. Category B exhibited a low reliability score (median 2, interquartile range 2). The information quality from category C questions was satisfactory (median 51.5, IQR 12.5). All 3 groups exhibited confusing readability levels according to the Flesch assessment. Limitations: The study's limitations encompass the emphasis on CU, possible bias in question selection, the use of particular instruments such as DISCERN, JAMA, and Flesch, as well as reliance on expert opinion for assessment. Conclusion: ChatGPT-4 demonstrates potential for producing medical content, nonetheless, its reliability is shaky underscoring the necessity for caution and confirmation when employing AI-generated medical information, especially in the management of CU. © 2025 Elsevier B.V., All rights reserved. | |
| dc.identifier.doi | 10.1016/j.waojou.2025.101071 | |
| dc.identifier.issn | 19394551 | |
| dc.identifier.issue | 7 | |
| dc.identifier.scopus | 2-s2.0-105008012709 | |
| dc.identifier.uri | https://doi.org/10.1016/j.waojou.2025.101071 | |
| dc.identifier.uri | https://hdl.handle.net/20.500.14719/6259 | |
| dc.identifier.volume | 18 | |
| dc.language.iso | en | |
| dc.publisher | Elsevier Inc. | |
| dc.relation.oastatus | All Open Access | |
| dc.relation.oastatus | Gold Open Access | |
| dc.relation.oastatus | Green Accepted Open Access | |
| dc.relation.oastatus | Green Open Access | |
| dc.relation.source | World Allergy Organization Journal | |
| dc.subject.authorkeywords | Artificial Intelligence | |
| dc.subject.authorkeywords | Chronic Urticaria | |
| dc.subject.authorkeywords | Generative Artificial Intelligence | |
| dc.subject.authorkeywords | Article | |
| dc.subject.authorkeywords | Chatgpt | |
| dc.subject.authorkeywords | Chatgpt 4 | |
| dc.subject.authorkeywords | Chronic Spontaneous Urticaria | |
| dc.subject.authorkeywords | Chronic Urticaria | |
| dc.subject.authorkeywords | Controlled Study | |
| dc.subject.authorkeywords | Correlation Coefficient | |
| dc.subject.authorkeywords | Discern Score | |
| dc.subject.authorkeywords | Disease Control | |
| dc.subject.authorkeywords | Flesch Kincaid Grade Level | |
| dc.subject.authorkeywords | Flesch Kincaid Reading Ease Score | |
| dc.subject.authorkeywords | Human | |
| dc.subject.authorkeywords | Jama Benchmark | |
| dc.subject.authorkeywords | Quality Of Life | |
| dc.subject.authorkeywords | Scoring System | |
| dc.subject.indexkeywords | Article | |
| dc.subject.indexkeywords | ChatGPT | |
| dc.subject.indexkeywords | ChatGPT 4 | |
| dc.subject.indexkeywords | chronic spontaneous urticaria | |
| dc.subject.indexkeywords | chronic urticaria | |
| dc.subject.indexkeywords | controlled study | |
| dc.subject.indexkeywords | correlation coefficient | |
| dc.subject.indexkeywords | DISCERN score | |
| dc.subject.indexkeywords | disease control | |
| dc.subject.indexkeywords | Flesch Kincaid grade level | |
| dc.subject.indexkeywords | Flesch Kincaid reading ease score | |
| dc.subject.indexkeywords | human | |
| dc.subject.indexkeywords | JAMA benchmark | |
| dc.subject.indexkeywords | quality of life | |
| dc.subject.indexkeywords | scoring system | |
| dc.title | How accurate are ChatGPT-4 responses in chronic urticaria? A critical analysis with information quality metrics | |
| dc.type | Article | |
| dcterms.references | Choudhury, Avishek, Investigating the Impact of User Trust on the Adoption and Use of ChatGPT: Survey Analysis, Journal of Medical Internet Research, 25, (2023), undefined, Guardian, (2023), Generative Pre Trained Transformer A Comprehensive Review on Enabling Technologies Potential Applications Emerging Challenges and Future Directions, (2023), undefined, J Allergy Clin Immunol, (2024), Zuberbier, Thorsten, The international EAACI/GA²LEN/EuroGuiDerm/APAAACI guideline for the definition, classification, diagnosis, and management of urticaria, Allergy: European Journal of Allergy and Clinical Immunology, 77, 3, pp. 734-766, (2022), Charnock, Deborah, DISCERN: An instrument for judging the quality of written consumer health information on treatment choices, Journal of Epidemiology and Community Health, 53, 2, pp. 105-111, (1999), undefined, Silberg, William M., Assesing, cotrolling and Assuring the Quality of medical information on the internet: Caveant lector et viewor - Let the reader and viewer beware, JAMA, 277, 15, pp. 1244-1245, (1997) | |
| dspace.entity.type | Publication | |
| local.indexed.at | Scopus | |
| person.identifier.scopus-author-id | 26636865300 | |
| person.identifier.scopus-author-id | 57844685500 | |
| person.identifier.scopus-author-id | 56922180700 | |
| person.identifier.scopus-author-id | 15725387700 | |
| person.identifier.scopus-author-id | 7801394949 | |
| person.identifier.scopus-author-id | 59943884900 | |
| person.identifier.scopus-author-id | 59943323800 | |
| person.identifier.scopus-author-id | 55617014100 | |
| person.identifier.scopus-author-id | 59152645100 | |
| person.identifier.scopus-author-id | 18437291300 |
