Publication:
Hybrid CNN+Transformer for Diabetic Retinopathy Recognition and Grading

No Thumbnail Available

Date

2023

Journal Title

Journal ISSN

Volume Title

Publisher

Institute of Electrical and Electronics Engineers Inc.

Research Projects

Organizational Units

Journal Issue

Abstract

Diabetic retinopathy (DR) is a cause of blindness when it is not cured timely. Therefore, automatic DR detection and grading systems play a significant role in early diagnosis and treatment. However, the accuracy of the existing computer-aided systems is still insufficient for clinical applications and they need large-scale training datasets for obtaining good performance. This paper proposes a hybrid CNN+Transformer DR recognition and grading system to competitively improve performance even when directly trained on small datasets. Firstly, a deep CNN-based EfficientNet-B0 backbone is used as the feature extractor. Then, global dependencies are drawn between the input and output by employing a Transformer encoder-decoder (TE-TD), interleaved with Multi-Head Self Attentions (MHSA) for feature encoding. It is followed by a Residual Spatial Module (RSM) to improve the performance of the model further while stabilizing the training. A prediction feed-forward network (PFFN) is used as a classifier. The effectiveness of different modules on the performance of the system and the superiority of the combined CNN and Transformer over plain individual architectures are all investigated through comprehensive ablation studies. Our approach attains a high generalization by obtaining state-of-the-art performance in both recognition and grading on five different benchmark datasets, i.e., EyePACS, APTOS, DDR, Messidor-l, and Messidor-2. © 2023 Elsevier B.V., All rights reserved.

Description

Keywords

Citation

Endorsement

Review

Supplemented By

Referenced By