Publication: Hybrid CNN+Transformer for Diabetic Retinopathy Recognition and Grading
No Thumbnail Available
Date
2023
Journal Title
Journal ISSN
Volume Title
Publisher
Institute of Electrical and Electronics Engineers Inc.
Abstract
Diabetic retinopathy (DR) is a cause of blindness when it is not cured timely. Therefore, automatic DR detection and grading systems play a significant role in early diagnosis and treatment. However, the accuracy of the existing computer-aided systems is still insufficient for clinical applications and they need large-scale training datasets for obtaining good performance. This paper proposes a hybrid CNN+Transformer DR recognition and grading system to competitively improve performance even when directly trained on small datasets. Firstly, a deep CNN-based EfficientNet-B0 backbone is used as the feature extractor. Then, global dependencies are drawn between the input and output by employing a Transformer encoder-decoder (TE-TD), interleaved with Multi-Head Self Attentions (MHSA) for feature encoding. It is followed by a Residual Spatial Module (RSM) to improve the performance of the model further while stabilizing the training. A prediction feed-forward network (PFFN) is used as a classifier. The effectiveness of different modules on the performance of the system and the superiority of the combined CNN and Transformer over plain individual architectures are all investigated through comprehensive ablation studies. Our approach attains a high generalization by obtaining state-of-the-art performance in both recognition and grading on five different benchmark datasets, i.e., EyePACS, APTOS, DDR, Messidor-l, and Messidor-2. © 2023 Elsevier B.V., All rights reserved.
