Araştırma Çıktıları | WoS | Scopus | TR-Dizin | PubMed
Permanent URI for this communityhttps://hdl.handle.net/20.500.14719/1741
Browse
8 results
Search Results
Publication Metadata only ARVA: An Augmented Reality-Based Visual Aid for Mobility Enhancement Through Real-Time Video Stream Transformation(IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC, 2024) Sadeghzadeh, Arezoo; Islam, Md Baharul; Uddin, Md Nur; Aydin, Tarkan; Bahcesehir University; State University System of Florida; Florida Gulf Coast University; Daffodil International UniversityVisual field loss (VFL) is a persistent visual impairment characterized by limited vision spots (scotoma) within the normal visual field, significantly impacting daily activities for affected individuals. Current Virtual Reality (VR) and Augmented Reality (AR)-based visual aids suffer from low video quality, content loss, high levels of contradiction, and limited mobility assessment. To address these issues, we propose an innovative vision aid utilizing AR headset and integrating advanced video processing techniques to elevate the visual perception of individuals with moderate to severe VFL to levels comparable to those with unimpaired vision. Our approach introduces a pioneering optimal video remapping function tailored to the characteristics of AR glasses. This function strategically maps the content of live video captures to the largest intact region of the visual field map, preserving quality while minimizing blurriness and content distortion. To evaluate the performance of our proposed method, a comprehensive empirical user study is conducted including object counting and multi-tasking walking track tests and involving 15 subjects with artificially induced scotomas in their normal visual fields. The proposed vision aid achieves 41.56% enhancement (from 57.31% to 98.87%) in the mean value of the average object recognition rates for all subjects in object counting test. In walking track test, the average mean scores for obstacle avoidance, detected signs, recognized signs, and grasped objects are significantly enhanced after applying the remapping function, with improvements of 7.56% (91.10% to 98.66%), 51.81% (44.85% to 96.66%), 49.31% (43.18% to 92.49%), and 77.77% (13.33% to 91.10%), respectively. Statistical analysis of data before and after applying the remapping function demonstrates the promising performance of our method in enhancing visual awareness and mobility for individuals with VFL.Publication Metadata only Assistive Visual Tool: Enhancing Safe Navigation with Video Remapping in AR Headsets(SPRINGER INTERNATIONAL PUBLISHING AG, 2025) Sadeghzadeh, Arezoo; Islam, Md Baharul; Uddin, Md Nur; Aydin, Tarkan; DelBue, A; Canton, C; Pont-Tuset, J; Tommasi, T; Bahcesehir University; State University System of Florida; Florida Gulf Coast UniversityVisual Field Loss (VFL) is characterized by blind spots or scotomas that poses detrimental impact on fundamental movement activities of individuals. Addressing the challenges (e.g., low video quality, content loss, high levels of contradiction, and limited mobility assessment) faced by existing Extended Reality (XR) systems as vision aids, we introduce a groundbreaking method that enriches the real-time navigation using Augmented Reality (AR) glasses. Our novel vision aid employs advanced video processing techniques to enhance visual perception in individuals with moderate to severe VFL, bridging the gap to healthy vision. A unique optimal video remapping function, tailored to our selected AR glasses characteristics, dynamically maps live video content to the largest intact region of the Visual Field (VF) map. Our method preserves video quality, minimizing blurriness and distortion. Through a comprehensive empirical user study involving 29 subjects with artificially induced scotomas, statistical analyses of object counting and multi-tasking walking track tests demonstrate the promising performance of our method in enhancing visual awareness and navigation capability in real-time.Publication Metadata only HiMODE: A Hybrid Monocular Omnidirectional Depth Estimation Model(IEEE, 2022) Junayed, Masum Shah; Sadeghzadeh, Arezoo; Islam, Md Baharul; Wong, Lai-Kuan; Aydin, Tarkan; Bahcesehir University; Multimedia UniversityMonocular omnidirectional depth estimation is receiving considerable research attention due to its broad applications for sensing 360 degrees surroundings. Existing approaches in this field suffer from limitations in recovering small object details and data lost during the ground-truth depth map acquisition. In this paper, a novel monocular omnidirectional depth estimation model, namely HiMODE is proposed based on a hybrid CNN+Transformer (encoder-decoder) architecture whose modules are efficiently designed to mitigate distortion and computational cost, without performance degradation. Firstly, we design a feature pyramid network based on the HNet block to extract high-resolution features near the edges. The performance is further improved, benefiting from a self and cross attention layer and spatial/temporal patches in the Transformer encoder and decoder, respectively. Besides, a spatial residual block is employed to reduce the number of parameters. By jointly passing the deep features extracted from an input image at each backbone block, along with the raw depth maps predicted by the transformer encoder-decoder, through a context adjustment layer, our model can produce resulting depth maps with better visual quality than the ground-truth. Comprehensive ablation studies demonstrate the significance of each individual module. Extensive experiments conducted on three datasets, Stanford3D, Matterport3D, and SunCG, demonstrate that HiMODE can achieve state-of-the-art performance for 360 degrees monocular depth estimation. Complete project code and supplementary materials are available at https://github.com/himode5008/HiMODE.Publication Metadata only Real-Time YOLO-based Heterogeneous front vehicles detection(Institute of Electrical and Electronics Engineers Inc., 2021) Junayed, Masum Shah; Islam, Md Baharul; Sadeghzadeh, Arezoo; Aydin, Tarkan; Kilimci, Z.H.; Yildirim, T.; Piuri, V.; Czarnowski, I.; Camacho, D.; Manolopoulos, Y.; Solak, S.; Junayed, Masum Shah, Department of Computer Engineering, Bahçeşehir Üniversitesi, Istanbul, Turkey, Daffodil International University, Dhaka, Bangladesh; Islam, Md Baharul, Bahçeşehir Üniversitesi, Istanbul, Turkey, College of Data Science and Engineering, American University of Malta, Cospicua, Malta; Sadeghzadeh, Arezoo, Department of Computer Engineering, Bahçeşehir Üniversitesi, Istanbul, Turkey; Aydin, Tarkan, Department of Computer Engineering, Bahçeşehir Üniversitesi, Istanbul, TurkeyThe perception of the complex road environment is a critical factor in autonomous driving, which has become the research focus in intelligent vehicles. In this paper, a real-time front vehicle detection system is proposed to ensure safe driving in a complex environment, particularly in congested megacities. This system is based on the YOLO model, which effectively detects and classifies various vehicles from both images and videos. It improves detection accuracy by modifying a feature extraction-based backbone. To the authors' best knowledge, this is the first time that vehicle detection is implemented on the recently published DhakaAI dataset. Compared to the other available datasets for object detection, such as KITTI, the DhakaAI dataset has a complex environment with numerous vehicles (21 different types). Experimental results demonstrate that the proposed system outperforms the state-of-the-art object detectors. In this method, the mAP (mean average precision) and the FPS (frame per second) is increased by 2.97% and 1.47, 4.64% and 5.57, 4.75% and 3.02, compared to the RetinaNet, SSD, and Faster RCNN on this dataset, respectively. © 2021 Elsevier B.V., All rights reserved.Publication Metadata only HiMODE: A Hybrid Monocular Omnidirectional Depth Estimation Model(IEEE Computer Society, 2022) Junayed, Masum Shah; Sadeghzadeh, Arezoo; Islam, Md Baharul; Lai-Kuan, Wong; Aydin, Tarkan; Junayed, Masum Shah, Bahçeşehir Üniversitesi, Istanbul, Turkey; Sadeghzadeh, Arezoo, Bahçeşehir Üniversitesi, Istanbul, Turkey; Islam, Md Baharul, Bahçeşehir Üniversitesi, Istanbul, Turkey, American University of Malta, Cospicua, Malta; Lai-Kuan, Wong, Multimedia University, Cyberjaya, Malaysia; Aydin, Tarkan, Bahçeşehir Üniversitesi, Istanbul, TurkeyMonocular omnidirectional depth estimation is receiving considerable research attention due to its broad applications for sensing 360° surroundings. Existing approaches in this field suffer from limitations in recovering small object details and data lost during the ground-truth depth map acquisition. In this paper, a novel monocular omnidirectional depth estimation model, namely HiMODE is proposed based on a hybrid CNN+Transformer (encoder-decoder) architecture whose modules are efficiently designed to mitigate distortion and computational cost, without performance degradation. Firstly, we design a feature pyramid network based on the HNet block to extract high-resolution features near the edges. The performance is further improved, benefiting from a self and cross attention layer and spatial/temporal patches in the Transformer encoder and decoder, respectively. Besides, a spatial residual block is employed to reduce the number of parameters. By jointly passing the deep features extracted from an input image at each backbone block, along with the raw depth maps predicted by the transformer encoder-decoder, through a context adjustment layer, our model can produce resulting depth maps with better visual quality than the ground-truth. Comprehensive ablation studies demonstrate the significance of each individual module. Extensive experiments conducted on three datasets, Stanford3D, Matterport3D, and SunCG, demonstrate that HiMODE can achieve state-of-the-art performance for 360° monocular depth estimation. Complete project code and supplementary materials are available at https://github.com/himode5008/HiMODE. © 2025 Elsevier B.V., All rights reserved.Publication Metadata only Hybrid CNN+Transformer for Diabetic Retinopathy Recognition and Grading(Institute of Electrical and Electronics Engineers Inc., 2023) Sadeghzadeh, Arezoo; Junayed, Masum Shah; Aydin, Tarkan; Islam, Md Baharul; Sadeghzadeh, Arezoo, Department of Computer Engineering, Bahçeşehir Üniversitesi, Istanbul, Turkey; Junayed, Masum Shah, Department of Computer Engineering, Bahçeşehir Üniversitesi, Istanbul, Turkey; Aydin, Tarkan, Department of Computer Engineering, Bahçeşehir Üniversitesi, Istanbul, Turkey; Islam, Md Baharul, Bahçeşehir Üniversitesi, Istanbul, Turkey, Florida Gulf Coast University, Fort Myers, United StatesDiabetic retinopathy (DR) is a cause of blindness when it is not cured timely. Therefore, automatic DR detection and grading systems play a significant role in early diagnosis and treatment. However, the accuracy of the existing computer-aided systems is still insufficient for clinical applications and they need large-scale training datasets for obtaining good performance. This paper proposes a hybrid CNN+Transformer DR recognition and grading system to competitively improve performance even when directly trained on small datasets. Firstly, a deep CNN-based EfficientNet-B0 backbone is used as the feature extractor. Then, global dependencies are drawn between the input and output by employing a Transformer encoder-decoder (TE-TD), interleaved with Multi-Head Self Attentions (MHSA) for feature encoding. It is followed by a Residual Spatial Module (RSM) to improve the performance of the model further while stabilizing the training. A prediction feed-forward network (PFFN) is used as a classifier. The effectiveness of different modules on the performance of the system and the superiority of the combined CNN and Transformer over plain individual architectures are all investigated through comprehensive ablation studies. Our approach attains a high generalization by obtaining state-of-the-art performance in both recognition and grading on five different benchmark datasets, i.e., EyePACS, APTOS, DDR, Messidor-l, and Messidor-2. © 2023 Elsevier B.V., All rights reserved.Publication Open Access ARVA: An Augmented Reality-Based Visual Aid for Mobility Enhancement Through Real-Time Video Stream Transformation(Institute of Electrical and Electronics Engineers Inc., 2024) Sadeghzadeh, Arezoo; Islam, Md Baharul; Uddin, Md Nur; Aydin, Tarkan; Sadeghzadeh, Arezoo, Department of Computer Engineering, Bahçeşehir Üniversitesi, Istanbul, Turkey; Islam, Md Baharul, Department of Computer Engineering, Bahçeşehir Üniversitesi, Istanbul, Turkey, Department of Computer Science and Software Engineering, Florida Gulf Coast University, Fort Myers, United States, Department of Computer Science and Engineering, Daffodil International University, Dhaka, Bangladesh; Uddin, Md Nur, BJIT Group, Engineering Division, Dhaka, Bangladesh; Aydin, Tarkan, Department of Computer Engineering, Bahçeşehir Üniversitesi, Istanbul, TurkeyVisual field loss (VFL) is a persistent visual impairment characterized by limited vision spots (scotoma) within the normal visual field, significantly impacting daily activities for affected individuals. Current Virtual Reality (VR) and Augmented Reality (AR)-based visual aids suffer from low video quality, content loss, high levels of contradiction, and limited mobility assessment. To address these issues, we propose an innovative vision aid utilizing AR headset and integrating advanced video processing techniques to elevate the visual perception of individuals with moderate to severe VFL to levels comparable to those with unimpaired vision. Our approach introduces a pioneering optimal video remapping function tailored to the characteristics of AR glasses. This function strategically maps the content of live video captures to the largest intact region of the visual field map, preserving quality while minimizing blurriness and content distortion. To evaluate the performance of our proposed method, a comprehensive empirical user study is conducted including object counting and multi-tasking walking track tests and involving 15 subjects with artificially induced scotomas in their normal visual fields. The proposed vision aid achieves 41.56% enhancement (from 57.31% to 98.87%) in the mean value of the average object recognition rates for all subjects in object counting test. In walking track test, the average mean scores for obstacle avoidance, detected signs, recognized signs, and grasped objects are significantly enhanced after applying the remapping function, with improvements of 7.56% (91.10% to 98.66%), 51.81% (44.85% to 96.66%), 49.31% (43.18% to 92.49%), and 77.77% (13.33% to 91.10%), respectively. Statistical analysis of data before and after applying the remapping function demonstrates the promising performance of our method in enhancing visual awareness and mobility for individuals with VFL. © 2024 Elsevier B.V., All rights reserved.Publication Metadata only Assistive Visual Tool: Enhancing Safe Navigation with Video Remapping in AR Headsets(Springer Science and Business Media Deutschland GmbH, 2025) Sadeghzadeh, Arezoo; Islam, Md Baharul; Uddin, Md Nur; Aydin, Tarkan; Del Bue, A.; Canton, C.; Pont-Tuset, J.; Tommasi, T.; Sadeghzadeh, Arezoo, Bahçeşehir Üniversitesi, Istanbul, Turkey; Islam, Md Baharul, Bahçeşehir Üniversitesi, Istanbul, Turkey, Florida Gulf Coast University, Fort Myers, United States; Uddin, Md Nur, BJIT Group, Engineering Division, Dhaka, Bangladesh; Aydin, Tarkan, Bahçeşehir Üniversitesi, Istanbul, TurkeyVisual Field Loss (VFL) is characterized by blind spots or scotomas that poses detrimental impact on fundamental movement activities of individuals. Addressing the challenges (e.g., low video quality, content loss, high levels of contradiction, and limited mobility assessment) faced by existing Extended Reality (XR) systems as vision aids, we introduce a groundbreaking method that enriches the real-time navigation using Augmented Reality (AR) glasses. Our novel vision aid employs advanced video processing techniques to enhance visual perception in individuals with moderate to severe VFL, bridging the gap to healthy vision. A unique optimal video remapping function, tailored to our selected AR glasses characteristics, dynamically maps live video content to the largest intact region of the Visual Field (VF) map. Our method preserves video quality, minimizing blurriness and distortion. Through a comprehensive empirical user study involving 29 subjects with artificially induced scotomas, statistical analyses of object counting and multi-tasking walking track tests demonstrate the promising performance of our method in enhancing visual awareness and navigation capability in real-time. © 2025 Elsevier B.V., All rights reserved.
