Araştırma Çıktıları | WoS | Scopus | TR-Dizin | PubMed
Permanent URI for this communityhttps://hdl.handle.net/20.500.14719/1741
Browse
5 results
Search Results
Publication Metadata only An Effective Multi-Camera Dataset and Hybrid Feature Matcher for Real-Time Video Stitching(IEEE Computer Society, 2021) Hosen, Md Imran; Islam, Md Baharul; Sadeghzadeh, Arezoo; Cree, M.J.; Hosen, Md Imran, Department of Computer Engineering, Bahçeşehir Üniversitesi, Istanbul, Turkey; Islam, Md Baharul, Department of Computer Engineering, Bahçeşehir Üniversitesi, Istanbul, Turkey; Sadeghzadeh, Arezoo, Department of Computer Engineering, Bahçeşehir Üniversitesi, Istanbul, TurkeyMulti-camera video stitching combines several videos captured by different cameras into a single video for a wide Field-of-View (FOV). In this paper, a novel dataset is developed for video stitching which consists of 30 video sets captured by four static cameras in various environmental scenarios. Then, a new video stitching method is proposed based on a hybrid matcher for stitching four videos with over 200° FOV. The keypoints and descriptors are obtained by the scale-invariant feature transform (SIFT) and Root-SIFT, respectively. Then, these keypoint descriptors are matched by applying a hybrid matcher, a combination of Brute Force (BF), and Fast Linear Approximated Nearest Neighbours (FLANN) matchers. After geometrical verification and eliminating outlier matching points, one-time homography is estimated based on Random Sample Consensus (RANSAC). The proposed method is implemented and evaluated in different indoor/outdoor video settings. Experimental results demonstrate the capability, high accuracy, and robustness of the proposed method. © 2022 Elsevier B.V., All rights reserved.Publication Metadata only Masked Face Inpainting Through Residual Attention UNet(Institute of Electrical and Electronics Engineers Inc., 2022) Hosen, Md Imran; Islam, Md Baharul; Hosen, Md Imran, Department of Computer Engineering, Bahçeşehir Üniversitesi, Istanbul, Turkey; Islam, Md Baharul, Bahçeşehir Üniversitesi, Istanbul, TurkeyRealistic image restoration with high texture areas such as removing face masks is challenging. The state-of-the-art deep learning-based methods fail to guarantee high-fidelity, cause training instability due to vanishing gradient problems (e.g., weights are updated slightly in initial layers) and spatial information loss. They also depend on intermediary stage such as segmentation meaning require external mask. This paper proposes a blind mask face inpainting method using residual attention UNet to remove the face mask and restore the face with fine details while minimizing the gap with the ground truth face structure. A residual block feeds info to the next layer and directly into the layers about two hops away to solve the gradient vanishing problem. Besides, the attention unit helps the model focus on the relevant mask region, reducing resources and making the model faster. Extensive experiments on the publicly available CelebA dataset show the feasibility and robustness of our proposed model. © 2022 Elsevier B.V., All rights reserved.Publication Metadata only Single Image Super-Resolution Using Inverted Residual and Channel-Wise Attention(Institute of Electrical and Electronics Engineers Inc., 2022) Hosen, Md Imran; Islam, Md Baharul; Hosen, Md Imran, Department of Computer Engineering, Bahçeşehir Üniversitesi, Istanbul, Turkey; Islam, Md Baharul, Bahçeşehir Üniversitesi, Istanbul, Turkey, American University of Malta, Cospicua, MaltaSingle-image super-resolution (SISR) is the task of reconstructing a high-resolution image from a low-resolution image. Convolutional neural network (CNN)-based SISR techniques have demonstrated promising results. However, most CNN-based models cannot discriminate between different forms of information and treat them identically, which limits the models' ability to represent information. On the other hand, when a neural network's depth increases, the long-Term information from earlier layers is more likely to degrade in later levels, which leads to poor image SR performance. This research presents a single image super-resolution strategy employing inverted residual connection with channel-wise attention (IRCA) to preserve meaningful information and keep long-Term features while balancing performance and computational cost. The inverted residual block achieves long-Term information persistence with fewer parameters than traditional residual networks. Meanwhile, by explicitly modeling inter-dependencies between channels, the attention block progressively adjusts channel-wise feature responses, enhancing essential information and suppressing unnecessary information. The efficacy of our suggested approach is demonstrated in three publicly accessible datasets. Code is available at https://github.com/mdhosen/SISR_IResBlock © 2023 Elsevier B.V., All rights reserved.Publication Metadata only Emotion, Age and Gender Prediction Through Masked Face Inpainting(Springer Science and Business Media Deutschland GmbH, 2023) Islam, Md Baharul; Hosen, Md Imran; Rousseau, J.-J.; Kapralos, B.; Islam, Md Baharul, Department of Computer Engineering, Bahçeşehir Üniversitesi, Istanbul, Turkey, College of Data Science and Engineering, American University of Malta, Cospicua, Malta; Hosen, Md Imran, Department of Computer Engineering, Bahçeşehir Üniversitesi, Istanbul, TurkeyPrediction of gesture and demographic information from the face is complex and challenging, particularly for the masked face. This paper proposes a deep learning-based integrated approach to predict emotion and demographic information for unmasked and masked faces, consisting of four sub-tasks: masked face detection, masked face inpainting, emotion, age, and gender prediction. The masked face detector module provides a binary decision on whether the face mask is available or not by applying pre-trained MobileNetV3. We use the inpainting module based on U-Net embedding with ImageNet weights to remove the face mask and restore the face. We use the convolutional neural networks to predict emotion (e.g., happy, angry). Besides, VGGFace-based transfer learning has been used to predict demographic information (e.g., age, gender). Extensive experiments on five publicly available datasets: AffectNet, UTKFace, FER-2013, CelebA, and MAFA, show the effectiveness of our proposed method to predict emotion and demographic identification through masked face reconstruction. © 2023 Elsevier B.V., All rights reserved.Publication Metadata only Efficient Object Detection Model for Edge Devices(Springer Science and Business Media Deutschland GmbH, 2024) Imani, Hassan; Hosen, Md Imran; Feryad, Vahit; Akyol, Ali; Ortis, A.; Hameed, A.A.; Jamil, A.; Imani, Hassan, Cozum Makina, Istanbul, Turkey; Hosen, Md Imran, Department of Computer Engineering, Bahçeşehir Üniversitesi, Istanbul, Turkey; Feryad, Vahit, Cozum Makina, Istanbul, Turkey; Akyol, Ali, Cozum Makina, Istanbul, TurkeyDeep learning-based object detection methods demonstrated promising results. In reality, most methods suffer while running on edge devices due to their extensive network architecture and low inference speed. Additionally, there is a lack of industrial scenarios in the existing person, helmet, and head detection datasets. This research presents an efficient tiny network (ETN) for object detection that can perform on edge devices with high inference speed. We take the YOLOv5s model as our base model. We compress the YOLOv5s object detection model and minimize the computation redundancy, and propose two lightweight C3 modules (MC3 and SC3). Additionally, we construct two novel datasets: H2 (consists of safety helmet and head) and Person104K (consists of person) that fill the gaps in the earlier datasets with various industrial scenarios. We implemented and tested our method on Person104K and H2 datasets and achieved about 50.6% higher inference speed than the original YOLOv5s without compromising the accuracy. On the Nvidia Jetson AGX edge device, ETN achieves 42% higher FPS compared to the original YOLOv5s. Code is available at https://github.com/mdhosen/ETN. © 2024 Elsevier B.V., All rights reserved.
