Araştırma Çıktıları | WoS | Scopus | TR-Dizin | PubMed

Permanent URI for this communityhttps://hdl.handle.net/20.500.14719/1741

Browse

Search Results

Now showing 1 - 3 of 3
  • Publication
    Masked Face Inpainting Through Residual Attention UNet
    (Institute of Electrical and Electronics Engineers Inc., 2022) Hosen, Md Imran; Islam, Md Baharul; Hosen, Md Imran, Department of Computer Engineering, Bahçeşehir Üniversitesi, Istanbul, Turkey; Islam, Md Baharul, Bahçeşehir Üniversitesi, Istanbul, Turkey
    Realistic image restoration with high texture areas such as removing face masks is challenging. The state-of-the-art deep learning-based methods fail to guarantee high-fidelity, cause training instability due to vanishing gradient problems (e.g., weights are updated slightly in initial layers) and spatial information loss. They also depend on intermediary stage such as segmentation meaning require external mask. This paper proposes a blind mask face inpainting method using residual attention UNet to remove the face mask and restore the face with fine details while minimizing the gap with the ground truth face structure. A residual block feeds info to the next layer and directly into the layers about two hops away to solve the gradient vanishing problem. Besides, the attention unit helps the model focus on the relevant mask region, reducing resources and making the model faster. Extensive experiments on the publicly available CelebA dataset show the feasibility and robustness of our proposed model. © 2022 Elsevier B.V., All rights reserved.
  • Publication
    Emotion, Age and Gender Prediction Through Masked Face Inpainting
    (Springer Science and Business Media Deutschland GmbH, 2023) Islam, Md Baharul; Hosen, Md Imran; Rousseau, J.-J.; Kapralos, B.; Islam, Md Baharul, Department of Computer Engineering, Bahçeşehir Üniversitesi, Istanbul, Turkey, College of Data Science and Engineering, American University of Malta, Cospicua, Malta; Hosen, Md Imran, Department of Computer Engineering, Bahçeşehir Üniversitesi, Istanbul, Turkey
    Prediction of gesture and demographic information from the face is complex and challenging, particularly for the masked face. This paper proposes a deep learning-based integrated approach to predict emotion and demographic information for unmasked and masked faces, consisting of four sub-tasks: masked face detection, masked face inpainting, emotion, age, and gender prediction. The masked face detector module provides a binary decision on whether the face mask is available or not by applying pre-trained MobileNetV3. We use the inpainting module based on U-Net embedding with ImageNet weights to remove the face mask and restore the face. We use the convolutional neural networks to predict emotion (e.g., happy, angry). Besides, VGGFace-based transfer learning has been used to predict demographic information (e.g., age, gender). Extensive experiments on five publicly available datasets: AffectNet, UTKFace, FER-2013, CelebA, and MAFA, show the effectiveness of our proposed method to predict emotion and demographic identification through masked face reconstruction. © 2023 Elsevier B.V., All rights reserved.
  • Publication
    Efficient Object Detection Model for Edge Devices
    (Springer Science and Business Media Deutschland GmbH, 2024) Imani, Hassan; Hosen, Md Imran; Feryad, Vahit; Akyol, Ali; Ortis, A.; Hameed, A.A.; Jamil, A.; Imani, Hassan, Cozum Makina, Istanbul, Turkey; Hosen, Md Imran, Department of Computer Engineering, Bahçeşehir Üniversitesi, Istanbul, Turkey; Feryad, Vahit, Cozum Makina, Istanbul, Turkey; Akyol, Ali, Cozum Makina, Istanbul, Turkey
    Deep learning-based object detection methods demonstrated promising results. In reality, most methods suffer while running on edge devices due to their extensive network architecture and low inference speed. Additionally, there is a lack of industrial scenarios in the existing person, helmet, and head detection datasets. This research presents an efficient tiny network (ETN) for object detection that can perform on edge devices with high inference speed. We take the YOLOv5s model as our base model. We compress the YOLOv5s object detection model and minimize the computation redundancy, and propose two lightweight C3 modules (MC3 and SC3). Additionally, we construct two novel datasets: H2 (consists of safety helmet and head) and Person104K (consists of person) that fill the gaps in the earlier datasets with various industrial scenarios. We implemented and tested our method on Person104K and H2 datasets and achieved about 50.6% higher inference speed than the original YOLOv5s without compromising the accuracy. On the Nvidia Jetson AGX edge device, ETN achieves 42% higher FPS compared to the original YOLOv5s. Code is available at https://github.com/mdhosen/ETN. © 2024 Elsevier B.V., All rights reserved.