Araştırma Çıktıları | WoS | Scopus | TR-Dizin | PubMed

Permanent URI for this communityhttps://hdl.handle.net/20.500.14719/1741

Browse

Search Results

Now showing 1 - 10 of 29
  • Publication
    A case study on transfer learning in convolutional neural networks, Evrişimli sinir aǧlarinda eǧitim transferi için örnek çalişma
    (Institute of Electrical and Electronics Engineers Inc., 2018) Gürkaynak, Cahit Deniz; Arica, Nafiz; Gürkaynak, Cahit Deniz, Bahçeşehir Üniversitesi, Istanbul, Turkey; Arica, Nafiz, Bahçeşehir Üniversitesi, Istanbul, Turkey
    In this work, a case study is performed on transfer learning approach in convolutional neural networks. Transfer learning parameters are examined on AlexNet, VGGNet and ResNet architectures for marine vessel classification task on MARVEL dataset. The results confirmed that transferring the parameter values of the first layers and fine-tuning the other layers, whose weights are initialized from pre-trained weights, performs better than training network from scratch. It's also observed that preprocessing and regularization improves overall scores significantly. © 2018 Elsevier B.V., All rights reserved.
  • Publication
    Deep and Wide Convolutional Neural Network Model for Highly Dense Crowd
    (Institute of Electrical and Electronics Engineers Inc., 2019) Kizrak, Merve Ayyuce; Bolat, Bülent; Kizrak, Merve Ayyuce, Department of Electrical and Electronic Engineering, Bahçeşehir Üniversitesi, Istanbul, Turkey; Bolat, Bülent, Department of Electrical and Electronic Engineering, Bahçeşehir Üniversitesi, Istanbul, Turkey
    In this study, a novel and efficient deep learning model are proposed to estimate the number of people in highly dense crowd images. We present a convolutional neural network model consisting of two parallel modules which focus on various specific features of the images. Thus, while the general density map is derived by obtaining lower-level features from the first module, it is possible to identify regions of the human body, such as head and upper body with the help of the higher-level features in the deeper second module. These two modules are then concatenated with a fully connected neural network. The proposed model was tested with the ShanghaiTech Part-A dataset. The mean square error and mean absolute error values are used as performance metrics. By comparing these metrics regarding recent studies, more successful results were obtained by using the proposed method. © 2020 Elsevier B.V., All rights reserved.
  • Publication
    Turkish Movie Genre Classification from Poster Images using Convolutional Neural Networks
    (Institute of Electrical and Electronics Engineers Inc., 2019) Gözüacik, Necip; Sakar, C. Okan; Gözüacik, Necip, NETAS, Istanbul, Turkey; Sakar, C. Okan, Department of Computer Engineering, Bahçeşehir Üniversitesi, Istanbul, Turkey
    Accessing to the best matching multimedia data is a trending topic due to the enormous amount of demand from people for movies, online TV series, videos etc. Advertising/Introducing form/image of such multimedia applications is important to give the key information to the audience. Sometimes a movie poster may play an important role to present the movie genre correctly. In recent years, Convolutional Neural Networks (CNN) as a deep learning architecture achieved state of-the- art performance in many image processing and recognition applications. In this paper, we implement transfer learning and fine-tuning methods on top of Google Inception-v3 algorithm, which is one of the most popular CNN architectures in this domain, and present comparative results of these methods in classifying the movie genre on a dataset consisting of Turkish movie posters. The obtained results show that fine tuning method performs better than pure CNN and transfer learning models on movie genre classification task. © 2020 Elsevier B.V., All rights reserved.
  • Publication
    Effects of Network Depths on Semantic Image Segmentation by Weakly Supervised Learning, Zayif Denetimli Ogrenmeyle Semantik Imge Bolutlemede Ag Derinliginin Etkileri
    (Institute of Electrical and Electronics Engineers Inc., 2020) Bircanoglu, Cenk; Arica, Nafiz; Bircanoglu, Cenk, Adevinta, Paris, France; Arica, Nafiz, Bahçeşehir Üniversitesi, Istanbul, Turkey
    Weakly Supervised Learning is one of the most interesting approaches that more complex labels are predicted by using related simple labels. In this study, we focus on segmentation problem by giving image class tags in learning stage. We examine how the number of layers and the usage of their output in Convolutional Neural Network affect the segmentation results. It is found that increasing the number of layers in the network has a positive effect on segmentation performance. After ResNet152 is determined as the most successful deep architecture in Pascal VOC2012 dataset, we construct a new architecture based on ResNet152. Experimental results show that proposed architecture outperforms the available studies tested on this particular dataset. In addition, we observe that early layers reach more general attributes for the object classes than the last layers and that these attributes can better identify the object boundaries. © 2021 Elsevier B.V., All rights reserved.
  • Publication
    SkNet: A Convolutional Neural Networks Based Classification Approach for Skin Cancer Classes
    (Institute of Electrical and Electronics Engineers Inc., 2020) Jeny, Afsana Ahsan; Sakib, Abu Noman Md; Junayed, Masum Shah; Lima, Khadija Akter; Ahmed, Ikhtiar; Islam, Md Baharul; Jeny, Afsana Ahsan, Department of CSE, Daffodil International University, Dhaka, Bangladesh; Sakib, Abu Noman Md, Department of Cse, Khulna University of Engineering and Technology, Khulna, Bangladesh; Junayed, Masum Shah, Department of Computer Engineering, Bahçeşehir Üniversitesi, Istanbul, Turkey; Lima, Khadija Akter, Department of CSE, Daffodil International University, Dhaka, Bangladesh; Ahmed, Ikhtiar, Department of CSE, Daffodil International University, Dhaka, Bangladesh; Islam, Md Baharul, Department of Computer Engineering, Bahçeşehir Üniversitesi, Istanbul, Turkey, American University of Malta, Cospicua, Malta
    Skin Cancer is one of the most common types of cancer. A solution for this globally recognized health problem is much required. Machine Learning techniques have brought revolutionary changes in the field of biomedical researches. Previously, It took a significant amount of time and much effort in detecting skin cancers. In recent years, many works have been done with Deep Learning which made the process a lot faster and much more accurate. In this paper, We have proposed a novel Convolutional Neural Networks (CNN) based approach that can classify four different types of Skin Cancer. We have developed our model SkNet consisting of 19 convolution layers. In previous works, the highest accuracy gained on 1000 images was 80.52%. Our proposed model exceeded that previous performance and achieved an accuracy of 95.26% on a dataset of 4800 images which is the highest acquired accuracy. © 2021 Elsevier B.V., All rights reserved.
  • Publication
    Towards Stereoscopic Video Deblurring Using Deep Convolutional Networks
    (Springer Science and Business Media Deutschland GmbH, 2021) Imani, Hassan; Islam, Md Baharul; Bebis, G.; Athitsos, V.; Yan, T.; Lau, M.; Li, F.; Shi, C.; Yuan, X.; Mousas, C.; Bruder, G.; Imani, Hassan, Department of Computer Engineering, Bahçeşehir Üniversitesi, Istanbul, Turkey; Islam, Md Baharul, Department of Computer Engineering, Bahçeşehir Üniversitesi, Istanbul, Turkey
    These days stereoscopic cameras are commonly used in daily life, such as the new smartphones and emerging technologies. The quality of the stereo video can be affected by various factors (e.g., blur artifact due to camera/object motion). For solving this issue, several methods are proposed for monocular deblurring, and there are some limited proposed works for stereo content deblurring. This paper presents a novel stereoscopic video deblurring model considering the consecutive left and right video frames. To compensate for the motion in stereoscopic video, we feed consecutive frames from the previous and next frames to the 3D CNN networks, which can help for further deblurring. Also, our proposed model uses the stereoscopic other view information to help for deblurring. Specifically, to deblur the stereo frames, our model takes the left and right stereoscopic frames and some neighboring left and right frames as the inputs. Then, after compensation for the transformation between consecutive frames, a 3D Convolutional Neural Network (CNN) is applied to the left and right batches of frames to extract their features. This model consists of the modified 3D U-Net networks. To aggregate the left and right features, the Parallax Attention Module (PAM) is modified to fuse the left and right features and create the output deblurred frames. The experimental results on the recently proposed Stereo Blur dataset show that the proposed method can effectively deblur the blurry stereoscopic videos. © 2021 Elsevier B.V., All rights reserved.
  • Publication
    DeepPyNet: A Deep Feature Pyramid Network for Optical Flow Estimation
    (IEEE Computer Society, 2021) Jeny, Afsana Ahsan; Islam, Md Baharul; Aydin, Tarkan; Cree, M.J.; Jeny, Afsana Ahsan, Department of Computer Engineering, Bahçeşehir Üniversitesi, Istanbul, Turkey; Islam, Md Baharul, Department of Computer Engineering, Bahçeşehir Üniversitesi, Istanbul, Turkey; Aydin, Tarkan, Department of Computer Engineering, Bahçeşehir Üniversitesi, Istanbul, Turkey
    Recent advances in optical flow prediction have been made possible by using feature pyramids and iterative refining. Though downsampling in feature pyramids may cause foreground items to merge with the background, the iterative processing could be incorrect in optical flow experiments. Particularly the outcomes of the movement of narrow and tiny objects can be more invisible in the flow scene. We introduce a novel method called DeepPyNet for optical flow estimation that includes feature extractor, multi-channel cost volume, and flow decoder. In this method, we propose a deep recurrent feature pyramid-based network for the end-to-end optical flow estimation. The feature extraction from each pixel of the feature map keeps essential information without modifying the feature receptive field. Then, a multi-scale 4D correlation volume is built from the visual similarity of each pair of pixels. Finally, we utilize the multi-scale correlation volumes to continuously update the flow field through an iterative recurrent method. Experimental results demonstrate that DeepPyNet significantly eliminates flow errors and provides state-of-the-art performance in various datasets. Moreover, DeepPyNet is less complex and uses only 6.1M parameters 81% and 35% smaller than the popular FlowNet and PWC-Net+, respectively. © 2022 Elsevier B.V., All rights reserved.
  • Publication
    Deep Covariance Feature and CNN-based End-to-End Masked Face Recognition
    (Institute of Electrical and Electronics Engineers Inc., 2021) Junayed, Masum Shah; Sadeghzadeh, Arezoo; Islam, Md Baharul; Struc, V.; Ivanovska, M.; Junayed, Masum Shah, Department of Computer Engineering, Bahçeşehir Üniversitesi, Istanbul, Turkey; Sadeghzadeh, Arezoo, Department of Computer Engineering, Bahçeşehir Üniversitesi, Istanbul, Turkey; Islam, Md Baharul, Department of Computer Engineering, Bahçeşehir Üniversitesi, Istanbul, Turkey, College of Data Science and Engineering, American University of Malta, Cospicua, Malta
    With the emergence of the global epidemic of COVID-19, face recognition systems have achieved much attention as contactless identity verification methods. However, covering a considerable part of the face by the mask poses severe challenges for conventional face recognition systems. This paper proposes an automated Masked Face Recognition (MFR) system based on the combination of a mask occlusion discarding technique and a deep-learning model. Initially, a pre-processing step is carried out in which the images pass three filters. Then, a Convolutional Neural Network (CNN) model is proposed to extract the features from unoccluded regions of the faces (i.e., eyes and forehead). These feature maps are employed to obtain covariance-based features. Two extra layers, i.e., Bitmap and Eigenvalue, are designed to reduce the dimension and concatenate these covariance feature matrices. The deep covariance features are quantized to codebooks combined based on Bag-of-Features (BoF) paradigm. Finally, a global histogram is created based on these codebooks and utilized for training an SVM classifier. The proposed method is trained and evaluated on Real-World-Masked-Face-Recognition-Dataset (RMFRD) and Simulated-Masked-Face-Recognition-Dataset (SMFRD) achieves an accuracy of 95.07% and 92.32 %, respectively, showing its competitive performance compared to the state-of-the-art. Experimental results prove that our system has high robustness against noisy data and illumination variations. © 2025 Elsevier B.V., All rights reserved.
  • Publication
    Stereoscopic Video Quality Assessment Using Modified Parallax Attention Module
    (Springer Science and Business Media Deutschland GmbH, 2022) Imani, Hassan; Zaim, Selim; Islam, Md Baharul; Junayed, Masum Shah; Durakbasa, N.M.; Gençyılmaz, M.G.; Imani, Hassan, Computer Vision Lab, Bahçeşehir Üniversitesi, Istanbul, Turkey; Zaim, Selim, Faculty of Engineering and Natural Sciences, Bahçeşehir Üniversitesi, Istanbul, Turkey; Islam, Md Baharul, Computer Vision Lab, Bahçeşehir Üniversitesi, Istanbul, Turkey; Junayed, Masum Shah, Computer Vision Lab, Bahçeşehir Üniversitesi, Istanbul, Turkey
    Deep learning techniques are utilized for most computer vision tasks. Especially, Convolutional Neural Networks (CNNs) have shown great performance in detection and classification tasks. Recently, in the field of Stereoscopic Video Quality Assessment (SVQA), 3D CNNs are used to extract spatial and temporal features from stereoscopic videos, but the importance of the disparity information which is very important did not consider well. Most of the recently proposed deep learning-based methods mostly used cost volume methods to produce the stereo correspondence for large disparities. Because the disparities can differ considerably for stereo cameras with different configurations, recently the Parallax Attention Mechanism (PAM) is proposed that captures the stereo correspondence disregarding the disparity changes. In this paper, we propose a new SVQA model using a base 3D CNN-based network, and a modified PAM-based left and right feature fusion model. Firstly, we use 3D CNNs and residual blocks to extract features from the left and right views of a stereo video patch. Then, we modify the PAM model to fuse the left and right features with considering the disparity information, and using some fully connected layers, we calculate the quality score of a stereoscopic video. We divided the input videos into cube patches for data augmentation and remove some cubes that confuse our model from the training dataset. Two standard stereoscopic video quality assessment benchmarks of LFOVIAS3DPh2 and NAMA3DS1-COSPAD1 are used to train and test our model. Experimental results indicate that our proposed model is very competitive with the state-of-the-art methods in the NAMA3DS1-COSPAD1 dataset, and it is the state-of-the-art method in the LFOVIAS3DPh2 dataset. © 2022 Elsevier B.V., All rights reserved.
  • Publication
    A Scoring Method for Interpretability of Concepts in Convolutional Neural Networks, Evrişimsel Sinir Aǧlarinda Kavram Yorumlama için bir Puanlama Yöntemi
    (Institute of Electrical and Electronics Engineers Inc., 2022) Gurkan, Mustafa Kaǧan; Arica, Nafiz; Vural, Fatos T.Yarman; Gurkan, Mustafa Kaǧan, Bahçeşehir Üniversitesi, Istanbul, Turkey; Arica, Nafiz, Bahçeşehir Üniversitesi, Istanbul, Turkey; Vural, Fatos T.Yarman, Middle East Technical University (METU), Ankara, Turkey
    In this paper, we propose a scoring algorithm for measuring the interpretability of CNN models by focusing on the feature extraction operation at the convolutional layers. The proposed approach is based on the principal of concept analysis, for a predefined list of concepts. A map of the network is created based on its responsiveness against each concept. Once this map is ready, various images can be applied as inputs and they are matched with the concepts whose hidden nodes are highly activated. Finally, the evaluation algorithm kicks in to use these descriptions during the final prediction and provides human-understandable explanations. © 2022 Elsevier B.V., All rights reserved.