Araştırma Çıktıları | WoS | Scopus | TR-Dizin | PubMed
Permanent URI for this communityhttps://hdl.handle.net/20.500.14719/1741
Browse
31 results
Search Results
Publication Metadata only Towards Stereoscopic Video Deblurring Using Deep Convolutional Networks(Springer Science and Business Media Deutschland GmbH, 2021) Imani, Hassan; Islam, Md Baharul; Bebis, G.; Athitsos, V.; Yan, T.; Lau, M.; Li, F.; Shi, C.; Yuan, X.; Mousas, C.; Bruder, G.; Imani, Hassan, Department of Computer Engineering, Bahçeşehir Üniversitesi, Istanbul, Turkey; Islam, Md Baharul, Department of Computer Engineering, Bahçeşehir Üniversitesi, Istanbul, TurkeyThese days stereoscopic cameras are commonly used in daily life, such as the new smartphones and emerging technologies. The quality of the stereo video can be affected by various factors (e.g., blur artifact due to camera/object motion). For solving this issue, several methods are proposed for monocular deblurring, and there are some limited proposed works for stereo content deblurring. This paper presents a novel stereoscopic video deblurring model considering the consecutive left and right video frames. To compensate for the motion in stereoscopic video, we feed consecutive frames from the previous and next frames to the 3D CNN networks, which can help for further deblurring. Also, our proposed model uses the stereoscopic other view information to help for deblurring. Specifically, to deblur the stereo frames, our model takes the left and right stereoscopic frames and some neighboring left and right frames as the inputs. Then, after compensation for the transformation between consecutive frames, a 3D Convolutional Neural Network (CNN) is applied to the left and right batches of frames to extract their features. This model consists of the modified 3D U-Net networks. To aggregate the left and right features, the Parallax Attention Module (PAM) is modified to fuse the left and right features and create the output deblurred frames. The experimental results on the recently proposed Stereo Blur dataset show that the proposed method can effectively deblur the blurry stereoscopic videos. © 2021 Elsevier B.V., All rights reserved.Publication Metadata only DeepPyNet: A Deep Feature Pyramid Network for Optical Flow Estimation(IEEE Computer Society, 2021) Jeny, Afsana Ahsan; Islam, Md Baharul; Aydin, Tarkan; Cree, M.J.; Jeny, Afsana Ahsan, Department of Computer Engineering, Bahçeşehir Üniversitesi, Istanbul, Turkey; Islam, Md Baharul, Department of Computer Engineering, Bahçeşehir Üniversitesi, Istanbul, Turkey; Aydin, Tarkan, Department of Computer Engineering, Bahçeşehir Üniversitesi, Istanbul, TurkeyRecent advances in optical flow prediction have been made possible by using feature pyramids and iterative refining. Though downsampling in feature pyramids may cause foreground items to merge with the background, the iterative processing could be incorrect in optical flow experiments. Particularly the outcomes of the movement of narrow and tiny objects can be more invisible in the flow scene. We introduce a novel method called DeepPyNet for optical flow estimation that includes feature extractor, multi-channel cost volume, and flow decoder. In this method, we propose a deep recurrent feature pyramid-based network for the end-to-end optical flow estimation. The feature extraction from each pixel of the feature map keeps essential information without modifying the feature receptive field. Then, a multi-scale 4D correlation volume is built from the visual similarity of each pair of pixels. Finally, we utilize the multi-scale correlation volumes to continuously update the flow field through an iterative recurrent method. Experimental results demonstrate that DeepPyNet significantly eliminates flow errors and provides state-of-the-art performance in various datasets. Moreover, DeepPyNet is less complex and uses only 6.1M parameters 81% and 35% smaller than the popular FlowNet and PWC-Net+, respectively. © 2022 Elsevier B.V., All rights reserved.Publication Metadata only Deep Covariance Feature and CNN-based End-to-End Masked Face Recognition(Institute of Electrical and Electronics Engineers Inc., 2021) Junayed, Masum Shah; Sadeghzadeh, Arezoo; Islam, Md Baharul; Struc, V.; Ivanovska, M.; Junayed, Masum Shah, Department of Computer Engineering, Bahçeşehir Üniversitesi, Istanbul, Turkey; Sadeghzadeh, Arezoo, Department of Computer Engineering, Bahçeşehir Üniversitesi, Istanbul, Turkey; Islam, Md Baharul, Department of Computer Engineering, Bahçeşehir Üniversitesi, Istanbul, Turkey, College of Data Science and Engineering, American University of Malta, Cospicua, MaltaWith the emergence of the global epidemic of COVID-19, face recognition systems have achieved much attention as contactless identity verification methods. However, covering a considerable part of the face by the mask poses severe challenges for conventional face recognition systems. This paper proposes an automated Masked Face Recognition (MFR) system based on the combination of a mask occlusion discarding technique and a deep-learning model. Initially, a pre-processing step is carried out in which the images pass three filters. Then, a Convolutional Neural Network (CNN) model is proposed to extract the features from unoccluded regions of the faces (i.e., eyes and forehead). These feature maps are employed to obtain covariance-based features. Two extra layers, i.e., Bitmap and Eigenvalue, are designed to reduce the dimension and concatenate these covariance feature matrices. The deep covariance features are quantized to codebooks combined based on Bag-of-Features (BoF) paradigm. Finally, a global histogram is created based on these codebooks and utilized for training an SVM classifier. The proposed method is trained and evaluated on Real-World-Masked-Face-Recognition-Dataset (RMFRD) and Simulated-Masked-Face-Recognition-Dataset (SMFRD) achieves an accuracy of 95.07% and 92.32 %, respectively, showing its competitive performance compared to the state-of-the-art. Experimental results prove that our system has high robustness against noisy data and illumination variations. © 2025 Elsevier B.V., All rights reserved.Publication Metadata only Physical layer authentication for extending battery life(Elsevier B.V., 2021) Ayyildiz, Cem; Cetin, Ramazan; Khodzhaev, Zulfidin; Kocak, Taskin; Gelal, Ece; Güngör, Vehbi Çağrı; Kurt, Güneş Karabulut; Ayyildiz, Cem, Computer Engineering, Bahçeşehir Üniversitesi, Istanbul, Turkey, GOHM Electronics, Istanbul, Turkey; Cetin, Ramazan, GOHM Electronics, Istanbul, Turkey; Khodzhaev, Zulfidin, Department of Physics, Oklahoma State University, Stillwater, United States; Kocak, Taskin, Department of Electrical Engineering, University of New Orleans Dr. Robert A. Savoie College of Engineering, New Orleans, United States; Gelal, Ece, Computer Engineering, Bahçeşehir Üniversitesi, Istanbul, Turkey; Güngör, Vehbi Çağrı, Department of Computer Engineering, Abdullah Gül Üniversitesi, Kayseri, Turkey; Kurt, Güneş Karabulut, Centre de Recherche Poly-Grames, Montreal, CanadaIncreasing population density in cities, and the increasing demand for efficiency in resource usage call for architectures enabling smart cities, such as the Internet of Things (IoT). In most such scenarios, the data generated by IoT sensors is not confidential, but its integrity is critical. Data integrity can be achieved by establishing certification mechanisms that provide cryptographic message authentication protocols, however, this requires relatively expensive components for storing and processing the encryption key on the sensor and consumes more power while processing and transmitting data, which leads to the renunciation of security issues in cost sensitive deployments. In this paper, we propose a security solution that provides data integrity without draining the batteries of IoT sensors. Our solution consists of, (i) differentiating legitimate sensors by taking advantage of their impurities formed during the manufacturing process of the transceiver components, and (ii) eliminating the complex components that carry out cryptography as well as the redundant packet header fields, thereby yielding power savings. The testbed implementation of the proposed solution yields power measurement results providing an estimate of 2.52 times improvement in battery life without compromising the integrity of communications in the system, in addition to offering an increase in spectral efficiency and a decrease in the overall IoT device cost. © 2021 Elsevier B.V., All rights reserved.Publication Metadata only Stereoscopic Video Quality Assessment Using Modified Parallax Attention Module(Springer Science and Business Media Deutschland GmbH, 2022) Imani, Hassan; Zaim, Selim; Islam, Md Baharul; Junayed, Masum Shah; Durakbasa, N.M.; Gençyılmaz, M.G.; Imani, Hassan, Computer Vision Lab, Bahçeşehir Üniversitesi, Istanbul, Turkey; Zaim, Selim, Faculty of Engineering and Natural Sciences, Bahçeşehir Üniversitesi, Istanbul, Turkey; Islam, Md Baharul, Computer Vision Lab, Bahçeşehir Üniversitesi, Istanbul, Turkey; Junayed, Masum Shah, Computer Vision Lab, Bahçeşehir Üniversitesi, Istanbul, TurkeyDeep learning techniques are utilized for most computer vision tasks. Especially, Convolutional Neural Networks (CNNs) have shown great performance in detection and classification tasks. Recently, in the field of Stereoscopic Video Quality Assessment (SVQA), 3D CNNs are used to extract spatial and temporal features from stereoscopic videos, but the importance of the disparity information which is very important did not consider well. Most of the recently proposed deep learning-based methods mostly used cost volume methods to produce the stereo correspondence for large disparities. Because the disparities can differ considerably for stereo cameras with different configurations, recently the Parallax Attention Mechanism (PAM) is proposed that captures the stereo correspondence disregarding the disparity changes. In this paper, we propose a new SVQA model using a base 3D CNN-based network, and a modified PAM-based left and right feature fusion model. Firstly, we use 3D CNNs and residual blocks to extract features from the left and right views of a stereo video patch. Then, we modify the PAM model to fuse the left and right features with considering the disparity information, and using some fully connected layers, we calculate the quality score of a stereoscopic video. We divided the input videos into cube patches for data augmentation and remove some cubes that confuse our model from the training dataset. Two standard stereoscopic video quality assessment benchmarks of LFOVIAS3DPh2 and NAMA3DS1-COSPAD1 are used to train and test our model. Experimental results indicate that our proposed model is very competitive with the state-of-the-art methods in the NAMA3DS1-COSPAD1 dataset, and it is the state-of-the-art method in the LFOVIAS3DPh2 dataset. © 2022 Elsevier B.V., All rights reserved.Publication Open Access Three-Stream 3D deep CNN for no-Reference stereoscopic video quality assessment(Elsevier B.V., 2022) Imani, Hassan; Islam, Md Baharul; Arica, Nafiz; Imani, Hassan, Computer Vision Lab, Bahçeşehir Üniversitesi, Istanbul, Turkey; Islam, Md Baharul, Computer Vision Lab, Bahçeşehir Üniversitesi, Istanbul, Turkey; Arica, Nafiz, Computer Vision Lab, Bahçeşehir Üniversitesi, Istanbul, TurkeyConvolutional Neural Networks (CNNs) have achieved great success in learning computer vision tasks, particularly 3D CNNs, for extracting spatio-temporal features from the given videos. However, 3D CNNs have not been well-examined for the Stereoscopic Video Quality Assessment (SVQA). To our best knowledge, most of the state-of-the-art methods used the traditional hand-crafted feature extraction methods for the SVQA. Very few methods used the power of deep learning for SVQA, and they just considered the spatial information, ignoring the disparity and motion information. In this paper, we propose a No-Reference (NR) deep 3D CNN architecture that jointly focuses on spatial, disparity, and temporal information between consecutive frames. A 3-Stream 3D CNN, shortly 3S-3DCNN, by performing 3D CNNs, extracts features from spatial, motion, and depth channels to estimate the stereo video's quality. It captures the degradations in the quality of the stereoscopic video in multiple dimensions. Firstly, the scene flow, which is the joint prediction of the optical flow and stereo disparity, is calculated. Then, the spatial information, optical flow, and disparity map of a given video are used as input to the 3S-3DCNN model. The extracted features are concatenated and utilized as inputs to the fully connected layers for doing the regression. We split the input videos into cube patches for data augmentation and remove the cubes that confuse our model from the training and testing sets. Two standard stereoscopic video quality assessment benchmarks of LFOVIAS3DPh2 and NAMA3DS1-COSPAD1 were used to evaluate our method. Experimental results show that our 3S-3DCNN method's objective score significantly correlates with the subjective SVQ scores in multiple video datasets. The RMSE for NAMA3DS1-COSPAD1 dataset is 0.2757, which outperforms other methods by a large margin. The SROCC value for the blur distortion of the LFOVIAS3DPh2 dataset is more than 98%, indicating that the 3S-3DCNN is consistent with human visual perception. © 2021 Elsevier B.V., All rights reserved.Publication Metadata only A Scoring Method for Interpretability of Concepts in Convolutional Neural Networks, Evrişimsel Sinir Aǧlarinda Kavram Yorumlama için bir Puanlama Yöntemi(Institute of Electrical and Electronics Engineers Inc., 2022) Gurkan, Mustafa Kaǧan; Arica, Nafiz; Vural, Fatos T.Yarman; Gurkan, Mustafa Kaǧan, Bahçeşehir Üniversitesi, Istanbul, Turkey; Arica, Nafiz, Bahçeşehir Üniversitesi, Istanbul, Turkey; Vural, Fatos T.Yarman, Middle East Technical University (METU), Ankara, TurkeyIn this paper, we propose a scoring algorithm for measuring the interpretability of CNN models by focusing on the feature extraction operation at the convolutional layers. The proposed approach is based on the principal of concept analysis, for a predefined list of concepts. A map of the network is created based on its responsiveness against each concept. Once this map is ready, various images can be applied as inputs and they are matched with the concepts whose hidden nodes are highly activated. Finally, the evaluation algorithm kicks in to use these descriptions during the final prediction and provides human-understandable explanations. © 2022 Elsevier B.V., All rights reserved.Publication Metadata only Performance Assessment of Physiotherapy and Rehabilitation Exercises with Deep Learning, Derin Öǧrenme ile Fizyoterapi ve Rehabilitasyon Egzersizleri için Performans Deǧerlendirme(Institute of Electrical and Electronics Engineers Inc., 2022) Aytutuldu, Ilhan; Aydin, Tarkan; Aytutuldu, Ilhan, Bilgisayar Mühendisliǧi Bölümü, Gebze Teknik Üniversitesi, Gebze, Turkey; Aydin, Tarkan, Bilgisayar Mühendisliǧi Bölümü, Bahçeşehir Üniversitesi, Istanbul, TurkeyPhysiotherapy and rehabilitation process is critical for patients in postoperative recovery or the treatment of a wide variety of musculoskeletal disorders. However, providing access to a clinician for each rehabilitation session is a heavy burden and high cost to individuals. Also, it is very important to remotely check whether the exercises are performed correctly, especially during periods of lock-down due to the current pandemic, to provide motivation in the rehabilitation progress of the patients and to ensure that the recommended exercises contribute to the treatment. In this study, deep learning-based performance assessment of rehabilitation exercises has been proposed by using RGB videos obtained with low cost off-the-shelf cameras instead of high-cost, hard-to-reach depth cameras or wearable contact sensors. The proposed deep learning (DL) network models, PtConvNet, PtHybNet and PtBiLSTM, utilize three dimensional (3D) skeletal joint positions of patients extracted from exercise videos. Performance scores given by the physiotherapists have been used as the ground-truth in the training of the framework. We showed that the performance estimates of the learning models reliably follow the actual values and that the DL models confirm the ability to evaluate rehabilitation exercises. © 2022 Elsevier B.V., All rights reserved.Publication Metadata only Triplet Loss-based Convolutional Neural Network for Static Sign Language Recognition(Institute of Electrical and Electronics Engineers Inc., 2022) Sadeghzadeh, Arezoo; Islam, Md Baharul; Sadeghzadeh, Arezoo, Department of Computer Engineering, Bahçeşehir Üniversitesi, Istanbul, Turkey; Islam, Md Baharul, Bahçeşehir Üniversitesi, Istanbul, TurkeySign language (SL) is a non-verbal visual language used as a primary communication tool by deaf or hearing-impaired community. Owing to availability of large number of SLs with wide varieties, a great effort is required for public majority to master in interpreting them which is not feasible. Despite the recent advances in developing automatic sign language recognition (SLR) systems, their performance undergoes tremendous degradation when low resolution images with large intra-class and slight inter-class variations are employed. To deal with these issues, a novel end-to-end Convolutional Neural Network (CNN) is proposed to extract the features from the low resolution input images. This feature extractor is trained based on the semi-hard triplet loss function so that the images belonging to the same class are placed close to one another in a lower dimensional embedding space while the distance between the samples from separate classes is maximized. In addition to the efficient loss function, proper selection of the filter and kernel sizes, activation functions, and regularization methods in the proposed CNN leads to effective feature vectors from the small-sized images while the number of the parameters is reduced. The embedded features with a fixed small vector length are utilized to train a Support Vector Machine (SVM) classifier for final recognition. Experimental results on two datasets from two SLs of American (MNIST) and Arabic (ArSL2018) with an accuracy of 100% and 97.54%, respectively, demonstrate that the proposed model outperforms the existing approaches without any need for increasing the quantity of the dataset with augmentation which proves its feasibility. © 2022 Elsevier B.V., All rights reserved.Publication Metadata only Cross-View Integration for Stereoscopic Video Deblurring(Institute of Electrical and Electronics Engineers Inc., 2022) Imani, Hassan; Islam, Md Baharul; Imani, Hassan, Department of Computer Engineering, Bahçeşehir Üniversitesi, Istanbul, Turkey; Islam, Md Baharul, Bahçeşehir Üniversitesi, Istanbul, Turkey, American University of Malta, Cospicua, MaltaStereoscopic cameras are now often seen in modern technology, including new Cellphones. Numerous elements, such as blur artifacts from camera/object motion, might influence the stereo video's quality. There are various deblurring techniques for monocular content, yet there are not many works for stereo content. A novel encoder-decoder-based stereoscopic video deblurring model presented in this work considers the subsequent left and right video frames. This approach employs the cross-view stereoscopic information to aid in deblurring. The proposed model uses the left and right stereoscopic frames and some nearby left and right frames as inputs to deblur the middle stereo frames. To extract their features, we first apply the stereo batch of frames to the encoder of our model. The left and right features are then fused together after being aggregated using the Parallax Attention Module (PAM). The decoder then extracts the deblurred stereo video frames using the output of PAM features. According to experimental findings on the recently proposed Stereo Blur dataset, the proposed approach effectively deblurs the stereoscopic video frames. © 2023 Elsevier B.V., All rights reserved.
