Araştırma Çıktıları | WoS | Scopus | TR-Dizin | PubMed

Permanent URI for this communityhttps://hdl.handle.net/20.500.14719/1741

Browse

Search Results

Now showing 1 - 7 of 7
  • Publication
    SkNet: A Convolutional Neural Networks Based Classification Approach for Skin Cancer Classes
    (IEEE, 2020) Jeny, Afsana Ahsan; Sakib, Abu Noman Md; Junayed, Masum Shah; Lima, Khadija Akter; Ahmed, Ikhtiar; Islam, Md Baharul; Daffodil International University; Khulna University of Engineering & Technology (KUET); Bahcesehir University; Daffodil International University
    Skin Cancer is one of the most common types of cancer. A solution for this globally recognized health problem is much required. Machine Learning techniques have brought revolutionary changes in the field of biomedical researches. Previously, It took a significant amount of time and much effort in detecting skin cancers. In recent years, many works have been done with Deep Learning which made the process a lot faster and much more accurate. In this paper, We have proposed a novel Convolutional Neural Networks (CNN) based approach that can classify four different types of Skin Cancer. We have developed our model SkNet consisting of 19 convolution layers. In previous works, the highest accuracy gained on 1000 images was 80.52%. Our proposed model exceeded that previous performance and achieved an accuracy of 95.26% on a dataset of 4800 images which is the highest acquired accuracy.
  • Publication
    Machine Vision-Based Expert System for Automated Skin Cancer Detection
    (SPRINGER INTERNATIONAL PUBLISHING AG, 2022) Junayed, Masum Shah; Jeny, Afsana Ahsan; Rada, Lavdie; Islam, Md Baharul; BritoLoeza, C; MartinGonzalez, A; CastanedaZeman, V; Safi, A; Bahcesehir University; Daffodil International University
    Skin cancer is the most frequently occurring kind of cancer, accounting for about one-third of all cases. Automatic early detection without expert intervention for a visual inspection would be of great help for society. The image processing and machine learning methods have significantly contributed to medical and biomedical research, resulting in fast and exact inspection in different problems. One of such problems is accurate cancer detection and classification. In this study, we introduce an expert system based on image processing and machine learning for skin cancer detection and classification. The proposed approach consists of three significant steps: pre-processing, feature extraction, and classification. The pre-processing step uses the grayscale conversion, Gaussian filter, segmentation, and morphological operation to represent skin lesion images better. We employ two feature extractors, i.e., the ABCD scoring method (asymmetry, border, color, diameter) and gray level co-occurrence matrix (GLCM), to extract cancer-affected areas. Finally, five different machine learning classifiers such as logistic regression (LR), decision tree (DT), k-nearest neighbors (KNN), support vector machine (SVM), and random forest (RF) used to detect and classify skin cancer. Experimental results show that random forest exceeds all other classifiers achieving an accuracy of 97.62% and 0.97 Area Under Curve (AUC), which is state-of-the-art on the experimented open-source dataset PH2.
  • Publication
    HiMODE: A Hybrid Monocular Omnidirectional Depth Estimation Model
    (IEEE, 2022) Junayed, Masum Shah; Sadeghzadeh, Arezoo; Islam, Md Baharul; Wong, Lai-Kuan; Aydin, Tarkan; Bahcesehir University; Multimedia University
    Monocular omnidirectional depth estimation is receiving considerable research attention due to its broad applications for sensing 360 degrees surroundings. Existing approaches in this field suffer from limitations in recovering small object details and data lost during the ground-truth depth map acquisition. In this paper, a novel monocular omnidirectional depth estimation model, namely HiMODE is proposed based on a hybrid CNN+Transformer (encoder-decoder) architecture whose modules are efficiently designed to mitigate distortion and computational cost, without performance degradation. Firstly, we design a feature pyramid network based on the HNet block to extract high-resolution features near the edges. The performance is further improved, benefiting from a self and cross attention layer and spatial/temporal patches in the Transformer encoder and decoder, respectively. Besides, a spatial residual block is employed to reduce the number of parameters. By jointly passing the deep features extracted from an input image at each backbone block, along with the raw depth maps predicted by the transformer encoder-decoder, through a context adjustment layer, our model can produce resulting depth maps with better visual quality than the ground-truth. Comprehensive ablation studies demonstrate the significance of each individual module. Extensive experiments conducted on three datasets, Stanford3D, Matterport3D, and SunCG, demonstrate that HiMODE can achieve state-of-the-art performance for 360 degrees monocular depth estimation. Complete project code and supplementary materials are available at https://github.com/himode5008/HiMODE.
  • Publication
    Deep Covariance Feature and CNN-based End-to-End Masked Face Recognition
    (IEEE, 2021) Junayed, Masum Shah; Sadeghzadeh, Arezoo; Islam, Md Baharul; Struc, V; Ivanovska, M; Bahcesehir University
    With the emergence of the global epidemic of COVID-19, face recognition systems have achieved much attention as contactless identity verification methods. However, covering a considerable part of the face by the mask poses severe challenges for conventional face recognition systems. This paper proposes an automated Masked Face Recognition (MFR) system based on the combination of a mask occlusion discarding technique and a deep-learning model. Initially, a pre-processing step is carried out in which the images pass three filters. Then, a Convolutional Neural Network (CNN) model is proposed to extract the features from unoccluded regions of the faces (i.e., eyes and forehead). These feature maps are employed to obtain covariance-based features. Two extra layers, i.e., Bitmap and Eigenvalue, are designed to reduce the dimension and concatenate these covariance feature matrices. The deep covariance features are quantized to codebooks combined based on Bag-of-Features (BoF) paradigm. Finally, a global histogram is created based on these codebooks and utilized for training an SVM classifier. The proposed method is trained and evaluated on Real-World-Masked-Face-Recognition-Dataset (RMFRD) and Simulated-Masked-Face-Recognition-Dataset (SMFRD) achieves an accuracy of 95.07% and 92.32%, respectively, showing its competitive performance compared to the state-of-the-art. Experimental results prove that our system has high robustness against noisy data and illumination variations.
  • Publication
    AN EFFICIENT END-TO-END IMAGE COMPRESSION TRANSFORMER
    (IEEE, 2022) Jeny, Afsana Ahsan; Junayed, Masum Shah; Islam, Md Baharul; Bahcesehir University
    Image and video compression received significant research attention and expanded their applications. Existing entropy estimation-based methods combine with hyperprior and local context, limiting their efficacy. This paper introduces an efficient end-to-end transformer-based image compression model, which generates a global receptive field to tackle the long-range correlation issues. A hyper encoder-decoder-based transformer block employs a multi-head spatial reduction self-attention (MHSRSA) layer to minimize the computational cost of the self-attention layer and enable rapid learning of multi-scale and high-resolution features. A Casual Global Anticipation Module (CGAM) is designed to construct highly informative adjacent contexts utilizing channel-wise linkages and identify global reference points in the latent space for end-to-end rate-distortion optimization (RDO). Experimental results demonstrate the effectiveness and competitive performance of the KODAK dataset.
  • Publication
    PoseTED: A Novel Regression-Based Technique for Recognizing Multiple Pose Instances
    (SPRINGER INTERNATIONAL PUBLISHING AG, 2021) Jeny, Afsana Ahsan; Junayed, Masum Shah; Islam, Md Baharul; Bebis, G; Athitsos, V; Yan, T; Lau, M; Li, F; Shi, C; Yuan, X; Mousas, C; Bruder, G; Bahcesehir University
    Pose estimation for multiple people can be viewed as a hierarchical set predicting challenge. Algorithms are needed to classify all persons according to their physical components appropriately. Pose estimation methods are divided into two categories: (1) heatmap-based, (2) regression-based. Heatmap-based techniques are susceptible to various heuristic designs and are not end-to-end trainable, while regression-based methods involve fewer intermediary non-differentiable stages. This paper presents a novel regression-based multi-instance human pose recognition network called PoseTED. It utilizes the well-known object detector YOLOv4 for person detection, and the spatial transformer network (STN) used as a cropping filter. After that, we used a CNN-based backbone that extracts deep features and positional encoding with an encoder-decoder transformer applied for keypoint detection, solving the heuristic design problem before regression-based techniques and increasing overall performance. A prediction-based feed-forward network (FFN) is used to predict several key locations' posture as a group and display the body components as an output. Two available public datasets are tested in this experiment. Experimental results are shown on the COCO andMPII datasets, with an average precision (AP) of 73.7% on the COCO val. dataset, 72.7% on the COCO test dev. dataset, and 89.7% on the MPII datasets, respectively. These results are comparable to the state-of-the-art methods.
  • Publication
    Stereoscopic Video Quality Assessment Using Modified Parallax Attention Module
    (SPRINGER-VERLAG SINGAPORE PTE LTD, 2022) Imani, Hassan; Zaim, Selim; Islam, Md Baharul; Junayed, Masum Shah; Durakbasa, NM; Gencyilmaz, MG; Bahcesehir University; Bahcesehir University
    Deep learning techniques are utilized for most computer vision tasks. Especially, Convolutional Neural Networks (CNNs) have shown great performance in detection and classification tasks. Recently, in the field of Stereoscopic Video Quality Assessment (SVQA), 3D CNNs are used to extract spatial and temporal features from stereoscopic videos, but the importance of the disparity information which is very important did not consider well. Most of the recently proposed deep learning-based methods mostly used cost volume methods to produce the stereo correspondence for large disparities. Because the disparities can differ considerably for stereo cameras with different configurations, recently the Parallax Attention Mechanism (PAM) is proposed that captures the stereo correspondence disregarding the disparity changes. In this paper, we propose a new SVQA model using a base 3D CNN-based network, and a modified PAM-based left and right feature fusion model. Firstly, we use 3D CNNs and residual blocks to extract features from the left and right views of a stereo video patch. Then, we modify the PAM model to fuse the left and right features with considering the disparity information, and using some fully connected layers, we calculate the quality score of a stereoscopic video. We divided the input videos into cube patches for data augmentation and remove some cubes that confuse our model from the training dataset. Two standard stereoscopic video quality assessment benchmarks of LFOVIAS3DPh2 and NAMA3DS1-COSPAD1 are used to train and test our model. Experimental results indicate that our proposed model is very competitive with the state-of-the-art methods in the NAMA3DS1-COSPAD1 dataset, and it is the state-of-the-art method in the LFOVIAS3DPh2 dataset.