Araştırma Çıktıları | WoS | Scopus | TR-Dizin | PubMed
Permanent URI for this communityhttps://hdl.handle.net/20.500.14719/1741
Browse
3 results
Search Results
Publication Metadata only Triplet Loss-based Convolutional Neural Network for Static Sign Language Recognition(Institute of Electrical and Electronics Engineers Inc., 2022) Sadeghzadeh, Arezoo; Islam, Md Baharul; Sadeghzadeh, Arezoo, Department of Computer Engineering, Bahçeşehir Üniversitesi, Istanbul, Turkey; Islam, Md Baharul, Bahçeşehir Üniversitesi, Istanbul, TurkeySign language (SL) is a non-verbal visual language used as a primary communication tool by deaf or hearing-impaired community. Owing to availability of large number of SLs with wide varieties, a great effort is required for public majority to master in interpreting them which is not feasible. Despite the recent advances in developing automatic sign language recognition (SLR) systems, their performance undergoes tremendous degradation when low resolution images with large intra-class and slight inter-class variations are employed. To deal with these issues, a novel end-to-end Convolutional Neural Network (CNN) is proposed to extract the features from the low resolution input images. This feature extractor is trained based on the semi-hard triplet loss function so that the images belonging to the same class are placed close to one another in a lower dimensional embedding space while the distance between the samples from separate classes is maximized. In addition to the efficient loss function, proper selection of the filter and kernel sizes, activation functions, and regularization methods in the proposed CNN leads to effective feature vectors from the small-sized images while the number of the parameters is reduced. The embedded features with a fixed small vector length are utilized to train a Support Vector Machine (SVM) classifier for final recognition. Experimental results on two datasets from two SLs of American (MNIST) and Arabic (ArSL2018) with an accuracy of 100% and 97.54%, respectively, demonstrate that the proposed model outperforms the existing approaches without any need for increasing the quantity of the dataset with augmentation which proves its feasibility. © 2022 Elsevier B.V., All rights reserved.Publication Metadata only BiSign-Net: Fine-grained Static Sign Language Recognition based on Bilinear CNN(Institute of Electrical and Electronics Engineers Inc., 2022) Sadeghzadeh, Arezoo; Islam, Md Baharul; Sadeghzadeh, Arezoo, Department of Computer Engineering, Bahçeşehir Üniversitesi, Istanbul, Turkey; Islam, Md Baharul, Department of Computer Engineering, Bahçeşehir Üniversitesi, Istanbul, Turkey, American University of Malta, Cospicua, MaltaSign language (SL) is a type of communication language used by deaf and hard-of-hearing people. Large varieties in different SLs and lack of knowledge in general public to interpret them bring an inevitable necessity for breaking down the communication barriers by automatic sign language recognition (SLR) systems. Despite the existence of numerous approaches with satisfactory performance, they still suffer from severe challenges in dealing with large intra-class and slight inter-class variations, which make them infeasible for real-world applications. To address this issue, a novel end-To-end fine-grained static SLR (SSLR) system is proposed, namely BiSign-Net, based on Bilinear Convolutional Neural Network (Bi-CNN) to efficiently model the variations both in the location and appearance of the hands in the images for enhancing the accuracy, speed, and robustness against the translation. To this end, fine-grained orderless bilinear features are generated by pooled outer product of the extracted features from two identical novel CNN-based feature extractors. Bilinear features pass a normalization module including the signed square root and l2 normalization through which the accuracy of the model is further improved. A dropout layer is deployed in the classification module to aid the model in dealing with small-scale datasets by preventing overfitting. The number of layers, hyper-parameters, and optimization technique of the proposed CNN are adjusted to achieve high performance and faster convergence with low number of parameters. Experimental results on four datasets of Static ASL, NUS I, Massey, and ArASL from two SLs (i.e. American and Arabic) with an accuracy of 100%, 100%, 99.20%, and 99.35%, respectively, demonstrate that the proposed model surpasses the existing approaches with high robustness and generalization ability. © 2023 Elsevier B.V., All rights reserved.Publication Metadata only T-SignSys: An Efficient CNN-Based Turkish Sign Language Recognition System(Springer Science and Business Media Deutschland GmbH, 2024) Colak, Sevval; Sadeghzadeh, Arezoo; Islam, Md Baharul; Ortis, A.; Hameed, A.A.; Jamil, A.; Colak, Sevval, Bahçeşehir Üniversitesi, Istanbul, Turkey; Sadeghzadeh, Arezoo, Bahçeşehir Üniversitesi, Istanbul, Turkey; Islam, Md Baharul, Bahçeşehir Üniversitesi, Istanbul, Turkey, Florida Gulf Coast University, Fort Myers, United StatesSign language (SL) is a communication tool playing a crucial role in facilitating the daily life of deaf or hearing-impaired people. Large varieties in the existing SLs and lack of interpretation knowledge in the general public lead to a communication barrier between the deaf and hearing communities. This issue has been addressed by automated sign language recognition (SLR) systems, mostly proposed for American Sign Language (ASL) with limited number of research studies on the other SLs. Consequently, this paper focuses on static Turkish Sign Language (TSL) recognition for its alphabets and digits by proposing an efficient novel Convolutional Neural Network (CNN) model. Our proposed CNN model comprises 9 layers, of which 6 layers are employed for feature extraction, and the remaining 3 layers are adopted for classification. The model is prevented from overfitting while dealing with small-scale datasets by benefiting from two regularization techniques: 1) ignoring a specified portion of neurons during training by applying a dropout layer, and 2) applying penalties during loss function optimization by employing L2 kernel regularizer in the convolution layers. The arrangement of the layers, learning rate, optimization technique, model hyper-parameters, and dropout layers are carefully adjusted so that the proposed CNN model can recognize both TSL alphabets and digits fast and accurately. The feasibility of our proposed T-SignSys is investigated through a comprehensive ablation study. Our model is evaluated on two datasets of TSL alphabets and digits with an accuracy of 97.85% and 99.52%, respectively, demonstrating its competitive performance despite straightforward implementation. © 2024 Elsevier B.V., All rights reserved.
