Araştırma Çıktıları | WoS | Scopus | TR-Dizin | PubMed
Permanent URI for this communityhttps://hdl.handle.net/20.500.14719/1741
Browse
1 results
Search Results
Publication Metadata only Stereoscopic Video Quality Assessment Using Modified Parallax Attention Module(Springer Science and Business Media Deutschland GmbH, 2022) Imani, Hassan; Zaim, Selim; Islam, Md Baharul; Junayed, Masum Shah; Durakbasa, N.M.; Gençyılmaz, M.G.; Imani, Hassan, Computer Vision Lab, Bahçeşehir Üniversitesi, Istanbul, Turkey; Zaim, Selim, Faculty of Engineering and Natural Sciences, Bahçeşehir Üniversitesi, Istanbul, Turkey; Islam, Md Baharul, Computer Vision Lab, Bahçeşehir Üniversitesi, Istanbul, Turkey; Junayed, Masum Shah, Computer Vision Lab, Bahçeşehir Üniversitesi, Istanbul, TurkeyDeep learning techniques are utilized for most computer vision tasks. Especially, Convolutional Neural Networks (CNNs) have shown great performance in detection and classification tasks. Recently, in the field of Stereoscopic Video Quality Assessment (SVQA), 3D CNNs are used to extract spatial and temporal features from stereoscopic videos, but the importance of the disparity information which is very important did not consider well. Most of the recently proposed deep learning-based methods mostly used cost volume methods to produce the stereo correspondence for large disparities. Because the disparities can differ considerably for stereo cameras with different configurations, recently the Parallax Attention Mechanism (PAM) is proposed that captures the stereo correspondence disregarding the disparity changes. In this paper, we propose a new SVQA model using a base 3D CNN-based network, and a modified PAM-based left and right feature fusion model. Firstly, we use 3D CNNs and residual blocks to extract features from the left and right views of a stereo video patch. Then, we modify the PAM model to fuse the left and right features with considering the disparity information, and using some fully connected layers, we calculate the quality score of a stereoscopic video. We divided the input videos into cube patches for data augmentation and remove some cubes that confuse our model from the training dataset. Two standard stereoscopic video quality assessment benchmarks of LFOVIAS3DPh2 and NAMA3DS1-COSPAD1 are used to train and test our model. Experimental results indicate that our proposed model is very competitive with the state-of-the-art methods in the NAMA3DS1-COSPAD1 dataset, and it is the state-of-the-art method in the LFOVIAS3DPh2 dataset. © 2022 Elsevier B.V., All rights reserved.
