Publication:
A New Dataset and Transformer for Stereoscopic Video Super-Resolution

dc.contributor.authorImani, Hassan
dc.contributor.authorIslam, Md Baharul
dc.contributor.authorWong, Lai-Kuan
dc.contributor.institutionBahcesehir University
dc.contributor.institutionMultimedia University
dc.date.accessioned2025-10-09T11:10:17Z
dc.date.issued2022
dc.description.abstractStereo video super-resolution (SVSR) aims to enhance the spatial resolution of the low-resolution video by reconstructing the high-resolution video. The key challenges in SVSR are preserving the stereo-consistency and temporal-consistency, without which viewers may experience 3D fatigue. There are several notable works on stereoscopic image super-resolution, but there is little research on stereo video super-resolution. In this paper, we propose a novel Transformer-based model for SVSR, namely Trans-SVSR. Trans-SVSR comprises two key novel components: a spatio-temporal convolutional self-attention layer and an optical flow-based feed-forward layer that discovers the correlation across different video frames and aligns the features. The parallax attention mechanism (PAM) that uses the cross-view information to consider the significant disparities is used to fuse the stereo views. Due to the lack of a benchmark dataset suitable for the SVSR task, we collected a new stereoscopic video dataset, SVSR-Set, containing 71 full high-definition (HD) stereo videos captured using a professional stereo camera. Extensive experiments on the collected dataset, along with two other datasets, demonstrate that the Trans-SVSR can achieve competitive performance compared to the state-of-the-art methods. Project code and additional results are available at https://github.com/H-deep/Trans-SVSR/.
dc.identifier.conferenceDateJUN 18-24, 2022
dc.identifier.conferenceNameIEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)
dc.identifier.conferencePlaceNew Orleans, LA
dc.identifier.conferenceSponsorIEEE,CVF,IEEE Comp Soc
dc.identifier.doi10.1109/CVPRW56347.2022.00086
dc.identifier.endpage714
dc.identifier.isbn978-1-6654-8739-9
dc.identifier.issn2160-7508
dc.identifier.startpage705
dc.identifier.urihttp://dx.doi.org/10.1109/CVPRW56347.2022.00086
dc.identifier.urihttps://hdl.handle.net/20.500.14719/16440
dc.identifier.wosWOS:000861612700077
dc.identifier.woscitationindexConference Proceedings Citation Index - Science (CPCI-S)
dc.language.isoen
dc.publisherIEEE
dc.relation.fundingNameScientific and Technological Research Council of Turkey (TUBITAK) 2232 Leading Researchers Program
dc.relation.fundingOrgScientific and Technological Research Council of Turkey (TUBITAK) 2232 Leading Researchers Program [118C301]
dc.relation.fundingTextThis work is supported by the Scientific and Technological Research Council of Turkey (TUBITAK) 2232 Leading Researchers Program, Project No. 118C301.
dc.relation.source2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION WORKSHOPS, CVPRW 2022
dc.relation.sourceIEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops
dc.subject.indexkeywordsENHANCEMENT
dc.subject.indexkeywordsATTENTION
dc.subject.wosComputer Science, Theory & Methods
dc.titleA New Dataset and Transformer for Stereoscopic Video Super-Resolution
dc.typeProceedings Paper
dspace.entity.typePublication
local.indexed.atWOS
person.identifier.orcidIslam, Md Baharul/0000-0002-9928-5776
person.identifier.ridIslam, Md Baharul/R-3751-2019
person.identifier.ridImani, Hassan/KSL-4309-2024
person.identifier.ridWong, Lai/AAO-7014-2021

Files