Publication: A New Dataset and Transformer for Stereoscopic Video Super-Resolution
| dc.contributor.author | Imani, Hassan | |
| dc.contributor.author | Islam, Md Baharul | |
| dc.contributor.author | Wong, Lai-Kuan | |
| dc.contributor.institution | Bahcesehir University | |
| dc.contributor.institution | Multimedia University | |
| dc.date.accessioned | 2025-10-09T11:10:17Z | |
| dc.date.issued | 2022 | |
| dc.description.abstract | Stereo video super-resolution (SVSR) aims to enhance the spatial resolution of the low-resolution video by reconstructing the high-resolution video. The key challenges in SVSR are preserving the stereo-consistency and temporal-consistency, without which viewers may experience 3D fatigue. There are several notable works on stereoscopic image super-resolution, but there is little research on stereo video super-resolution. In this paper, we propose a novel Transformer-based model for SVSR, namely Trans-SVSR. Trans-SVSR comprises two key novel components: a spatio-temporal convolutional self-attention layer and an optical flow-based feed-forward layer that discovers the correlation across different video frames and aligns the features. The parallax attention mechanism (PAM) that uses the cross-view information to consider the significant disparities is used to fuse the stereo views. Due to the lack of a benchmark dataset suitable for the SVSR task, we collected a new stereoscopic video dataset, SVSR-Set, containing 71 full high-definition (HD) stereo videos captured using a professional stereo camera. Extensive experiments on the collected dataset, along with two other datasets, demonstrate that the Trans-SVSR can achieve competitive performance compared to the state-of-the-art methods. Project code and additional results are available at https://github.com/H-deep/Trans-SVSR/. | |
| dc.identifier.conferenceDate | JUN 18-24, 2022 | |
| dc.identifier.conferenceName | IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) | |
| dc.identifier.conferencePlace | New Orleans, LA | |
| dc.identifier.conferenceSponsor | IEEE,CVF,IEEE Comp Soc | |
| dc.identifier.doi | 10.1109/CVPRW56347.2022.00086 | |
| dc.identifier.endpage | 714 | |
| dc.identifier.isbn | 978-1-6654-8739-9 | |
| dc.identifier.issn | 2160-7508 | |
| dc.identifier.startpage | 705 | |
| dc.identifier.uri | http://dx.doi.org/10.1109/CVPRW56347.2022.00086 | |
| dc.identifier.uri | https://hdl.handle.net/20.500.14719/16440 | |
| dc.identifier.wos | WOS:000861612700077 | |
| dc.identifier.woscitationindex | Conference Proceedings Citation Index - Science (CPCI-S) | |
| dc.language.iso | en | |
| dc.publisher | IEEE | |
| dc.relation.fundingName | Scientific and Technological Research Council of Turkey (TUBITAK) 2232 Leading Researchers Program | |
| dc.relation.fundingOrg | Scientific and Technological Research Council of Turkey (TUBITAK) 2232 Leading Researchers Program [118C301] | |
| dc.relation.fundingText | This work is supported by the Scientific and Technological Research Council of Turkey (TUBITAK) 2232 Leading Researchers Program, Project No. 118C301. | |
| dc.relation.source | 2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION WORKSHOPS, CVPRW 2022 | |
| dc.relation.source | IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops | |
| dc.subject.indexkeywords | ENHANCEMENT | |
| dc.subject.indexkeywords | ATTENTION | |
| dc.subject.wos | Computer Science, Theory & Methods | |
| dc.title | A New Dataset and Transformer for Stereoscopic Video Super-Resolution | |
| dc.type | Proceedings Paper | |
| dspace.entity.type | Publication | |
| local.indexed.at | WOS | |
| person.identifier.orcid | Islam, Md Baharul/0000-0002-9928-5776 | |
| person.identifier.rid | Islam, Md Baharul/R-3751-2019 | |
| person.identifier.rid | Imani, Hassan/KSL-4309-2024 | |
| person.identifier.rid | Wong, Lai/AAO-7014-2021 |
