Publication: Spatial–Temporal Coherence in Extreme Video Retargeting for Consumer Screening Devices
No Thumbnail Available
Date
2025
Authors
Journal Title
Journal ISSN
Volume Title
Publisher
Institute of Electrical and Electronics Engineers Inc.
Abstract
The accessibility of diverse display devices and their aspect ratios has drawn much research attention to video retargeting. Non-consistent video retargeting can significantly affect a video’s spatial and temporal quality, particularly in extreme retargeting cases. Since there are no perfectly annotated datasets for video retargeting, deep learning-based techniques are rarely utilized. This paper proposes a method that learns to retarget videos by detecting the salient areas and shifting them to the appropriate location. First, we segment the salient objects using a unified Transformer model. Using convolutional layers and a shifting strategy, we shift and warp objects to the appropriate size and location in the frame. We use 1D convolution to move the salient items in the scene. Additionally, we employ a frame interpolation technique to preserve temporal information. To train the network, we feed the retargeted frames to a variational auto-encoder network to map the retargeted frames back to the input frames. Furthermore, we design perceptual and wavelet-based loss functions to train our model. Thus, we train the network unsupervised. Extensive qualitative and quantitative experiments on the DAVIS dataset show the superiority of the proposed method over existing image and video-based methods. © 2025 Elsevier B.V., All rights reserved.
