TY - JOUR
T1 - Fine-grained video super-resolution via spatial-temporal learning and image detail enhancement
AU - Yeh, Chia Hung
AU - Yang, Hsin Fu
AU - Lin, Yu Yang
AU - Huang, Wan Jen
AU - Tsai, Feng Hsu
AU - Kang, Li Wei
N1 - Publisher Copyright:
© 2023 Elsevier Ltd
PY - 2024/5
Y1 - 2024/5
N2 - This paper addresses the problem for fine-grained video super-resolution (FGVSR) to suppress temporal flickering caused by separately processed consecutive frames and enhance the quality of restored video frame details when upsizing videos. Some existing video SR methods fail to sufficiently utilize spatial-temporal information from input low-resolution (LR) videos, while others may generate undesirable artifacts or cannot well reconstruct image details. To overcome these problems, we present a novel deep learning framework for FGVSR, which takes a set of consecutive LR video frames and generate the corresponding super-resolved frames. Our deep FGVSR framework focuses on reconstructing missing information from the LR sources based on the proposed multi-frame alignment and refinement strategies. More specifically, we propose an alignment module, where multiple frames are aligned at feature level, to prevent the output videos from flickering. Then, we introduce a feature fusion module, where aligned features generated from our alignment module are fused and refined in a multi-scale manner. Finally, the proposed refinement module is used to reconstruct missing information based on the fused features. In addition, we also embed an image enhancement module on the skip connection from the input layer to the output layer of our network for further enhancing the SR results. Experimental results show that the proposed deep FGVSR, compared with existing deep learning-based VSR methods, achieves state-of-the-art performances on the three well-known benchmarks, including REDS, Vid4, and Vimeo90k. More specifically, compared with the state-of-the-art VSR methods in our experiments, our FGVSR achieves quantitative improvements from 0.70 dB to 9.54 dB in PSNR. On the other hand, our method has also been shown to be efficient to other image restoration tasks, such as image inpainting.
AB - This paper addresses the problem for fine-grained video super-resolution (FGVSR) to suppress temporal flickering caused by separately processed consecutive frames and enhance the quality of restored video frame details when upsizing videos. Some existing video SR methods fail to sufficiently utilize spatial-temporal information from input low-resolution (LR) videos, while others may generate undesirable artifacts or cannot well reconstruct image details. To overcome these problems, we present a novel deep learning framework for FGVSR, which takes a set of consecutive LR video frames and generate the corresponding super-resolved frames. Our deep FGVSR framework focuses on reconstructing missing information from the LR sources based on the proposed multi-frame alignment and refinement strategies. More specifically, we propose an alignment module, where multiple frames are aligned at feature level, to prevent the output videos from flickering. Then, we introduce a feature fusion module, where aligned features generated from our alignment module are fused and refined in a multi-scale manner. Finally, the proposed refinement module is used to reconstruct missing information based on the fused features. In addition, we also embed an image enhancement module on the skip connection from the input layer to the output layer of our network for further enhancing the SR results. Experimental results show that the proposed deep FGVSR, compared with existing deep learning-based VSR methods, achieves state-of-the-art performances on the three well-known benchmarks, including REDS, Vid4, and Vimeo90k. More specifically, compared with the state-of-the-art VSR methods in our experiments, our FGVSR achieves quantitative improvements from 0.70 dB to 9.54 dB in PSNR. On the other hand, our method has also been shown to be efficient to other image restoration tasks, such as image inpainting.
KW - Convolutional neural networks
KW - Deep learning
KW - Video enhancement
KW - Video frame alignment
KW - Video reconstruction
KW - Video super-resolution
UR - http://www.scopus.com/inward/record.url?scp=85181755449&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85181755449&partnerID=8YFLogxK
U2 - 10.1016/j.engappai.2023.107789
DO - 10.1016/j.engappai.2023.107789
M3 - Article
AN - SCOPUS:85181755449
SN - 0952-1976
VL - 131
JO - Engineering Applications of Artificial Intelligence
JF - Engineering Applications of Artificial Intelligence
M1 - 107789
ER -