Self-Supervised Multi-Scale Deep Learning Framework for Unpaired Image Super-Resolution
Keywords:
High-Fidelity Image Recovery, Hierarchical Feature Extraction, Autonomous Representation Learning, Vi- sual Enhancement, Perceptual Optimization, Unlabeled Training Data, Structural Restoration, Cycle Consistency, Unsupervised Image Reconstruction,Visual Quality MetricsAbstract
Image Super-Resolution (SR) aims to reconstruct high-resolution (HR) images from their low-resolution (LR) counterparts, a task traditionally reliant on large-scale paired datasets. However, acquiring perfectly aligned LR-HR image pairs is often impractical, particularly in real-world and domain- specific applications. This paper proposes a novel self-supervised multi-scale deep learning framework that addresses the SR challenge using unpaired data. The architecture incorporates a multi-scale feature extraction module that effectively captures hierarchical contextual information at different resolutions, en- hancing the model’s capacity to reconstruct fine image details. To eliminate the dependency on paired data, a self-supervised learning mechanism is introduced, leveraging pseudo-pair gen- eration, cycle-consistency constraints, and perceptual similarity losses. We evaluate our approach on benchmark datasets such as DIV2K and Flickr2K using only HR images for training. Experi- mental results demonstrate that our method achieves competitive performance in terms of PSNR and SSIM while significantly outperforming conventional models in texture preservation and visual fidelity. The framework proves effective and generalizable, particularly for real-world deployment where paired data is limited or unavailable.
Downloads
References
C. Ledig et al., “Photo-Realistic Single Image Super-Resolution Using a Generative Adversarial Network,” in Proc. IEEE Conf. Comput. Vis. Pattern Recognit. (CVPR), 2017, pp. 4681–4690.
B. Lim, S. Son, H. Kim, S. Nah, and K. M. Lee, “Enhanced Deep Residual Networks for Single Image Super-Resolution,” in Proc. IEEE Conf. Comput. Vis. Pattern Recognit. Workshops (CVPRW), 2017, pp. 1132–1140.
Y. Zhang, K. Li, K. Li, B. Zhong, and Y. Fu, “Residual Channel Attention Networks for Image Super-Resolution,” in Proc. Eur. Conf. Comput. Vis. (ECCV), 2018, pp. 286–301.
A. Lugmayr, M. Danelljan, and R. Timofte, “NTIRE 2020 Challenge on Real-World Image Super-Resolution: Methods and Results,” in Proc. IEEE/CVF Conf. Comput. Vis. Pattern Recognit. Workshops (CVPRW), 2020, pp. 494–495.
J.-Y. Zhu, T. Park, P. Isola, and A. A. Efros, “Unpaired Image-to- Image Translation Using Cycle-Consistent Adversarial Networks,” in Proc. IEEE Int. Conf. Comput. Vis. (ICCV), 2017, pp. 2223–2232.
R. Zhang, P. Isola, A. A. Efros, E. Shechtman, and O. Wang, “The Unreasonable Effectiveness of Deep Features as a Perceptual Metric,” in Proc. IEEE Conf. Comput. Vis. Pattern Recognit. (CVPR), 2018, pp. 586–595.
Z. Wang, A. C. Bovik, H. R. Sheikh, and E. P. Simoncelli, “Image Quality Assessment: From Error Visibility to Structural Similarity,” IEEE Trans. Image Process., vol. 13, no. 4, pp. 600–612, Apr. 2004.
Z. Wang, J. Chen, and S. C. H. Hoi, “Deep Learning for Image Super- Resolution: A Survey,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 43, no. 10, pp. 3365–3387, Oct. 2021.
J. Chen, Y. Tai, J. Yang, X. Liu, and C. Xu, “Learning for Image Super- Resolution via Unsupervised Degradation Modeling,” in Proc. IEEE Conf. Comput. Vis. Pattern Recognit. (CVPR), 2021, pp. 10547–10556.
M. Chen, H. Zhang, Y. Xu, X. Wang, and M. Sun, “Learning Texture Transformer Network for Image Super-Resolution,” in Proc. IEEE Conf. Comput. Vis. Pattern Recognit. (CVPR), 2020, pp. 5791–5800.
D. Li, K. Zhou, J. Li, and Y. Qiao, “Feedback Network for Image Super-Resolution,” in Proc. IEEE Conf. Comput. Vis. Pattern Recognit. (CVPR), 2019, pp. 3867–3876.
Z. Liang et al., “SwinIR: Image Restoration Using Swin Transformer,” arXiv preprint arXiv:2108.10257, 2021.
S. Ioffe and C. Szegedy, “Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift,” in Proc. Int. Conf. Mach. Learn. (ICML), 2015, pp. 448–456.
J. Hu, L. Shen, and G. Sun, “Squeeze-and-Excitation Networks,” in Proc. IEEE Conf. Comput. Vis. Pattern Recognit. (CVPR), 2018, pp. 7132–7141.
A. Dosovitskiy et al., “An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale,” in Proc. Int. Conf. Learn. Representa- tions (ICLR), 2021.
J. Ho, A. Jain, and P. Abbeel, “Denoising Diffusion Probabilistic Models,” in Adv. Neural Inf. Process. Syst. (NeurIPS), vol. 33, pp. 6840– 6851, 2020.
Downloads
Published
How to Cite
Issue
Section
License

This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.
All papers should be submitted electronically. All submitted manuscripts must be original work that is not under submission at another journal or under consideration for publication in another form, such as a monograph or chapter of a book. Authors of submitted papers are obligated not to submit their paper for publication elsewhere until an editorial decision is rendered on their submission. Further, authors of accepted papers are prohibited from publishing the results in other publications that appear before the paper is published in the Journal unless they receive approval for doing so from the Editor-In-Chief.
IJISAE open access articles are licensed under a Creative Commons Attribution-ShareAlike 4.0 International License. This license lets the audience to give appropriate credit, provide a link to the license, and indicate if changes were made and if they remix, transform, or build upon the material, they must distribute contributions under the same license as the original.