Multipose Attaire-fit-in using Deep Neural Network Frame Works
Keywords:
Deep Learning, Virtual Try- on, Apparel Try-on, Post EstimationAbstract
Creating an image-based virtual try-on system that seamlessly fits in-shop clothing onto a reference person across various poses poses a significant challenge. Previous efforts have concentrated on preserving intricate clothing details such as textures, logos, and patterns during the transfer of desired garments onto a target person in a fixed pose. However, when extending these methods to accommodate multi-pose scenarios, the performance of existing approaches has notably declined. This paper introduces an innovative end-to-end solution, the Multi-Pose Virtual Try-On Network, designed to effectively adapt desired clothing onto a reference person in arbitrary poses. The virtual try-on process involves three key sub-modules. Firstly, a Semantic Prediction Module (SPM) is employed to generate a comprehensive semantic map of the desired clothing. This predicted semantic map enhances guidance for locating the desired clothing region, leading to the creation of a preliminary try-on image. Secondly, a module is introduced to warp the clothing to the desired shape based on the predicted semantic map and pose. To address misalignment issues during the clothing warping process, a conductible cycle consistency loss is incorporated. Lastly, a Try-on Module combines the coarse result and the warped clothes to produce the final virtual try-on image, ensuring the preservation of intricate details and alignment with the desired pose.Additionally, a face identity loss is introduced to refine facial appearance while maintaining the identity of the virtual try-on result. The proposed method is evaluated on a vast multi-pose dataset, demonstrating its superiority over state-of-the-art methods. The qualitative and quantitative experiments indicate that the virtual try-on system exhibits robustness to data noise, including changes in background and accessories such as hats and handbags. This showcases its scalability and effectiveness in real-world scenarios.
Downloads
References
Lewis KM, Varadharajan S, Kemelmacher-Shlizerman I, Tryon-gan: body-aware try-on via layered interpolation. ACM Trans Graph.2021
M. R. Minar, T. T. Tuan, H. Ahn, P. Rosin, and Y.-K. Lai, “CP-VTON+: Clothing shape and texture preserving image-based virtual try-on,” inProc. IEEE/CVF Conf. Compute. Vis. Pattern Recognit. Workshops, Jun. 2021
Gong K, Liang X, Zhang D, Shen X, Lin L Look into person: self-supervised structure-sensitive learning and a new benchmark for human parsing. In: Proceedings of the IEEE conference on computer vision and pattern recognition,2021
Choi S, Park S, Lee M, Choo J Viton-hd: high-resolution virtual try-on via misalignment-aware normalization. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition,2021.
Ghodhbani H, Neji M, Razzak I, Alimi AM .You can try without visiting: a comprehensive survey on virtually try-on outfits.,2022
B. Akinkunmi and P. C. Bassey. A Qualitative Approach for Spatial Qualification Logic. International Journal of Artificial Intelligence & Applications, 2017.
Jetchev N, Bergmann U, The conditional analogy gan: swapping fashion articles on people images. In: Proceedings of the IEEE international conference on computer vision workshops,2022.
Sarkar K, Golyanik V, Liu L, Theobalt C ,Style and pose control for image synthesis of humans from a single monocular view,2021.
T. Park, A. A. Efros, R. Zhang, and J.-Y. Zhu, “Contrastive learning for unpaired image-to-image translation,” in Proc. Eur. Conf. Comput. Vis., Cham, Switzerland, 2020..
Ge, Y., Song, Y., Zhang, R., Ge, C., Liu, W., Luo, P.: Parser-free virtual try-on via distilling appearance flows. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. pp. 8485–8493 (2021).
Han,X., Hu,X., Huang, W., Scott, M.R.: Clothflow: A flow-based model for clothed person generation. In: Proc. of the IEEE international conference on computer vision (ICCV). pp. 10471–10480 (2019).
Heusel, M., Ramsauer, H., Unterthiner, T., Nessler, B., Hochreiter, S.: GANs trained by a two time-scale update rule converge to a local nash equilibrium. In: Proc. the Advances in Neural Information Processing Systems (NeurIPS) (2017).
Issenhuth, T., Mary, J., Calauz` enes, C.: Do not mask what you do not need to mask: a parser-free virtual try-on. In: European Conference on Computer Vision. pp. 619–635. Springer (2020).
Jandial, S., Chopra, A., Ayush, K., Hemani, M., Krishnamurthy, B., Halwai, A.: Sievenet: A unified framework for robust image-based virtual try-on. In: Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision. pp. 2182–2190 (2020).
Li, K. Chong, M.J., Zhang, J., Liu, J.: Toward accurate and realistic outfits visualization with attention to details. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. pp. 15546–15555 (2021)
Lim, J.H., Ye, J.C.: Geometric gan. arXiv preprint arXiv:1705.02894 (2017).
Mao, X., Li, Q., Xie, H., Lau, R.Y., Wang, Z., Paul Smolley, S.: Least squares generative adversarial networks. In: Proceedings of the IEEE international conference on computer vision. pp. 2794–2802 (2017).
Minar, M.R., Ahn, H.: Cloth-vton: Clothing three-dimensional reconstruction for hybrid image-based virtual try-on. In: Proceedings of the Asian Conference on Computer Vision (2020).
Miyato, T., Kataoka, T., Koyama, M., Yoshida, Y.: Spectral normalization for generative adversarial networks. In: International Conference on Learning Representations (2018).
Podell, D., English, Z., Lacey, K., Blattmann, A., Dockhorn, T., Müller, J., Penna, J., Rombach, R.: Sdxl: Improving latent diffusion models for high-resolution image synthesis. arXiv preprint arXiv:2307.01952 (2023)
Radford, A., Kim, J.W., Hallacy, C., Ramesh, A., Goh, G., Agarwal, S., Sastry, G., Askell, A., Mishkin, P., Clark, J., et al.: Learning transferable visual models from natural language supervision. In: International conference on machine learning. pp.
Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. The Journal of Machine Learning Research 21.
Ramesh, A., Dhariwal, P., Nichol, A., Chu, C., Chen, M.: Hierarchical text conditional image generation with clip latents. arXiv preprint arXiv:2204.06125 1(2), 3 (2022)
Rombach, R., Blattmann, A., Lorenz, D., Esser, P., Ommer, B.: High-resolution image synthesis with latent diffusion models. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. pp. 10684–10695 (2022)
Ronneberger, O., Fischer, P., Brox, T.: U-net: Convolutional networks for biomedical image segmentation. In: Medical Image Computing and Computer-Assisted Intervention–MICCAI 2015: 18th International Conference, Munich, Germany, October 5-9, 2015, Proceedings, Part III 18. pp. 234–241, Springer (2018).
Downloads
Published
How to Cite
Issue
Section
License

This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.
All papers should be submitted electronically. All submitted manuscripts must be original work that is not under submission at another journal or under consideration for publication in another form, such as a monograph or chapter of a book. Authors of submitted papers are obligated not to submit their paper for publication elsewhere until an editorial decision is rendered on their submission. Further, authors of accepted papers are prohibited from publishing the results in other publications that appear before the paper is published in the Journal unless they receive approval for doing so from the Editor-In-Chief.
IJISAE open access articles are licensed under a Creative Commons Attribution-ShareAlike 4.0 International License. This license lets the audience to give appropriate credit, provide a link to the license, and indicate if changes were made and if they remix, transform, or build upon the material, they must distribute contributions under the same license as the original.


