Deep Learning Approach for Combined Indian Sign Language Recognition and Video Generation Model

Authors

  • Prachi Pramod Waghmare, Ashwini Mangesh Deshpande, Siddhi Dubewar , Tanuja Dhaybar

Keywords:

Indian Sign Language, Sign Language Recognition, Transfer learning, Video Generation, GAN

Abstract

To alleviate communication barriers experienced by the deaf population, this research offers a system that uses deep learning models to recognize hand positions for Indian Sign Language (ISL). Utilizing Long Short-Term Memory (LSTM), Convolutional Neural Network (CNN), VGG-16, and ResNet architectures on an exclusive dataset comprising 73 ISL videos, the suggested CNN model attains an exceptional 98% accuracy rate, signifying a noteworthy advancement in promoting inclusivity in communication for people with speech and hearing impairments. We investigate and propose a powerful combination of CNN and Generative Adversarial Networks (GANs) in artificial intelligence, with a focus on text-to-video streaming. The performance metrics as PSNR with 31.14 dB and SSIM value of 0.9916 indicate superior resolution and minimal distortion in the generated videos, affirming the GAN-CNN model's adept preservation of intricate video details.

Downloads

Download data is not yet available.

References

K. Mistree, D. Thakorand B. Bhatt, “Towards Indian Sign Language Sentence Recognition using INSIGNVID: Indian Sign Language Video Dataset”, vol. 12, no. 8, Jan. 2021, doi: 10.14569/IJACSA.2021.0120881

K. Shenoy, T. Dastane, V. Raoand D. Vyavaharkar, “Real-time Indian Sign Language (ISL) Recognition,” IEEE, Jul. 2018. doi: 10.1109/ICCCNT.2018.8493808.

R. A. Muhamad and M. Husni, “The Recognition of American Sign Language Using CNN with Hand Keypoint”, vol. 9, no. 2, pp. 86–95, Dec. 2023.

D. Kothadiya, C. Bhatt, K. Sapariya, K. R. Patel, A.-B. Gil-Gonzálezand J. M. Corchado, “Deepsign: Sign Language Detection and Recognition Using Deep Learning”, vol. 11, no. 11, Jun. 2022, doi: 10.3390/electronics11111780.

B. Natarajan and R. Elakkiya, “Dynamic GAN for high-quality sign language video generation from skeletal poses using generative adversarial networks”, pp. 13153–13175, Dec. 2022.

S. Krishna and J. Ukey, “GAN based Indian Sign Language synthesis”, Computer Vision, Graphics and Image Processing, Dec. 2021.

Zhang, Fan, Valentin Bazarevsky, Andrey Vakunov, Andrei Tkachenka, George Sung, Chuo-Ling Chang, and Matthias Grundmann. "Mediapipe hands: On-device real-time hand tracking." arXiv preprint arXiv:2006.10214 2020.

Mekala, P.; Gao, Y.; Fan, J.; Davari, “A. Real-time sign language recognition based on neural network architecture”e. In Proceedings of the IEEE 43rd Southeastern Symposium on System Theory, Auburn, AL, USA, 14–16 March 2011.

Goodfellow, Ian J., Jean Pouget-Abadie, Mehdi Mirza, Bing Xu,"Generative adversarial networks." arXiv preprint arXiv:1406.2661 2014.

Abbas, R. K. Karshand R. Jain, “CNN based feature extraction and classification for sign language”, vol. 80, no. 2, Jan. 2021, doi: 10.1007/S11042-020-09829-Y.

Aly, Salehand W. Aly, “A "DeepArSLR: A novel signer-independent deep learning framework for isolated arabic sign language gestures recognition pp. 83199–83212 2020, doi: 10.1109/ACCESS.2020.2990699.

K. Mehrotra, A. Godboleand S. Belhe, “Indian sign language recognition using kinect sensor.”, in Image Analysis and Recognition, Springer International Publishing, Jan2015https://doi.org/10.1007/978-3-319-20801-5_59.

J. Rekha, J. Bhattacharya and S. Majumder, “Shape, texture and local movement hand gesture features for Indian Sign Language recognition”, IEEE, Dec. 2011. doi: 10.1109/TISC.2011.6169079.

R. Elakkiya, P. Vijayakumar, N. Kumarand N. Kumar, “An optimized Generative Adversarial Network based continuous sign language classification”, vol. 182, Nov. 2021, doi: 10.1016/J.ESWA.2021.115276.

Q. Chen and V. Koltun, “Photographic Image Synthesis with Cascaded Refinement Networks”, IEEE Computer Society, Jul. 2017. doi: 10.1109/ICCV.2017.168.

Powers, David. (2008) “Evaluation: from precision, recall and F-measure to ROC, informedness, markedness and correlation”, ArXiv, abs/2010.16061.

Z. Wang, A. C. Bovik, H. R. Sheikhand E. P. Simoncelli, “Image quality assessment: from error visibility to structural similarity”, IEEE transactions on image processing vol. 13, no. 4, Apr. 2004, doi: 10.1109/TIP.2003.819861.

Downloads

Published

12.06.2024

How to Cite

Prachi Pramod Waghmare. (2024). Deep Learning Approach for Combined Indian Sign Language Recognition and Video Generation Model. International Journal of Intelligent Systems and Applications in Engineering, 12(4), 3296–3302. Retrieved from https://ijisae.org/index.php/IJISAE/article/view/6824

Issue

Section

Research Article