Multiple Deep CNN models for Indian Sign Language translation for Person with Verbal Impairment



Hand Gesture Recognition, segmentation, Indian Sign Language, convolutional neural networks


Understanding the sign language is very useful for people with verbal and hearing impairment. Sign language is a category of nonverbal communication for people weakened by speech and listening capability. Automatic translation of sign gestures into text have gained more attention in recent years. In this research work, a deep CNN based approach has been proposed for detecting existing 35 signs of Indian Sign Language (ISL) alphabets into text in an efficient manner using hand kinematics. A Convolution Neural Network (CNN) with custom developed number of convolution layers with a suitable optimizer is applied. CNN handle issues of lighting conditions and most importantly it is efficient in tackling the problems under computer vision. It is considered for detecting features with required training without any manual pre-processing. The proposed approach has achieved 92.85% of accuracy with ISL dataset and accuracy of this method is compared with the transfer learning models such as Densenet201 and Resnet50.


Download data is not yet available.


Das, S. Gawde, K. Suratwala and D. Kalbande, "Sign Language Recognition Using Deep Learning on Custom Processed Static Gesture Images," 2018 International Conference on Smart City and Emerging Technology (ICSCET), 2018, pp. 1-6, doi: 10.1109/ICSCET.2018.8537248.

Abhishek, K. S., Qubeley, L. C. F., & Ho, D. (2016, August). Glove-based hand gesture recognition sign language translator using capacitive touch sensor. In 2016 IEEE International Conference on Electron Devices and Solid-State Circuits (EDSSC) (pp. 334-337). IEEE.

Beena, M. V., Namboodiri, M. A., & Dean, P. G. (2017). Automatic sign language finger spelling using convolution neural network: analysis. Int J Pure Appl Math, 117(20), 9-15.

Caon, M., Yue, Y., Tscherrig, J., Mugellini, E., & Khaled, O. A. (2011, October). Context-aware 3d gesture interaction based on multiple kinects. In The First International Conference on Ambient Computing, Applications, Services and Technologies (pp. 7-12).

Philip, A. M., and D. S. . Hemalatha. “Identifying Arrhythmias Based on ECG Classification Using Enhanced-PCA and Enhanced-SVM Methods”. International Journal on Recent and Innovation Trends in Computing and Communication, vol. 10, no. 5, May 2022, pp. 01-12, doi:10.17762/ijritcc.v10i5.5542.

Cui, R., Liu, H., & Zhang, C. (2019). A deep neural framework for continuous sign language recognition by iterative training. IEEE Transactions on Multimedia, 21(7), 1880-1891.

Glenn, C. M., Mandloi, D., Sarella, K., & Lonon, M. (2005, June). An image processing technique for the translation of ASL finger-spelling to digital audio or text. In Instructional Technology and Education of the deaf Symposium, Rochester, NY (pp. 1-7).

Guo, D., Wang, S., Tian, Q., & Wang, M. (2019, August). Dense Temporal Convolution Network for Sign Language Translation. In IJCAI (pp. 744-750).

Gill, D. R. . (2022). A Study of Framework of Behavioural Driven Development: Methodologies, Advantages, and Challenges. International Journal on Future Revolution in Computer Science &Amp; Communication Engineering, 8(2), 09–12.

He, Kaiming, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. "Deep residual learning for image recognition." In Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 770-778. 2016.

Hossny, M., Filippidis, D., Abdelrahman, W., Zhou, H., Fielding, M., Mullins, J., ... & Nahavandi, S. (2012, January). Low cost multimodal facial recognition via kinect sensors. In LWC 2012: Potent land force for a joint maritime strategy: Proceedings of the 2012 Land Warfare Conference (pp. 77-86). Commonwealth of Australia.

Huang, G., Liu, Z., Van Der Maaten, L., & Weinberger, K. Q. (2017). Densely connected convolutional networks. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 4700-4708).

Modiya, P., & Vahora, S. (2022). Brain Tumor Detection Using Transfer Learning with Dimensionality Reduction Method. International Journal of Intelligent Systems and Applications in Engineering, 10(2), 201–206. Retrieved from

Huang, J., Zhou, W., Zhang, Q., Li, H., & Li, W. (2018, April). Video-based sign language recognition without temporal segmentation. In Thirty-Second AAAI Conference on Artificial Intelligence.

Kakoty, N. M., & Sharma, M. D. (2018). Recognition of sign language alphabets and numbers based on hand kinematics using a data glove. Procedia Computer Science, 133, 55-62.

Kishore, P. V. V., & Kumar, P. R. (2012). Segment, track, extract, recognize and convert sign language videos to voice/text. International Journal of Advanced Computer Science and Applications, 3(6).

Kolivand, H., Joudaki, S., Sunar, M. S., & Tully, D. (2021). A new framework for sign language alphabet hand posture recognition using geometrical features through artificial neural network (part 1). Neural Computing and Applications, 33(10), 4945-4963.

Kopuklu, O., Gunduz, A., Kose, N., & Rigoll, G. (2019, May). Real-time hand gesture detection and classification using convolutional neural networks. In 2019 14th IEEE International Conference on Automatic Face & Gesture Recognition (FG 2019) (pp. 1-8). IEEE.

Modi, K., & More, A. (2013). Translation of Sign Language Finger-Spelling to Text using Image Processing. International Journal of Computer Applications, 77(11).

Obaid, F., Babadi, A., & Yoosofan, A. (2020). Hand gesture recognition in video sequences using deep convolutional and recurrent neural networks. Applied Computer Systems, 25(1), 57-61.

Papastratis, Ilias, et al. "Continuous sign language recognition through cross-modal alignment of video and text embeddings in a joint-latent space." IEEE Access 8 (2020): 91170-91180.

Pigou, L., Dieleman, S., Kindermans, P. J., & Schrauwen, B. (2014, September). Sign language recognition using convolutional neural networks. In European Conference on Computer Vision (pp. 572-578). Springer, Cham.

Prathum Arikeri. 2021

Linda R. Musser. (2020). Older Engineering Books are Open Educational Resources. Journal of Online Engineering Education, 11(2), 08–10. Retrieved from

Rafibakhsh, N., Gong, J., Siddiqui, M. K., Gordon, C., & Lee, H. F. (2012). Analysis of xbox kinect sensor data for use on construction sites: depth accuracy and sensor interference assessment. In Construction Research Congress 2012: Construction Challenges in a Flat World (pp. 848-857).

Sahoo, A. K. (2021, June). Indian Sign Language Recognition Using Machine Learning Techniques. In Macromolecular Symposia (Vol. 397, No. 1, p. 2000241).

Venugopalan, A., & Reghunadhan, R. (2021). Applying deep neural networks for the automatic recognition of sign language words: A communication aid to deaf agriculturists. Expert Systems with Applications, 185, 115601

SL Dataset Sample Images




How to Cite

K. T and D. . Kumar.V, “Multiple Deep CNN models for Indian Sign Language translation for Person with Verbal Impairment”, Int J Intell Syst Appl Eng, vol. 10, no. 3, pp. 382–389, Oct. 2022.



Research Article