skip to main content
research-article

Static and Dynamic Isolated Indian and Russian Sign Language Recognition with Spatial and Temporal Feature Detection Using Hybrid Neural Network

Authors Info & Claims
Published:25 November 2022Publication History
Skip Abstract Section

Abstract

The Sign Language Recognition system intends to recognize the Sign language used by the hearing and vocally impaired populace. The interpretation of isolated sign language from static and dynamic gestures is a difficult study field in machine vision. Managing quick hand movement, facial expression, illumination variations, signer variation, and background complexity are amongst the most serious challenges in this arena. While deep learning-based models have been used to accomplish the entirety of the field's state-of-the-art outcomes, the previous issues have not been fully addressed. To overcome these issues, we propose a Hybrid Neural Network Architecture for the recognition of Isolated Indian and Russian Sign Language. In the case of static gesture recognition, the proposed framework deals with the 3D Convolution Net with an atrous convolution mechanism for spatial feature extraction. For dynamic gesture recognition, the proposed framework is an integration of semantic spatial multi-cue feature detection, extraction, and Temporal-Sequential feature extraction. The semantic spatial multi-cue feature detection and extraction module help in the generation of feature maps for Full-frame, pose, face, and hand. For face and hand detection, GradCam and Camshift algorithm have been used. The temporal and sequential module consists of a modified auto-encoder with a GELU activation function for abstract high-level feature extraction and a hybrid attention layer. The hybrid attention layer is an integration of segmentation and spatial attention mechanism. The proposed work also involves creating a novel multi-signer, single, and double-handed Isolated Sign representation dataset for Indian and Russian Sign Language. The experimentation was done on the novel dataset created. The accuracy obtained for Static Isolated Sign Recognition was 99.76%, and the accuracy obtained for Dynamic Isolated Sign Recognition was 99.85%. We have also compared the performance of our proposed work with other baseline models with benchmark datasets, and our proposed work proved to have better performance in terms of Accuracy metrics.

REFERENCES

  1. [1] Saleh Y. and Issa G.. 2020. Arabic sign language recognition through deep neural networks fine-tuning. https://www.learntechlib.org/p/217934/.Google ScholarGoogle Scholar
  2. [2] Wangchuk K., Wangchuk K., and Riyamongkol P.. 2020. Bhutanese sign language hand-shaped alphabets and digits detection and recognition (doctoral dissertation, naresuan university). http://nuir.lib.nu.ac.th/dspace/handle/123456789/2491.Google ScholarGoogle Scholar
  3. [3] Jiang X., Lu M., and Wang S. H.. 2020. An eight-layer convolutional neural network with stochastic pooling, batch normalization and dropout for fingerspelling recognition of chinese sign language. Multimedia Tools Appl. 79, 21 (2020), 1569715715.Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. [4] Sevli O. and Kemaloğlu N.. 2020. Turkish sign language digits classification with CNN using different optimizers. Int. Adv. Res. Eng. J. 4, 3 (2020), 200207.Google ScholarGoogle ScholarCross RefCross Ref
  5. [5] Elakkiya R. and Rajalakshmi Islan E.. Mendeley Data, Vol. 1. https://data.mendeley.com/datasets/rc349j45m5/1.Google ScholarGoogle Scholar
  6. [6] Ong S. C. and Ranganath S.. 2005. Automatic sign language analysis: A survey and the future beyond lexical meaning. IEEE Trans. Pattern Anal. Mach. Intell. 27, 6 (2005), 873891.Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. [7] Bragg D., Koller O., Bellard M., Berke L., Boudreault P., Braffort A., Caselli N., Huenerfauth M., Kacorri H., Verhoef T., and Vogler C.. 2019. Sign language recognition, generation, and translation: An interdisciplinary perspective. In Proceedings of the 21st International ACM SIGACCESS Conference on Computers and Accessibility 1631.Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. [8] Cui R., Liu H., and Zhang C.. 2019. A deep neural framework for continuous sign language recognition by iterative training. IEEE Trans. Multimedia 21, 7 (2019), 18801891.Google ScholarGoogle ScholarCross RefCross Ref
  9. [9] Huang J., Zhou W., Zhang Q., Li H., and Li W.. 2018. Video-based sign language recognition without temporal segmentation. In Proceedings of the 32nd AAAI Conference on Artificial Intelligence.Google ScholarGoogle ScholarCross RefCross Ref
  10. [10] Koller O., Camgoz N. C., Ney H., and Bowden R.. 2019. Weakly supervised learning with multi-stream CNN-LSTM-HMMs to discover sequential parallelism in sign language videos. IEEE Trans. Pattern Anal. Mach. Intell. 42, 9 (2019), 23062320.Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. [11] Wang H., Chai X., and Chen X.. 2019. A novel sign language recognition framework using hierarchical grassmann covariance matrix. IEEE Trans. Multimedia 21, 11 (2019), 28062814.Google ScholarGoogle ScholarCross RefCross Ref
  12. [12] Pu J., Zhou W., and Li H.. 2019. Iterative alignment network for continuous sign language recognition. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 41654174.Google ScholarGoogle ScholarCross RefCross Ref
  13. [13] Camgoz N. C., Koller O., Hadfield S., and Bowden R.. 2020. Sign language transformers: Joint end-to-end sign language recognition and translation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 1002310033.Google ScholarGoogle ScholarCross RefCross Ref
  14. [14] Koller O., Forster J., and Ney H.. 2015. Continuous sign language recognition: Towards large vocabulary statistical recognition systems handling multiple signers. Comput. Vision Image Understand. 141, (2015) 108125.Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. [15] Hendrycks D. and Gimpel K.. 2016. Gaussian error linear units (gelus). Retrieved from DOI: https://arXiv:1606.08415.Google ScholarGoogle Scholar
  16. [16] Sandler W.. 2012. The phonological organization of sign languages. Lang. Ling. Compass 6, 3 (2012), 162182.Google ScholarGoogle ScholarCross RefCross Ref
  17. [17] Elakkiya R.. 2021. Machine learning-based sign language recognition: A review and its research frontier. J. Ambient Intell. Human. Comput. 12, 7 (2021), 72057224.Google ScholarGoogle ScholarCross RefCross Ref
  18. [18] Diwakar S. and Basu A.. 2008. A multilingual multimedia Indian sign language dictionary tool. In Proceedings of the International Joint Conference on Natural language Processing (IJCNLP’08). 57.Google ScholarGoogle Scholar
  19. [19] Liddell S. K. and Johnson R. E.. 1989. American sign language: The phonological base. Sign Lang. Studies 64, 1 (1989), 195277.Google ScholarGoogle ScholarCross RefCross Ref
  20. [20] Eccarius P. and Brentari D.. 2007. Symmetry and dominance: A cross-linguistic study of signs and classifier constructions. Lingua 117, 7 (2007), 11691201.Google ScholarGoogle ScholarCross RefCross Ref
  21. [21] Ren Z., Yuan J., Meng J., and Zhang Z.. 2013. Robust part-based hand gesture recognition using kinect sensor. IEEE Trans. Multimedia 15, 5 (2013), 11101120.Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. [22] Wang C., Liu Z., and Chan S. C.. 2014. Superpixel-based hand gesture recognition with kinect depth camera. IEEE Trans. Multimedia 17, 1 (2014), 2939.Google ScholarGoogle ScholarCross RefCross Ref
  23. [23] Liu Z., Chai X., Liu Z., and Chen X.. 2017. Continuous gesture recognition with hand-oriented spatiotemporal feature. In Proceedings of the IEEE International Conference on Computer Vision Workshops. 30563064.Google ScholarGoogle ScholarCross RefCross Ref
  24. [24] Molchanov P., Yang X., Gupta S., Kim K., Tyree S., and Kautz J.. 2016. Online detection and classification of dynamic hand gestures with recurrent 3D convolutional neural network. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 42074215.Google ScholarGoogle ScholarCross RefCross Ref
  25. [25] Kumar A., Thankachan K., and Dominic M. M.. 2016. Sign language recognition. In Proceedings of the 3rd International Conference on Recent Advances in Information Technology (RAIT’16). IEEE, 422428.Google ScholarGoogle Scholar
  26. [26] Aran O., Ari I., Akarun L., Sankur B., Benoit A., Caplier A., Campr P., and Carrillo A. H.. 2009. Signtutor: An interactive system for sign language tutoring. IEEE MultiMedia 16, 1 (2009), 8193.Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. [27] Sandjaja I. N. and Marcos N.. 2009. Sign language number recognition. In Proceedings of the 5th International Joint Conference on INC, IMS, and IDC. IEEE, 15031508.Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. [28] Yang Q.. 2010. Chinese sign language recognition based on video sequence appearance modeling. In Proceedings of the 5th IEEE Conference on Industrial Electronics and Applications. IEEE, 15371542.Google ScholarGoogle Scholar
  29. [29] Hore S., Chatterjee S., Santhi V., Dey N., Ashour A. S., Balas V. E., and Shi F.. 2017. Indian sign language recognition using optimized neural networks. In Information Technology and Intelligent Transportation Systems. Springer, Cham, 553563.Google ScholarGoogle Scholar
  30. [30] Agarwal R.. 2021. Bayesian k-nearest neighbour based redundancy removal and hand gesture recognition in isolated indian sign language without materials support. In Proceedings of the IOP Conference Series: Materials Science and Engineering. IOP Publishing, 012126.Google ScholarGoogle ScholarCross RefCross Ref
  31. [31] Reshna S. and Jayaraju M.. 2017. Spotting and recognition of hand gesture for Indian sign language recognition system with skin segmentation and SVM. In Proceedings of the International Conference on Wireless Communications, Signal Processing and Networking (WiSPNET’17). IEEE, 386390.Google ScholarGoogle ScholarCross RefCross Ref
  32. [32] Chaudhary A. and Raheja J. L.. 2018. Light invariant real-time robust hand gesture recognition. Optik 159 (2018), 283294.Google ScholarGoogle ScholarCross RefCross Ref
  33. [33] Shenoy K., Dastane T., Rao V., and Vyavaharkar D.. 2018. Real-time indian sign language (ISL) recognition. In Proceedings of the 9th International Conference on Computing, Communication and Networking Technologies (ICCCNT’18). IEEE, 19.Google ScholarGoogle Scholar
  34. [34] Tyagi A. and Bansal S.. 2021. Feature extraction technique for vision-based indian sign language recognition system: A review. Comput. Methods Data Eng. 3953.Google ScholarGoogle ScholarCross RefCross Ref
  35. [35] Agarwal R.. 2021. Bayesian K-nearest neighbour based redundancy removal and hand gesture recognition in isolated indian sign language without materials support. In Proceedings of the IOP Conference Series: Materials Science and Engineering. IOP Publishing, 1116, 1 (2021), 012126.Google ScholarGoogle ScholarCross RefCross Ref
  36. [36] Mukushev M., Sabyrov A., Imashev A., Koishibay K., Kimmelman V., and Sandygulova A.. 2020. Evaluation of manual and non-manual components for sign language recognition. In Proceedings of the 12th Language Resources and Evaluation Conference. European Language Resources Association (ELRA’20).Google ScholarGoogle Scholar
  37. [37] Tunga A., Nuthalapati S. V., and Wachs J. P.. 2021. Pose-based sign language recognition using GCN and BERT. In Proceedings of the IEEE Workshop on Applications of Computer Vision (WACV’21). 3140.Google ScholarGoogle ScholarCross RefCross Ref
  38. [38] Camgoz N. C., Hadfield S., Koller O., Ney H., and Bowden R.. 2018. Neural sign language translation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 77847793.Google ScholarGoogle ScholarCross RefCross Ref
  39. [39] Lim K. M., Tan A. W. C., Lee C. P., and Tan S. C.. 2019. Isolated sign language recognition using convolutional neural network hand modelling and hand energy image. Multimedia Tools Appl. 78, 14 (2019), 1991719944.Google ScholarGoogle ScholarDigital LibraryDigital Library
  40. [40] Aly S. and Aly W.. 2020. DeepArSLR: A novel signer-independent deep learning framework for isolated arabic sign language gestures recognition. IEEE Access 8, 8319983212.Google ScholarGoogle ScholarCross RefCross Ref
  41. [41] Li D., Yu X., Xu C., Petersson L., and Li H.. 2020. Transferring cross-domain knowledge for video sign language recognition. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 62056214.Google ScholarGoogle ScholarCross RefCross Ref
  42. [42] Rastgoo R., Kiani K., and Escalera S.. 2020. Video-based isolated hand sign language recognition using a deep cascaded model. Multimedia Tools Appl. 79, 2296522987.Google ScholarGoogle ScholarCross RefCross Ref
  43. [43] Jiang S., Sun B., Wang L., Bai Y., Li K., and Fu Y.. 2021. Skeleton aware multi-modal sign language recognition. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 34133423.Google ScholarGoogle ScholarCross RefCross Ref
  44. [44] Sharma S. and Singh S.. 2021. Recognition of Indian sign language (ISL) using deep learning model. Wireless Personal Commun. (2021), 122.Google ScholarGoogle Scholar
  45. [45] Jianchun G., Jiannuan G., and Lili W.. 2021. Gesture recognition method based on attention mechanism for complex background. J. Phys.: Conf. Ser. 1873, 1 (2021), 012009.Google ScholarGoogle ScholarCross RefCross Ref
  46. [46] Mazhar O., Ramdani S., and Cherubini A.. 2021. A deep learning framework for recognizing both static and dynamic gestures. Sensors 21, 6 (2021), 2227.Google ScholarGoogle ScholarCross RefCross Ref
  47. [47] Grif M. G., Elakkiya R., Prikhodko A. L., Bakaev M. A., and Rajalakshmi E.. 2021, Raspoznavanie recognition of Russian and Indian sign languages based on machine learning. Analysis and Data Processing Systems 3, 83 (2021), 5374.Google ScholarGoogle ScholarCross RefCross Ref
  48. [48] Latif G., Mohammad N., Alghazo J., AlKhalaf R., and AlKhalaf R.. 2019. ArASL: Arabic alphabets sign language dataset. Data Brief 23 (2019), 103777.Google ScholarGoogle ScholarCross RefCross Ref
  49. [49] Li D., Rodriguez C., Yu X., and Li H.. 2020. Word-level deep sign language recognition from video: A new large-scale dataset and methods comparison. In Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision. 14591469.Google ScholarGoogle ScholarCross RefCross Ref
  50. [50] Tunga A., Nuthalapati S. V., and Wachs J. P.. 2021. Pose-based sign language recognition using GCN and BERT. In Proceedings of the IEE Workshop on Applications of Computer Vision (WACV’21). 3140.Google ScholarGoogle ScholarCross RefCross Ref
  51. [51] Hosain A. A., Santhalingam P. S., Pathak P., Rangwala H., and Kosecka J.. 2021. Hand pose guided 3d pooling for word-level sign language recognition. In Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision. 34293439.Google ScholarGoogle ScholarCross RefCross Ref
  52. [52] Maruyama M., Ghose S., Inoue K., Roy P. P., Iwamura M., and Yoshioka M.. 2021. Word-level sign language recognition with multi-stream neural networks focusing on local regions. Retrieved from DOI: https://arXiv:2106.15989.Google ScholarGoogle Scholar
  53. [53] Punchimudiyanse M. and Meegama R. G. N.. 2017. Animation of fingerspelled words and number signs of the sinhala sign language. ACM Trans. Asian Low-Res. Lang. Info. Process. 16, 4 (2017), 126.Google ScholarGoogle ScholarDigital LibraryDigital Library
  54. [54] Jung H. Y., Lee J. H., Min E., and Na S. H.. 2019. Word reordering for translation into korean sign language using syntactically-guided classification. ACM Trans. Asian Low-Res. Lang. Info. Process. 19, 2 (2019), 120.Google ScholarGoogle Scholar
  55. [55] Kumar P. and Kaur S.. 2020. Sign language generation system based on indian sign language grammar. ACM Trans. Asian Low-Res. Lang. Info. Process. 19, 4 (2020), 126.Google ScholarGoogle ScholarDigital LibraryDigital Library
  56. [56] Singha J. and Das K.. 2013. Recognition of indian sign language in live video. Retrieved from DOI: https://arXiv:1306.1301.Google ScholarGoogle Scholar
  57. [57] Sidig A. A. I., Luqman H., Mahmoud S., and Mohandes M.. 2021. KArSL: Arabic sign language database. ACM Trans. Asian Low-Res. Lang. Info. Process. 20, 1 (2021), 119.Google ScholarGoogle ScholarDigital LibraryDigital Library
  58. [58] Verma K. K. and Singh B. M.. 2021. Deep multi-model fusion for human activity recognition using evolutionary algorithms. Int. J. Interact. Multimedia Artific. Intell. 7 (2021), 4458.Google ScholarGoogle Scholar
  59. [59] Verma K. K., Singh B. M., Mandoria H. L., and Chauhan P.. 2020. Two-stage human activity recognition using 2D-ConvNet. Int. J. Interact. Multimedia Artific. Intell. 6 (2020), 11.Google ScholarGoogle Scholar
  60. [60] Boháček M. and Hrúz M.. 2022. Sign pose-based transformer for word-level sign language recognition. In Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision. 182191.Google ScholarGoogle ScholarCross RefCross Ref
  61. [61] Srivastava S., Gangwar A., Mishra R., and Singh S.. 2022. Sign language recognition system using tensorflow object detection API. Retrieved from DOI: https://arXiv:2201.01486.Google ScholarGoogle Scholar

Index Terms

  1. Static and Dynamic Isolated Indian and Russian Sign Language Recognition with Spatial and Temporal Feature Detection Using Hybrid Neural Network

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in

    Full Access

    • Published in

      cover image ACM Transactions on Asian and Low-Resource Language Information Processing
      ACM Transactions on Asian and Low-Resource Language Information Processing  Volume 22, Issue 1
      January 2023
      340 pages
      ISSN:2375-4699
      EISSN:2375-4702
      DOI:10.1145/3572718
      Issue’s Table of Contents

      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 25 November 2022
      • Online AM: 20 April 2022
      • Accepted: 8 April 2022
      • Revised: 23 March 2022
      • Received: 17 January 2022
      Published in tallip Volume 22, Issue 1

      Permissions

      Request permissions about this article.

      Request Permissions

      Check for updates

      Qualifiers

      • research-article
      • Refereed

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Full Text

    View this article in Full Text.

    View Full Text

    HTML Format

    View this article in HTML Format .

    View HTML Format
    About Cookies On This Site

    We use cookies to ensure that we give you the best experience on our website.

    Learn more

    Got it!