MoViNet-A2 for Vietnamese sign language recognition

Authors

  • Trương Duy Việt
  • Ngô Hữu Gia Huy
  • Phạm Đăng Khôi
  • Nguyễn Trần Thiên Phúc

Keywords:

Abstract

Sign language recognition from video is an essential task to support communication for the hearing-impaired community. However, the diversity of gestures, different camera angles, and varying environmental conditions pose significant challenges for traditional recognition systems. In this study, we propose a Vietnamese sign language recognition method based on MoViNet-A2, an advanced model optimized for action recognition in videos on mobile devices. The research dataset consists of 98 words or phrases, performed by 18 students from Lam Dong - Da Lat School for the Disabled, with a total of 4,709 videos from three different camera angles, ensuring diversity in training data. MoViNet-A2 serves as the backbone, pre-trained on the Kinetics-600 dataset. It is combined with preprocessing techniques such as class balancing, brightness normalization, and data augmentation to improve model generalization. Our method achieves a Top-1 Accuracy of 88.55%. Experimental results demonstrate that the proposed method achieves high performance in classifying and recognizing sign language gestures while ensuring real-time processing capabilities on mobile devices. This research not only improves the accuracy of sign language recognition systems but also opens up practical applications in facilitating communication for the hearing-impaired community.

Downloads

Download data is not yet available.

Author Biographies

  • Trương Duy Việt
    Dalat College, Dalat City
  • Ngô Hữu Gia Huy
    Dalat College, Dalat City
  • Phạm Đăng Khôi
    Dalat College, Dalat City
  • Nguyễn Trần Thiên Phúc
    Dalat College, Dalat City

Published

2025-07-19

Issue

Section

Bài viết