Journal of Imaging

15 pages, 6555 KB

Open AccessArticle

Video-Based Sign Language Recognition via ResNet and LSTM Network

by Jiayu Huang and Varin Chouvatut

J. Imaging 2024, 10(6), 149; https://doi.org/10.3390/jimaging10060149 - 20 Jun 2024

Cited by 3 | Viewed by 4791

Sign language recognition technology can help people with hearing impairments to communicate with non-hearing-impaired people. At present, with the rapid development of society, deep learning also provides certain technical support for sign language recognition work. In sign language recognition tasks, traditional convolutional neural [...] Read more.

Sign language recognition technology can help people with hearing impairments to communicate with non-hearing-impaired people. At present, with the rapid development of society, deep learning also provides certain technical support for sign language recognition work. In sign language recognition tasks, traditional convolutional neural networks used to extract spatio-temporal features from sign language videos suffer from insufficient feature extraction, resulting in low recognition rates. Nevertheless, a large number of video-based sign language datasets require a significant amount of computing resources for training while ensuring the generalization of the network, which poses a challenge for recognition. In this paper, we present a video-based sign language recognition method based on Residual Network (ResNet) and Long Short-Term Memory (LSTM). As the number of network layers increases, the ResNet network can effectively solve the granularity explosion problem and obtain better time series features. We use the ResNet convolutional network as the backbone model. LSTM utilizes the concept of gates to control unit states and update the output feature values of sequences. ResNet extracts the sign language features. Then, the learned feature space is used as the input of the LSTM network to obtain long sequence features. It can effectively extract the spatio-temporal features in sign language videos and improve the recognition rate of sign language actions. An extensive experimental evaluation demonstrates the effectiveness and superior performance of the proposed method, with an accuracy of 85.26%, F1-score of 84.98%, and precision of 87.77% on Argentine Sign Language (LSA64). Full article

(This article belongs to the Special Issue Recent Trends in Computer Vision with Neural Networks)

► Show Figures

Figure 1

12 pages, 7865 KB

Open AccessArticle

Weakly Supervised SVM-Enhanced SAM Pipeline for Stone-by-Stone Segmentation of the Masonry of the Loire Valley Castles

by Stuardo Lucho, Sylvie Treuillet, Xavier Desquesnes, Remy Leconge and Xavier Brunetaud

J. Imaging 2024, 10(6), 148; https://doi.org/10.3390/jimaging10060148 - 19 Jun 2024

Viewed by 1648

Journal Menu

Journal Browser

J. Imaging, Volume 10, Issue 6 (June 2024) – 25 articles

Further Information

Guidelines

MDPI Initiatives

Follow MDPI