Khmer printed character recognition using attention-based Seq2Seq network

Rina Buoy; Nguonly Taing; Sovisal Chenda; Sokchea Kor

Các tác giả

Rina Buoy
Nguonly Taing
Sovisal Chenda
Sokchea Kor

Từ khóa:

Tóm tắt

This paper presents an end-to-end deep convolutional recurrent neural network solution for Khmer optical character recognition (OCR) task. The proposed solution uses a sequence-to-sequence (Seq2Seq) architecture with attention mechanism. The encoder extracts visual features from an input text-line image via layers of convolutional blocks and a layer of gated recurrent units (GRU). The features are encoded in a single context vector and a sequence of hidden states which are fed to the decoder for decoding one character at a time until a special end-of-sentence (EOS) token is reached. The attention mechanism allows the decoder network to adaptively select relevant parts of the input image while predicting a target character. The Seq2Seq Khmer OCR network is trained on a large collection of computer-generated text-line images for multiple common Khmer fonts. Complex data augmentation is applied on both train and validation dataset. The proposed model’s performance outperforms the state-of-art Tesseract OCR engine for Khmer language on the validation set of 6400 augmented images by achieving a character error rate (CER) of 0.7% vs 35.9%.

Lượt tải

Chưa có dữ liệu tải xuống.

Tiểu sử tác giả

Rina Buoy

Techo Startup Center, Phnom Penh, Cambodia
Nguonly Taing

Techo Startup Center, Phnom Penh, Cambodia
Sovisal Chenda

Techo Startup Center, Phnom Penh, Cambodia
Sokchea Kor

Royal University of Phnom Penh, Phnom Penh, Cambodia

Khmer printed character recognition using attention-based Seq2Seq network

Các tác giả

Từ khóa:

Tóm tắt

Lượt tải

Tiểu sử tác giả

Đã Xuất bản

Số

Chuyên mục

Ngôn ngữ

Thông tin

Tạp chí Khoa học Việt Nam Trực tuyến - Vietnam Journals Online