Multi-Classifier Jawi Handwritten Sub-Word Recognition

Anton Heryanto Hasan, Khairuddin Omar, Muhammad Faidzul Nasrudin

Abstract


The problems and challenges in Jawi handwritten recognition are inherited from Arabic script which consists of cursive natures, large variety of writing styles due to its morphologically rich, ligature, overlapping characters, dialects and the low quality of the manuscripts images. The word segmentation is difficult because the existence of sub words due to the presence of space within words when contain disconnect characters. The performance of previous Jawi handwritten recognition still consider sub-par. There are three main problem of previous approach. First, the recognizer consist of multiple independent components where the improvement of performance in one component not shared across the systems. Secondly, the features extraction using features engineering approach only works on specific subsets of training data and is less capable to handle broader variants of testing data. Finally, the classifier used implicit segmentation where target class is sub-word with limited lexicon. This paper propose use of Deep Learning approach to address the first problem where training is conducted end-to-end from input to class output which enable the improvement of each component to improve overall performance. Secondly, Convolutional Network is use as learning features optimizes the data representation through end-to-end training of the parameters from raw input data to target class. Finally, A multi-classifier implicitly segments the sub-word into sequences of characters are proposed. The classifiers consists of one sub-word length classifier and seven character classifiers. This approach is lexicon-free to address absent of lexicon data. Experiments conducted on a Jawi handwritten standard dataset showed an accuracy of up to 92.20% and suggest that the approach used is superior to state-of-the-art methods of Jawi handwriting recognition.


Keywords


jawi; handwritten recognition; sub-word; end-to-end learning, learning features; convolutional network.

Full Text:

PDF

References


M. F. Nasrudin, K. Omar, M. S. Zakaria, C. Y. Liong, 2008, Handwritten Cursive Jawi Character Recognition: A Survey, Proceeding of the 5th International Conference on Computer Graphics, Imaging and Visualization (2008) 247–256

Sitti Rachmawati, S.N.H. Sheikh Abdullah, K. Omar, M.S. Zakaria, & C.Y. Liong, “Review on Image Enhancement Methods of Old Manuscript with the Damaged Background.†International Journal on Electrical Engineering and Informatics. 2(1): 1-14. ISSN 2085-6830. (2010)

K. Omar, “Jawi Handwritten Text Recognition Using Multi-Level Classifier (in Malay),†PhD Thesis, Universiti Putra Malaysia, (2000)

M. Manaf, “Jawi Handwritten Text Recognition Using Recurrent Bama Neural Networks (in Malay),†PhD Thesis, Universiti Kebangsaan Malaysia, (2002)

A. Heryanto, M. F. Nasrudin, K. Omar, "Offline Jawi handwritten recognizer using hybrid artificial neural networks and dynamic programming," in Information Technology, 2008. ITSim 2008. International Symposium, vol.2, no., pp.1-6, (26-28 Aug. 2008)

R. Redika, K. Omar, M. F. Nasrudin, "Handwritten Jawi words recognition using Hidden Markov Models," in Information Technology, 2008. ITSim 2008. International Symposium, vol.2, no., pp.1-5, (26-28 Aug. 2008)

M. F. Nasrudin, M. Petrou, L. Kotoulas, "Jawi Character Recognition Using the Trace Transform," in Computer Graphics, Imaging and Visualization (CGIV), 2010 Seventh International Conference, vol., no., pp.151-156, (7-10 Aug. 2010)

R. Plamondon, S. N. Srihari, 2000, On-Line and Off-Line Handwriting Recognition: A Comprehensive Survey. IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol. 22 No. 1 (2000) 63–84.

L. M. Lorigo, V. Govindaraju, 2006, Offline Arabic Handwriting Recognition: A Survey. IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol. 28 No. 5 (2006) 712–724

Azmi, M. S., Nasrudin, M. F., Omar, K., Ahmad, C. W. S. B. C. W., & Ghazali, K. W. M. (2013). Exploiting features from triangle geometry for digit recognition. In 2013 International Conference on Control, Decision and Information Technologies, CoDIT 2013 (pp. 876-880). [6689658] DOI: 10.1109/CoDIT.2013.6689658

LeCun, Y., Bottou, L., Bengio, Y. & Haffner, P. “Gradient-based learning applied to document recognition.†Proc. IEEE 86, 2278–2324. (1998)

D. Hubel and T. Wiesel (1959, 1962, Nobel Prize 1981). Visual cortex consists of a hierarchy of simple, complex, and hyper-complex cells

Krizhevsky, A., Sutskever, I. & Hinton, G. “ImageNet classification with deep convolutional neural networks.†In Proc. Advances in Neural Information Processing Systems 25 1090–1098. (2012)

Goodfellow, I. J., Bulatov, Y., Ibarz, J., Arnoud, S., Shet, Vinay

Hinton, G. E., Osindero, S. & Teh, Y.-W. “A fast learning algorithm for deep belief nets.†Neural Comp. 18, 1527–1554. (2006)

Srivastava, N., Hinton, G., Krizhevsky, A., Sutskever, I. & Salakhutdinov, R. Dropout: a simple way to prevent neural networks from overfitting. J. Machine Learning Res.15, 1929–1958. (2014)

Bengio, Y. Learning Deep Architectures for AI (Nov, 2009).

Bengio, Y. Ducharme, A. Vincent, P. “A Neural Probabilistic Language Model.†Journal of Machine Learning Research 3: 1137–1155. (2003)

Bengio, Y., Courville, A. & Vincent, P. “Representation learning: a review and new perspectives.†IEEE Trans. Pattern Anal. Machine Intell. 35,1798–1828. (2013)

Yoshua Bengio, Honglak Lee, “Editorial introduction to the Neural Networks special issue on Deep Learning of Representations.†Neural Networks 64: 1-3. (2015)

Junyoung Chung, Kyunghyun Cho, Yoshua Bengio, “A Character-level Decoder without Explicit Segmentation for Neural Machine Translation.†CoRR abs/1603.06147. (2016)




DOI: http://dx.doi.org/10.18517/ijaseit.8.4-2.6959

Refbacks

  • There are currently no refbacks.



Published by INSIGHT - Indonesian Society for Knowledge and Human Development