Research and Development of Feature Extraction from Myanmar Palm Leaf Manuscripts for the Myanmar Character Recognition System

Nwe Nwe Soe, Win Htay

Abstract


This paper proposed Myanmar palm leaf manuscript handwriting OCR system. Each text area in the Myanmar palm-leaf manuscript is segmented. This segmented character text image is needed to be recognized to transform to Myanmar handwritten characters which express Myanmar’s precious historical and invaluable information. This paper involves two essential steps: preprocessing and feature extraction. The preprocessing is carried out to extract the attractive palm-leaf manuscript region from the Images automatically are taken by the camera and to support the enhanced images for subsequence processes of Myanmar character recognition from Myanmar palm leaves. The one-dimensional segmentation approach is used to crop leaf area in the image which is taken with high resolution. Line count analysis is also done to extract the region for using enough line count. After that, line segmentation is carried out using Object Frequency Histogram along the horizontal lines which can find the best optimal points between the lines. Similarly, the same technique but vertically is used to get each character or smallest group of characters. Totally 18 features are extracted to recognize the Myanmar palm-leaf manuscript characters. Although the experimental results are good enough but some difficulties are still needed to take account related to the connected components. 


Keywords


Myanmar palm leaf manuscripts; feature extraction; OCR; line segmentation; character segmentation.

Full Text:

PDF

References


Alahakoon, C. N. K., “Identification of physical problems of major palm leaf manuscripts collectionsâ€, Sri Lanka. J. Univ. Libr. Assoc. Sri Lanka, 2006, October, pp.54–65.

Nwe Nwe Soe, Win Htay, “Finding region of interest and automatic cropping from Palm leaf manuscripts by using one-dimensional segmentationâ€, 14th ICCA Conference, 2016, February.

Nwe Nwe Soe, Win Htay, “Syllabus segmentation from Palm leaf manuscriptsâ€, 16th ICCA Conference, 2018, February.

Nwe Nwe Soe, “Syllabus Line Segmentation from Palm Leaf Manuscripts by using Vector Neural Networkâ€, Journal of Applied Informatics and Technology (JIT), Thailand, 2018, Volume-1, Number 1, January – June.

Kumar, Neethu S., Dwivedi Sanjeet Kumar, S. Swathikiran, and Alex Pappachen James. "Ancient Indian document analysis using cognitive memory network." In Advances in Computing, Communications and Informatics (ICACCI, 2014 International Conference on, 2014, pp. 2665- 2668. IEEE.

Likforman-Sulem, Laurence, Abderrazak Zahour, Bruno Taconet. "Text line segmentation of historical documents: a survey." International Journal of Document Analysis and Recognition (IJDAR) 9, No. 2-4, 2007, pp.123-138.

Lakshmi, T. R., Panyam Narahari Sastry, Ramakrishnan Krishnan, N. V. Rao, and T. V. Rajinikanth. "Analysis of Telugu Palm Leaf Character Recognition Using 3D Feature." In Computational Intelligence and Networks (CINE), 2015 International Conference on, 2015, pp. 36-41. IEEE.

Soumya, A., G. Hemantha Kumar. "Fourier Features for the Recognition of Ancient Kannada Text." In Computational Intelligence in Data Mining—Volume 1, 2016, pp. 421-428, Springer India.

R. Manmatha and J. L. Rothfeder, "A scale space approach for automatically segmenting words from historical handwritten documents," Pattern Analysis and Machine Intelligence, IEEE Transactions on, vol. 27,2005, pp. 1212-1225.

V. Lavrenko, et al., "Holistic word recognition for handwritten historical documents", in Document Image Analysis for Libraries, 2004. Proceedings, First International Workshop on, pp. 278-287.

A. Zahour, et al., "Arabic hand-written text-line extraction," in Proceedings. Sixth International Conference on Document Analysis and Recognition, 2001, pp. 281-285.

Y. H. Tseng and H. J. Lee, "Recognition-based handwritten Chinese character segmentation using a probabilistic Viterbi algorithm," Pattern Recognition Letters, vol. 20, pp. 791-806, 1999.

O. Surinta, "Optimization of line segmentation techniques for Thai handwritten documents," in Eighth International Symposium on Natural Language Processing, 2009, pp. 180-183.

M. Arivazhagan, et al., "A statistical approach to line segmentation in handwritten documents," in Proc. SPIE on Document Recognition and Retrieval XIV, CA, USA, 2007.

R. Chamchong and C. C. Fung, "Character segmentation from ancient palm leaf manuscripts in Thailand," in Proceedings of the 2011 Workshop on Historical Document Imaging and Processing, Beijing, China, 2011.

N. Tripathy and U. Pal, "Handwriting Segmentation of Unconstrained Oriya Text," presented at the Ninth International Workshop on Frontiers in Handwriting Recognition (IWFHR'04), 2005.




DOI: http://dx.doi.org/10.18517/ijaseit.9.6.9001

Refbacks

  • There are currently no refbacks.



Published by INSIGHT - Indonesian Society for Knowledge and Human Development