An Efficient Phase-Based Binarization Method for Degraded Historical Documents

Alaa Sulaiman, Khairuddin Omar, Mohammad F. Nasrudin

Abstract


Document image binarization is the first essential step in digitalizing images and is considered an essential technique in both document image analysis applications and optical character recognition operations, the binarization process is used to obtain a binary image from the original image, binary image is the proper presentation for image segmentation, recognition, and restoration as underlined by several studies which assure that the next step of document image analysis applications depends on the binarization result.  However, old and historical document images mainly suffering from several types of degradations, such as bleeding through the blur, uneven illumination and other types of degradations which makes the binarization process a difficult task. Therefore, extracting of foreground from a degraded background relies on the degradation, furthermore it also depends on the type of used paper and document age. Developed binarization methods are necessary to decrease the impact of the degradation in document background. To resolve this difficulty, this paper proposes an effective, enhanced binarization technique for degraded and historical document images. The proposed method is based on enhancing an existing binarization method by modifying parameters and adding a post-processing stage, thus improving the resulting binary images. This proposed technique is also robust, as there is no need for parameter tuning. After using document image binarization Contest (DIBCO) datasets to evaluate this proposed technique, our findings show that the proposed method efficiency is promising, producing better results than those obtained by some of the winners in the DIBCO.


Keywords


document image binarization; document image binarization contest (DIBCO); HOWE binarization.

Full Text:

PDF

References


Sulaiman, A., Omar, K. and Nasrudin, M.F., 2019. Degraded Historical Document Binarization: A Review on Issues, Challenges, Techniques, and Future Directions. Journal of Imaging, 5(4), p.48.

Susan, Seba, and KM Rachna Devi. "Text area segmentation from document images by novel adaptive thresholding and template matching using texture cues." Pattern Analysis and Applications (2019): 1-13.

H.S. Baird, “The state of the art of document image degradation modeling,†In Proc. 4 IAPR Int. Workshop Doc. Anal. Syst. 2000, pp. 1–16.

A. Sulaiman et al., “A database for degraded Arabic historical manuscripts,†in ICEEI 2017 6th Int. Conf., pp. 1-6, 2017.

K. Ntirogiannis et al., “ICFHR2014 competition on handwritten document image binarization (H-DIBCO 2014),†ICFHR, pp. 809–813, 2014.

T. Kalaiselvi, “A comparative study on thresholding techniques for gray image binarization,†IJARCS, vol. 8, no. 7, pp. 1168–1172, 2017.

W. Niblack, An Introduction to Digital Image Processing, Strandberg Publishing Company, 1985.

J. Sauvola and M. Pietikäinen, “Adaptive document image binarization,†Pattern Recognition, vol. 33, no. 2, pp. 225–236, 2000.

L. P. Saxena, “Niblack’s binarization method and its modifications to real-time applications: a review,†ArtificialIntell.Review,2017.

Yahya, Sitti Rachmawati, et al. "Image enhancement background for high damage Malay manuscripts using adaptive Threshold Binarization." International Journal on Advanced Science, Engineering and Information Technology 8.4-2 (2018): 1552-1564.

I. Pratikakis et al., “ICFHR 2016 handwritten document image binarization contest,†Proc. ICFHR, 2016, pp. 619–623.

I. Pratikakis et al., “ICDAR2017 competition on document image binarization,†2017 14th IAPR ICDAR, 2017, pp.1395–1403.

S.M. Ayatollahi and H. Ziaei Nafchi, “Persian heritage image binarization competition (PHIBC 2012),†in 1st Iranian Conf. Pattern Recognit. Image Anal. PRIA, 2013.

N. R. Howe, “Document binarization with automatic parameter tuning,†Proc. Int. Conf. Doc. Anal. Recog., vol. 16, no. 3, pp. 247–258, 2013.

N. R. Howe, “A Laplacian energy for document binarization,†Proc. ICDAR, pp. 6–10, 2011.

C. Tensmeyer and T. Martinez, “Document image binarization with fully convolutional neural networks,†2017.

B. Su et al., “Binarization of historical document images using the local maximum and minimum,†Proc. 9th IAPR Int. Workshop on Doc. Anal. Syst., 2010, pp. 159-166.

J. Burie, et al., “ICFHR 2016 competition on the analysis of handwritten text in images of Balinese palm leaf manuscripts,†Proc. of Int. Conf. Frontiers Handwriting Recog.ICFHR,2016, pp.0–5,

B. Gatos et al., “DIBCO 2009: document image binarization contest,†Int. J. Doc. Anal. Recog, vol. 14., no. 1, pp. 35–44, 2011.

I. Pratikakis et al., “H-DIBCO 2010 - handwritten document image binarization competition,†Proc. 12th ICFHR,2010, pp.727–732.

I. Pratikakis et al., “ICDAR 2011 document image binarization contest,†ICDAR, 2010, pp. 727–732,

I. Pratikakis et al., “ICFHR 2012 competition on handwritten document image binarization,†ICFHR, pp. 12, 18–20, 2012.

I. Pratikakis et al., “ICDAR 2013 document image binarization contest (DIBCO 2013),†Proc. ICDAR, 2013, pp.1471–1476.

N. Otsu, “A threshold selection method from gray-level histogramsâ€,IEEE Trans. Sys., Man., Cyber, vol. 9, Pp. 62–66, 1979.




DOI: http://dx.doi.org/10.18517/ijaseit.9.6.7774

Refbacks

  • There are currently no refbacks.



Published by INSIGHT - Indonesian Society for Knowledge and Human Development