Indian Language Benchmark Portal

1 results

Please Login/Register to submit the new Resources

OCR in Bangla: an Indo-Bangladeshi Language
U. Pal ; B.B. Chaudhuri

In this paper a complete OCR system is described for documents of single Bangla (Bengali) font. The character shapes are recognized by a combination of template and feature matching approach. Images digitized by flatbed scanner are subjected to skew correction, line, word and character segmentation, simple and compound character separation, feature extraction and finally character recognition. A feature based tree classifier is used for simple character recognition. Preprocessing like thinning and skeletonization is not necessary in our scheme and hence the system is quite fast. At present, the system has an accuracy of about 96%. Also, some character occurrence statistics have been computed to model an error detection and correction technique in the near future.

Filter by Author
P. D. Gujrati (8)
Manish Shrivastava (7)
Partha Pratim Roy (5)
Umapada Pal (5)
Ayan Kumar Bhunia (4)
Iti Mathur (4)