Printed Kannada Character Segmentation.
OBJECTIVE
The Main objective of “Printed Kannada Character Segmentation” is to Segment the Kannada printed characters written in Text Books, Official Documents, Files, News Papers and other Historical Data which is widely used in the state of Karnataka. Data Entry of the Printed Kannada characters is very difficult as well as time consuming requires more man power to do the task. Thus idea behind our projects is to convert printed Kannada character into editable file very easily by adopting the OCR Mechanism. Character Segmentation is a module which is initial stage of the printed character recognition.
Kannada is a widely spoken language of South India and Native to Karnataka, Kannadigas use to speak, read and write. Most of the Govt offices like Municipal, City Corporation, Taluk, and Zilla Panchayath are struggling to maintain their “important” printed “Old Documents” In Files and in Racks, Converting them into Digitized-File is Our Motto. Character segmentation in Kannada text is difficult task, since adjacent characters in Kannada word overlap in the vertical projection profile due to presence of bottom extension characters (Vatthus). So usual way of segmentation is not efficient. This problem is solved by using connected component algorithm and remaining characters are segmented by Vertical Projection profile.
USERS OF THE SYSTEM
• Government staffs
• Confidential Assistants
• Other Personal Users
Note: Madam we are only doing the part of the project. So how can we specify the end user?
FUNCTIONAL REQUIREMENTS
• A scan conversion mechanism for converting printed scan copy to Kannada compatible editor format.
• This system is developed as a component module of the entire Optical Character recognition system for Kannada.
• Segmentation is the vital component of OCR, which leads the accuracy of character recognition.
• A preprocessor is provided for making the scanned document