Algorithms for off-line recognition of Chinese characters

Ren, M, 1998. Algorithms for off-line recognition of Chinese characters. PhD, Nottingham Trent University.

[thumbnail of 10183034.pdf]
Preview
Text
10183034.pdf - Published version

Download (33MB) | Preview

Abstract

Computer recognition of Chinese characters is a challenging topic and important research area. It is relevant to documentation, publications, language translation, handwriting of Chinese and Japanese 'Kanji in industry, business, diplomacy and daily life. Typical development of the recognition process focuses on printed, on-line and off-line hand-written characters using techniques including a two-layer hierarchy, four-corner, radical and a whole character recognition. Although existing recognition methods have achieved some success, the lack of fundamental algorithms for representing the structure of Chinese characters has prevented the recognition of characters within large vocabulary and having a complicated topological structure embedded within the 2-D pictorial format. The current project develops a new structural representation to remedy the lack of an effective recognition process of such characters. The research also investigates methods of dealing with variable size, position, shape, vagueness and ambiguity of a character. A key input character method using manual operation, called the 'Cang-Jie'' method, is applied as an effective tool for verification of a Chinese character.

A novel method is developed to represent the structure of Chinese characters: a three-layer hierarchy of character-radical-stroke and its process: character-radical-code, which is specially suited for 2-D objects with topological features. The character is deconstructed into radicals according to their shape, position and extraction order. Radicals are classified into 26 categories in terms of their shape structure and meanings. Recognition of a radical yields the code of the category to which it belongs. The chain code method is applied to restructure these category codes into a 1-D chain code. The chain code is verified by matching it to a code database. To further enhance the method, a fuzzy neural network system has been designed and implemented to recognise characters in printed and standard writing, using uncertainty and topology analysis, fuzzy possibilistic reasoning, neocognitron and associative memory neural networks, chain code method and error probability method. A software system has been written using the C programming language and X View function. Test results of the system have been obtained. Improvement of the system to deal with vagueness and ambiguity (two separate characteristics) during recognition has been carried out at several stages and the recognition rate has been increased to 96%.

The main achievements include the structural representation of Chinese characters, extraction of radicals, recognition and verification of characters, and simplifying the recognition process.

Item Type: Thesis
Creators: Ren, M.
Date: 1998
ISBN: 9781369313260
Identifiers:
Number
Type
PQ10183034
Other
Rights: This copy of the thesis has been supplied on condition that anyone who consults it is understood to recognise that its copyright rests with its author and that no quotation from the thesis and no information derived from it may be published without the author’s prior written consent.
Divisions: Schools > School of Science and Technology
Record created by: Linda Sullivan
Date Added: 28 Aug 2020 14:31
Last Modified: 21 Jun 2023 10:58
URI: https://irep.ntu.ac.uk/id/eprint/40590

Actions (login required)

Edit View Edit View

Statistics

Views

Views per month over past year

Downloads

Downloads per month over past year