Funding: Swiss National Science Foundation
Duration: 2 years (October 1999 - September 2001)
Principal Investigator: Juergen Luettin
Internal Staff: Alessandro Vinciarelli
Keywords: statistical pattern recognition, computer vision,
optical character recognition, cursive script recognition, segmentation,
shape modelling, hidden Markov models, language modelling.
As of today, the capabilities of current CSR systems is far behind the
performance of human readers which prevents their use for practical applications
that demand high robustness and low error rates.
The main objective of this work is to investigate, develop, and test
new CSR methods that could improve the state-of-the-art of current systems.
More specifically, our key goals are to:
Evaluate and compare existing and new feature extraction techniques and
identify those which are best suited for cursive script modelling.
Apply state-of-the-art hidden Markov modelling techniques widely used in
Develop and integrate different levels of knowledge such as context dependency
modelling, orthographic constraints, and syntactical constraints.
Develop a complete system for user independent large vocabulary cursive
Perform extensive evaluations and iterative improvements of the resulting
Our approach to CSR is based on the combination of mathematical methods
from a number of different research domains including computer vision,
statistical pattern recognition, speech recognition, and language modelling.
We believe that such a multi-disciplinary approach holds considerable promise
to advance research in the field of cursive script recognition. Our approach
to address open research issues is as follows:
Apply shape modelling technique used in computer vision and computer graphics
for character representation and feature extraction.
Determine word/character units that are most suitable as basic model units
(sub-letters, letters, words, etc) for CSR.
Apply training and recognition methods based on Hidden Markov Models
Integrate higher level knowledge sources (lexicon, n-gram language model,
document topic) in the recognition process.