System for recognizing speech

A pattern recognition system particularly useful for recognizing speech or handwriting. An input signal is first filtered by a filter bank having two stages where the outputs of the first stage is fed forward to the second stage of a significant number of filters and the output of the second stage is fed back to the first stage of a significant number of the filters. Such feedback enhances the signal-to-noise ratio and resembles the coupling between the different sections of the basilar membrane of the cochlear. The output of the filter bank is a two-dimensional frequency-time representation of the original signal. A second set of filters which takes as input two-dimensional signals, detects the presence of elementary tonotopic features such as the onset, rise, fall and frequency of any significant tones in a speech signal. A third set of filters detects any contrasts in the elementary features at various levels of resolution. After such filtering, a neural network is employed to learn patterns formed from the multi-resolution contrasts in the identified features so that the system recognizes symbols from an input signal that is continuous in time. In the case of speech, the system recognizes continuous speech in a speaker-independent manner, and is also tolerant of noise.

U.S. Patent 5377302, 1994