TY - GEN
T1 - A novel automatic lip reading method based on polynomial fitting
AU - Li, Meng
AU - CHEUNG, Yiu Ming
N1 - Copyright:
Copyright 2010 Elsevier B.V., All rights reserved.
PY - 2010
Y1 - 2010
N2 - This paper addresses the problem of isolate number recognition using visual information only. We utilize the intensity transformation and spatial filter to estimate the minimum enclosing rectangle of mouth in each frame. For each utterance, we obtain the two vectors composed of width and height of mouth, respectively. Then, we present a method to recognize the speech based on the polynomial fitting. Firstly, both width and height vectors are normalized and arranged into the constant length via interpolation. Secondly, least square method is utilized to produce two 3-order polynomials that can represent the main trend of the two vectors, respectively, and reduce the noise caused by the estimate error. Lastly, the positions of three crucial points (i.e. maximum, minimum, and right boundary point) in each 3-order polynomial curve are formed as a feature vector. For each utterance, we calculate the average of all vectors of training data to make a template, and utilize Euclidean distance between the template and testing data to perform the classification. Experiments show the promising results of the proposed approach in comparison with the existing methods.
AB - This paper addresses the problem of isolate number recognition using visual information only. We utilize the intensity transformation and spatial filter to estimate the minimum enclosing rectangle of mouth in each frame. For each utterance, we obtain the two vectors composed of width and height of mouth, respectively. Then, we present a method to recognize the speech based on the polynomial fitting. Firstly, both width and height vectors are normalized and arranged into the constant length via interpolation. Secondly, least square method is utilized to produce two 3-order polynomials that can represent the main trend of the two vectors, respectively, and reduce the noise caused by the estimate error. Lastly, the positions of three crucial points (i.e. maximum, minimum, and right boundary point) in each 3-order polynomial curve are formed as a feature vector. For each utterance, we calculate the average of all vectors of training data to make a template, and utilize Euclidean distance between the template and testing data to perform the classification. Experiments show the promising results of the proposed approach in comparison with the existing methods.
UR - http://www.scopus.com/inward/record.url?scp=78649851371&partnerID=8YFLogxK
U2 - 10.1007/978-3-642-15470-6_31
DO - 10.1007/978-3-642-15470-6_31
M3 - Conference proceeding
AN - SCOPUS:78649851371
SN - 3642154697
SN - 9783642154690
T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
SP - 296
EP - 305
BT - Active Media Technology - 6th International Conference, AMT 2010, Proceedings
T2 - 2010 6th International Conference on Active Media Technology, AMT 2010
Y2 - 28 August 2010 through 30 August 2010
ER -