ABSTRACT
Computer-generated speech animations are commonly seen in video games and movies. Although high-quality facial motions can be created by the hand crafted work of skilled artists, this approach is not always suitable because of time and cost constraints. A data-driven approach [Taylor et al. 2012], such as machine learning to concatenate video portions of speech training data, has been utilized to generate natural speech animation, while a large number of target shapes are often required for synthesis. We can obtain smooth mouth motions from prepared lip shapes for typical vowels by using an interpolation of lip shapes with Gaussian mixture models (GMMs) [Yano et al. 2007]. However, the resulting animation is not directly generated from the measured lip motions of someone's actual speech.
Supplemental Material
Available for Download
Supplemental material.
- Taylor, S. L., et al. 2012. Dynamic Units of Visual Speech. In Proc. ACM SCA 2012, 275--284. Google Scholar
Digital Library
- Yano, A., et al. 2007. Variable Rate Speech Animation Synthesis. In Proc. ACM SIGGRAPH 2007, Poster, no. 18. Google Scholar
Digital Library
- Lee, A., et al. 2009. Recent Development of Open-source Speech Recognition Engine Julius. In Proc. APSIPA ASC 2009, 131--137.Google Scholar
Index Terms
Efficient speech animation synthesis with vocalic lip shapes
Recommendations
Automated lip-synch and speech synthesis for character animation
An automated method of synchronizing facial animation to recorded speech is described. In this method, a common speech synthesis method (linear prediction) is adapted to provide simple and accurate phoneme recognition. The recognized phonemes are then ...
Automated lip-synch and speech synthesis for character animation
CHI '87: Proceedings of the SIGCHI/GI Conference on Human Factors in Computing Systems and Graphics InterfaceAn automated method of synchronizing facial animation to recorded speech is described. In this method, a common speech synthesis method (linear prediction) is adapted to provide simple and accurate phoneme recognition. The recognized phonemes are then ...
Automated lip-synch and speech synthesis for character animation
An automated method of synchronizing facial animation to recorded speech is described. In this method, a common speech synthesis method (linear prediction) is adapted to provide simple and accurate phoneme recognition. The recognized phonemes are then ...




Comments