ABSTRACT
We describe an implemented system which automatically generates and animates conversations between multiple human-like agents with appropriate and synchronized speech, intonation, facial expressions, and hand gestures. Conversation is created by a dialogue planner that produces the text as well as the intonation of the utterances. The speaker/listener relationship, the text, and the intonation in turn drive facial expressions, lip motions, eye gaze, head motion, and arm gestures generators. Coordinated arm, wrist, and hand motions are invoked to create semantically meaningful gestures. Throughout we will use examples from an actual synthesized, fully animated conversation.
Supplemental Material
Available for Download
- 1.M. Argyle and M. Cook. Gaze and Mutual gaze. Cambridge University Press, 1976.Google Scholar
- 2.N. I. Badler, B. A. Barsky, and D. Zeltzer, editors. Making ThemMove: Mechanics, Control, and Animation of Articulated Figures. Morgan-Kaufmann, San Mateo, CA, 1991. Google Scholar
Digital Library
- 3.N. I. Badler, C. Phillips and B. L. Webber. Simulating Humans: Computer Graphics, Animation, and Control. Oxford University Press, June 1993. Google Scholar
Digital Library
- 4.Welton M. Becket. The jack lisp api. Technical Report MS-CIS-94-01/Graphics Lab 59, University of Pennsylvania, 1994.Google Scholar
- 5.Tom Calvert. Composition of realistic animation sequences for multiple human figures. In Norman I. Badler, Brian A. Barsky, and David Zeltzer, editors, Making Them Move: Mechanics, Control, and Animation of Articulated Figures, pages 35-50. Morgan-Kaufmann, San Mateo, CA, 1991. Google Scholar
Digital Library
- 6.J. Cappella. personal communication, 1993.Google Scholar
- 7.Justine Cassell, Mark Steedman, Norm Badler, Catherine Pelachaud, Matthew Stone, Brett Douville, Scott Prevost and Brett Achorn. Modeling the interaction between speech and gesture. Proceedings of the Cognitive Science Society Annual Conference, 1994.Google Scholar
Cross Ref
- 8.Justine Cassell and David McNeill. Gesture and the poetics of prose. Poetics Today, 12:375-404, 1992.Google Scholar
Cross Ref
- 9.Justine Cassell, David McNeill, and Karl-Erik McCullough. Kids, don't try this at home: Experimental mismatches of speech and gesture. presented at the International Communication Association annual meeting, 1993.Google Scholar
- 10.D. T. Chen, S. D. Pieper, S. K. Singh, J. M. Rosen, and D. Zeltzer. The virtual sailor: An implementation of interactive human body modeling. In Proc. 1993 Virtual Reality Annual International Symposium, Seattle, WA, September 1993. IEEE.Google Scholar
Digital Library
- 11.M.M. Cohen and D.W. Massaro. Modeling coarticulation in synthetic visual speech. In N.M. Thalmann and D.Thalmann, editors, Models and Techniques in Computer Animation, pages 139-156. Springer-Verlag, 1993.Google Scholar
Cross Ref
- 12.G. Collier. Emotional expression. Lawrence Erlbaum Associates, 1985.Google Scholar
- 13.W.S. Condon and W.D. Osgton.Speech and body motion synchrony of the speaker-hearer. In D.H. Hortonand J.J. Jenkins, editors, The perceptionof Language, pages 150-184. Academic Press, 1971.Google Scholar
- 14.S. Duncan. Some signals and rules for taking speaking turns in conversations. In Weitz, editor, Nonverbal Communication. Oxford University Press, 1974.Google Scholar
- 15.P. Ekman. Movements with precise meanings. The Journal of Communication, 26, 1976.Google Scholar
- 16.P. Ekman. About brows: emotional and conversational signals. In M. von Cranach, K. Foppa, W. Lepenies, and D. Ploog, editors, Humanethology: claims and limits of a new disipline: contributions to the Colloquium, pages 169-248. Cambridge University Press, Cambridge, England; New-York, 1979.Google Scholar
- 17.P. Ekmanand W. Friesen. Facial Action Coding System. Consulting Psychologists Press, Inc., 1978.Google Scholar
- 18.Jean-Paul Gourret, Nadia Magnenat-Thalmann, and Daniel Thalmann. Simulation of object and human skin deformations in a grasping task. Computer Graphics, 23(3):21-30, 1989. Google Scholar
Digital Library
- 19.P. Kalra, A. Mangili, N. Magnenat-Thalmann, and D. Thalmann. Smile: A multilayeredfacial animationsystem. In T.L. Kunii, editor, Modeling in Computer Graphics. Springer-Verlag, 1991.Google Scholar
Cross Ref
- 20.A. Kendon. Movement coordination in social interaction: some examples de-scribed. In Weitz, editor, Nonverbal Communication. Oxford University Press, 1974.Google Scholar
- 21.AdamKendon. Gesticulation and speech: Two aspects of the process of utterance. In M.R.Key, editor, The Relation between Verbal and Nonverbal Communication, pages 207-227. Mouton, 1980.Google Scholar
Cross Ref
- 22.Jintae Lee and Tosiyasu L. Kunii. Visual translation: From native language to sign language. In Workshop on Visual Languages, Seattle, WA, 1993. IEEE.Google Scholar
- 23.Philip Lee, Susanna Wei, Jianmin Zhao, and Norman I. Badler. Strength guided motion. Computer Graphics, 24(4):253-262, 1990. Google Scholar
Digital Library
- 24.Mark Liberman and A. L. Buchsbaum. Structure and usage of current Bell Labs text to speech programs. Technical MemorandumTM 11225-850731-11, AT&T Bell Laboratories, 1985.Google Scholar
- 25.Jeffrey Loomis, Howard Poizner, Ursula Bellugi, Alynn Blakemore, and John Hollerbach. Computer graphic modeling of American Sign Language. Computer Graphics, 17(3):105-114, July 1983. Google Scholar
Digital Library
- 26.Nadia Magnenat-Thalmann and Daniel Thalmann. Human body deformations using joint-dependent local operators and finite-element theory. In Norman I. Badler, Brian A. Barsky, and David Zeltzer, editors, Making Them Move: Me-chanics, Control, and Animation of Articulated Figures, pages 243-262.Morgan-Kaufmann, San Mateo, CA, 1991. Google Scholar
Digital Library
- 27.David McNeill. Handand Mind: What Gestures Reveal about Thought. University of Chicago, 1992.Google Scholar
- 28.M. Patel. Making FACES. PhD thesis, School of Mathematical Sciences, Univer-sity of Bath, Bath, AVON, UK, 1991.Google Scholar
- 29.C. Pelachaud, N.I. Badler, and M. Steedman. Linguistic issues in facial animation. In N. Magnenat-Thalmann and D. Thalmann, editors, Computer Animation '91, pages 15-30. Springer-Verlag, 1991.Google Scholar
Cross Ref
- 30.Richard Power. The organisation of purposeful dialogues. Linguistics, 1977.Google Scholar
- 31.Scott Prevost and Mark Steedman. Generating contextually appropriate intonation. In Proceedings of the 6th Conference of the European Chapter of the Association for Computational Linguistics, pages 332-340, Utrecht, 1993. Google Scholar
Digital Library
- 32.Ellen F. Prince. The ZPG letter: Subjects, definiteness and information status. In S. Thompson and W. Mann, editors, Discourse description: diverse analyses of a fund raising text, pages 295-325. John Benjamins B.V., 1992.Google Scholar
Cross Ref
- 33.Hans Rijpkema and Michael Girard. Computer animation of hands and grasping. Computer Graphics, 25(4):339-348, July 1991. Google Scholar
Digital Library
- 34.Barbara Robertson. Easy motion. Computer Graphics World, 16(12):33-38, December 1993.Google Scholar
- 35.Klaus R. Scherer. The functions of nonverbal signs in conversation. In H. Giles R. St. Clair, editor, The Social and Physhological Contexts of Language, pages 225-243. Lawrence Erlbaum Associates, 1980.Google Scholar
- 36.Mark Steedman. Structure and intonation. Language, 67:260-296, 1991.Google Scholar
Cross Ref
- 37.Akikazu Takeuchi and Katashi Nagao. Communicative facial displays as a new conversational modality. In ACM/IFIP INTERCHI'93, Amsterdam, 1993. Google Scholar
Digital Library
- 38.K. Tuite. The production of gesture. Semiotica, 93(1/2), 1993.Google Scholar
Index Terms
Animated conversation: rule-based generation of facial expression, gesture & spoken intonation for multiple conversational agents
Recommendations
Animated Lombard speech: Motion capture, facial animation and visual intelligibility of speech produced in adverse conditions
In this paper we study the production and perception of speech in diverse conditions for the purposes of accurate, flexible and highly intelligible talking face animation. We recorded audio, video and facial motion capture data of a talker uttering a ...
Photo-realistic conversation agent
Integrated image and graphics technologiesImage-based facial animation (IBFA) techniques allow us to build photorealistic talking heads. In this chapter, we address two important issues in creating a conversation agent using IBFA. First, we show how to make a conversation agent appear to be ...
Animated conversation: rule-based generation of facial expression, gesture and spoken intonation for multiple conversational agents
Readings in intelligent user interfaces




Comments