ABSTRACT
The Behavior Expression Animation Toolkit (BEAT) allows animators to input typed text that they wish to be spoken by an animated human figure, and to obtain as output appropriate and synchronized nonverbal behaviors and synthesized speech in a form that can be sent to a number of different animation systems. The nonverbal behaviors are assigned on the basis of actual linguistic and contextual analysis of the typed text, relying on rules derived from extensive research into human conversational behavior. The toolkit is extensible, so that new rules can be quickly added. It is designed to plug into larger systems that may also assign personality profiles, motion characteristics, scene constraints, or the animation styles of particular animators.
- 1.Amaya, K., Bruderlin, A., and Calvert, T., Emotion from motion. Proc. Graphics Interface'96, pp. 222-229, 1996. Google Scholar
Digital Library
- 2.Badler, N., Bindiganavale, R., Allbeck, J., Schuler, W., Zhao, L., and Palmer., M., Parameterized Action Representation for Virtual Human Agents., in Embodied Conversational Agents, J. Cassell, J. Sullivan, S. Prevost, and E. Churchill, Eds. Cambridge, MA: MIT Press, 2000, pp. 256-284. Google Scholar
- 3.Becheiraz, P. and Thalmann, D., A Behavioral Animation System for Autonomous Actors personified by Emotions, Proc. of the1st Workshop on Embodied Conversational Characters, 57-65, 1998.Google Scholar
- 4.Blumberg, B. and Galyean, T. A., Multi-Level Direction of Autonomous Creatures for Real-Time Virtual Environments. SIGGRAPH 95 Conference Proceedings, pp. 47-54, ACM SIGGRAPH, Addison Wesley, 1995. Google Scholar
Digital Library
- 5.Bodenheimer, B., Rose, C., and Cohen, M., Verbs and Adverbs: Multidimensional Motion Interpolation, IEEE Computer Graphics and Applications, vol. 18 (5), pp. 32-40, 1998. Google Scholar
Digital Library
- 6.Brand, M., Voice Puppetry. SIGGRAPH 99 Conference Proceedings, pp. 21-28, ACM SIGGRAPH, Addison Wesley, 1999. Google Scholar
Digital Library
- 7.Bregler, C., Covell, M., and Slaney, M., Video Rewrite: driving visual speech with audio. SIGGRAPH 97 Conference Proceedings, pp. 353-360, ACM SIGGRAPH, Addison Wesley, 1997. Google Scholar
Digital Library
- 8.Calvert, T., Composition of realistic animation sequences for multiple human figures, in Making Them Move: Mechanics, Control, and Animation of Articulated Figures, N. Badler, B. Barsky, and D. Zeltzer, Eds. San Mateo, CA: Morgan-Kaufmann, pp. 35-50, 1991. Google Scholar
Digital Library
- 9.Cassell, J., Nudge, Nudge, Wink, Wink: Elements of Face-to-Face Conversation for Embodied Conversational Agents, in Embodied Conversational Agents, J. Cassell, J. Sullivan, S. Prevost, and E. Churchill, Eds. Cambridge: MIT Press, pp. 1-27, 2000. Google Scholar
Digital Library
- 10.Cassell, J., Pelachaud, C., Badler, N., Steedman, M., Achorn, B., Becket, T., Douville, B., Prevost, S., and Stone, M., Animated Conversation: Rule-Based Generation of Facial Expression, Gesture and Spoken Intonation for Multiple Conversational Agents. Siggraph 94 Conference Proceedings, ACM SIGGRAPH, Addison Wesley, pp. 413-420, 1994. Google Scholar
Digital Library
- 11.Cassell, J. and Prevost, S., Distribution of Semantic Features Across Speech and Gesture by Humans and Computers. Proc. Workshop on the Integration of Gesture in Language and Speech, pp. 253-270, Newark, DE, 1996.Google Scholar
- 12.Cassell, J., Torres, O., and Prevost, S., Turn Taking vs. Discourse Structure: How Best to Model Multimodal Conversation, in Machine Conversations, Y. Wilks, Ed. The Hague: Kluwer, pp. 143-154, 1999.Google Scholar
- 13.Chang, J., Action Scheduling in Humanoid Conversational Agents, M.S. Thesis in Electrical Engineering and Computer Science. Cambridge, MA: MIT, 1998.Google Scholar
- 14.Chi, D., Costa, M., Zhao, L., and Badler, N., The EMOTE model for effort and shape. SIGGRAPH 00 Conference Proceedings, ACM SIGGRAPH, Addison Wesley, pp. 173-182, 2000. Google Scholar
Digital Library
- 15.Colburn, A., Cohen, M. F., and Drucker, S., The Role of Eye Gaze in Avatar Mediated Conversational Interfaces, MSR-TR-2000-81. Microsoft Research, 2000.Google Scholar
- 16.Halliday, M. A. K., Explorations in the Functions of Language. London: Edward Arnold, 1973.Google Scholar
- 17.Hirschberg, J., Accent and Discourse Context: Assigning Pitch Accent in Synthetic Speech. Proc. AAAI 90, pp. 952-957, 1990.Google Scholar
- 18.Hiyakumoto, L., Prevost, S., and Cassell, J., Semantic and Discourse Information for Text-to-Speech Intonation. Proc. ACL Workshop on Concept-to-Speech Generation, Madrid, 1997.Google Scholar
- 19.Huang, X., Acero, A., Adcock, J., Hon, H.-W., Goldsmith, J., Liu, J., and Plumpe, M., Whistler: A Trainable Text-to-Speech System. Proc. 4th Int'l. Conf. on Spoken Language Processing (ICSLP '96), pp. 2387- 2390, Piscataway, NJ, 1996.Google Scholar
- 20.Kurlander, D., Skelly, T., and Salesin, D., Comic Chat, SIGGRAPH 96 Conference Proceedings, ACM SIGGRAPH, Addison Wesley, pp. 225-236, 1996. Google Scholar
Digital Library
- 21.Lenat, D. B. and Guha, R. V., Building Large Knowledge-Based Systems: Representation and Inference in the Cyc Project. Reading, MA: Addison Wesley, 1990. Google Scholar
Digital Library
- 22.Massaro, D. W., Perceiving Talking Faces: From Speech Perception to a Behavioral Principle. Cambridge, MA: MIT Press, 1987.Google Scholar
- 23.McNeill, D., Hand and Mind: What Gestures Reveal about Thought. Chicago, IL/London, UK: The University of Chicago Press, 1992.Google Scholar
- 24.Miller, G. A., Beckwith, R., Fellbaum, C., Gross, D., and Miller, K., Introduction to Wordnet: An On-line Lexical Database, 1993.Google Scholar
- 25.Nagao, K. and Takeuchi, A., Speech Dialogue with Facial Displays: Multimodal Human-Computer Conversation. Proc. ACL 94, pp. 102-109., , 1994. Google Scholar
Digital Library
- 26.Pearce, A., Wyvill, B., Wyvill, G., and Hill, D., Speech and expression: a computer solution to face animation. Proc. Graphics Interface, pp. 136-140, 1986. Google Scholar
Digital Library
- 27.Pelachaud, C., Badler, N., and Steedman, M., Generating Facial Expressions for Speech, Cognitive Science, 20(1), pp. 1-46, 1994.Google Scholar
Cross Ref
- 28.Perlin, K., Noise, Hypertexture, Antialiasing and Gesture, in Texturing and Modeling, A Procedural Approach, D. Ebert, Ed. Cambridge, MA: AP Professional, 1994.Google Scholar
- 29.Perlin, K. and Goldberg, A., Improv: A System for Scripting Interactive Actors in Virtual Worlds, Proceedings of SIGGRAPH '96, pp. 205-216, 1996. Google Scholar
Digital Library
- 30.Prevost, S. and Steedman, M., Specifying intonation from context for speech synthesis, Speech Communication, vol. 15, pp. 139-153, 1994. Google Scholar
Digital Library
- 31.Roehl, B., Specification for a Standard Humanoid, Version 1.1, H. A. W. Group, Ed. http://ece.uwaterloo.ca/~h-anim/spec1.1/, 1999.Google Scholar
- 32.Taylor, P., Black, A., and Caley, R., The architecture of the Festival Speech Synthesis System. Proc. 3rd ESCA Workshop on Speech Synthesis, pp. 147-151, Jenolan Caves, Australia, 1998.Google Scholar
- 33.Waters, K. and Levergood, T., An Automatic Lip- Synchronization Algorithm for Synthetic Faces. Proc. of the 2nd ACM international conference on Multimedia, pp. 149-156, San Francisco CA, 1994. Google Scholar
Digital Library
- 34.Yan, H., Paired Speech and Gesture Generation in Embodied Conversational Agents, M.S. thesis in the Media Lab. Cambridge, MA: MIT, 2000.Google Scholar
Index Terms
BEAT: the Behavior Expression Animation Toolkit
Recommendations
Spontaneous spoken dialogues with the furhat human-like robot head
HRI '14: Proceedings of the 2014 ACM/IEEE international conference on Human-robot interactionFurhat [1] is a robot head that deploys a back-projected animated face that is realistic and human-like in anatomy. Furhat relies on a state-of-the-art facial animation architecture allowing accurate synchronized lip movements with speech, and the ...
Multimodal multiparty social interaction with the furhat head
ICMI '12: Proceedings of the 14th ACM international conference on Multimodal interactionWe will show in this demonstrator an advanced multimodal and multiparty spoken conversational system using Furhat, a robot head based on projected facial animation. Furhat is a human-like interface that utilizes facial animation for physical robot heads ...
Physically-based forehead animation including wrinkles
Physically-based animation techniques enable more realistic and accurate animation to be created. We present a fully physically-based approach for efficiently producing realistic-looking animations of facial movement, including animation of expressive ...





Comments