ABSTRACT
Natural language is an easy and effective medium for describing visual ideas and mental images. Thus, we foresee the emergence of language-based 3D scene generation systems to let ordinary users quickly create 3D scenes without having to learn special software, acquire artistic skills, or even touch a desktop window-oriented interface. WordsEye is such a system for automatically converting text into representative 3D scenes. WordsEye relies on a large database of 3D models and poses to depict entities and actions. Every 3D model can have associated shape displacements, spatial tags, and functional properties to be used in the depiction process. We describe the linguistic analysis and depiction techniques used by WordsEye along with some general strategies by which more abstract concepts are made depictable.
- 1.Proceedings of the Sixth Message Understanding Conference (MUC-6), San Mateo, CA, 1995. Morgan Kaufmann.Google Scholar
- 2.G. Adorni, M. Di Manzo, and F. Giunchiglia. Natural Language Driven Image Generation. In COLING 84, pages 495- 500, 1984. Google Scholar
Digital Library
- 3.N. Badler, R. Bindiganavale, J. Allbeck, W. Schuler, L. Zhao, and M. Palmer. Parameterized Action Representation for Virtual Human Agents. In J. Cassell, J. Sullivan, S. Prevost, and E. Churchill, editors, Embodied Conversational Agents, pages 256-284. MIT Press, Cambridge, MA, 2000. Google Scholar
Digital Library
- 4.R. Bindiganavale, W. Schuler, J. Allbeck, N. Badler, A. Joshi, and M. Palmer. Dynamically Altering Agent Behaviors Using Natural Language Instructions. In Autonomous Agents, pages 293-300, 2000. Google Scholar
Digital Library
- 5.C. Brugman. The Story of Over. Master's thesis, University of California, Berkeley, Berkeley, CA, 1980.Google Scholar
- 6.Y. Chang and A. P. Rockwood. A Generalized de Casteljau Approach to 3D Free-Form Deformation. In SIGGRAPH 94 Conference Proceedings, pages 257-260. ACM SIGGRAPH, Addison Wesley, 1994. Google Scholar
Digital Library
- 7.K. Church. A Stochastic Parts Program and Noun Phrase Parser for Unrestricted Text. In Proceedings of the Second Conference on Applied Natural Language Processing, pages 136-143, Morristown, NJ, 1988. Association for Computational Linguistics. Google Scholar
Digital Library
- 8.S. R. Clay and J. Wilhelms. Put: Language-Based Interactive Manipulation of Objects. IEEE Computer Graphics and Applications, pages 31-39, March 1996. Google Scholar
Digital Library
- 9.M. Collins. Head-Driven Statistical Models for Natural Language Parsing. PhD thesis, University of Pennsylvania, Philadelphia, PA, 1999. Google Scholar
Digital Library
- 10.C. Fellbaum, editor. WordNet: An Electronic Lexical Database. MIT Press, Cambridge, MA, 1998.Google Scholar
Cross Ref
- 11.C. Freksa, C. Habel, and K. F. Wender, editors. Spatial Cognition. Springer, Berlin, 1998.Google Scholar
- 12.J. Funge, X. Tu, and D. Terzopoulos. Cognitive Modeling: Knowledge, Reasoning and Planning for Intelligent Characters. In SIGGRAPH 99 Conference Proceedings, pages 29- 38. ACM SIGGRAPH, Addison Wesley, 1999. Google Scholar
Digital Library
- 13.Sanda Harabagiu and Steven Maiorano. Knowledge-lean coreference resolution and its relation to textual cohesion and coherence. In Proceedings of the ACL-99 Workshop on the Relation of Discourse/Dialogue Structure and Reference, pages 29-38, College Park, MD, 1999. Association for Computational Linguistics.Google Scholar
- 14.B. Hawkins. The Semantics of English Spatial Prepositions. PhD thesis, University of California, San Diego, San Diego, CA, 1984.Google Scholar
- 15.A. Herskovitz. Language and Spatial Cognition: An Interdisciplinary Study of the Prepositions in English. Cambridge University Press, Cambridge, 1986. Google Scholar
Digital Library
- 16.R. Hudson. Word Grammar. Blackwell, Oxford, 1984.Google Scholar
- 17.R. Langacker. Foundations of Cognitive Grammar : Theoretical Prerequisites. Stanford University Press, Stanford, CA, 1987.Google Scholar
- 18.B. Levin. English Verb Classes And Alternations: A Preliminary Investigation. University of Chicago Press, Chicago, IL, 1993.Google Scholar
- 19.M. Marcus, B. Santorini, and M. A. Marcinkiewicz. Building a Large Annotated Corpus of English: the Penn Treebank. Computational Linguistics, 19(2):313-330, 1993. Google Scholar
Digital Library
- 20.P. Olivier and K.-P. Gapp, editors. Representation and Processing of Spatial Prepositions. Lawrence Erlbaum Associates, Mahwah, NJ, 1998. Google Scholar
Digital Library
- 21.C. W. Reynolds. Flocks, Herds and Schools: A Distributed Behavioral Model. In SIGGRAPH 87 Conference Proceedings, pages 25-34. ACM SIGGRAPH, Addison Wesley, 1987. Google Scholar
Digital Library
- 22.G. Senft, editor. Referring to Space: Studies in Austronesian and Papuan Languages. Clarendon Press, Oxford, 1997.Google Scholar
- 23.T. Winograd. Understanding Natural Language. PhD thesis, Massachusetts Institute of Technology, 1972.Google Scholar
- 24.A. Yamada. Studies on Spatial Description Understanding based on Geometric Constraints Satisfaction. PhD thesis, Kyoto University, Kyoto, 1993.Google Scholar
- 25.J. Zhao and N. Badler. Inverse Kinematics Positioning Using Nonlinear Programming for Highly Articulated Figures. ACM Transactions on Graphics, pages 313-336, October 1994. Google Scholar
Digital Library
Index Terms
WordsEye: an automatic text-to-scene conversion system
Recommendations
Controllable Scene Generation from Natural Language
AbstractWe propose a novel framework to generate recognizable scenes conditioned on natural language (NL) descriptions. The proposed modular approach decomposes the scene synthesis process into several manageable steps, in which it first infers a spatial ...
WordsEye: A Text-to-Scene Conversion System
PorTAL '02: Proceedings of the Third International Conference on Advances in Natural Language ProcessingI will present WordsEye, a natural language understanding system that generates three-dimensional scenes from English descriptions of those scenes (joint work with Bob Coyne, Chris Johnson, Owen Rambow, Srinivas Bangalore).WordsEye works by first ...
Visualizing Natural Language Descriptions: A Survey
A natural language interface exploits the conceptual simplicity and naturalness of the language to create a high-level user-friendly communication channel between humans and machines. One of the promising applications of such interfaces is generating ...





Comments