skip to main content
research-article

The conductor interaction method

Published:12 December 2007Publication History
Skip Abstract Section

Abstract

Computers have increasingly become part of our everyday lives, with many activities either involving their direct use or being supported by one. This has prompted research into developing methods and mechanisms to assist humans in interacting with computers (human-computer interaction, or HCI). A number of HCI techniques have been developed over the years, some of which are quite old but continue to be used, and some more recent and still evolving. Many of these interaction techniques, however, are not natural in their use and typically require the user to learn a new means of interaction. Inconsistencies within these techniques and the restrictions they impose on user creativity can also make such interaction techniques difficult to use, especially for novice users.

This article proposes an alternative interaction method, the conductor interaction method (CIM), which aims to provide a more natural and easier-to-learn interaction technique. This novel interaction method extends existing HCI methods by drawing upon techniques found in human-human interaction. It is argued that the use of a two-phased multimodal interaction mechanism, using gaze for selection and gesture for manipulation, incorporated within a metaphor-based environment, can provide a viable alternative for interacting with a computer (especially for novice users). Both the model and an implementation of the CIM within a system are presented in this article. This system formed the basis of a number of user studies that have been performed to assess the effectiveness of the CIM, the findings of which are discussed in this work.

References

  1. 5DT. 2004. Data glove. http://www.5dt.com/.Google ScholarGoogle Scholar
  2. Argyle, M. 1996. Bodily Communication, 2nd ed., Routledge, London.Google ScholarGoogle Scholar
  3. Aoki, T. 1999. MONJUnoCHIE system: Videoconference system with eye contact for decision making. In Proceedings of the International Workshop on Advanced Image Technology (IWAIT).Google ScholarGoogle Scholar
  4. Bauer, B. and Kraiss, K. F. 2002. Towards an automatic sign language recognition system using subunits. In Proceedings of the International Gesture Workshop, on Gestrure and Sign Language in Human-Computer Interaction. Lecture Notes in Computer Science, vol. 2298, Springer, Heidelberg, Germany. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. Bauml, B. J. and Bauml, F. H. 1997. Dictionary of Worldwide Gestures. Scarecrow Press. MD.Google ScholarGoogle Scholar
  6. Benford, S., Snowdon, D., Greenhalgh, C., Ingram, R., Knox, I., and Brown, C. 1995. VR-VIBE: A virtual environment for co-operative information retrieval. In Proceedings of the Conference, Eurographics. Maastricht, The Netherlands.Google ScholarGoogle Scholar
  7. Bolt, R. 1980. “Put-that-there”: Voice and gesture at the graphics interface. In Proceedings of the 7th Annual Conference on Computer Graphics and Interactive Techniques, Seattle, WA. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. Bolt, R. and Harranz, E. 1992. Two-Handed gesture in multi-modal natural dialog. In Proceedings of the 5th Annual ACM Symposium on User Interface Software and Technology (UIST). Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. Borchers, O. 1997. WorldBeat: Designing a baton-based interface for an interactive music exhibit. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems, Atlanta, GA. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. Chen, W. C. 2000. Toward a compelling sensation of telepresence: Demonstrating a portal to a distant (static) office. In Proceedings of the IEEE Visualization Conference, Salt Lake City, UT. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. Dix, A. J., Finlay, J. E., Abowd, G. D., and Beale, R. 2004. Human-Computer Interaction, 3rd ed., Prentice Hall, Hertfordshire, UK. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. Farid, M. M. and Murtagh, F. 2002. Eye-Movements and voice as interface modalities to computer systems. In Proceedings of the SPIE Regional Meeting on Optoelectronics, Photonics and Imaging (OPTO), Galway, Ireland.Google ScholarGoogle Scholar
  13. Farid, M. M., Murtagh, F., and Starck J. L. 2002. Computer display control and interaction using eye-gaze, J. Soc. Inf. Display.Google ScholarGoogle ScholarCross RefCross Ref
  14. Gribbs, S. J. 1999. TELEPORT-Towards immersive copresence. Multimedia Syst. 7, 3, 214--221. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. Gips, J. and Olivieri, P. 1996. EagleEyes: An eye control system for persons with disabilities. In Proceedings of the 11th International Conference on Technology and Persons with Disabilities, Los Angeles, CA.Google ScholarGoogle Scholar
  16. Hart, S. G. 1987. Background description and application of the NASA task load index (TLX). In Proceedings of the Department of Defence Human Engineering Technical Advisory Group Workshop on Workload (NUSC6688) Newport, RI, 90--97.Google ScholarGoogle Scholar
  17. Instance, H. and Howarth, P. 1994. Keeping an eye on your interface: The potential for eye-gaze control of graphical user interfaces. In Proceedings of the Conference on Human-Computer Interaction, People and Computers, Glasgow, 195--209. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. Jacob, R. J. K. 1991. The use of eye movements in human-computer interaction techniques: What you look at is what you get. ACM Trans. Inf. Syst. 9, 3, (Apr.). Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. Kam, K. S. 2002. Definitions “dwell” vs. “gaze”. In Eye-Movement Mailing List, http://www.jiscmail.ac.uk/files/eye-movement/introduction.html.Google ScholarGoogle Scholar
  20. Kendon, A. 1992. The negotiation of context in face-to-face Interaction. In Rethinking context: Language as an Interactive Phenomenon, Duranti and Goodwin, Eds. Cambridge University Press.Google ScholarGoogle Scholar
  21. Kauff, P. and Schreer, O. 2002. An immersive 3D video-conferencing system using shared virtual team user environments. In Proceedings of the 4th International Conference on Collaborative Virtual Environments (CVE), Bonn, Germany, 105--112. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. LC-Technologies 1986. VOG. http://www.eyegaze.com/.Google ScholarGoogle Scholar
  23. Marrin, T. and Paradiso, J. 1997. The digital baton: A versatile performance instrument. In Proceedings of the International Computer Music Conference, Thessaloniki, Greece. Natural Point. 1997. Optical tracking systems. http://www.naturalpoint.com/.Google ScholarGoogle Scholar
  24. Park, E., Kim, B., Salim, W., and Cheok, A. D. 2006. Magic Asian art. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems, Montréal, Québec, Canada, 255--258. Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. Qvarfordt, P. and Zhai, S. 2005. Conversing with the user based on eye-gaze patterns. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems, Portland, OR, 221--230. Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. Salvucci, D. D. 1999. Inferring intent in eye-based interfaces: Tracing eye movements with process models. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems, Pittsburgh, PA. Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. Shneiderman, B. and Plaisant, C. 2005. Designing the User Interface: Strategies for Effective Human-Computer Interaction. Addison-Wesley.Google ScholarGoogle Scholar
  28. Sibert, L. E. and Jacob, R. J. K. 2000. Evaluation of gaze interaction. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems, The Hague, The Netherlands. Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. Sowa, T., Frohlich, M., and Latoschik, M. E. 1999. Temporal symbolic integration applied to a multimodal system using gestures and speech. In Proceedings of the International Gesture Workshop on Gesture-Based Communication in Human-Computer Interaction, Gif-Sur-Yvette, France. Lecture Notes in Computer Science, vol. 1739. Springer, Heidelberg, Germany. Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. Thorisson, K. R. 1998. Real-Time decision-making in multimodal face-to-face communication. In Proceedings of the Autonomous Agents Conference, Minneapolis, MN. Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. Vertegaal, R. 1999. The gaze groupware system: Mediating joint attention in multiparty communication and collaboration. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems, Pittsburgh, PA. Google ScholarGoogle ScholarDigital LibraryDigital Library
  32. Vertegaal, R., Slagter, R., van der Veer, G., and Nijholt, A. 2001. Eye gaze patterns in conversations: There is more the conversational agents than meets the eyes. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems, Seattle, WA. Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. Zhai, S., Morimoto, C., and Ihde, S. 1999. Manual and gaze input cascade (MAGIC) pointing. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems, Pittsburgh, PA. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. The conductor interaction method

        Recommendations

        Reviews

        Bernice T. Glenn

        Put-that-there, developed by Richard Bolt in 1980, amazed and astonished viewers at a time when most computers did not have a graphic interface. A person using speech and gestures aimed at a computer screen directed almost magical movement of ships at sea. Since then, there have been many papers describing alternative methods to traditional computer inputs, including more costly gesture recognition. Fast-forward to the present to read about a current methodology, envisioned by the authors of this paper. Think, as they do, of the user as the conductor of a grand opera, using only gaze and gestures to represent an environment on a computer screen stage. Icons representing a database are the metaphorical musicians in this new possibility. Using only gaze and gestures, the user-conductor can call up a picture to serve as a background for a scene on the stage. Other gestures can resize the picture and make it brighter, or the user-conductor can gesture to put it away and choose another picture just by staring at it. Conductor Interaction, a prototype developed by the authors, is an interactive methodology depending purely on gesture and gaze as its interactive mechanisms. The authors developed their prototype because they found that many interaction systems, both past and present, are not natural for human use or learning, are creatively restrictive, and may be quite difficult to learn. Although the authors agree that computer technology for capturing gestures is costly, they suggest that technology is developing devices that are less intrusive than data gloves, and are also less expensive. Conductor Interaction uses two metaphors: the orchestra, representing the application environment, and the conductor, which represents user interactions. The orchestra stands for the resources available to the user; it uses a stage as its screen display presentation space. The stage contains a database containing icons for music, sound and sound effects, animation, photos, videos, and saved presentations. The conductor is the application user who interacts with the system to create a presentation through gaze and gesture. The gaze interface allows the user-conductor to select elements from a visual display. The system recognizes the user's focus on a displayed object, and sends a message to the system to activate it. The gesture interface allows the user-conductor to manipulate a gaze-chosen object through a series of hand positions. Optical fiber gloves are used. An audio-visual interface displays the choices the user has made as a visual presentation on the stage. Audio can be output to the stage display, but cannot provide any feedback to gesture and gaze interfaces. The authors' example with graphics shows how all of the elements in Conductor Interaction are used. At the start, the computer screen displays an empty stage. Jack, our example conductor, chooses a background. To do so, Jack stares at the picturebook icon on the stage. It highlights, and as Jack continues to focus his gaze, the program activates a series of available pictures. Jack scrolls though the list by moving his palm up and down. When he sees a picture he likes, Jack points at it with his index finger to load it. The picture appears on the stage, and is darker than Jack likes. By gazing at the brightness icon on a media control panel, the icon highlights. Jack twists his left hand to adjust the brightness. With another gesture, Jack exits the media editor and continues to make other adjustments. The authors performed user studies to assess their methodology. They wanted to find out whether experience with PowerPoint had any effect on user interactions. Their subjects were made up of two groups: experienced computer users with extensive PowerPoint knowledge, and inexperienced users with little PowerPoint experience. Overall feedback was mixed. Most users understood the metaphors and found them appropriate. They found the hand gestures to be natural, although the system did not always correctly recognize some that were more difficult to make. Many of the experienced users, however, were frustrated with the amount of learning needed. In contrast, inexperienced users saw little difference in the learning curve needed for Conductor Interaction when compared to other applications they learned. Subject feedback suggested investigating how this application might be useful to people with disabilities who use gestures to communicate in everyday life. Other suggestions to explore were domains relying on manipulation, such as mixing music or designing. While the authors' primary purpose was to develop an application that did not depend on speech, but rather relied on gaze and gesture, they were really unclear as to how their application would be used, and by whom. While subject feedback gave them some ideas, the authors need to give more thought to the proposed users of their system, combined with the costs of setting up the application for them. Organizations may be quite willing to pay for an expensive system that will serve a large number of people, but this may be a system that individual users could not afford. Online Computing Reviews Service

        Access critical reviews of Computing literature here

        Become a reviewer for Computing Reviews.

        Comments

        Login options

        Check if you have access through your login credentials or your institution to get full access on this article.

        Sign in

        Full Access

        • Published in

          cover image ACM Transactions on Multimedia Computing, Communications, and Applications
          ACM Transactions on Multimedia Computing, Communications, and Applications  Volume 3, Issue 4
          December 2007
          147 pages
          ISSN:1551-6857
          EISSN:1551-6865
          DOI:10.1145/1314303
          Issue’s Table of Contents

          Copyright © 2007 ACM

          Publisher

          Association for Computing Machinery

          New York, NY, United States

          Publication History

          • Published: 12 December 2007
          • Received: 1 August 2007
          • Accepted: 1 August 2007
          Published in tomm Volume 3, Issue 4

          Permissions

          Request permissions about this article.

          Request Permissions

          Check for updates

          Qualifiers

          • research-article
          • Research
          • Refereed

        PDF Format

        View or Download as a PDF file.

        PDF

        eReader

        View online with eReader.

        eReader
        About Cookies On This Site

        We use cookies to ensure that we give you the best experience on our website.

        Learn more

        Got it!