ABSTRACT
Over the years, screen readers have been an essential tool for assisting blind users in accessing digital information. Yet, its sequential nature undermines blind people's ability to efficiently find relevant information, despite the browsing strategies they have developed. We propose taking advantage of the Cocktail Party Effect, which states that people are able to focus on a single speech source among several conversations, but still identify relevant content in the background. Therefore, oppositely to one sequential speech channel, we hypothesize that blind people can leverage concurrent speech channels to quickly get the gist of digital information. In this paper, we present an experiment with 23 participants, which aims to understand blind people's ability to search for relevant content listening to two, three or four concurrent speech channels. Our results suggest that it is easy to identify the relevant source with two and three concurrent talkers. Moreover, both two and three sources may be used to understand the relevant source content depending on the task intelligibility demands and user characteristics.
References
- Ahmed, F. et al 2012. Why Read if You Can Skim: Towards Enabling Faster Screen Reading. In proc. of W4A. Google Scholar
Digital Library
- Alain, C. 2007. Breaking the wave: effects of attention and learning on concurrent sound perception. Hearing research, 229(1), 225--236.Google Scholar
- Aoki, Paul M., et al. 2003 The mad hatter's cocktail party: a social mobile audio space supporting multiple simultaneous conversations. Proceedings of CHI, ACM. Google Scholar
Digital Library
- Arons, B. 1997. SpeechSkimmer: A System for Interactively Skimming Recorded Speech. ACM TOCHI -- Special issue on speech as data, 4(1):3--38. Google Scholar
Digital Library
- Asakawa, C., & Takagi, H. 2008. Transcoding. In Web Accessibility (pp. 231--260). Springer London.Google Scholar
- Bigham, J. P. et al 2007. WebinSitu: a comparative analysis of blind and sighted browsing behavior. In Proc. of ASSETS (pp. 51--58). ACM. Google Scholar
Digital Library
- Borodin, Y. et al. 2010 More than meets the eye: a survey of screen-reader browsing strategies. In Proc. of the 2010 W4A. Google Scholar
Digital Library
- Bregman, A. S. 1994 Auditory scene analysis: The perceptual organization of sound. MIT press.Google Scholar
- Brungart, D. S., and Simpson, B. D. 2005. Optimizing the spatial configuration of a seven-talker speech display. ACM Transactions on Applied Perception (TAP), 2(4), 430--436. Google Scholar
Digital Library
- Brungart, D. S., & Simpson, B. D. (2005). Improving multitalker speech communication with advanced audio displays. Air Force Research Lab Wright Patterson AFN OH.Google Scholar
- Burton, H. 2003. Visual cortex activity in early and late blind people. The Journal of neuroscience, 23(10), 4005--4011.Google Scholar
Cross Ref
- Cherry, E. C. 1953. Some experiments on the recognition of speech, with one and with two ears. The Journal of the acoustical society of America, 25(5), 975--979.Google Scholar
- Crispien, K.et al 1996. A 3D-Auditory Environment for Hierarchical Navigation in Non-visual Interaction. Proceedings of ICAD.Google Scholar
- Darwin, C. J., Brungart, D. S., and Simpson, B. D. 2003. Effects of fundamental frequency and vocal-tract length changes on attention to one of two simultaneous talkers. The Journal of the Acoustical Society of America, 114, 2913.Google Scholar
Cross Ref
- Drullman, R. and Bronkhorst, A. 2000. Multichannel speech intelligibility and talker recognition using monaural, binaural, and three-dimensional auditory presentation. The Journal of the Acoustical Society of America.Google Scholar
Cross Ref
- Goose, S., & Möller, C. 1999. A 3D audio only interactive Web browser: using spatialization to convey hypermedia document structure. In Proc. the ACM international conference on Multimedia (Part 1) (pp. 363--371). ACM. Google Scholar
Digital Library
- Harper, S., and Patel, N. 2005. Gist summaries for visually impaired surfers. In Proc of ASSETS (pp. 90--97). Google Scholar
Digital Library
- Harper, S. (2012). Deep Accessibility: Adapting Interfaces to Suit Our Senses. Invited Talk - University of Lisbon. {online} Available at: http://www.slideshare.net/simon-harper/adapting-sensory-interfaces.Google Scholar
- Hugdahl, K.et al. 2004. Blind individuals show enhanced perceptual and attentional sensitivity for identification of speech sounds. Cognitive brain research, 19(1), 28--32.Google Scholar
- Parente, P. (2006) .Clique: a conversant, task-based audio display for GUI applications. ACM SIGACCESS Accessibility and Computing 84: 34--37. Google Scholar
Digital Library
- Paulo, Sérgio, et al. 2008, DIXI--a generic text-to-speech system for European Portuguese. Computational Processing of the Portuguese Language. Springer Berlin Heidelberg. Google Scholar
Digital Library
- Sato, D. et al. 2011. Sasayaki: augmented voice web browsing experience. In Proc of CHI, ACM. Google Scholar
Digital Library
- Schmandt, C. and Mullins, A. 1995. AudioStreamer: exploiting simultaneity for listening. Conference companion on Human factors in Computing Systems, pages 218--219. Google Scholar
Digital Library
- Shinn-Cunningham, B. G., & Ihlefeld, A. 2004. Selective and Divided Attention: Extracting Information from Simultaneous Sound Sources. In ICAD.Google Scholar
- Sodnik, J. et al. 2010 Enhanced synthesized text reader for visually impaired users. Advances in Computer-Human Interactions Google Scholar
Digital Library
- Takagi, H. et al. 2007. Analysis of navigability of Web applications for improving blind usability. ACM Transactions on Computer-Human Interaction, 14(3):13{es. Google Scholar
Digital Library
- Vazquez-Alvarez, Y., & Brewster, S. A. 2011. Eyes-free multitasking: the effect of cognitive load on mobile spatial audio interfaces. In Proceedings of CHI (pp. 2173--2176). Google Scholar
Digital Library
- Vestergaard, M. D. et al. 2009. The interaction of vocal characteristics and audibility in the recognition of concurrent syllablesa). The Journal of the Acoustical Society of America, 125(2), 1114--1124.Google Scholar
Cross Ref
- Vigo, M., & Harper, S. (2013). Coping tactics employed by visually disabled users on the web. International Journal of Human-Computer Studies, 71(11), 1013--1025 Google Scholar
Digital Library
- Watanabe, T. 2007. Experimental Evaluation of Usability and Accessibility of Heading Elements Components of Web Accessibility. Disability & Rehabilitation: Assistive Technology, pages 1--8. Google Scholar
Digital Library
- Wechsler, D. 1981. WAIS-R manual: Wechsler adult intelligence scale-revised. Psychological Corporation.Google Scholar
- Wenzel, E. M. et al (1993). Localization using non-individualized head related transfer functions. The Journal of the Acoustical Society of America, 94(1), 111--123.Google Scholar
Cross Ref
Index Terms
Text-to-speeches






Comments