Contact The DL Team Contact Us | Switch to tabbed view

top of pageABSTRACT

We focus on the issue of robustness of conversational interfaces that are flexible enough to allow natural "multithreaded" conversational flow. Our main advance is to use context-sensitive speech recognition in a general way, with a representation of dialogue context that is rich and flexible enough to support conversation about multiple interleaved topics, as well as the interpretation of corrective fragments. We explain, by use of worked examples, the use of our "Conversational Intelligence Architecture" (CIA) to represent conversational threads, and how each thread can be associated with a language model (LM) for more robust speech recognition. The CIA uses fine-grained dynamic representations of dialogue context, which supersede those used in finite-state or form-based dialogue managers. In an evaluation of a dialogue system built using this architecture we found that 87.9% of recognized utterances were recognized using a context-specific language model, resulting in an 11.5% reduction in the overall utterance recognition error rate, and a 13.4% reduction in concept error rate. Thus we show that by using context-sensitive recognition based on the predicted type of the user's next dialogue move, a more flexible dialogue system can also exhibit an improvement in speech recognition performance.
Advertisements



top of pageAUTHORS



Author image not provided  Oliver Lemon

No contact information provided yet.

Bibliometrics: publication history
Publication years1997-2012
Publication count47
Citation Count374
Available for download30
Downloads (6 Weeks)37
Downloads (12 Months)339
Downloads (cumulative)4,911
Average downloads per article163.70
Average citations per article7.96
View colleagues of Oliver Lemon


Author image not provided  Alexander Gruenstein

No contact information provided yet.

Bibliometrics: publication history
Publication years2002-2009
Publication count9
Citation Count67
Available for download8
Downloads (6 Weeks)7
Downloads (12 Months)89
Downloads (cumulative)2,890
Average downloads per article361.25
Average citations per article7.44
View colleagues of Alexander Gruenstein

top of pageREFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

 
1
 
2
Boros, M., Eckert, W., Gallwitz, F., Görz, G., Hanrieder, G., and Niemann, H. 1996. Towards understanding spontaneous speech: Word accuracy vs. concept accuracy. In Proceedings ICSLP '96, vol. 2 (Philadelphia, PA, 1996). 1009--1012.
 
3
Bos, J., Klein, E., Lemon, O., and Oka, T. 2003. DIPPER: Description and Formalisation of an Information-State Update Dialogue System Architecture. In 4th SIGdial Workshop on Discourse and Dialogue. Sapporo.
 
4
Carlson, L. 1983. Dialogue Games: An Approach to Discourse Analysis. D. Reidel.
 
5
Chotimongkol, A. and Rudnicky, A. I. 2001. N-best speech hypotheses reordering using linear regression. In Proceedings of European Conference on Speech Communication and Technology (EuroSpeech 2001). 1829--1832.
 
6
 
7
Clark, B., Lemon, O., Gruenstein, A., Bratt, E. O., Fry, J., Peters, S., Pon-Barry, H., Schultz, K., Thomsen-Gray, Z., and Treeratpituk, P. 2004. A general purpose architecture for intelligent tutoring systems. In Natural, Intelligent and Effective Interaction in Multimodal Dialogue Systems, J. van Kuppevelt, L. Dybkjaer, and N. O. Bernsen, Eds., Kluwer, 107--123.
 
8
Doherty, P., Granlund, G., Kuchcinski, K., Sandewall, E., Nordberg, K., Skarman, E., and Wiklund, J. 2000. The WITAS unmanned aerial vehicle project. In European Conference on Artificial Intelligence (ECAI 2000).
 
9
 
10
 
11
Fernandez, R. and Ginzburg, J. 2002. Non-sentential utterances: A corpus study. Traitement Automatique des Langues (TAL) 43, 2, 13--42. Special Issue on Dialogue.
 
12
 
13
Ginzburg, J., Sag, I. A., and Purver, M. 2001. Integrating Conversational Move Types in the Grammar of Conversation. In Bi-Dialog 2001---Proceedings of the 5th Workshop on Formal Semantics and Pragmatics of Dialogue (2001). 45--56.
 
14
 
15
Gruenstein, A. H. 2002. Conversational interfaces: A domain-independent architecture for task-oriented dialogues. Master's thesis, Stanford. Tech. rep.
 
16
Guzzoni, D., Cheyer, A., Julia, L., and Konolige, K. 1996. Many robots make short work. In AAAI Robotics Contest (Menlo Park, CA., 1996). SRI International, AAAI Press.
 
17
 
18
Houghton, G. 1986. The Production of Language in Dialogue: A Computational Model. Ph. D. thesis, University of Sussex.
 
19
Larsson, S. 2002. Issue-based Dialogue Management. Ph. D. thesis, Göteborg University.
 
20
Larsson, S., Bohlin, P., Bos, J., and Traum, D. 2000. TRINDIKIT 1.0 Manual. Technical report, University of Gothenburg.
 
21
Lemon, O., Bracy, A., Gruenstein, A., and Peters, S. 2001. Information states in a multi-modal dialogue system for human-robot conversation. In 5th Workshop on Formal Semantics and Pragmatics of Dialogue (Bi-Dialog 2001), P. Kühnlein, H. Reiser, and H. Zeevat, Eds. 57--67.
 
22
Lemon, O., Bracy, A., Gruenstein, A., and Peters, S. 2003a. An information state approach in a multi-modal dialogue system for human-robot conversation. In Perspectives on Dialogue in the New Millennium, P. Kühnlein, H. Rieser, and H. Zeevat, Eds., Number 114 in Pragmatics and Beyond new series, John Benjamins Publishers, 229--242.
 
23
Lemon, O., Cavedon, L., and Kelly, B. 2003b. Managing dialogue interaction: A multi-layered approach. In 4th SIGdial Workshop on Discourse and Dialogue (Sapporo, 2003). 168--177.
 
24
Lemon, O., Gruenstein, A., Gullett, R., Battle, A., Hiatt, L., and Peters, S. 2003c. Generation of collaborative spoken dialogue contributions in dynamic task environments. In AAAI Spring Symposium on Natural Language Generation in Spoken and Written Dialogue, Tech. Rep. SS-03-07. AAAI Press, Menlo Park, CA.
 
25
Lemon, O., Gruenstein, A., and Peters, S. 2002. Collaborative activities and multi-tasking in dialogue systems. Traitement Automatique des Langues (TAL) 43, 2, 131--154. Special Issue on Dialogue.
 
26
 
27
Levin, L., Thyme-Gobbél, A., Lavie, A., Ries, K., and Zechner, K. 1998. A discourse coding scheme for conversational spanish. In Proceedings of International Conference on Speech and Language Processing (ICSLP 98).
 
28
Lewin, I., Rupp, C. J., Hieronymous, J., Milward, D., Larsson, S., and Berman. 2000. SIRIDUS system architecture and interface report. Tech. Rep. D6.1, Siridus Project.
 
29
 
30
Martin, D., Cheyer, A., and Moran, D. 1999. The Open Agent Architecture: a framework for building distributed software systems. Applied Artificial Intelligence: An International Journal 13, 1--2.
 
31
McTear, M. 1998. Modelling spoken dialogues with state transition diagrams: Experiences with the CSLU toolkit. In Proceedings of the 5th International Conference on Spoken Language Processing.
 
32
Nuance. 2003. http://www.nuance.com/prodserv/prodvocalizer.html. As of 21 January 2003.
 
33
Power, R. 1979. The organization of purposeful dialogues. Linguistics 17, 107--152.
 
34
Rayner, M., Dowding, J., and Hockey, B. 2001a. A baseline method for compiling typed unification grammars into context free language models. In Proceedings of EuroSpeech 2001.
 
35
 
36
 
37
 
38
 
39
Sandewall, E., Doherty, P., Lemon, O., and Peters, S. 2003. Words at the right time: Real-time dialogues with the WITAS unmanned aerial vehicle. In Proceedings of KI'03: the 26th German Conference on Artificial Intelligence, A. Günter, R. Kruse, and B. Neumann, Eds., Lecture Notes in Artifical Intelligence no. 2821. Springer-Verlag.
 
40
 
41
Seneff, S., Hurley, E., Lau, R., Pao, C., Schmid, P., and Zue, V. 1998. Galaxy-II: a reference architecture for conversational system development. In Proceedings of ICSLP 98.
 
42
 
43
Taylor, P., Black, A., and Caley, R. 1998. The architecture of the the Festival speech synthesis system. In Third International Workshop on Speech Synthesis, Sydney, Australia.

top of pageCITED BY

16 Citations

 
 
 
 
 
 
 
 
 
 
 

top of pageINDEX TERMS

The ACM Computing Classification System (CCS rev.2012)

Note: Larger/Darker text within each node indicates a higher relevance of the materials to the taxonomic classification.

top of pagePUBLICATION

Title ACM Transactions on Computer-Human Interaction (TOCHI) TOCHI Homepage table of contents archive
Volume 11 Issue 3, September 2004
Pages 241-267
Publication Date2004-09-01 (yyyy-mm-dd)
PublisherACM New York, NY, USA
ISSN: 1073-0516 EISSN: 1557-7325 doi>10.1145/1017494.1017496

top of pageREVIEWS


Reviews are not available for this item
Computing Reviews logo

top of pageCOMMENTS

Be the first to comment To Post a comment please sign in or create a free Web account

top of pageTable of Contents

ACM Transactions on Computer-Human Interaction (TOCHI)

Volume 11 Issue 3, September 2004

Table of Contents
Introduction to mobile and adaptive conversational interfaces
Sharon Oviatt, Stephanie Seneff
Pages: 237-240
doi>10.1145/1017494.1017495
Full text: PDFPDF
multithreaded context for robust conversational interfaces: Context-sensitive speech recognition and interpretation of corrective fragments
Oliver Lemon, Alexander Gruenstein
Pages: 241-267
doi>10.1145/1017494.1017496
Full text: PDFPDF

We focus on the issue of robustness of conversational interfaces that are flexible enough to allow natural "multithreaded" conversational flow. Our main advance is to use context-sensitive speech recognition in a general way, with a representation of ...
expand
ISIS: an adaptive, trilingual conversational system with interleaving interaction and delegation dialogs
Helen Meng, P. C. Ching, Shuk Fong Chan, Yee Fong Wong, Cheong Chat Chan
Pages: 268-299
doi>10.1145/1017494.1017497
Full text: PDFPDF

ISIS (Intelligent Speech for Information Systems) is a trilingual spoken dialog system (SDS) for the stocks domain. It handles two dialects of Chinese (Cantonese and Putonghua) as well as English---the predominant languages in our region. The system ...
expand
Toward adaptive conversational interfaces: Modeling speech convergence with animated personas
Sharon Oviatt, Courtney Darves, Rachel Coulston
Pages: 300-328
doi>10.1145/1017494.1017498
Full text: PDFPDF

The design of robust interfaces that process conversational speech is a challenging research direction largely because users' spoken language is so variable. This research explored a new dimension of speaker stylistic variation by examining whether users' ...
expand

Powered by The ACM Guide to Computing Literature


The ACM Digital Library is published by the Association for Computing Machinery. Copyright © 2016 ACM, Inc.
Terms of Usage   Privacy Policy   Code of Ethics   Contact Us

Useful downloads: Adobe Reader    QuickTime    Windows Media Player    Real Player
Did you know the ACM DL App is now available?
Did you know your Organization can subscribe to the ACM Digital Library?
The ACM Guide to Computing Literature
All Tags
Export Formats
 
 
Save to Binder