skip to main content
research-article

Contextual partitioning for speech recognition

Published:05 September 2013Publication History
Skip Abstract Section

Abstract

Many multicore computers are single-user devices, creating the potential to partition by situational usage contexts, similar to how the human brain is organized. Contextual partitioning (CP) permits multiple simplified versions of the same task to exist in parallel, with selection tied to the context in use. We introduce CP for speech recognition, specifically targeted at user interfaces in handheld embedded devices. Contexts are drawn from webpage interactions. CP results in 61% fewer decoding errors, 97% less training for vocabulary changes, near-linear scaling potential with increasing core counts, and up to a potential 90% reduction in power usage.

References

  1. Clark, D. 2001. Speech recognition: The wireless interface revolution. Comput. 34, 16--18. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. Coen, M., Weisman, L., Thomas, K., and Groh, M. 1999. A context sensitive natural language modality for an intelligent room. In Proceedings of the 1st International Workshop on Managing Interactions in Smart Environments (MANSE'99), 68--79.Google ScholarGoogle Scholar
  3. Cristoforetti, L., Matassoni, M., Omologo, M., and Svaizer, P. 2003. Use of parallel recognizers for robust in-car speech interaction. In Proceedings of the International Conference on Acoustic, Speech, and Signal Processing (ICASSP'03).Google ScholarGoogle Scholar
  4. Crothers, B. 2008. Intel says to prepare for ‘thousands of cores’. Nanotech: The Circuits Blog CNET. http://news.cnet.com/8301-13924_3-9981760-64.html.Google ScholarGoogle Scholar
  5. Dediu, H. 2011. Revolutionary user interfaces. Asymco. http://www.asymco.com/2011/11/03/revolutionary-user-interface.Google ScholarGoogle Scholar
  6. Hawkins, J. and Blakeslee, S. 2004. On Intelligence. Henry Hold and Company, LLC, New York, NY. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. Hui, L., Li, D., Dong, Y., Yi-Fan, G., Acero, A., and Chin-Hui, L. 2009. A study on multilingual acoustic modeling for large vocabulary ASR. In Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). 4333--4336. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. Kent, C. G. 2009. Personalized computer architecture as contextual partitioning for speech recognition. Master's Thesis. Virginia Polytechnic and State University, Blacksburg, VA.Google ScholarGoogle Scholar
  9. Larus, J. 2009. Spending Moore's dividend. Commun. ACM 52, 62--69. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. Leong, L. H., Kobayashi, S., Koshizuka, N., and Sakamura, K. 2005. CASIS: A context-aware speech interface system. In Proceedings of the 10th International Conference on Intelligent User Interfaces. 231--238. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. Li, S., Li-Shiuan, P., and Jha, N. K. 2003. Dynamic voltage scaling with links for power optimization of interconnection networks. In Proceedings of the 9th International Symposium on High-Performance Computer Architecture (HPCA'03). 91--102. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. Meyer, B. H., Pieper, J. J., Paul, J. M., Nelson, J. E., Pieper, S. M., and Rowe, A. G. 2005. Power-performance simulation and design strategies for single-chip heterogeneous multiprocessors. IEEE Trans. Comput. 54, 684--697. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. Paul, J. M., Otoom, M., Somers, M., Pieper, S., and Schulte, M. J. 2009. The emerging landscape of computer performance evaluation. In Advances in Computers, M. V. Zelkowitz, Ed., Academic Press, Burlington, VT, 235--280.Google ScholarGoogle Scholar
  14. Rabiner, L. R. 1989. A tutorial on hidden Markov models and selected applications in speech recognition. Proc. IEEE, 77, 2, 257--286.Google ScholarGoogle ScholarCross RefCross Ref
  15. Somers, M. and Paul, J. M. 2008. Webpage-based benchmarks for mobile device design. In Proceedings of the Asia and South Pacific Design Automation Conference. 795--800. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. Suontausta, J., Hakkinen, J., and Viikki, O. 2000. Fast decoding in large vocabulary name dialing. In Acoustics, Speech, and Signal Processing, 2000. In Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP'00). 1535--1538.Google ScholarGoogle Scholar
  17. Wheeler, M. E., Shulman, G. L., Buckner, R. L., Miezin, F. M., Velanova, K., and Petersen, S. E. 2006. Evidence for separate perceptual reactivation and search processes during remembering. Cereb. Cortex 16, 949--959.Google ScholarGoogle ScholarCross RefCross Ref

Index Terms

  1. Contextual partitioning for speech recognition

                Recommendations

                Comments

                Login options

                Check if you have access through your login credentials or your institution to get full access on this article.

                Sign in

                Full Access

                PDF Format

                View or Download as a PDF file.

                PDF

                eReader

                View online with eReader.

                eReader
                About Cookies On This Site

                We use cookies to ensure that we give you the best experience on our website.

                Learn more

                Got it!