Abstract
Many multicore computers are single-user devices, creating the potential to partition by situational usage contexts, similar to how the human brain is organized. Contextual partitioning (CP) permits multiple simplified versions of the same task to exist in parallel, with selection tied to the context in use. We introduce CP for speech recognition, specifically targeted at user interfaces in handheld embedded devices. Contexts are drawn from webpage interactions. CP results in 61% fewer decoding errors, 97% less training for vocabulary changes, near-linear scaling potential with increasing core counts, and up to a potential 90% reduction in power usage.
- Clark, D. 2001. Speech recognition: The wireless interface revolution. Comput. 34, 16--18. Google Scholar
Digital Library
- Coen, M., Weisman, L., Thomas, K., and Groh, M. 1999. A context sensitive natural language modality for an intelligent room. In Proceedings of the 1st International Workshop on Managing Interactions in Smart Environments (MANSE'99), 68--79.Google Scholar
- Cristoforetti, L., Matassoni, M., Omologo, M., and Svaizer, P. 2003. Use of parallel recognizers for robust in-car speech interaction. In Proceedings of the International Conference on Acoustic, Speech, and Signal Processing (ICASSP'03).Google Scholar
- Crothers, B. 2008. Intel says to prepare for ‘thousands of cores’. Nanotech: The Circuits Blog CNET. http://news.cnet.com/8301-13924_3-9981760-64.html.Google Scholar
- Dediu, H. 2011. Revolutionary user interfaces. Asymco. http://www.asymco.com/2011/11/03/revolutionary-user-interface.Google Scholar
- Hawkins, J. and Blakeslee, S. 2004. On Intelligence. Henry Hold and Company, LLC, New York, NY. Google Scholar
Digital Library
- Hui, L., Li, D., Dong, Y., Yi-Fan, G., Acero, A., and Chin-Hui, L. 2009. A study on multilingual acoustic modeling for large vocabulary ASR. In Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). 4333--4336. Google Scholar
Digital Library
- Kent, C. G. 2009. Personalized computer architecture as contextual partitioning for speech recognition. Master's Thesis. Virginia Polytechnic and State University, Blacksburg, VA.Google Scholar
- Larus, J. 2009. Spending Moore's dividend. Commun. ACM 52, 62--69. Google Scholar
Digital Library
- Leong, L. H., Kobayashi, S., Koshizuka, N., and Sakamura, K. 2005. CASIS: A context-aware speech interface system. In Proceedings of the 10th International Conference on Intelligent User Interfaces. 231--238. Google Scholar
Digital Library
- Li, S., Li-Shiuan, P., and Jha, N. K. 2003. Dynamic voltage scaling with links for power optimization of interconnection networks. In Proceedings of the 9th International Symposium on High-Performance Computer Architecture (HPCA'03). 91--102. Google Scholar
Digital Library
- Meyer, B. H., Pieper, J. J., Paul, J. M., Nelson, J. E., Pieper, S. M., and Rowe, A. G. 2005. Power-performance simulation and design strategies for single-chip heterogeneous multiprocessors. IEEE Trans. Comput. 54, 684--697. Google Scholar
Digital Library
- Paul, J. M., Otoom, M., Somers, M., Pieper, S., and Schulte, M. J. 2009. The emerging landscape of computer performance evaluation. In Advances in Computers, M. V. Zelkowitz, Ed., Academic Press, Burlington, VT, 235--280.Google Scholar
- Rabiner, L. R. 1989. A tutorial on hidden Markov models and selected applications in speech recognition. Proc. IEEE, 77, 2, 257--286.Google Scholar
Cross Ref
- Somers, M. and Paul, J. M. 2008. Webpage-based benchmarks for mobile device design. In Proceedings of the Asia and South Pacific Design Automation Conference. 795--800. Google Scholar
Digital Library
- Suontausta, J., Hakkinen, J., and Viikki, O. 2000. Fast decoding in large vocabulary name dialing. In Acoustics, Speech, and Signal Processing, 2000. In Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP'00). 1535--1538.Google Scholar
- Wheeler, M. E., Shulman, G. L., Buckner, R. L., Miezin, F. M., Velanova, K., and Petersen, S. E. 2006. Evidence for separate perceptual reactivation and search processes during remembering. Cereb. Cortex 16, 949--959.Google Scholar
Cross Ref
Index Terms
Contextual partitioning for speech recognition
Recommendations
Contextual invariant-integration features for improved speaker-independent speech recognition
This work presents a feature-extraction method that is based on the theory of invariant integration. The invariant-integration features are derived from an extended time period, and their computation has a very low complexity. Recognition experiments ...
Noise Robust Speech Recognition Based on Noise-Adapted HMMs Using Speech Feature Compensation
ACSAT '13: Proceedings of the 2013 International Conference on Advanced Computer Science Applications and TechnologiesIn conventional VTS-based noisy speech recognition methods, the parameters of the clean HMM are adapted to test noisy speech, or the original clean speech is estimated from the test noisy speech. However, in noisy speech recognition, improved ...
Speech disorder Malay speech recognition system
SENSIG'09/VIS'09/MATERIALS'09: Proceedings of the 2nd WSEAS International Conference on Sensors, and Signals and Visualization, Imaging and Simulation and Materials ScienceAutomatic speech recognition systems have the potential to make hard to understand speech more easily recognizable. Designing a system that recognizes impaired speech is more difficult than a system that recognizes normal speech. The Automatic Malay ...






Comments