skip to main content
article

Online audio background determination for complex audio environments

Published:01 May 2007Publication History
Skip Abstract Section

Abstract

We present a method for foreground/background separation of audio using a background modelling technique. The technique models the background in an online, unsupervised, and adaptive fashion, and is designed for application to long term surveillance and monitoring problems. The background is determined using a statistical method to model the states of the audio over time. In addition, three methods are used to increase the accuracy of background modelling in complex audio environments. Such environments can cause the failure of the statistical model to accurately capture the background states. An entropy-based approach is used to unify background representations fragmented over multiple states of the statistical model. The approach successfully unifies such background states, resulting in a more robust background model. We adaptively adjust the number of states considered background according to background complexity, resulting in the more accurate classification of background models. Finally, we use an auxiliary model cache to retain potential background states in the system. This prevents the deletion of such states due to a rapid influx of observed states that can occur for highly dynamic sections of the audio signal. The separation algorithm was successfully applied to a number of audio environments representing monitoring applications.

References

  1. Azlan, M., Cartwright, I., Jones, N., Quirk, T., and West, G. 2005. Multimodal monitoring of the aged in their own homes. In Proceedings of the ICOST'2005: 3rd. International Conference on Smart Homes and Health Telematics (July) Magog, Canada.Google ScholarGoogle Scholar
  2. Chen, J., Kam, A. H., Zhang, J., Liu, N., and Shue, L. 2005a. Bathroom activity monitoring based on sound. In Pervasive Computing. Munich, Germany, 47--61. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. Chen, J., Zhang, J., Kam, A., and Shue, L. 2005b. An automatic acoustic bathroom monitoring system. In IEEE International Symposium on Circuits and Systems (ISCAS 05). vol. 2, 1750--1753.Google ScholarGoogle Scholar
  4. Clarkson, B., Sawhney, N., and Pentland, A. 1998. Auditory context awareness in wearable computing. In Workshop on Perceptual User Interfaces. San Francisco, U.S.A., 47--61.Google ScholarGoogle Scholar
  5. Clavel, C., Ehrette, T., and Richard, G. 2005. Events detection for an audio-based surveillance system. In IEEE International Conference on Multimedia and Expo (ICME 2005). Amsterdam, Netherlands.Google ScholarGoogle Scholar
  6. Cover, T. and Thomas, J. 1991. Elements of Information Theory. John Wiley and Sons. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. Cowling, M. and Sitte, R. 2003. Comparison of techniques for environmental sound recognition. Pattern Recognition Letters 24, 15, 2895--2907. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. Cristani, M., Bicego, M., and Murino, V. 2004. Online adaptive background modelling for audio surveillance. In Proceedings of the 17th International Conference on Pattern Recognition (ICPR 04). vol. 2, 399--402. Google ScholarGoogle ScholarCross RefCross Ref
  9. Daubechies, I. 1992. Ten Lectures on Wavelets. Society for Industrial and Applied Mathematics, Philadelphia, Pennsylvania. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. Deller, J. R., Proakis, J. G., and Hansen, J. H. 1993. Discrete-Time Processing of Speech Signals. Maxwell Macmillan International. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. Elgammal, A., Duraiswami, R., Harwood, D., and Davis, L. S. 2000. Non-parametric model for background subtraction. In Proceedings of the 6th European Conference on Computer Vision-Part II. Springer-Verlag, Dublin, Ireland, 751--767. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. Ellis, D. P. W. 2001. Detecting alarm sounds. In Consistent and Reliable Acoustic Cues for Sound Analysis. Aalborg, Denmark.Google ScholarGoogle Scholar
  13. Foote, J. T. and Cooper, M. L. 2003. Media segmentation using self-similarity decomposition. In SPIE Storage and Retrieval for Multimedia Databases. vol. 5021. 167--175.Google ScholarGoogle Scholar
  14. Gaunard, P., Mubikangiey, C., Couvreur, C., and Fontaine, V. 1998. Automatic classification of environmental noise events by hidden markov models. In IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP '98). vol. 6, 3609--3612.Google ScholarGoogle Scholar
  15. Härmä, A., McKinney, M., and Skowronek, J. 2005. Automatic surveillance of the acoustic activity in our living environment. In IEEE International Conference on Multimedia and Expo (ICME 2005). Amsterdam, Netherlands.Google ScholarGoogle Scholar
  16. Kim, K., Chalidabhongse, T. H., Harwood, D., and Davis, L. 2004. Background modeling and subtraction by codebook construction. In IEEE International Conference on Image Processing (ICIP). Singapore.Google ScholarGoogle Scholar
  17. Lee, L. 1999. Measures of distributional similarity. In 37th Annual Meeting of the Association for Computational Linguistics. 25--32. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. Moncrieff, S., Venkatesh, S., and West, G. 2005. Persistent audio modelling for background determination. In IEEE International Conference on Multimedia and Expo (ICME 2005). Amsterdam, Netherlands.Google ScholarGoogle Scholar
  19. Moncrieff, S,. Venkatesh, S., and West, G. 2006. Unifying background models over complex audio using entropy. In International Conference on Pattern Recognition (ICPR 2006). Hong Kong, China. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. Radhakrishnan, R., Divakaran, A., and Xiong, Z. 2004. A time series clustering based framework for multimedia mining and summarization using audio features. In Proceedings of the 6th ACM SIGMM International Workshop on Multimedia Information Retrieval (MIR '04). ACM Press, 157--164. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. Stager, M., Lukowicz, P., Perera, N., von Buren, T., Troster, G., and Starner, T. 2003. Soundbutton: Design of a low power wearable audio classification system. In Proceedings of the Seventh IEEE International Symposium on Wearable Computers (2003). 12--17. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. Stauffer, C. and Grimson, W. 1999. Adaptive background mixture models for real-time tracking. In IEEE Computer Society Conference on Computer Vision and Pattern Recognition (1999). vol. 2. Fort Collins, CO USA, 246--252.Google ScholarGoogle Scholar
  23. Vacher, M., Istrate, D., Besacier, L., Serignat, J. F., and Castelli, E. 2004. Sound detection and classification for medical telesurvey. In 2nd Conference on Biomedical Engineering. ACTA Press, Ed. Innsbruck, Austria, 395--398.Google ScholarGoogle Scholar
  24. Witten, I. H. and Frank, E. 2000. Data Mining: Practical Machine Learning Tools with Java Implementations. Morgan Kaufmann. Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. Wren, C., Azarbayejani, A., Darrel, T., and Pentland, A. 1997. Pfinder: Real-time tracking of the human body. PAMI 19, 7, 780--785. Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. Zhang, T. and Jay Kuo, C.-C. 1999. Hierarchical classification of audio data for archiving and retrieving. In IEEE International Conference On Acoustics, Speech, and Signal Processing. vol. 6. 3001--3004. Phoenix. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Online audio background determination for complex audio environments

      Recommendations

      Comments

      Login options

      Check if you have access through your login credentials or your institution to get full access on this article.

      Sign in

      Full Access

      PDF Format

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader
      About Cookies On This Site

      We use cookies to ensure that we give you the best experience on our website.

      Learn more

      Got it!