Abstract
Linear modal synthesis methods have often been used to generate sounds for rigid bodies. One of the key challenges in widely adopting such techniques is the lack of automatic determination of satisfactory material parameters that recreate realistic audio quality of sounding materials. We introduce a novel method using prerecorded audio clips to estimate material parameters that capture the inherent quality of recorded sounding materials. Our method extracts perceptually salient features from audio examples. Based on psychoacoustic principles, we design a parameter estimation algorithm using an optimization framework and these salient features to guide the search of the best material parameters for modal synthesis. We also present a method that compensates for the differences between the real-world recording and sound synthesized using solely linear modal synthesis models to create the final synthesized audio. The resulting audio generated from this sound synthesis pipeline well preserves the same sense of material as a recorded audio example. Moreover, both the estimated material parameters and the residual compensation naturally transfer to virtual objects of different sizes and shapes, while the synthesized sounds vary accordingly. A perceptual study shows the results of this system compare well with real-world recordings in terms of material perception.
Supplemental Material
Available for Download
Supplemental movie, appendix, image and software files for, Example-guided physically based modal sound synthesis
- Adrien, J.-M. 1991. The missing link: Modal synthesis. In Representations of Musical Signals, MIT Press, Cambridge, MA, 269--298. Google Scholar
Digital Library
- Audiokinetic. 2011. Wwise SoundSeed Impact. http://www.audiokinetic. com/en/products/wwise-add-ons/soundseed/introductionGoogle Scholar
- Besl, P. J. and McKay, N. D. 1992. A method for registration of 3-D shapes. IEEE Trans. Pattern Anal. Mach. Intell. 239--256. Google Scholar
Digital Library
- Bonneel, N., Drettakis, G., Tsingos, N., Viaud-Delmon, I., and James, D. 2008. Fast modal sounds with scalable frequency-domain synthesis. ACM Trans. Graph. 27, 3, 24. Google Scholar
Digital Library
- Chadwick, J. N., An, S. S., and James, D. L. 2009. Harmonic shells: A practical nonlinear sound model for near-rigid thin shells. In Proceedings of the SIGGRAPH Asia '09: ACM SIGGRAPH Asia Papers. ACM, New York, 1--10. Google Scholar
Digital Library
- Chadwick, J. N. and James, D. L. 2011. Animating fire with sound. ACM Trans. Graph. 30, 84. Google Scholar
Digital Library
- Cook, P. R. 1996. Physically informed sonic modeling (PhISM): percussive synthesis. In Proceedings of the International Computer Music Conference. The International Computer Music Association, 228--231.Google Scholar
- Cook, P. R. 1997. Physically informed sonic modeling (phism): Synthesis of percussive sounds. Comput. Music J. 21, 3, 38--49.Google Scholar
Cross Ref
- Cook, P. R. 2002. Real Sound Synthesis for Interactive Applications. A. K. Peters, Ltd., Natick, MA. Google Scholar
Digital Library
- Corbett, R., van den Doel, K., Lloyd, J. E., and Heidrich, W. 2007. Timbrefields: 3d interactive sound models for real-time audio. Presence: Teleoper. Virtual Environ. 16, 6, 643--654. Google Scholar
Digital Library
- Dobashi, Y., Yamamoto, T., and Nishita, T. 2003. Real-Time rendering of aerodynamic sound using sound textures based on computational fluid dynamics. ACM Trans. Graph. 22, 3, 732--740. Google Scholar
Digital Library
- Dobashi, Y., Yamamoto, T., and Nishita, T. 2004. Synthesizing sound from turbulent field using sound textures for interactive fluid simulation. Comput. Graph. Forum 23, 539--545.Google Scholar
Cross Ref
- Dubuisson, M. P. and Jain, A. K. 1994. A modified hausdorff distance for object matching. In Proceedings of the 12th International Conference on Pattern Recognition. Vol. 1, IEEE Computer Society Press, 566--568.Google Scholar
- Fontana, F. 2003. The sounding object. In Mondo Estremo.Google Scholar
- Gope, C. and Kehtarnavaz, N. 2007. Affine invariant comparison of point-sets using convex hulls and hausdorff distances. Pattern Recogn. 40, 1, 309--320. Google Scholar
Digital Library
- Griffin, D. and Lim, J. 2003. Signal estimation from modified short-time Fourier transform. IEEE Trans. Acoust. Speech Signal Process. 32, 2, 236--243.Google Scholar
Cross Ref
- ISO. 2003. ISO 226: 2003: AcousticsNormal equalloudness-level contours. International Organization for Standardization.Google Scholar
- James, D., Barbič, J., and Pai, D. 2006. Precomputed acoustic transfer: Output-sensitive, accurate sound generation for geometrically complex vibration sources. In Proceedings of the ACM SIGGRAPH '06 Papers. ACM, 995. Google Scholar
Digital Library
- Lagarias, J. C., Reeds, J. A., Wright, M. H., and Wright, P. E. 1999. Convergence properties of the Nelder-Mead simplex method in low dimensions. SIAM J. Optim. 9, 1, 112--147. Google Scholar
Digital Library
- Lakatos, S., Mcadams, S., and Caussé, R. 1997. The representation of auditory source characteristics: Simple geometric form. Atten., Percept. Psychophys. 59, 8, 1180--1190.Google Scholar
Cross Ref
- Levine, S. N., Verma, T. S., and Smith, J. O. 1998. Multiresolution sinusoidal modeling for wideband audio with modifications. In Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing. Vol. 6, IEEE, 3585--3588.Google Scholar
- Lloyd, D. B., Raghuvanshi, N., and Govindaraju, N. K. 2011. Sound synthesis for impact sounds in video games. In Proceedings of the Symposium on Interactive 3D Graphics and Games. Google Scholar
Digital Library
- Morchen, F., Ultsch, A., Thies, M., and Lohken, I. 2006. Modeling timbre distance with temporal statistics from polyphonic music. IEEE Trans. Audio, Speech Lang. Process. 14, 1, 81--90. Google Scholar
Digital Library
- Moss, W., Yeh, H., Hong, J., Lin, M., and Manocha, D. 2010. Sounding liquids: Automatic sound synthesis from fluid simulation. ACM Trans. Graph. Google Scholar
Digital Library
- O'Brien, J. F., Cook, P. R., and Essl, G. 2001. Synthesizing sounds from physically based motion. In Proceedings of the ACM SIGGRAPH Conference on Computer Graphics and Interactive Techniques. ACM Press, 529--536. Google Scholar
Digital Library
- O'Brien, J. F., Shen, C., and Gatchalian, C. M. 2002. Synthesizing sounds from rigid-body simulations. In Proceedings of the ACM SIGGRAPH Symposium on Computer Animation. ACM Press, 175--181. Google Scholar
Digital Library
- Oppenheim, A. V., Schafer, R. W., and Buck, J. R. 1989. Discrete-Time Signal Processing. Vol. 1999, Prentice Hall, Englewood Cliffs, NJ. Google Scholar
Digital Library
- Pai, D. K., Doel, K. V. D., James, D. L., Lang, J., Lloyd, J. E., Richmond, J. L., and Yau, S. H. 2001. Scanning physical interaction behavior of 3d objects. In Proceedings of the 28th Annual Conference on Computer Graphics and Interactive Techniques (SIGGRAPH '01). ACM, New York, 87--96. Google Scholar
Digital Library
- Pampalk, E., Rauber, A., and Merkl, D. 2002. Content-Based organization and visualization of music archives. In Proceedings of the 10th ACM International Conference on Multimedia. ACM, 570--579. Google Scholar
Digital Library
- Picard, C., Tsingos, N., and Faure, F. 2009. Retargetting example sounds to interactive physics-driven animations. In Proceedings of the AES 35th International Conference-Audio for Games.Google Scholar
- Quatieri, T. and McAulay, R. 1985. Speech transformations based on a sinusoidal representation. In Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP '85). Vol. 10, 489--492.Google Scholar
Cross Ref
- Raghuvanshi, N. and Lin, M. 2006. Symphony: Real-Time physically-based sound synthesis. In Proceedings of the Symposium on Interactive 3D Graphics and Games.Google Scholar
- Ren, Z., Mehra, R., Coposky, J., and Lin, M. C. 2012. Tabletop ensemble: touch-enabled virtual percussion instruments. In Proceedings of the ACM SIGGRAPH Symposium on Interactive 3D Graphics and Games (I3D '12). ACM, New York, 7--14. Google Scholar
Digital Library
- Ren, Z., Yeh, H., and Lin, M. 2010. Synthesizing contact sounds between textured models. In Proceedings of the IEEE Virtual Reality Conference (VR'10). 139--146. Google Scholar
Digital Library
- Ren, Z., Yeh, H., Klatzky, R., and Lin, M. C. 2013. Geometry-Invariant material perception: Analysis and evaluation of Rayleigh damping model. IEEE Trans. Visualiz. Comput. Graph. 19, 4 (Special Issue VR'13).Google Scholar
Cross Ref
- Roads, C. 2004. Microsound. The MIT Press. Google Scholar
Digital Library
- Serra, X. 1997. Musical sound modeling with sinusoids plus noise. In Musical Signal Processing, 497--510.Google Scholar
- Serra, X. and Smith III, J. 1990. Spectral modeling synthesis: A sound analysis/synthesis system based on a deterministic plus stochastic decomposition. Comput. Music J. 14, 4, 12--24.Google Scholar
Cross Ref
- Shabana, A. 1997. Vibration of Discrete and Continuous Systems. Springer Verlag.Google Scholar
- Trebien, F. and Oliveira, M. 2009. Realistic real-time sound resynthesis and processing forinteractive virtual worlds. Vis. Comput. 25, 469-- 477. Google Scholar
Digital Library
- Välimäki, V., Huopaniemi, J., Karjalainen, M., and Jánosy, Z. 1996. Physical modeling of plucked string instruments with application to real-time sound synthesis. J. Audio Engin. Soc. 44, 5, 331--353.Google Scholar
- Välimäki, V. and Tolonen, T. 1997. Development and calibration of a guitar synthesizer. Audio Visual Society.Google Scholar
- van den Doel, K., Knott, D., and Pai, D. K. 2004. Interactive simulation of complex audiovisual scenes. Presence: Teleoper. Virtual Environ. 13, 99--111. Google Scholar
Digital Library
- van den Doel, K., Kry, P., and Pai, D. 2001. FoleyAutomatic: Physically-Based sound effects for interactive simulation and animation. In Proceedings of the 28th Annual Conference on Computer Graphics and Interactive Techniques. ACM, New York, 537--544. Google Scholar
Digital Library
- van den Doel, K. and Pai, D. K. 1998. The sounds of physical shapes. Presence: Teleoper. Virtual Environ. 7, 382--395. Google Scholar
Digital Library
- Van Den Doel, K. and Pai, D. K. 2002. Measurements of perceptual quality of contact sound models. In Proceedings of the International Conference on Auditory Display (ICAD '02). 345--349.Google Scholar
- Zheng, C. and James, D. L. 2009. Harmonic fluids. In Proceedings of SIGGRAPH '09: ACM SIGGRAPH Papers. ACM, New York, 1--12. Google Scholar
Digital Library
- Zheng, C. and James, D. L. 2010. Rigid-Body fracture sound with precomputed soundbanks. ACM Trans. Graph. 29, 69:1--69:13. Google Scholar
Digital Library
- Zheng, C. and James, D. L. 2011. Toward high-quality modal contact sound. ACM Trans. Graph. 30, 4. Google Scholar
Digital Library
- Zwicker, E. and Fastl, H. 1999. Psychoacoustics: Facts and Models 2nd Ed. Springer, New York.Google Scholar
Cross Ref
Index Terms
Example-guided physically based modal sound synthesis
Recommendations
Physically Based Sound Synthesis for Large-Scale Virtual Environments
Recorded sound clips have two main drawbacks. First, the sound generated is repetitive. Real sounds depend on how objects collide and where impact occurs, and prerecorded sound clips fail to capture such factors. Second, recording original sound clips ...
Perceptual Evaluation of Synthesized Sound Effects
Sound synthesis is the process of generating artificial sounds through some form of simulation or modelling. This article aims to identify which sound synthesis methods achieve the goal of producing a believable audio sample that may replace a recorded ...
Physics-Guided Sound Synthesis for Rotating Blades
Advances in Computer GraphicsAbstractThis paper focuses on sound synthesis for rotating blades such as fans, helicopters and wind turbines, which is common in both real world and computer games though has received little attention until now. In this paper, we propose a novel physics-...





Comments