ABSTRACT
Sound recognition is an important and popular function of smart devices. The location of sound is basic information associated with the acoustic source. Apart from sound recognition, whether the acoustic sources can be localized largely affects the capability and quality of the smart device's interactive functions. In this work, we study the problem of concurrently localizing multiple acoustic sources with a smart device (e.g., a smart speaker like Amazon Alexa). The existing approaches either can only localize a single source, or require deploying a distributed network of microphone arrays to function. Our proposal called Symphony is the first approach to tackle the above problem with a single microphone array. The insight behind Symphony is that the geometric layout of microphones on the array determines the unique relationship among signals from the same source along the same arriving path, while the source's location determines the DoAs (direction-of-arrival) of signals along different arriving paths. Symphony therefore includes a geometry-based filtering module to distinguish signals from different sources along different paths and a coherence-based module to identify signals from the same source. We implement Symphony with different types of commercial off-the-shelf microphone arrays and evaluate its performance under different settings. The results show that Symphony has a median localization error of 0.694m, which is 68% less than that of the state-of-the-art approach.
- Omid Abari, Deepak Vasisht, Dina Katabi, and Anantha Chandrakasan. 2015. Caraoke: An e-toll transponder network for smart cities. In Proceedings of ACM SIGCOMM, London, United Kingdom, August 17--21, 2015.Google Scholar
Digital Library
- Inkyu An, Myung-Bae Son, Dinesh Manocha, and Sung-Eui Yoon. 2018. Reflection-aware sound source localization. In Proceedings of IEEE International Conference on Robotics and Automation, Brisbane, Australia, May 21--25, 2018.Google Scholar
- BlissLights. 2020. Laser star projector with led nebula galaxy for room decor, home theater lighting, or bedroom night light mood ambiance. https://www.amazon.com/BlissLights-Sky-Lite-Projector-Bedroom/dp/B084DCF429/. Accessed: 2020-10-02.Google Scholar
- Jack Capon. 1969. High-resolution frequency-wavenumber spectrum analysis. Proc. IEEE 57, 8 (1969), 1408--1418.Google Scholar
- Travis C. Colliera, Alexander N. G. Kirschel, and Charles E. Taylor. 2010. Acoustic localization of antbirds in a mexican rainforest using a wireless sensor network. The Journal of the Acoustical Society of America 128, 1 (2010), 182--189.Google Scholar
Cross Ref
- Pablo Corbalán, Gian Pietro Picco, and Sameera Palipana. 2019. Chorus: Uwb concurrent transmissions for gps-like passive localization of countless targets. In Proceedings of ACM/IEEE IPSN, Montreal, QC, Canada, April 16--18, 2019.Google Scholar
Digital Library
- John C. Curlander and Robert N. McDonough. 1991. Synthetic aperture radar. Wiley, New York.Google Scholar
- Ivan Dokmanić, Reza Parhizkar, Andreas Walther, Yue M Lu, and Martin Vetterli. 2013. Acoustic echoes reveal room shape. National Academy of Sciences 110, 30 (2013), 12186--12191.Google Scholar
Cross Ref
- Otis Lamont Frost. 1972. An algorithm for linearly constrained adaptive array processing. Proc. IEEE 60, 8 (1972), 926--935.Google Scholar
Cross Ref
- Sharon Gannot, David Burshtein, and Ehud Weinstein. 2001. Signal enhancement using beamforming and nonstationarity with applications to speech. IEEE Transactions on Signal Processing 49, 8 (2001), 1614--1626.Google Scholar
Digital Library
- Stephen M. Goldfeld, Stephen M, and Richard E. Quandt. 1972. Nonlinear methods in econometrics. North-Holland Pub. Co.Google Scholar
- Shyamnath Gollakota and Dina Katabi. 2008. Zigzag decoding: Combating hidden terminals in wireless networks. In Proceedings of ACM SIGCOMM, Seattle, WA, USA, August 17--22, 2008.Google Scholar
Digital Library
- Govee. 2020. Govee 32.8ft led strip lights works with alexa google home. https://www.amazon.com/Govee-Wireless-Control-Kitchen-Million/dp/B07WHP2V77/. Accessed: 2020-10-02.Google Scholar
- L. Griffiths and C. Jim. 1982. An alternative approach to linearly constrained adaptive beamforming. IEEE Transactions on Antennas and Propagation 30, 1 (1982), 27--34.Google Scholar
Cross Ref
- Bernhard Großwindhager, Michael Stocker, Michael Rath, Carlo Alberto Boano, and Kay Römer. 2019. SnapLoc: An ultra-fast uwb-based indoor localization system for an unlimited number of tags. In Proceedings of ACM/IEEE IPSN, Montreal, QC, Canada, April 16--18, 2019.Google Scholar
Digital Library
- IEEE 802 Working Group. 2011. IEEE standard for local and metropolitan area networks---Part 15.4: Low-rate wireless personal area networks (lr-wpans). IEEE STD 802 (2011), 4--2011.Google Scholar
- Pan Hu, Pengyu Zhang, and Deepak Ganesan. 2015. Laissez-faire: Fully asymmetric backscatter communication. In Proceedings of ACM SIGCOMM, London, United Kingdom, August 17--21, 2015.Google Scholar
Digital Library
- Richard H. Roy III and Thomas Kailath. 1989. ESPRIT-estimation of signal parameters via rotational invariance techniques. IEEE Transactions on Acoustics, Speech, and Signal Processing 37, 7 (1989), 984--995.Google Scholar
Cross Ref
- Vikram Iyer, Rajalakshmi Nandakumar, Anran Wang, Sawyer B. Fuller, and Shyamnath Gollakota. 2019. Living IoT: A flying wireless platform on live insects. In Proceedings of ACM MobiCom, Los Cabos, Mexico, USA, October 21--25, 2019.Google Scholar
Digital Library
- Meng Jin, Yuan He, Chengkun Jiang, and Yunhao Liu. 2020. Fireworks: Channel estimation of parallel backscattered signals. In Proceedings of ACM/IEEE IPSN, Sydney, Australia, April 21--24, 2020.Google Scholar
Cross Ref
- Meng Jin, Yuan He, Xin Meng, Dingyi Fang, and Xiaojiang Chen. 2018. Parallel backscatter in the wild: When burstiness and randomness play with you. In Proceedings of ACM MobiCom, New Delhi, India, October 29 - November 02, 2018.Google Scholar
Digital Library
- Meng Jin, Yuan He, Xin Meng, Yilun Zheng, Dingyi Fang, and Xiaojiang Chen. 2017. FlipTracer: Practical parallel decoding for backscatter communication. In Proceedings of ACM MobiCom, Snowbird, UT, USA, October 16 - 20, 2017.Google Scholar
Digital Library
- Charles Knapp and Glifford Carter. 1976. The generalized correlation method for estimation of time delay. IEEE Transactions on Acoustics, Speech, and Signal Processing 24, 4 (1976), 320--327.Google Scholar
- Miranda Krekovic, Ivan Dokmanic, and Martin Vetterli. 2016. EchoSLAM: Simultaneous localization and mapping with acoustic echoes. In Proceedings of IEEE ICASSP, Shanghai, China, March 20--25, 2016.Google Scholar
Digital Library
- Deepak Kumar, Riccardo Paccagnella, Paul Murley, Eric Hennenfent, Joshua Mason, Adam Bates, and Michael Bailey. 2018. Skill squatting attacks on amazon alexa. In Proceedings of USENIX Security Symposium, Baltimore, MD, USA, August 15--17, 2018.Google Scholar
- Liqun Li, Guobin Shen, Chunshui Zhao, Thomas Moscibroda, Jyh-Han Lin, and Feng Zhao. 2014. Experiencing and handling the diversity in data density and environmental locality in an indoor positioning service. In Proceedings of ACM MobiCom, Maui, HI, USA, September 7--11, 2014.Google Scholar
Digital Library
- Qiongzheng Lin, Zhenlin An, and Lei Yang. 2019. Rebooting ultrasonic positioning systems for ultrasound-incapable smart devices. In Proceedings of ACM MobiCom, Los Cabos, Mexico, October 21--25, 2019.Google Scholar
Digital Library
- Chris Xiaoxuan Lu, Yang Li, Peijun Zhao, Changhao Chen, Linhai Xie, Hongkai Wen, Rui Tan, and Niki Trigoni. 2018. Simultaneous localization and mapping with power network electromagnetic field. In Proceedings of ACM MobiCom, New Delhi, India, October 29 - November 02, 2018.Google Scholar
Digital Library
- Jiajue Ou, Mo Li, and Yuanqing Zheng. 2015. Come and be served: parallel decoding for cots rfid Tags. In Proceedings of ACM MobiCom, Paris, France, September 7--11, 2015.Google Scholar
Digital Library
- Daniel V. Rabinkin, Richard J. Renomeron, James L. Flanagan, and Dwight F. Macomber. 1998. Optimal truncation time for matched filter array processing. In Proceedings of IEEE ICASSP, Seattle, Washington, USA, May 12--15, 1998.Google Scholar
- Nirupam Roy, Haitham Hassanieh, and Romit Roy Choudhury. 2017. Backdoor: Making microphones hear inaudible sounds. In Proceedings of ACM MobiSys, Niagara Falls, NY, USA, June 19--23, 2017.Google Scholar
Digital Library
- Ralph Schmidt. 1986. Multiple emitter location and signal parameter estimation. IEEE Transactions on Antennas and Propagation 34, 3 (1986), 276--280.Google Scholar
- Seeed. 2020. ReSpeaker 4-Mic Linear Array Kit for Raspberry Pi. https://wiki.seeedstudio.com/ReSpeaker_4-Mic_Linear_Array_Kit_for_Raspberry_Pi/. Accessed: 2020-06-18.Google Scholar
- Seeed. 2020. ReSpeaker 6-Mic Circular Array kit for Raspberry Pi. https://wiki.seeedstudio.com/ReSpeaker_6-Mic_Circular_Array_kit_for_Raspberry_Pi/. Accessed: 2020-06-18.Google Scholar
- Sheng Shen, Daguan Chen, Yu-Lin Wei, Zhijian Yang, and Romit Roy Choudhury. 2020. Voice localization using nearby wall reflections. In Proceedings of ACM MobiCom, London, United Kingdom, September 21--25, 2020.Google Scholar
Digital Library
- Yuanchao Shu, Cheng Bo, Guobin Shen, Chunshui Zhao, Liqun Li, and Feng Zhao. 2015. Magicol: Indoor localization using pervasive magnetic field and opportunistic WiFi sensing. IEEE Journal on Selected Areas in Communications 33, 7 (2015), 1443--1457.Google Scholar
Digital Library
- Yuanchao Shu, Kang G. Shin, Tian He, and Jiming Chen. 2015. Last-mile navigation using smartphones. In Proceedings of ACM MobiCom, Paris, France, September 7--11, 2015.Google Scholar
Digital Library
- Takeshi Sugawara, Benjamin Cyr, Sara Rampazzi, Daniel Genkin, and Kevin Fu. 2020. Light Commands: Laser-Based Audio Injection Attacks on Voice-Controllable Systems. In Proceedings of USENIX Security Symposium, Virtual Event, August 12--14, 2020.Google Scholar
- Harshavardhan Sundar, Weiran Wang, Ming Sun, and Chao Wang. 2020. Raw waveform based end-to-end deep convolutional network for spatial localization of multiple acoustic sources. In Proceedings of IEEE ICASSP, Barcelona, Spain, May 4--8, 2020.Google Scholar
Cross Ref
- Jue Wang, Deepak Vasisht, and Dina Katabi. 2014. RF-IDraw: Virtual touch screen in the air using rf signals. Proceedings of ACM SIGCOMM, Chicago, IL, USA, August 17--22, 2014.Google Scholar
Digital Library
- Wikipedia. 2020. Alibaba Tmall Genie. https://en.wikipedia.org/wiki/Tmall_Genie. Accessed: 2020-09-21.Google Scholar
- Wikipedia. 2020. Amazon Alexa. https://en.wikipedia.org/wiki/Amazon_Alexa. Accessed: 2020-09-21.Google Scholar
- Wikipedia. 2020. Apple HomePod. https://en.wikipedia.org/wiki/HomePod. Accessed: 2020-09-21.Google Scholar
- Wikipedia. 2020. Google Home (recently renamed as Google Nest). https://en.wikipedia.org/wiki/Google_Nest_(smart_speakers). Accessed: 2020-09-21.Google Scholar
- Jie Xiong and Kyle Jamieson. 2013. Arraytrack: A fine-grained indoor location system. In Proceedings of USENIX NSDI, Lombard, IL, USA, April 2--5, 2013.Google Scholar
- Zhice Yang, Zeyu Wang, Jiansong Zhang, Chenyu Huang, and Qian Zhang. 2015. Wearables can afford: Light-weight indoor positioning with visible light. In Proceedings of ACM MobiSys, Florence, Italy, May 19--22, 2015.Google Scholar
Digital Library
- Chi Zhang and Xinyu Zhang. 2016. LiTell: robust indoor localization using unmodified light fixtures. In Proceedings of ACM MobiCom, New York City, NY, USA, October 3--7, 2016.Google Scholar
Digital Library
- Guoming Zhang, Chen Yan, Xiaoyu Ji, Tianchen Zhang, Taimin Zhang, and Wenyuan Xu. 2017. Dolphinattack: Inaudible voice commands. In Proceedings of ACM Conference on Computer and Communications Security, Dallas, TX, USA, October 30 - November 03, 2017.Google Scholar
Digital Library
- Shilin Zhu and Xinyu Zhang. 2017. Enabling high-precision visible light localization in today's buildings. In Proceedings of ACM MobiSys, Niagara Falls, NY, USA, June 19--23, 2017.Google Scholar
Digital Library
Index Terms
Symphony: localizing multiple acoustic sources with a single microphone array
Recommendations
MYRiAD: a multi-array room acoustic database
AbstractIn the development of acoustic signal processing algorithms, their evaluation in various acoustic environments is of utmost importance. In order to advance evaluation in realistic and reproducible scenarios, several high-quality acoustic databases ...
Sound propagation model for sound source localization in area of observation of an audio robot
NN'08: Proceedings of the 9th WSEAS International Conference on Neural NetworksAn audio robot uses sound information to localize the subjects or persons in area of observation. The main problem in this case is to recover the sound direction and to localize the position of the talker. This problem is similar to the ability of human ...
Maximum Likelihood Sound Source Localization and Beamforming for Directional Microphone Arrays in Distributed Meetings
In distributed meeting applications, microphone arrays have been widely used to capture superior speech sound and perform speaker localization through sound source localization (SSL) and beamforming. This paper presents a unified maximum likelihood ...





Comments