ABSTRACT
An ability to detect, classify, and locate complex acoustic events can be a powerful tool to help smart systems build context-awareness, e.g., to make rich inferences about human behaviors in physical spaces. Conventional methods to measure acoustic signals employ microphones as sensors. As signals from multiple acoustic sources are blended during propagation to a sensor, such methods impose a dual challenge of separating the signal for an acoustic event from background noise and from other acoustic events of interest. Recent research has proposed using radio-frequency (RF) signals, e.g., Wi-Fi and millimeter-wave (mmWave), to sense sound directly from source vibrations. Whereas these works allow separating an acoustic event from background noise, they cannot monitor multiple sound sources simultaneously. In this paper, we present UWHear, a system that simultaneously recovers and separates sounds from multiple sources. Unlike previous works using continuous-wave RF, UWHear employs Impulse Radio Ultra-Wideband (IR-UWB) technology, in order to construct an enhanced audio sensing system tackling the above challenges. Further, IR-UWB radios can penetrate light building materials, which enables UWHear to operate in some non-line-of-sight (NLOS) conditions. In addition to providing a theoretical guarantee for audio recovery using RF pulses, we also implement an audio sensing prototype exploiting a commercial-off-the-shelf IR-UWB radar. Our experiments show that UWHear can effectively separate the content of two speakers that are placed only 25cm apart. UWHear can also capture and separate multiple sounds and vibrations of household appliances while being immune to non-target noise coming from other directions.
- Gregory D Abowd, Anind K Dey, Peter J Brown, Nigel Davies, Mark Smith, and Pete Steggles. 1999. Towards a better understanding of context and context-awareness. In International symposium on handheld and ubiquitous computing. Springer, 304--307.Google Scholar
Digital Library
- Fawad Ahmad, Hang Qiu, Ray Eells, Fan Bai, and Ramesh Govindan. 2020. CarMap: Fast 3D Feature Map Updates for Automobiles. In 17th USENIX Symposium on Networked Systems Design and Implementation (NSDI 2020). 1063--1081.Google Scholar
- Amr Alanwar, Henrique Ferraz, Kevin Hsieh, Rohit Thazhath, Paul Martin, João Hespanha, and Mani Srivastava. 2017. D-SLATS: Distributed simultaneous localization and time synchronization. In Proceedings of the 18th ACM International Symposium on Mobile Ad Hoc Networking and Computing. 1--10.Google Scholar
Digital Library
- Amazon. 2020. Amazon Alexa Premium Far-Field Voice Dev Kit. https://developer.amazon.com/en-US/alexa/alexa-voice-service/dev-kits/amazon-premium-voice Accessed: 2020-10-18.Google Scholar
- Nikolaj Andersen, Kristian Granhaug, Jørgen Andreas Michaelsen, Sumit Bagga, Håkon A Hjortland, Mats Risopatron Knutsen, Tor Sverre Lande, and Dag T Wisland. 2017. A 118-mw pulse-based radar soc in 55-nm cmos for non-contact human vital signs detection. IEEE Journal of Solid-State Circuits 52, 12 (2017), 3421--3433.Google Scholar
Cross Ref
- Novelda AS. 2020. Novelda Presence Sensor. https://novelda.com/novelda-presence-sensor.html. Accessed: 2020-07-01.Google Scholar
- Novelda AS. 2020. X4 Datasheet - Impulse Radar Transceiver SoC. https://novelda.com/fo/themes/default/img/contents/x4_datasheet_revF.pdf. Accessed: 2020-07-01.Google Scholar
- Novelda AS. 2020. Xethru Raspberry Driver Example. https://github.com/novelda/Legacy-SW/tree/master/Examples/X4Driver_RaspberryPi. Accessed: 2020-05-28.Google Scholar
- Novelda AS. 2020. XeThru X4 Phase Noise Correction. https://github.com/novelda/Legacy-Documentation/blob/master/Application-Notes/XTAN-14_XeThru_X4_Phase_Noise_Correction_rev_a.pdf. Accessed: 2020-05-28.Google Scholar
- Michael Berouti, Richard Schwartz, and John Makhoul. 1979. Enhancement of speech corrupted by acoustic noise. In ICASSP'79. IEEE International Conference on Acoustics, Speech, and Signal Processing, Vol. 4. IEEE, 208--211.Google Scholar
Cross Ref
- S Boll. 1979. A spectral subtraction algorithm for suppression of acoustic noise in speech. In ICASSP'79. IEEE International Conference on Acoustics, Speech, and Signal Processing, Vol. 4. IEEE, 200--203.Google Scholar
Cross Ref
- Chao Cai, Zhe Chen, Henglin Pu, Liyuan Ye, Menglan Hu, and Jun Luo. 2020. AcuTe: Acoustic Thermometer Empowered by a Single SmartPhone. In Proc. of the 18th ACM SenSys. 1--14. Google Scholar
Digital Library
- Jianfeng Chen, Alvin Harvey Kam, Jianmin Zhang, Ning Liu, and Louis Shue. 2005. Bathroom activity monitoring based on sound. In International Conference on Pervasive Computing. Springer, 47--61.Google Scholar
Digital Library
- Mango Communications. 2015. WARP v3 User Guide: RF Interfaces. https://warpproject.org/trac/wiki/HardwareUsersGuides/WARPv3/RF Accessed: 2020-10-18.Google Scholar
- Abe Davis, Michael Rubinstein, Neal Wadhwa, Gautham J Mysore, Frédo Durand, and William T Freeman. 2014. The visual microphone: passive recovery of sound from video. (2014).Google Scholar
- Gert Dekkers, Steven Lauwereins, Bart Thoen, Mulu Weldegebreal Adhana, Henk Brouckxon, Toon van Waterschoot, Bart Vanrumste, Marian Verhelst, and Peter Karsmakers. 2017. The SINS Database for Detection of Daily Activities in a Home Environment Using an Acoustic Sensor Network. In Proceedings of the Detection and Classification of Acoustic Scenes and Events 2017 Workshop (DCASE2017). 32--36.Google Scholar
- Ashutosh Dhekne, Mahanth Gowda, Yixuan Zhao, Haitham Hassanieh, and Romit Roy Choudhury. 2018. Liquid: A wireless liquid identifier. In Proceedings of the 16th Annual International Conference on Mobile Systems, Applications, and Services. ACM, 442--454.Google Scholar
Digital Library
- Maria-Gabriella Di Benedetto. 2006. UWB communication systems: a comprehensive overview. Vol. 5. Hindawi Publishing Corporation.Google Scholar
- Shuya Ding, Zhe Chen, Tianyue Zheng, and Jun Luo. 2020. RF-Net: A Unified Meta-Learning Framework for RF-enabled One-Shot Human Activity Recognition. In Proc. of the 18th ACM SenSys. 1--14. Google Scholar
Digital Library
- Chris Donahue, Bo Li, and Rohit Prabhavalkar. 2018. Exploring speech enhancement with generative adversarial networks for robust speech recognition. In 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). IEEE, 5024--5028.Google Scholar
Digital Library
- Igor Dotlic, Andrew Connell, Hang Ma, Jeff Clancy, and Michael McLaughlin. 2017. Angle of arrival estimation using decawave DW1000 integrated circuits. In 2017 14th Workshop on Positioning, Navigation and Communications (WPNC). IEEE, 1--6.Google Scholar
Cross Ref
- Francois G Germain, Qifeng Chen, and Vladlen Koltun. 2018. Speech denoising with deep feature losses. arXiv preprint arXiv:1806.10522 (2018).Google Scholar
- Anthony Griffin, Anastasios Alexandridis, Despoina Pavlidi, Yiannis Mastorakis, and Athanasios Mouchtaris. 2015. Localizing multiple audio sources in a wireless acoustic sensor network. Signal Processing 107 (2015), 54--67.Google Scholar
Digital Library
- Danilo Hollosi, Jens Schröder, Stefan Goetze, and Jens-E Appell. 2010. Voice activity detection driven acoustic event classification for monitoring in smart homes. In 2010 3rd International Symposium on Applied Sciences in Biomedical and Communication Technologies (ISABEL 2010). IEEE, 1--5.Google Scholar
Cross Ref
- Antonio Ramón Jiménez and Fernando Seco. 2016. Comparing Decawave and Bespoon UWB location systems: Indoor/outdoor performance analysis. In 2016 International Conference on Indoor Positioning and Indoor Navigation (IPIN). IEEE, 1--8.Google Scholar
Cross Ref
- Sunil Kamath and Philipos Loizou. 2002. A multi-band spectral subtraction method for enhancing speech corrupted by colored noise.. In ICASSP, Vol. 4. Citeseer, 44164--44164.Google Scholar
Cross Ref
- Ilya Kavalerov, Scott Wisdom, Hakan Erdogan, Brian Patton, Kevin Wilson, Jonathan Le Roux, and John R Hershey. 2019. Universal sound separation. In 2019 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA). IEEE, 175--179.Google Scholar
Cross Ref
- M Klemm, IJ Craddock, JA Leendertz, A Preece, DR Gibbins, M Shere, and R Benjamin. 2010. Clinical trials of a UWB imaging radar for breast cancer. In Proceedings of the Fourth European Conference on Antennas and Propagation. IEEE, 1--4.Google Scholar
- Rakhesh Singh Kshetrimayum. 2009. An introduction to UWB communication systems. IEEE Potentials 28, 2 (2009), 9--13.Google Scholar
Cross Ref
- MATRIX labs. 2020. The IoT Development Board for Building Incredibly Smart Products. https://www.matrix.one/products/creator Accessed: 2020-10-18.Google Scholar
- Dawna Lewis, Kendra Schmid, Samantha O'Leary, Jody Spalding, Elizabeth Heinrichs-Graham, and Robin High. 2016. Effects of noise on speech recognition and listening effort in children with normal hearing and children with mild bilateral or unilateral hearing loss. Journal of Speech, Language, and Hearing Research 59, 5 (2016), 1218--1232.Google Scholar
Cross Ref
- Jinyu Li, Li Deng, Yifan Gong, and Reinhold Haeb-Umbach. 2014. An overview of noise-robust automatic speech recognition. IEEE/ACM Transactions on Audio, Speech, and Language Processing 22, 4 (2014), 745--777.Google Scholar
Digital Library
- Xiaolin Liang, Jianqin Deng, Hao Zhang, and Thomas Aaron Gulliver. 2018. Ultra-wideband impulse radar through-wall detection of vital signs. Scientific reports 8, 1 (2018), 1--21.Google Scholar
- Liang Liu, Junyan Ren, Xuejing Wang, and Fan Ye. 2007. Design of low-power, 1GS/s throughput FFT processor for MIMO-OFDM UWB communication system. In 2007 IEEE International Symposium on Circuits and Systems. IEEE, 2594--2597.Google Scholar
Cross Ref
- Annamaria Mesaros, Toni Heittola, and Tuomas Virtanen. 2016. TUT database for acoustic scene classification and sound event detection. In 2016 24th European Signal Processing Conference (EUSIPCO). IEEE, 1128--1132.Google Scholar
Cross Ref
- Eliya Nachmani, Yossi Adi, and Lior Wolf. 2020. Voice Separation with an Unknown Number of Multiple Speakers. arXiv preprint arXiv:2003.01531 (2020).Google Scholar
- Rajalakshmi Nandakumar, Shyamnath Gollakota, and Nathaniel Watson. 2015. Contactless sleep apnea detection on smartphones. In Proceedings of the 13th annual international conference on mobile systems, applications, and services. 45--57.Google Scholar
Digital Library
- Tuan Anh Nguyen and Marco Aiello. 2013. Energy intelligent buildings based on user activity: A survey. Energy and buildings 56 (2013), 244--257.Google Scholar
- Tsuyoki Nishikawa, Hiroshi Saruwatari, and Kiyohiro Shikano. 2003. Blind source separation of acoustic signals based on multistage ICA combining frequency-domain ICA and time-domain ICA. IEICE Transactions on Fundamentals of Electronics, Communications and Computer Sciences 86, 4 (2003), 846--858.Google Scholar
- Despoina Pavlidi, Anthony Griffin, Matthieu Puigt, and Athanasios Mouchtaris. 2013. Real-time multiple sound source localization and counting using a circular microphone array. IEEE Transactions on Audio, Speech, and Language Processing 21, 10 (2013), 2193--2206.Google Scholar
Digital Library
- Hang Qiu, Fawad Ahmad, Fan Bai, Marco Gruteser, and Ramesh Govindan. 2018. Avr: Augmented vehicular reality. In Proceedings of the 16th Annual International Conference on Mobile Systems, Applications, and Services. 81--95.Google Scholar
Digital Library
- Niranjini Rajagopal, John Miller, Krishna Kumar Reghu Kumar, Anh Luong, and Anthony Rowe. 2018. Demo abstract: welcome to my world: demystifying multi-user AR with the cloud. In 2018 17th ACM/IEEE International Conference on Information Processing in Sensor Networks(IPSN). IEEE, 146--147.Google Scholar
Digital Library
- Christian G Reiff. 2009. Acoustic source localization and cueing from an aerostat during the NATO SET-093 field experiment. In Unattended Ground, Sea, and Air Sensor Technologies and Applications XI, Vol. 7333. International Society for Optics and Photonics, 73330M.Google Scholar
- Dario Rethage, Jordi Pons, and Xavier Serra. 2018. A wavenet for speech denoising. In 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). IEEE, 5069--5073.Google Scholar
Digital Library
- Hugh Robjohns. 2001. A brief history of microphones. In Microphone Data Book. http://microphone-data.com/media/filestore/articles/History-10.pdf.Google Scholar
- Antonio Ramón Jiménez Ruiz and Fernando Seco Granja. 2017. Comparing ubisense, bespoon, and decawave uwb location systems: Indoor performance analysis. IEEE Transactions on instrumentation and Measurement 66, 8 (2017), 2106--2117.Google Scholar
Cross Ref
- F. Sabath, E. L. Mokole, and S. N. Samaddar. 2005. Definition and classification of ultra-wideband signals and devices. URSI Radio Science Bulletin 2005, 313 (2005), 12--26.Google Scholar
- Swapnil Sayan Saha, Sandeep Singh Sandha, and Mani Srivastava. 2020. Deep Convolutional Bidirectional LSTM for Complex Activity Recognition with Missing Data. Springer.Google Scholar
- Sana Salous, Vittorio Degli Esposti, Franco Fuschini, Reiner S Thomae, Robert Mueller, Diego Dupleich, Katsuyuki Haneda, Jose-Maria Molina Garcia-Pardo, Juan Pascual Garcia, Davy P Gaillot, et al. 2016. Millimeter-Wave Propagation: Characterization and modeling toward fifth-generation systems.[Wireless Corner]. IEEE Antennas and Propagation Magazine 58, 6 (2016), 115--127.Google Scholar
Cross Ref
- Hiroshi Sawada, Shoko Araki, Ryo Mukai, and Shoji Makino. 2006. Blind extraction of dominant target sources using ICA and time-frequency masking. IEEE Transactions on Audio, Speech, and Language Processing 14, 6 (2006), 2165--2173.Google Scholar
Digital Library
- Elahe Shojaei, Hassan Ashayeri, Zahra Jafari, Mohammad Reza Zarrin Dast, and Koorosh Kamali. 2016. Effect of signal to noise ratio on the speech perception ability of older adults. Medical journal of the Islamic Republic of Iran 30 (2016), 342.Google Scholar
- Akash Deep Singh, Sandeep Singh Sandha, Luis Garcia, and Mani Srivastava. 2019. Radhar: Human activity recognition from point clouds generated through a millimeter-wave radar. In Proceedings of the 3rd ACM Workshop on Millimeter-wave Networks and Sensing Systems. 51--56.Google Scholar
Digital Library
- Nikolaos Stefanakis, Despoina Pavlidi, and Athanasios Mouchtaris. 2017. Perpendicular cross-spectra fusion for sound source localization with a planar microphone array. IEEE/ACM Transactions on Audio, Speech, and Language Processing 25, 9 (2017), 1821--1835.Google Scholar
Digital Library
- Yuki Tamai, Yoko Sasaki, Satoshi Kagami, and Hiroshi Mizoguchi. 2005. Three ring microphone array for 3d sound localization and separation for mobile robot audition. In 2005 IEEE/RSJ International Conference on Intelligent Robots and Systems. IEEE, 4172--4177.Google Scholar
Cross Ref
- Federico Thomas and Lluís Ros. 2005. Revisiting trilateration for robot localization. IEEE Transactions on robotics 21, 1 (2005), 93--101.Google Scholar
Digital Library
- Nicolas Turpault, Romain Serizel, Ankit Parag Shah, and Justin Salamon. 2019. Sound event detection in domestic environments with weakly labeled data and soundscape synthesis. In Workshop on Detection and Classification of Acoustic Scenes and Events. New York City, United States. https://hal.inria.fr/hal-02160855Google Scholar
Cross Ref
- Efthymios Tzinis, Scott Wisdom, John R Hershey, Aren Jansen, and Daniel PW Ellis. 2020. Improving universal sound separation using sound classification. In ICASSP 2020--2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). IEEE, 96--100.Google Scholar
Cross Ref
- Michel Vacher, Dan Istrate, Laurent Besacier, Jean-François Serignat, and Eric Castelli. 2004. Sound detection and classification for medical telesurvey.Google Scholar
- J-M Valin, François Michaud, Jean Rouat, and Dominic Létourneau. 2003. Robust sound source localization using a microphone array on a mobile robot. In Proceedings 2003 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS 2003)(Cat. No. 03CH37453), Vol. 2. IEEE, 1228--1233.Google Scholar
Cross Ref
- Swaroop Venkatesh, Christopher R Anderson, Natalia V Rivera, and R Michael Buehrer. 2005. Implementation and analysis of respiration-rate estimation using impulse-based UWB. In MILCOM 2005--2005 IEEE Military Communications Conference. IEEE, 3314--3320.Google Scholar
Cross Ref
- Kavitha Viswanathan and Sharmila Sengupta. 2015. Blind navigation proposal using SONAR. In 2015 IEEE International Conference on Computer Graphics, Vision and Information Security (CGVIS). IEEE, 151--156.Google Scholar
Cross Ref
- Hanbiao Wang, Jeremy Elson, Lewis Girod, Deborah Estrin, and Kung Yao. 2003. Target classification and localization in habitat monitoring. In 2003 IEEE International Conference on Acoustics, Speech, and Signal Processing, 2003. Proceedings.(ICASSP'03)., Vol. 4. IEEE, IV-844.Google Scholar
- Qiaosong Wang. 2019. Towards Real-time 3D Reconstruction using Consumer UAVs. arXiv preprint arXiv:1902.09733 (2019).Google Scholar
- Ziqi Wang, Zhihao Gu, Junwei Yin, Zhe Chen, and Yuedong Xu. 2018. Syncope detection in toilet environments using Wi-Fi channel state information. In Proceedings of the 2018 ACM International Joint Conference and 2018 International Symposium on Pervasive and Ubiquitous Computing and Wearable Computers. 287--290.Google Scholar
Digital Library
- Teng Wei, Shu Wang, Anfu Zhou, and Xinyu Zhang. 2015. Acoustic eavesdropping through wireless vibrometry. In Proceedings of the 21st Annual International Conference on Mobile Computing and Networking. ACM, 130--141.Google Scholar
Digital Library
- Scott Wisdom, Hakan Erdogan, Daniel P. W. Ellis, Romain Serizel, Nicolas Turpault, Eduardo Fonseca, Justin Salamon, Prem Seetharaman, and John R. Hershe. 2020. What's All the FUSS About Free Universal Sound Separation Data? https://github.com/google-research/sound-separation/tree/master/models/dcase2020_fuss_baseline Accessed: 2020-10-18.Google Scholar
- Tianwei Xing, Marc Roig Vilamala, Luis Garcia, Federico Cerutti, Lance Kaplan, Alun Preece, and Mani Srivastava. 2019. Deepcep: Deep complex event processing using distributed multimodal information. In 2019 IEEE International Conference on Smart Computing (SMARTCOMP). IEEE, 87--92.Google Scholar
Cross Ref
- Chenhan Xu, Zhengxiong Li, Hanbin Zhang, Aditya Singh Rathore, Huining Li, Chen Song, Kun Wang, and Wenyao Xu. 2019. WaveEar: Exploring a mmWave-based Noise-resistant Speech Sensing for Voice-User Interface. In Proceedings of the 17th Annual International Conference on Mobile Systems, Applications, and Services. ACM, 14--26.Google Scholar
Digital Library
- Lei Yang, Yao Li, Qiongzheng Lin, Xiang-Yang Li, and Yunhao Liu. 2016. Making sense of mechanical vibration period with sub-millisecond accuracy using backscatter signals. In Proceedings of the 22nd Annual International Conference on Mobile Computing and Networking. ACM, 16--28.Google Scholar
Digital Library
- Jiaxing Ye, Takumi Kobayashi, and Masahiro Murakawa. 2017. Urban sound event classification based on local and global features aggregation. Applied Acoustics 117 (2017), 246--256.Google Scholar
Cross Ref
- Esfandiar Zavarehei. 2020. Berouti Spectral Subtraction MATLAB implementation. https://www.mathworks.com/matlabcentral/fileexchange/7653-berouti-spectral-subtraction. Accessed: 2020-10-18.Google Scholar
- Esfandiar Zavarehei. 2020. Boll Spectral Subtraction MATLAB implementation. https://jp.mathworks.com/matlabcentral/fileexchange/7675-boll-spectral-subtraction. Accessed: 2020-10-18.Google Scholar
- Esfandiar Zavarehei. 2020. Multi-band Spectral Subtraction MATLAB implementation. https://www.mathworks.com/matlabcentral/fileexchange/7674-multi-band-spectral-subtraction. Accessed: 2020-10-18.Google Scholar
- Cemin Zhang, Michael Kuhn, Brandon Merkl, Aly E Fathy, and Mohamed Mahfouz. 2006. Accurate UWB indoor localization system utilizing time difference of arrival approach. In 2006 IEEE radio and wireless symposium. IEEE, 515--518.Google Scholar
- Dongheng Zhang, Yang Hu, Yan Chen, and Bing Zeng. 2019. BreathTrack: Tracking indoor human breath status via commodity WiFi. IEEE Internet of Things Journal 6, 2 (2019), 3899--3911.Google Scholar
Cross Ref
- Yang Zhang, Gierad Laput, and Chris Harrison. 2018. Vibrosight: Long-Range Vibrometry for Smart Environment Sensing. In The 31st Annual ACM Symposium on User Interface Software and Technology. ACM, 225--236.Google Scholar
Digital Library
- Tianyue Zheng, Zhe Chen, Chao Cai, Jun Luo, and Xu Zhang. 2020. V2iFi: in-Vehicle Vital Sign Monitoring via Compact RF Sensing. Proc. ACM Interact. Mob. Wearable Ubiquitous Technol 4, 2 (Jun 2020). Google Scholar
Cross Ref
Index Terms
UWHear: through-wall extraction and separation of audio vibrations using wireless signals
Recommendations
Exploring audio and kinetic sensing on earable devices
WearSys '18: Proceedings of the 4th ACM Workshop on Wearable Systems and ApplicationsIn this paper, we explore audio and kinetic sensing on earable devices with the commercial on-the-shelf form factor. For the study, we prototyped earbud devices with a 6-axis inertial measurement unit and a microphone. We systematically investigate the ...
Sensing the Physical World with RF: Self-Interferometry & Passive-Interferometry
S3'19: Proceedings of the 2019 on Wireless of the Students, by the Students, and for the Students WorkshopRF can provide a non-contact and non-line-of-sight of sensing of the physical world, therefore, it makes RF unique sensing modality that has found applications in automotive sensing, smart-home sensing, health monitoring, and many other applications. ...
Teaching RF to Sense without RF Training Measurements
In this paper, we propose a novel, generalizable, and scalable idea that eliminates the need for collecting Radio Frequency (RF) measurements, when training RF sensing systems for human-motion-related activities. Existing learning-based RF sensing ...





Comments