skip to main content
10.1145/3384419.3430713acmconferencesArticle/Chapter ViewAbstractPublication PagessensysConference Proceedingsconference-collections
research-article

Patronus: preventing unauthorized speech recordings with support for selective unscrambling

Published:16 November 2020Publication History

ABSTRACT

The widespread adoption and ubiquity of smart devices equipped with microphones (e.g., cellphones, smartwatches, etc.) unfortunately create many significant privacy risks. In recent years, there have been several cases of people's conversations being secretly recorded, sometimes initiated by the device itself. Although some manufacturers are trying to protect users' privacy, to the best of our knowledge, there is not any effective technical solution available. In this work, we present Patronus, a system that can both prevent unauthorized devices from making secret recordings while allowing authorized devices to record conversations. Patronus prevents unauthorized speech recording by emitting what we call a scramble, a low-frequency noise generated by inaudible ultrasonic waves. The scramble prevents unauthorized recordings by leveraging the nonlinear effects of commercial off-the-shelf microphones. The frequency components of the scramble are randomly determined and connected with linear chirps, and the frequency period is fine-tuned so that the scramble pattern is hard to attack. Patronus allows authorized speech recording by secretly delivering the scramble pattern to authorized devices, which can use an adaptive filter to cancel out the scramble. We implement a prototype system and conduct comprehensive experiments. Our results show that only 19.7% of words protected by Patronus' scramble can be recognized by unauthorized devices. Furthermore, authorized recordings have 1.6x higher perceptual evaluation of speech quality (PESQ) score and, on average, 50% lower speech recognition error rates than unauthorized recordings.

References

  1. The Guardian. Apple apologises for allowing workers to listen to siri recordings. https://www.theguardian.com/technology/2019/aug/29/apple-apologises-listen-siri-recordings. (Accessed on Feb. 28, 2020).Google ScholarGoogle Scholar
  2. CNBC. Amazon echo recorded conversation, sent to random person: report. https://www.cnbc.com/2018/05/24/amazon-echo-recorded-conversation-sent-to-random-person-report.html. (Accessed on Feb. 28, 2020).Google ScholarGoogle Scholar
  3. The Guardian. Ukraine prime minister offers resignation after leaked recording. https://www.theguardian.com/world/2020/jan/17/ukraine-prime-minister-oleksiy-goncharuk-offers-resignation-after-leaked-recording. (Accessed on Feb. 28, 2020).Google ScholarGoogle Scholar
  4. Yu-Chih Tung and Kang G. Shin. Exploiting sound masking for audio privacy in smartphones. In Proceedings of ACMA SIACCS, July 7--12, 2019, Auckland, New Zealand.Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. Qiongzheng Lin, Zhenlin An, and Lei Yang. Rebooting ultrasonic positioning systems for ultrasound-incapable smart devices. In Proceedings of ACM MobiCom, October 21--25, 2019, Los Cabos, Mexico.Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. Anti-eavesdropping and recording blocker device, China Patent 201320228440, Oct. 2013.Google ScholarGoogle Scholar
  7. Nirupam Roy, Haitham Hassanieh, and Romit Roy Choudhury. Backdoor: Making microphones hear inaudible sounds. In Proceedings of ACM MobiSys, June 19--23, 2017, Niagara Falls, NY, USA.Google ScholarGoogle Scholar
  8. Nirupam Roy, Sheng Shen, Haitham Hassanieh, and Romit Roy Choudhury. Inaudible voice commands: The long-range attack and defense. In Proceedings of USENIX NSDI, April 9--11, 2018, Renton, WA, USA.Google ScholarGoogle Scholar
  9. Guoming Zhang, Chen Yan, Xiaoyu Ji, Tianchen Zhang, Taimin Zhang, and Wenyuan Xu. Dolphinattack: Inaudible voice commands. In Proceedings of ACM CCS, October 30-November 3, 2017, Dallas, TX, USA.Google ScholarGoogle Scholar
  10. Tao Chen, Longfei Shangguan, Zhenjiang Li, and Kyle Jamieson. Metamorph: Injecting inaudible commands into over-the-air voice controlled systems. In Proceedings of NDSS, February 23--26, 2020, San Diego, CA, USA.Google ScholarGoogle ScholarCross RefCross Ref
  11. Xinyan Zhou, Xiaoyu Ji, Chen Yan, Jiangyi Deng, and Wenyuan Xu. Nauth: Secure face-to-face device authentication via nonlinearity. In Proceedings of IEEE INFOCOM, April 29-May 2, 2019, Paris, France.Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. Qiben Yan, Kehai Liu, Qin Zhou, Hanqing Guo, and Ning Zhang. Surfingattack: Interactive hidden attack on voice assistants using ultrasonic guided wave. In Proceedings of NDSS, February 23--26, 2020, San Diego, CA, USA.Google ScholarGoogle ScholarCross RefCross Ref
  13. Aleksandr Rovner. The principle of ultrasound. https://www.echopedia.org/wiki/The_principle_of_ultrasound, 2015. (Accessed on Oct. 19, 2020).Google ScholarGoogle Scholar
  14. Ali H Sayed. Fundamentals of adaptive filtering. John Wiley & Sons, 2003.Google ScholarGoogle Scholar
  15. Antony W Rix, John G Beerends, Michael P Hollier, and Andries P Hekstra. Perceptual evaluation of speech quality (pesq)-a new method for speech quality assessment of telephone networks and codecs. In Proceedings of IEEE ICASSP, May 7--11, 2001, Salt Lake City, UT, USA.Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. Yitao He, Junyu Bian, Xinyu Tong, Zihui Qian, Wei Zhu, Xiaohua Tian, and Xinbing Wang. Canceling inaudible voice commands against voice control systems. In Proceedings of ACM MobiCom, October 21--25, 2019, Los Cabos, Mexico.Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. Anran Wang, Chunyi Peng, Ouyang Zhang, Guobin Shen, and Bing Zeng. Inframe: Multifiexing full-frame visible communication channel for humans and devices. In Proceedings of ACM HotNets, October 27--28, 2014, Los Angeles, CA, USA.Google ScholarGoogle Scholar
  18. Anran Wang, Zhuoran Li, Chunyi Peng, Guobin Shen, Gan Fang, and Bing Zeng. Inframe++ achieve simultaneous screen-human viewing and hidden screen-camera communication. In Proceedings of ACM MobiSys, May 18--22, 2015, Florence, Italy.Google ScholarGoogle Scholar
  19. Viet Nguyen, Yaqin Tang, Ashwin Ashok, Marco Gruteser, Kristin Dana, Wenjun Hu, Eric Wengrowski, and Narayan Mandayam. High-rate flicker-free screen-camera communication with spatially adaptive embedding. In Proceedings of IEEE INFOCOM, April 10--15, 2016, San Francisco, CA, USA.Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. Kai Zhang, Chenshu Wu, Chaofan Yang, Yi Zhao, Kehong Huang, Chunyi Peng, Yunhao Liu, and Zheng Yang. Chromacode: A fully imperceptible screen-camera communication system. In Proceedings of ACM MobiCom, October 29-November 2, 2018, New Delhi, India.Google ScholarGoogle Scholar
  21. Qian Wang, Kui Ren, Man Zhou, Tao Lei, Dimitrios Koutsonikolas, and Lu Su. Messages behind the sound: real-time hidden acoustic signal capture with smartphones. In Proceedings of ACM MobiCom, October 3--7, 2016, New York, NY, USA.Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. Man Zhou, Qian Wang, Kui Ren, Dimitrios Koutsonikolas, Lu Su, and Yanjiao Chen. Dolphin: Real-time hidden acoustic signal capture with smartphones. IEEE Transactions on Mobile Computing, 18(3):560--573, 2018.Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. Lan Zhang, Cheng Bo, Jiahui Hou, Xiang-Yang Li, Yu Wang, Kebin Liu, and Yunhao Liu. Kaleido: You can watch it but cannot record it. In Proceedings of ACM MobiCom, September 7--11, 2015, Paris, France.Google ScholarGoogle Scholar
  24. Shilin Zhu, Chi Zhang, and Xinyu Zhang. Automating visual privacy protection using a smart led. In Proceedings of ACM MobiCom, October 16--20, Snowbird, Utah, USA.Google ScholarGoogle Scholar
  25. Ingo R Titze and Daniel W Martin. Principles of voice production, 1998.Google ScholarGoogle Scholar
  26. Ronald J Baken and Robert F Orlikoff. Clinical measurement of speech and voice. Cengage Learning, 2000.Google ScholarGoogle Scholar
  27. Sheng Shen, Nirupam Roy, Junfeng Guan, Haitham Hassanieh, and Romit Roy Choudhury. Mute: bringing iot to noise cancellation. In Proceedings of ACM SIGCOMM, August 20--25, 2018, Budapest, Hungary.Google ScholarGoogle Scholar
  28. ITUT Rec. P. 800.1, mean opinion score (mos) terminology. International Telecommunication Union, Geneva, 2006.Google ScholarGoogle Scholar
  29. Mika Wilson. Pesq - what is it and how could it transform your customer experience? https://www.spearline.com/blog/post/pesq-what-is-it-and-how-could-it-transform-your-customer-experience-/, 2018. (Accessed on Oct. 2, 2020).Google ScholarGoogle Scholar
  30. Kamil Wojcicki. Pesq matlab wrapper. https://www.mathworks.com/matlabcentral/fileexchange/33820-pesq-matlab-wrapper. (Accessed on Mar. 6, 2020).Google ScholarGoogle Scholar

Index Terms

  1. Patronus: preventing unauthorized speech recordings with support for selective unscrambling

          Recommendations

          Comments

          Login options

          Check if you have access through your login credentials or your institution to get full access on this article.

          Sign in
          • Published in

            cover image ACM Conferences
            SenSys '20: Proceedings of the 18th Conference on Embedded Networked Sensor Systems
            November 2020
            852 pages
            ISBN:9781450375900
            DOI:10.1145/3384419

            Copyright © 2020 ACM

            Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

            Publisher

            Association for Computing Machinery

            New York, NY, United States

            Publication History

            • Published: 16 November 2020

            Permissions

            Request permissions about this article.

            Request Permissions

            Check for updates

            Qualifiers

            • research-article

            Acceptance Rates

            Overall Acceptance Rate174of867submissions,20%

          PDF Format

          View or Download as a PDF file.

          PDF

          eReader

          View online with eReader.

          eReader