skip to main content
10.1145/3465481.3465772acmotherconferencesArticle/Chapter ViewAbstractPublication PagesaresConference Proceedingsconference-collections
research-article

SoK: Automatic Deobfuscation of Virtualization-protected Applications

Published: 17 August 2021 Publication History

Abstract

Malware authors often rely on code obfuscation to hide the malicious functionality of their software, making detection and analysis more difficult. One of the most advanced techniques for binary obfuscation is virtualization-based obfuscation, which converts the functionality of a program into the bytecode of a randomly generated virtual machine which is embedded into the protected program. To enable the automatic detection and analysis of protected malware, new deobfuscation techniques against virtualization-based obfuscation are constantly being developed and proposed in the literature.
In this work, we systematize existing knowledge of automatic deobfuscation of virtualization-protected programs in a novel classification scheme and evaluate where we stand in the arms race between malware authors and code analysts in regards to virtualization-based obfuscation. In addition to a theoretical discussion of different types of deobfuscation methodologies, we present an in-depth practical evaluation that compares state-of-the-art virtualization-based obfuscators with currently available deobfuscation tools. The results clearly indicate the possibility of automatic deobfuscation of virtualization-based obfuscation in specific scenarios. Furthermore, however, the results highlight limitations of existing deobfuscation methods. Multiple challenges still lie ahead on the way towards reliable and resilient automatic deobfuscation of virtualization-based obfuscation.

References

[1]
2020. Code Virtualizer. https://www.oreans.com/CodeVirtualizer.php Last Accessed 2020.03.12.
[2]
2020. EXECrypter. https://web.archive.org/web/20180520123330http://www.strongbit.com/execryptor.asp Last Accessed 2020.04.24.
[3]
2020. GCC, the GNU Compiler Collection. https://gcc.gnu.org/ Last Accessed 2020.07.11.
[4]
2020. MinGW - Minimalist GNU for Windows. http://www.mingw.org/ Last Accessed 2020.07.11.
[5]
2020. Themida. https://www.oreans.com/Themida.php Last Accessed 2020.04.24.
[6]
2020. Tigress Challenge Script 0001. http://tigress.cs.arizona.edu/scripts_txt/0001.sh.txt Last Accessed 2020.05.28.
[7]
2020. Tigress Challenge Script 0003. http://tigress.cs.arizona.edu/scripts_txt/0003.sh.txt Last Accessed 2020.05.28.
[8]
2020. VMHunt. https://github.com/s3team/VMHunt Last Accessed 2020.02.27.
[9]
2020. VMProtect. https://vmpsoft.com/ Last Accessed 2020.01.31.
[10]
Sebastian Banescu, Christian Collberg, Vijay Ganesh, Zack Newsham, and Alexander Pretschner. 2016. Code Obfuscation against Symbolic Execution Attacks. In Proceedings of the 32nd Annual Conference on Computer Security Applications(ACSAC ’16). ACM, 189–200.
[11]
Tim Blazytko, Moritz Contag, Cornelius Aschermann, and Thorsten Holz. 2017. Syntia: Synthesizing the Semantics of Obfuscated Code. In Proceedings of the 26th USENIX Conference on Security Symposium(SEC). USENIX, 643–659.
[12]
Xiaoyang Cheng, Yan Lin, Debin Gao, and Chunfu Jia. 2019. DynOpVm: VM-based software obfuscation with dynamic opcode mapping. In International Conference on Applied Cryptography and Network Security. Springer, 155–174.
[13]
Xiaoyang Cheng, Yan Lin, Debin Gao, and Chunfu Jia. 2019. DynOpVm: VM-Based Software Obfuscation with Dynamic Opcode Mapping. In Applied Cryptography and Network Security. Springer International Publishing, 155–174.
[14]
Christian Collberg. 2020. The Tigress C diversifier/obfuscator. https://tigress.wtf/introduction.html Last Accessed 2020.03.01.
[15]
Christian Collberg, Clark Thomborson, and Douglas Low. 1997. A taxonomy of obfuscating transformations. Technical Report. Department of Computer Science, The University of Auckland, New Zealand.
[16]
Kevin Coogan, Gen Lu, and Saumya Debray. 2011. Deobfuscation of Virtualization-Obfuscated Software: A Semantics-Based Approach. In Proceedings of the 18th ACM Conference on Computer and Communications Security. ACM, 275–284.
[17]
Leonardo De Moura and Nikolaj Bjørner. 2008. Z3: An Efficient SMT Solver. In Tools and Algorithms for the Construction and Analysis of Systems. Springer Berlin Heidelberg, 337–340.
[18]
Manuel Egele, Theodoor Scholte, Engin Kirda, and Christopher Kruegel. 2012. A Survey on Automated Dynamic Malware-Analysis Techniques and Tools. ACM Comput. Surv. 44, 2 (02 2012).
[19]
Sudeep Ghosh, Jason D Hiser, and Jack W Davidson. 2015. Matryoshka: Strengthening Software Protection via Nested Virtual Machines. In 2015 IEEE/ACM 1st International Workshop on Software Protection. IEEE, 10–16.
[20]
Yoann Guillot and Alexandre Gazet. 2009. Semi-automatic binary protection tampering. Journal in Computer Virology 5 (05 2009), 119–149.
[21]
Yoann Guillot and Alexandre Gazet. 2010. Automatic binary deobfuscation. Journal in Computer Virology 6, 3 (2010).
[22]
Shohreh Hosseinzadeh, Sampsa Rauti, Samuel Laurén, Jari-Matti Mäkelä, Johannes Holvitie, Sami Hyrynsalmi, and Ville Leppänen. 2016. A Survey on Aims and Environments of Diversification and Obfuscation in Software Security. In Proceedings of the 17th International Conference on Computer Systems and Technologies 2016(CompSysTech ’16). ACM, 113–120.
[23]
Shohreh Hosseinzadeh, Sampsa Rauti, Samuel Laurén, Jari-Matti Mäkelä, Johannes Holvitie, Sami Hyrynsalmi, and Ville Leppänen. 2018. Diversification and obfuscation techniques for software security: A systematic literature review. Information and Software Technology 104 (12 2018).
[24]
Joonhyung Hwang and Taisook Han. 2018. Identifying Input-Dependent Jumps from Obfuscated Execution using Dynamic Data Flow Graphs. In Proc. of the 8th Software Security, Protection, and Reverse Engineering Workshop. ACM.
[25]
Anatoli Kalysch, Johannes Götzfried, and Tilo Müller. 2017. VMAttack: Deobfuscating Virtualization-Based Packed Binaries. In Proceedings of the 12th International Conference on Availability, Reliability and Security(ARES ’17). ACM.
[26]
Johannes Kinder. 2012. Towards Static Analysis of Virtualization-Obfuscated Binaries. In 2012 19th Working Conference on Reverse Engineering. IEEE, 61–70.
[27]
Johannes Kinder and Helmut Veith. 2008. Jakstab: A Static Analysis Platform for Binaries. In Computer Aided Verification. Springer Berlin Heidelberg, 423–427.
[28]
Kaiyuan Kuang, Zhanyong Tang, Xiaoqing Gong, Dingyi Fang, Xiaojiang Chen, Tianzhang Xing, Guixin Ye, Jie Zhang, and Zheng Wang. 2016. Exploiting Dynamic Scheduling for VM-Based Code Obfuscation. In 2016 IEEE Trustcom/BigDataSE/ISPA. IEEE, 489–496.
[29]
Jae-Yung Lee, Jae Hyuk Suk, and Dong Hoon Lee. 2019. VODKA: Virtualization Obfuscation Using Dynamic Key Approach. In Information Security Applications. Springer International Publishing, 131–145.
[30]
Mingyue Liang, Zhoujun Li, Qiang Zeng, and Zhejun Fang. 2018. Deobfuscation of Virtualization-Obfuscated Code Through Symbolic Execution and Compilation Optimization. In Information and Communications Security. Springer International Publishing, 313–324.
[31]
Chi-Keung Luk, Robert Cohn, Robert Muth, Harish Patil, Artur Klauser, Geoff Lowney, Steven Wallace, Vijay Janapa Reddi, and Kim Hazelwood. 2005. Pin: Building Customized Program Analysis Tools with Dynamic Instrumentation. In Proceedings of the 2005 ACM SIGPLAN Conference on Programming Language Design and Implementation(PLDI ’05). ACM, 190–200.
[32]
Jiang Ming, Dongpeng Xu, Yufei Jiang, and Dinghao Wu. 2017. BinSim: Trace-based Semantic Binary Diffing via System Call Sliced Segment Equivalence Checking. In 26th USENIX Security Symposium (USENIX Security 17). USENIX Association, 253–270.
[33]
Jasvir Nagra and Christian Collberg. 2009. Surreptitious Software: Obfuscation, Watermarking, and Tamperproofing for Software Protection: Obfuscation, Watermarking, and Tamperproofing for Software Protection. Pearson Education.
[34]
J Raber. 2013. Virtual deobfuscator-a darpa cyber fast track funded effort. Proc. of the 16th Black Hat USA(2013).
[35]
Jeffrey Racine. 2000. The Cygwin tools: a GNU toolkit for Windows. Journal of Applied Econometrics 15 (2000), 331–341.
[36]
Rolf Rolles. 2009. Unpacking Virtualization Obfuscators. In Proceedings of the 3rd USENIX Conference on Offensive Technologies(WOOT’09). USENIX Association.
[37]
Jonathan Salwan, Sébastien Bardin, and Marie-Laure Potet. 2018. Symbolic Deobfuscation: From Virtualized Code Back to the Original. In Detection of Intrusions and Malware, and Vulnerability Assessment. 372–392.
[38]
Jonathan Salwan, Sebastien Bardin, and Marie-Laure Potet. 2020. Tigress_Protection. https://github.com/JonathanSalwan/Tigress_protection Last Accessed 2020.02.27.
[39]
Florent Saudel and Jonathan Salwan. 2015. Triton: A Dynamic Symbolic Execution Framework. In Symposium sur la sécurité des technologies de l’information et des communications, SSTIC, France, Rennes, June 3-5 2015. SSTIC, 31–54.
[40]
Sebastian Schrittwieser, Stefan Katzenbeisser, Johannes Kinder, Georg Merzdovnik, and Edgar Weippl. 2016. Protecting Software Through Obfuscation: Can It Keep Pace with Progress in Code Analysis?ACM Computing Surveys (CSUR) 49, 1 (04 2016), 4:1–4:37.
[41]
Monirul Sharif, Andrea Lanzi, Jonathon Giffin, and Wenke Lee. 2009. Automatic Reverse Engineering of Malware Emulators. In 2009 30th IEEE Symposium on Security and Privacy. IEEE, 94–109.
[42]
Zhanyong Tang, Kaiyuan Kuang, Lei Wang, Chao Xue, Xiaoqing Gong, Xiaojiang Chen, Dingyi Fang, Jie Liu, and Zheng Wang. 2017. SEEAD: A Semantic-Based Approach for Automatic Binary Code De-obfuscation. In 2017 IEEE Trustcom/BigDataSE/ICESS. IEEE, 261–268.
[43]
Zhanyong Tang, Meng Li, Guixin Ye, Shuai Cao, Meiling Chen, Xiaoqing Gong, Dingyi Fang, and Zheng Wang. 2018. VMGuards: A Novel Virtual Machine Based Code Protection System with VM Security as the First Class Design Concern. Applied Sciences 8, 5 (2018).
[44]
Clark Taylor and Christian Colberg. 2016. A Tool for Teaching Reverse Engineering. In 2016 USENIX Workshop on Advances in Security Education (ASE 16). USENIX.
[45]
Huaijun Wang, Dingyi Fang, Guanghui Li, Xiaoyan Yin, Bo Zhang, and Yuanxiang Gu. 2013. NISLVMP: Improved Virtual Machine-Based Software Protection. In Proceedings of the 9th International Conference on Computational Intelligence and Security(CIS ’13). IEEE, 479–483.
[46]
Wei Wang, Meng Li, Zhanyong Tang, Huanting Wang, Guixin Ye, Fuwei Wang, Jie Ren, Xiaoqing Gong, Dingyi Fang, and Zheng Wang. 2019. Invalidating Analysis Knowledge for Code Virtualization Protection Through Partition Diversity. IEEE Access 7(2019), 169160–169173.
[47]
Dongpeng Xu, Jiang Ming, Yu Fu, and Dinghao Wu. 2018. VMHunt: A Verifiable Approach to Partially-Virtualized Binary Code Simplification. In Proceedings of the 2018 ACM SIGSAC Conference on Computer and Communications Security(CCS ’18). ACM, 442–458.
[48]
Chao Xue, Zhanyong Tang, Guixin Ye, Guanghui Li, Xiaoqing Gong, Wei Wangg, Dingyi Fang, and Zheng Wang. 2018. Exploiting Code Diversity to Enhance Code Virtualization Protection. In 2018 IEEE 24th International Conference on Parallel and Distributed Systems (ICPADS). IEEE, 620–627.
[49]
Babak Yadegari, Brian Johannesmeyer, Ben Whitely, and Saumya Debray. 2015. A Generic Approach to Automatic Deobfuscation of Executable Code. In 2015 IEEE Symposium on Security and Privacy. IEEE, 674–691.

Cited By

View all
  • (2023)Function-Level Code Obfuscation Detection Through Self-Attention-Guided Multi-Representation FusionInternational Journal of Software Engineering and Knowledge Engineering10.1142/S021819402350066334:04(651-673)Online publication date: 11-Dec-2023
  • (2023)xVMP: An LLVM-based Code Virtualization Obfuscator2023 IEEE International Conference on Software Analysis, Evolution and Reengineering (SANER)10.1109/SANER56733.2023.00082(738-742)Online publication date: Mar-2023
  • (2023)Computer-Aided Reverse Engineering of Protected SoftwareDigital Sovereignty in Cyber Security: New Challenges in Future Vision10.1007/978-3-031-36096-1_1(3-15)Online publication date: 16-Jun-2023
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Other conferences
ARES '21: Proceedings of the 16th International Conference on Availability, Reliability and Security
August 2021
1447 pages
ISBN:9781450390514
DOI:10.1145/3465481
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 17 August 2021

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. Application security
  2. Deobfuscation
  3. Virtualiziation-based obfuscation

Qualifiers

  • Research-article
  • Research
  • Refereed limited

Funding Sources

Conference

ARES 2021

Acceptance Rates

Overall Acceptance Rate 228 of 451 submissions, 51%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)136
  • Downloads (Last 6 weeks)12
Reflects downloads up to 23 Sep 2024

Other Metrics

Citations

Cited By

View all
  • (2023)Function-Level Code Obfuscation Detection Through Self-Attention-Guided Multi-Representation FusionInternational Journal of Software Engineering and Knowledge Engineering10.1142/S021819402350066334:04(651-673)Online publication date: 11-Dec-2023
  • (2023)xVMP: An LLVM-based Code Virtualization Obfuscator2023 IEEE International Conference on Software Analysis, Evolution and Reengineering (SANER)10.1109/SANER56733.2023.00082(738-742)Online publication date: Mar-2023
  • (2023)Computer-Aided Reverse Engineering of Protected SoftwareDigital Sovereignty in Cyber Security: New Challenges in Future Vision10.1007/978-3-031-36096-1_1(3-15)Online publication date: 16-Jun-2023
  • (2022)Obfuscation-Resilient Semantic Functionality Identification Through Program SimulationSecure IT Systems10.1007/978-3-031-22295-5_15(273-291)Online publication date: 30-Nov-2022

View Options

Get Access

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

HTML Format

View this article in HTML Format.

HTML Format

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media