Abstract
We address the problem of reverse engineering of stripped executables, which contain no debug information. This is a challenging problem because of the low amount of syntactic information available in stripped executables, and the diverse assembly code patterns arising from compiler optimizations. We present a novel approach for predicting procedure names in stripped executables. Our approach combines static analysis with neural models. The main idea is to use static analysis to obtain augmented representations of call sites; encode the structure of these call sites using the control-flow graph (CFG) and finally, generate a target name while attending to these call sites. We use our representation to drive graph-based, LSTM-based and Transformer-based architectures. Our evaluation shows that our models produce predictions that are difficult and time consuming for humans, while improving on existing methods by 28% and by 100% over state-of-the-art neural textual models that do not use any static analysis. Code and data for this evaluation are available at https://github.com/tech-srl/Nero.
Supplemental Material
- Miltiadis Allamanis. 2018. The Adverse Efects of Code Duplication in Machine Learning Models of Code. arXiv preprint arXiv: 1812. 06469 ( 2018 ).Google Scholar
- Miltiadis Allamanis, Earl T. Barr, Christian Bird, and Charles Sutton. 2015a. Suggesting Accurate Method and Class Names. In Proceedings of the 2015 10th Joint Meeting on Foundations of Software Engineering (ESEC/FSE 2015 ). ACM, New York, NY, USA, 38-49. https://doi.org/10.1145/2786805.2786849 Google Scholar
Digital Library
- Miltiadis Allamanis, Marc Brockschmidt, and Mahmoud Khademi. 2018. Learning to Represent Programs with Graphs. In ICLR.Google Scholar
- Miltiadis Allamanis, Hao Peng, and Charles A. Sutton. 2016. A Convolutional Attention Network for Extreme Summarization of Source Code. In Proceedings of the 33nd International Conference on Machine Learning, ICML 2016, New York City, NY, USA, June 19-24, 2016. 2091-2100. http://jmlr.org/proceedings/papers/v48/allamanis16.htmlGoogle Scholar
- Miltiadis Allamanis, Daniel Tarlow, Andrew D. Gordon, and Yi Wei. 2015b. Bimodal Modelling of Source Code and Natural Language. In Proceedings of the 32nd International Conference on International Conference on Machine Learning-Volume 37 (ICML'15). JMLR.org, 2123-2132. http://dl.acm.org/citation.cfm?id= 3045118. 3045344Google Scholar
- Uri Alon, Shaked Brody, Omer Levy, and Eran Yahav. 2019a. code2seq: Generating Sequences from Structured Representations of Code. In International Conference on Learning Representations. https://openreview.net/forum?id=H1gKYo09tXGoogle Scholar
- Uri Alon, Roy Sadaka, Omer Levy, and Eran Yahav. 2019b. Structural Language Models for Any-Code Generation. arXiv preprint arXiv: 1910. 00577 ( 2019 ).Google Scholar
- Uri Alon, Meital Zilberstein, Omer Levy, and Eran Yahav. 2018. A General Path-based Representation for Predicting Program Properties. In Proceedings of the 39th ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI 2018 ). ACM, New York, NY, USA, 404-419. https://doi.org/10.1145/3192366.3192412 Google Scholar
Digital Library
- Uri Alon, Meital Zilberstein, Omer Levy, and Eran Yahav. 2019c. Code2Vec: Learning Distributed Representations of Code. Proc. ACM Program. Lang. 3, POPL, Article 40 ( 2019 ), 29 pages. https://doi.org/10.1145/3290353 Google Scholar
Digital Library
- Dzmitry Bahdanau, Kyunghyun Cho, and Yoshua Bengio. 2014. Neural Machine Translation by Jointly Learning to Align and Translate. CoRR abs/1409.0473 ( 2014 ). http://arxiv.org/abs/1409.0473Google Scholar
- Tifany Bao, Jonathan Burket, Maverick Woo, Rafael Turner, and David Brumley. 2014. BYTEWEIGHT: Learning to recognize functions in binary code. Proceedings of the 23rd USENIX Security Symposium ( 2014 ), 845-860.Google Scholar
- Rohan Bavishi, Michael Pradel, and Koushik Sen. 2018. Context2Name: A deep learning-based approach to infer natural variable names from usage contexts. arXiv preprint arXiv: 1809. 05193 ( 2018 ).Google Scholar
- Pavol Bielik, Veselin Raychev, and Martin T. Vechev. 2016. PHOG: Probabilistic Model for Code. In Proceedings of the 33nd International Conference on Machine Learning, ICML 2016, New York City, NY, USA, June 19-24, 2016. 2933-2942. http://jmlr.org/proceedings/papers/v48/bielik16.htmlGoogle Scholar
Digital Library
- Marc Brockschmidt, Miltiadis Allamanis, Alexander L. Gaunt, and Oleksandr Polozov. 2019. Generative Code Modeling with Graphs. In International Conference on Learning Representations. https://openreview.net/forum?id=Bke4KsA5FXGoogle Scholar
- Chung-Cheng Chiu, Tara N Sainath, Yonghui Wu, Rohit Prabhavalkar, Patrick Nguyen, Zhifeng Chen, Anjuli Kannan, Ron J Weiss, Kanishka Rao, Ekaterina Gonina, et al. 2018. State-of-the-art speech recognition with sequence-to-sequence models. In 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). IEEE, 4774-4778.Google Scholar
Digital Library
- Kyunghyun Cho, Bart Van Merriënboer, Caglar Gulcehre, Dzmitry Bahdanau, Fethi Bougares, Holger Schwenk, and Yoshua Bengio. 2014. Learning phrase representations using RNN encoder-decoder for statistical machine translation. arXiv preprint arXiv:1406.1078 ( 2014 ).Google Scholar
- Yaniv David, Nimrod Partush, and Eran Yahav. 2017. Similarity of Binaries Through Re-optimization. In Proceedings of the 38th ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI 2017 ). ACM, New York, NY, USA, 79-94. https://doi.org/10.1145/3062341.3062387 Google Scholar
Digital Library
- Daniel DeFreez, Aditya V. Thakur, and Cindy Rubio-González. 2018. Path-based Function Embedding and Its Application to Error-handling Specification Mining. In Proceedings of the 2018 26th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering (ESEC/FSE 2018 ). ACM, New York, NY, USA, 423-433. https://doi.org/10.1145/3236024.3236059 Google Scholar
Digital Library
- Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. 2019. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers). 4171-4186.Google Scholar
- Steven H H Ding, Benjamin C M Fung, and Philippe Charland. 2019. Asm2Vec : Boosting Static Representation Robustness for Binary Clone Search against Code Obfuscation and Compiler Optimization. S&P ( 2019 ), 5-6.Google Scholar
- R. Edmonds. 2006. PolyUnpack : Automating the Hidden-Code Extraction of.Google Scholar
- Patrick Fernandes, Miltiadis Allamanis, and Marc Brockschmidt. 2019. Structured Neural Summarization. In International Conference on Learning Representations. https://openreview.net/forum?id=H1ersoRqtmGoogle Scholar
- Martin Fowler and Kent Beck. 1999. Refactoring: Improving the Design of Existing Code. Addison-Wesley Professional.Google Scholar
Digital Library
- Jingxuan He, Pesho Ivanov, Petar Tsankov, Veselin Raychev, and Martin Vechev. 2018. Debin: Predicting Debug Information in Stripped Binaries. In Proceedings of the 2018 ACM SIGSAC Conference on Computer and Communications Security (CCS '18). ACM, New York, NY, USA, 1667-1680. https://doi.org/10.1145/3243734.3243866 Google Scholar
Digital Library
- Sepp Hochreiter and Jürgen Schmidhuber. 1997. Long Short-Term Memory. Neural Comput. 9, 8 (Nov. 1997 ), 1735-1780. https://doi.org/10.1162/neco. 1997. 9.8. 1735 Google Scholar
Digital Library
- Einar W. Høst and Bjarte M. Østvold. 2009. Debugging Method Names. In Proceedings of the 23rd European Conference on ECOOP 2009-Object-Oriented Programming (Genoa). Springer-Verlag, Berlin, Heidelberg, 294-317. https://doi.org/10. 1007/978-3-642-03013-0_14 Google Scholar
Digital Library
- Intel. [n. d.]. Linux64-abi LINUXABI. https://software.intel.com/sites/default/files/article/402129/mpx-linux64-abi.pdf.Google Scholar
- Srinivasan Iyer, Ioannis Konstas, Alvin Cheung, and Luke Zettlemoyer. 2018. Mapping Language to Code in Programmatic Context. In Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing. 1643-1652.Google Scholar
Cross Ref
- Emily R. Jacobson, Nathan E. Rosenblum, and Barton P. Miller. 2011. Labeling library functions in stripped binaries. In Proceedings of the 10th SIGPLAN-SIGSOFT workshop on Program analysis for software tools, PASTE'11. 1-8. https: //doi.org/10.1145/2024569.2024571 Google Scholar
Digital Library
- Omer Katz, Noam Rinetzky, and Eran Yahav. 2018. Statistical Reconstruction of Class Hierarchies in Binaries. In Proceedings of the Twenty-Third International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS '18). ACM, New York, NY, USA, 363-376. https://doi.org/10.1145/3173162.3173202 Google Scholar
Digital Library
- Diederik Kingma and Jimmy Ba. 2014. Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 ( 2014 ).Google Scholar
- Thomas Kipf and Max Welling. 2017. Semi-Supervised Classification with Graph Convolutional Networks. In ICLR.Google Scholar
- Jeremy Lacomis, Pengcheng Yin, Edward J Schwartz, Miltiadis Allamanis, Claire Le Goues, Graham Neubig, and Bogdan Vasilescu. 2019. DIRE: A Neural Approach to Decompiled Identifier Naming. arXiv preprint arXiv: 1909. 09029 ( 2019 ).Google Scholar
- JongHyup Lee, Thanassis Avgerinos, and David Brumley. 2011. TIE: Principled reverse engineering of types in binary programs. ( 2011 ).Google Scholar
- Cristina V Lopes, Petr Maj, Pedro Martins, Vaibhav Saini, Di Yang, Jakub Zitny, Hitesh Sajnani, and Jan Vitek. 2017. DéjàVu: a map of code duplicates on GitHub. Proceedings of the ACM on Programming Languages 1, OOPSLA ( 2017 ), 84.Google Scholar
Digital Library
- Yanxin Lu, Swarat Chaudhuri, Chris Jermaine, and David Melski. 2017. Data-Driven Program Completion. CoRR abs/1705.09042 ( 2017 ). arXiv: 1705.09042 http://arxiv.org/abs/1705.09042Google Scholar
- Thang Luong, Hieu Pham, and Christopher D. Manning. 2015. Efective Approaches to Attention-based Neural Machine Translation. In Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, EMNLP 2015, Lisbon, Portugal, September 17-21, 2015. 1412-1421. http://aclweb.org/anthology/D/D15/D15-1166.pdfGoogle Scholar
Cross Ref
- James R Lyle and David Binkley. 1993. Program slicing in the presence of pointers. In Proceedings of the 1993 Software Engineering Research Forum. Citeseer, 255-260.Google Scholar
- Chris Maddison and Daniel Tarlow. 2014. Structured generative models of natural source code. In International Conference on Machine Learning. 649-657.Google Scholar
- Vijayaraghavan Murali, Swarat Chaudhuri, and Chris Jermaine. 2017. Bayesian Sketch Learning for Program Synthesis. CoRR abs/1703.05698 ( 2017 ). arXiv: 1703.05698 http://arxiv.org/abs/1703.05698Google Scholar
- Jannik Pewny, Behrad Garmany, Robert Gawlik, Christian Rossow, and Thorsten Holz. 2015. Cross-Architecture Bug Search in Binary Executables. In Proceedings of the 2015 IEEE Symposium on Security and Privacy (SP '15). IEEE Computer Society, Washington, DC, USA, 709-724. https://doi.org/10.1109/SP. 2015.49 Google Scholar
Digital Library
- Michael Pradel and Koushik Sen. 2018. DeepBugs: A Learning Approach to Name-based Bug Detection. Proc. ACM Program. Lang. 2, OOPSLA, Article 147 (Oct. 2018 ), 25 pages. https://doi.org/10.1145/3276517 Google Scholar
Digital Library
- Alec Radford, Jefrey Wu, Rewon Child, David Luan, Dario Amodei, and Ilya Sutskever. 2018. Language models are unsupervised multitask learners. ( 2018 ).Google Scholar
- Veselin Raychev, Pavol Bielik, and Martin Vechev. 2016a. Probabilistic Model for Code with Decision Trees. In Proceedings of the 2016 ACM SIGPLAN International Conference on Object-Oriented Programming, Systems, Languages, and Applications (OOPSLA 2016 ). ACM, New York, NY, USA, 731-747. https://doi.org/10.1145/2983990.2984041 Google Scholar
Digital Library
- Veselin Raychev, Pavol Bielik, Martin Vechev, and Andreas Krause. 2016b. Learning Programs from Noisy Data. In Proceedings of the 43rd Annual ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages (POPL '16). ACM, New York, NY, USA, 761-774. https://doi.org/10.1145/2837614.2837671 Google Scholar
Digital Library
- Veselin Raychev, Martin Vechev, and Andreas Krause. 2015. Predicting Program Properties from "Big Code". In Proceedings of the 42Nd Annual ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages (POPL '15). ACM, New York, NY, USA, 111-124. https://doi.org/10.1145/2676726.2677009 Google Scholar
Digital Library
- Veselin Raychev, Martin Vechev, and Eran Yahav. 2014. Code Completion with Statistical Language Models. In Proceedings of the 35th ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI '14). ACM, New York, NY, USA, 419-428. https://doi.org/10.1145/2594291.2594321 Google Scholar
Digital Library
- T. Reps, G. Balakrishnan, J. Lim, and T. Teitelbaum. 2005. A Next-generation Platform for Analyzing Executables. In Proceedings of the Third Asian Conference on Programming Languages and Systems (APLAS'05). Springer-Verlag, Berlin, Heidelberg, 212-229. https://doi.org/10.1007/11575467_15 Google Scholar
Digital Library
- Andrew Rice, Edward Aftandilian, Ciera Jaspan, Emily Johnston, Michael Pradel, and Yulissa Arroyo-Paredes. 2017. Detecting argument selection defects. Proceedings of the ACM on Programming Languages 1, OOPSLA ( 2017 ), 104.Google Scholar
Digital Library
- Saksham Sachdev, Hongyu Li, Sifei Luan, Seohyun Kim, Koushik Sen, and Satish Chandra. 2018. Retrieval on source code: a neural code search. In Proceedings of the 2nd ACM SIGPLAN International Workshop on Machine Learning and Programming Languages, [email protected] 2018, Philadelphia, PA, USA, June 18-22, 2018. 31-41. https://doi.org/10.1145/3211346.3211353 Google Scholar
Digital Library
- Eui Chul Richard Shin, Dawn Song, and Reza Moazzezi. 2015. Recognizing Functions in Binaries with Neural Networks.. In USENIX Security Symposium. 611-626.Google Scholar
Digital Library
- Nitish Srivastava, Geofrey E Hinton, Alex Krizhevsky, Ilya Sutskever, and Ruslan Salakhutdinov. 2014. Dropout: a simple way to prevent neural networks from overfitting. Journal of machine learning research 15, 1 ( 2014 ), 1929-1958.Google Scholar
Digital Library
- Ilya Sutskever, Oriol Vinyals, and Quoc V Le. 2014. Sequence to sequence learning with neural networks. In Advances in neural information processing systems. 3104-3112.Google Scholar
- Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N Gomez, Łukasz Kaiser, and Illia Polosukhin. 2017. Attention is all you need. In Advances in neural information processing systems. 5998-6008.Google Scholar
- Daniel Votipka, Seth Rabin, Kristopher Micinski, Jefrey S Foster, and Michelle L Mazurek. 2020. An Observational Investigation of Reverse Engineers' Processes. In 29th USENIX Security Symposium (USENIX Security 20). 1875-1892.Google Scholar
- Mark Weiser. 1984. Program Slicing. IEEE Transactions on Software Engineering SE-10, 4 (jul 1984 ), 352-357. https: //doi.org/10.1109/TSE. 1984.5010248 Google Scholar
Digital Library
- Xiaojun Xu, Chang Liu, Qian Feng, Heng Yin, Le Song, and Dawn Song. 2017. Neural network-based graph embedding for cross-platform binary code similarity detection. In Proceedings of the 2017 ACM SIGSAC Conference on Computer and Communications Security. 363-376.Google Scholar
Digital Library
Index Terms
Neural reverse engineering of stripped binaries using augmented control flow graphs
Recommendations
Extracting safe and precise control flow from binaries
RTCSA '00: Proceedings of the Seventh International Conference on Real-Time Systems and ApplicationsAs a starting point for static program analysis, a control flow graph (CFG) is needed. If only the binary executable is available, this CFG has to be reconstructed from sequences of instructions. The usual way to do this is a top-down approach: the ...
Combined WCET analysis of bitcode and machine code using control-flow relation graphs
LCTES '13: Proceedings of the 14th ACM SIGPLAN/SIGBED conference on Languages, compilers and tools for embedded systemsStatic program analyses like stack usage analysis and worst-case execution time (WCET) analysis depend on the actual machine code generated by the compiler for the target system. As the analysis of binary code is costly, hard to diagnose and platform ...
Pushdown control-flow analysis for free
POPL '16Traditional control-flow analysis (CFA) for higher-order languages introduces spurious connections between callers and callees, and different invocations of a function may pollute each other's return flows. Recently, three distinct approaches have been ...






Comments