Reverse Engineering of Binaries Using Deep Learning Neural Networks

GLOBAL RESEARCH INSTITUTES

  • Reverse Engineering of Binaries Using Deep Learning Neural Networks

Reverse Engineering of Software Binaries using Deep Leaning Neural Networks for Vulnerability Detection

Project Summary:

With the sustained growth of software complexity, finding security vulnerabilities in programmes has become an important necessity, especially in mobile applications. Given the proliferation of mobile devices and their associated app-stores, the volume of new applications is too large to manually examine each application for vulnerabilities - nowadays, applications are shipped with thousands of binary executables. Unfortunately, methodologies and tools for an application scale program testing within a limited time budget are still missing.

Vulnerability analysis has traditionally been based on manually examining the behavior and/or de-compiled code. This process does not easily scale to large numbers of applications, consequently there has recently been some work on automatic vulnerability detection using ideas from machine learning. Various methods have been proposed based on examining the dynamic application behavior, requested permissions and the n-grams present in the application byte-code. However many of these methods are reliant on expert analysis to design the discriminative features that are passed to the machine learning system used to make the final classification decision. In addition, most recent techniques rely on supervised learning which requires many had labelled training samples. Manual labelling is simply not a scalable technology going forward.

Recently, deep neural networks have been shown to perform well on a variety of tasks related to natural language processing. In this project we propose to investigate the application of recurrent neural networks to vulnerability detection by treating the disassembled byte-code of an application as a text to be analyzed. This approach has the advantage that features are automatically learned from raw data, and hence removes the need for malware signatures to be designed by hand. We will investigate both static and dynamic analysis to predict if a test case is likely to contain a software vulnerability. Furthermore we will also investigate unsupervised and semi-supervised approaches to learning the features for vulnerability detection.

How to Apply

Applicants should apply electronically through the Queen’s online application portal at: https://dap.qub.ac.uk/portal/

Contact Details

Dr Paul Miller
Email: p.miller@qub.ac.uk
Telephone: +44 (0)28 9097 4637