Abstract
Despite a considerable number of approaches that have been proposed to protect computer systems, cyber-criminal activities are on the rise and forensic analysis of compromised machines and seized devices is becoming essential in computer security.
This article focuses on memory forensics, a branch of digital forensics that extract artifacts from the volatile memory. In particular, this article looks at a key ingredient required by memory forensics frameworks: a precise model of the OS kernel under analysis, also known as profile. By using the information stored in the profile, memory forensics tools are able to bridge the semantic gap and interpret raw bytes to extract evidences from a memory dump.
A big problem with profile-based solutions is that custom profiles must be created for each and every system under analysis. This is especially problematic for Linux systems, because profiles are not generic: they are strictly tied to a specific kernel version and to the configuration used to build the kernel. Failing to create a valid profile means that an analyst cannot unleash the true power of memory forensics and is limited to primitive carving strategies.
For this reason, in this article we present a novel approach that combines source code and binary analysis techniques to automatically generate a profile from a memory dump, without relying on any non-public information. Our experiments show that this is a viable solution and that profiles reconstructed by our framework can be used to run many plugins, which are essential for a successful forensics investigation.
1 INTRODUCTION
While traditionally focused on the analysis of the information stored on hard drives, in recent years digital forensics broadened its scope to cover other components of computer systems. One of these components, the volatile memory, is becoming more and more crucial in many investigations, because it contains a number of artifacts which are not found elsewhere. Moreover, in large organizations, the analysis of volatile memory—best known as memory forensics—is nowadays not only used as part of incident response, but also as a proactive tool to periodically check machines and look for signs of compromise or infection. For example, Microsoft has recently announced Project Freta [32], a cloud-based solution to detect malicious processes and rootkits using memory forensics techniques.
The core idea behind memory forensics is to extract evidences from the data structures used by the operating system kernel. While some of these structures can be located by carving the memory for particular byte patterns, the true power of memory forensics comes from the so-called structured analysis. In most of the cases, this type of analysis starts by finding a set of global symbols inside a memory dump. From these variables, other kernel structures are then discovered by de-referencing pointers [28]. For example, a common task performed in memory forensics consists of listing the processes that were running inside the machine when the memory dump was acquired. Under Linux, a way to retrieve this information is to find the location of the global variable
For Microsoft Windows operating systems, retrieving the correct profile from the system under analysis is not really a problem, because the number of different kernels is limited and well known. Moreover, the layout can be retrieved from the kernel debugging symbols, which are generally available on a public server. On the other hand, memory forensics is more and more focusing on Linux-based operating systems, both for the analysis of servers and to support a wide range of appliances, mobile phones, and network devices. Unfortunately, when it comes to Linux there is no central symbol server and the number of combinations of kernel versions and possible configuration is countless.
However, it is important to understand that the main challenge is not to determine the specific version of the kernel under analysis. Thus, it is not important how much kernel structures change across kernel versions, but instead how much they change within a single version—because of user configurations or compiler options. Previous research [49] have empirically confirmed this effect and reported that the layout of important forensics structures is affected by the configuration used at compile time. Thus, forensics analysts have to create a profile for each and every system they want to analyze. Currently, this is a manual process that involves the compilation of a special kernel module. While this operation is generally performed on the machine under analysis, it may also be performed offline by cross-compiling the module on the analyst workstation. In both cases, this process has several important requirements. For instance, it requires access to the kernel headers, the kernel configuration file and, in certain cases, the very same compiler toolchain used to build the kernel (as different compilers or compiler versions can result in different data structure offsets and layouts). In some cases, like in the latest development version of Volatility—the de facto standard framework when it comes to memory forensics—the profile generation even requires to have access to the full debugging symbols or to recompile the entire kernel itself. While the previous constraints might not be an obstacle for a common desktop machine, the required information are rarely available for kernels running on network appliances,
To make things worse, the source code of the kernel module needs to be manually updated every time a new kernel is released [17]. In fact, the definition of several structures used by memory forensics tools is not exported from header files and therefore must be copied into the module source code. For example, the definition of the
Finally, modern kernels include the ability to perform structure layout randomization, which poses a serious “threat” to memory forensics. Originally developed as a protection mechanism by Grsecurity [40] and later studied by other researchers [5, 19, 22], structure layout randomization is nowadays present in the latest versions of the Linux kernel as well. This compile-time option randomizes the layout of sensitive kernel structures, as an effective way to harden the kernel against exploitation. As a side effect, the authors highlight that enabling this option will “prevent the use of forensic tools like Volatility against the system”.
All of the previous limitations are also shared by Rekall [6], another well-known forensics framework. In fact, even if Rekall stores the OS profiles in a different format, the process of extracting the layout of kernel structures is identical to the one implemented by Volatility.
The memory forensics community is well-aware of all of these problems, as recently emphasized once again by Case and Richard [4] in their overview of memory forensics open challenges. In the article, the authors urge the community to create a public database of Linux profiles—which nowadays exists only in the form of a commercial solution. Unfortunately, they also note how a “considerable amount of monetary and infrastructure” is needed to create such a database and how, in any case, this approach can only cover settings used by pre-compiled kernel shipped as part of Linux distributions. This is the case of the Volatility community repository [44], which unfortunately has received only a few contributions in the past few years. While this repository contains more than 230 profiles (both for
In the past years, researchers have also proposed partial solutions to the profile generation problem. For example, Case et al. [3] and subsequently Zhang et al. [48], suggested that the layout can be retrieved from the analysis of kernel code. Unfortunately, their manual approach cover only a handful of kernel structures layout, while nowadays memory forensics requires several hundreds of them. On the other hand, approaches such as the proposed one by Socała and Cohen [38] still requires the configuration that was used to build the kernel under analysis.
For these reasons, we believe it is time to move away from costly manually-curated profiles and investigate the possibility to design a holistic and fully automated approach to memory analysis. As a first step in this direction, in this article we propose AutoProfile, a novel approach to automatically create Linux profiles. To the best of our knowledge, this is the first solution to create entire profiles based only on information publicly available or extracted from the memory dump itself. Our experimental results show how the profiles extracted by AutoProfile support several Volatility plugins—such as those that list the running processes and the open files—when targeting a very diverse set of kernels. This set includes a version of a Debian kernel that use structure layout randomization, an Android kernel, a kernel running on Raspberry Pi devices, a kernel shipped by Openwrt (a project targeting network devices), and an old version of the Ubuntu kernel released more than a decade ago.
2 RECOVERING OBJECTS LAYOUT FROM BINARY CODE
In this section, we discuss a practical example of how the layout of an object is shaped by the configuration used at compile time, thus making it impossible to deduce the correct offsets of its fields by reasoning only on its definition. We then introduce the core idea behind this article and how it can be generalized to recover the layout of all kernel objects used in memory forensics.
2.1 Problem Statement
The key ingredient that makes memory forensics possible is the availability of the kernel profile: a detailed model of the symbols and data types required to perform the analysis. In the case of Linux memory forensics, a profile contains two separate pieces of information: the addresses of global variables and kernel functions, and the exact layout of kernel objects. The latter is of particular interest for different reasons. First of all, this information is lost during the compilation process and the only way to preserve it is to ask the compiler to emit the debugging symbols. This is often the case for kernels shipped by common Linux distributions that usually provide them in a separate debugging package. Moreover, the Linux kernel is a highly customizable piece of software, designed to run on a large variety of devices and architectures and to suit different needs. This means that the very same kernel version tailored to two different systems can result in dramatic differences between the layout of the kernel objects.
To illustrate how the customization of the Linux kernel is in fact a problem for memory forensics, we present a practical example in Figure 1. In the left part of this figure, we show a short code snippet responsible for the set up of a task which, in this example, is represented by the
Fig. 1. On the left the C source code we use in our examples, on the right its compiled form.
The first difference between the two versions is present at lines 3 and 1, respectively. The semantic of these two instructions is equivalent: they store the argument
While this is a trivial example, it introduces a very common pattern that is present thousands of times in the kernel codebase. For example, the definition of the
2.2 Data Structure Layout Recovery
The intuition behind this article is that, while the precise structures’ layout is lost during the compilation process, it affects the code generated by the compiler. More specifically, the displacement used to access the fields of a given object must reflect the layout of the data structures and therefore can be extracted if we know where each field is used across the entire codebase, and how the code is accessing the field. These two pieces of information allow us to locate the functions that operate on the requested field, and to follow the access pattern that led the code to a particular object. For example, a piece of data can be passed as parameter, but it can also being referenced by a global variable, reached by traversing another object, or obtained by calling a separate function.
Back to our example, let’s assume we want to recover the offset of the name field. First, by looking at the source code, we can tell that the function
It is important to note that it is very difficult to tell which of the three access is the one operating on the field we are interested in. In fact, functions often access dozens of different fields and compilers optimizations often change the exact order and number of those accesses in the binary code. However, we can leverage the fact that the name field is also probably accessed in other functions, and therefore we can combine and cross-reference multiple candidate locations to narrow down its exact offset. In Section 7.1, we will describe in detail the numerous challenges the layout recovery algorithm needs to face when dealing with complex kernel code and the solutions we adopted to overcome these problems.
3 PAST APPROACHES
The forensics community is well aware of this problem and over the years have proposed some preliminary solutions, which are summarized in Table 1. The first attempt at solving this problem was published by Case et al. [3] in 2010 and, quite similarly, by Zhang et al. [48] in 2016. The two approaches are quite straightforward: after locating a set of defined functions, the authors extracted the layout of kernel objects by looking at the disassembly of these functions. While we believe this was a step in the right direction, these approaches had several limitations. First of all, both the functions and the corresponding objects were selected manually. This limited the scalability of the solution, and in fact the authors were only able to manually recover a dozen fields in total—while our experiments show how Volatility uses more than two hundred fields. Moreover, to locate the functions in the memory dump, previous solutions rely on the content of
| Approach | Limitations |
|---|---|
| Case and Zhang [3, 48] | Manual approach which requires high knowledge of the kernel internals |
| Requires dynamic analysis of a running kernel similar to the target one | |
| Requires the configuration file used to compile the kernel | |
| Type Inference (discussed in Section 9) | Designed to recover the types, and not to distinguish fields in a structure |
| Memory Carving (discussed in Section 9) | Orthogonal to structured memory forensics, does not require a profile |
Table 1. A Review of Previous Attemps at Automated Profile Generation
Case et al. [3] and Zhang et al. [48] presented also another way to find the offset of a field based on the relationship among global kernel objects. Both authors noted that, for example, the field
Finally, in 2016 Socała and Cohen [38] presented an interesting approach to create a profile on-the-fly, without the need to rely on the compiler toolchain. Their tool,
4 APPROACH OVERVIEW
In this section, we explain our approach to automatically extract a valid memory forensics profile from a memory dump. Our system can be conceptually divided in three independent phases as illustrated in Figure 2. In the first phase, we find the location of all symbols in the memory dump and we identify the version of the running kernel. During the second phase we use a compiler plugin to analyze the source code of the identified version and emit a set of models—which we call access chains —that describe the way the code operates over a selected set of kernel objects. It is important to note that we only need access to the public source code but not to the exact configuration (kernel options, compiler settings, randomization seed, etc.) that was used to build the kernel captured in the memory dump. The chains extracted in this phase are finally fed into the third component, the exploration engine, which matches them to the actual kernel binary code extracted from the memory dump. The final output of AutoProfile is a working memory forensics profile, which can be used by Volatility to extract evidences from a memory dump.
Fig. 2. AutoProfile overview.
For example, during the second phase AutoProfile would discover that the field
5 PHASE I: KERNEL IDENTIFICATION AND SYMBOLS RECOVERY
The goal of the first phase is to recover two key pieces of information: the version of the kernel and the location of its symbols (functions and global variables).
Locating Kernel Symbols. As we already explained in Section 1, existing memory forensics tools require to know the location of certain global symbols to bootstrap their analysis. On top of that, AutoProfile also requires the location of some kernel functions, which will serve as basis for our analysis.
The recovery of this information is greatly complicated by two different factors. First of all, unlike other memory forensics tools, we cannot rely on the
The first problem with this solution is that exported symbols constitute only a tiny subset of all the kernel symbols. For this reason, Zhang et al. [48] introduced a way to recover another larger subset of symbols—called the
For these reasons, we designed a novel and generic way to automatically extract the addresses of all kernel functions and global variables. Our approach extends the ideas presented so far, but it relies on automatically finding and executing the
Therefore, to find the physical address of a candidate
Kernel Version Identification. Multiple techniques exist to identify the version of a kernel contained in a memory dump. The straightforward approach consists of grepping for strings that match the format of a Linux kernel banner. However, even thought the kernel is generally loaded in the first few megabytes of the physical address space and therefore the correct version should be in the first few matches, this technique can potentially result in several false positives, depending on the content of the memory dump. Because of this, we resort to a more precise identification by extracting the global variable
6 PHASE II: CODE ANALYSIS
At the end of the first phase we identified the version of the running kernel, which we can use to download its corresponding source code. In this second phase we automatically analyze the code to extract three pieces of information: the type definitions, the pre-processor directives, and the access chain models.
The bulk of our analysis is performed by a custom plugin for the Clang compiler, which operates on the Abstract Syntax Tree (AST) of the Linux kernel. While the analysis we need to perform would be much easier and more practical if performed at a later stage of the compilation process—i.e., by working on the compiler intermediate representation—working on the AST provides the advantage of being compatible with all version of the Linux kernel. In fact, while recent versions of the kernel can compile with Clang and few older versions are supported through a set of manually created patches, for the vast majority of kernel versions Clang is not able to produce an intermediate representation. However, Clang is “fault tolerant” when it builds the AST and thus it creates one for all versions of the Linux kernel, regardless of being able to compile the sources.
To recover the aforementioned pieces of information, we compile the kernel configured with
6.1 Pre-processor Directives
The first piece of information we save from the compilation process is the position of
6.2 Types Definition
Along with the functions’ AST, our plugin also visits the AST representing the definition of kernel objects. When traversing this tree it saves the type of each object along with the name, the type, and the definition line of its fields. As a special case, when exploring unions, the tool marks the fields they contain accordingly.
The information gathered from parsing a record definition plays an important role in our system. For example, by looking at the order in which the fields are defined, our exploration system can constrain the candidate offsets for a given field. Moreover, the offset of certain fields can be statically deduced (e.g., we safely assume the first field in a structure is always at offset zero).
6.3 Access Chains
To model the way the code accesses kernel objects we introduce the concept of access chain, defined as a triple {Location, Transitions, and Source}. In the triple, the Location defines where the access is performed, in terms of a file name, a function name, and a line number. The Transitions element is a list containing the type of the objects and the name of the fields of every data structure involved in the chain. For example, the chain describing the access at line 3 of Figure 3 would contain three elements:
Fig. 3. Example used to explain how the Clang plugin works.
Finally, the third element of an access chain is its Source, that represents how the first variable of the chain is initialized. This information is essential to select among the memory accesses contained in a function only those belonging to a target object. In the previous example, since the base variable is
Local variables, which can be legitimately used as base variables for an access, are not valid sources. This is because local variables must be initialized before they can be used and their initialization must fall in one of the previous categories. As we will explain in the next section, a core aspect of the plugin is that it keeps a map from variables to their initialization. This enables the plugin to correctly determine the source for each access chain.
The plugin extracts access chains from the kernel source code by parsing three types of nodes in the AST: assignments and declarations, object accesses, and function calls and returns.
Assignments and Declarations are used to maintain the map of all variables and the way they are initialized. For instance, when we encounter the node representing the declaration at line 2
of Figure 3, the plugin first extracts the variable used in the left-hand side (LHS) of the statement. If the type of the variable is a
To simplify the analysis, our plugin only keeps track of one path, and not all possible paths where a variable can be assigned. However, to extract the offset corresponding to a given access is sufficient to find one path inside a function that reaches that access, rather than exploring all of them.
Object Accesses (as modeled by
When traversing the objects involved in a chain, the plugin keeps track of how fields are accessed. While the C standard defines the arrow and the dot operator as the only way to access a field, we are also interested in other operators that may affect an access. The first is related to the
Function Calls and Returns are the last two types of nodes explored by the plugin. This information is essential to extract accesses in functions which are inlined by the compiler. When our plugin encounters a function call, we save the name of the called function and its arguments. Similarly to how object accesses are represented, every argument is expressed as an access chain. The only difference is that these chains might have an empty Transitions element. This happens, for example, when one function calls another and it passes as parameter one of its own arguments or a global variable. A similar approach is applied to return statements.
6.4 Non Unique Functions
Another problem when dealing with projects in the size of the Linux kernel is that function names are not always unique. In fact, the
Finally, for optimization reasons, the compiler can decide to remove a parameter from a function or even split a function in two or more parts. Fortunately, when these optimizations are applied, the compiler also adds a suffix—respectively,
7 PHASE III: PROFILE GENERATION
It is important to point out that a profile includes the layout of only a small subset of all kernel data structures—those that are needed to complete the forensic analysis tasks supported by a given tool. For this reason, our system focuses on recovering only the information actually used by Volatility. However, manually listing the objects used by every Volatility plugin is a tedious and error prone process, and it is further complicated by the fact that some of these objects vary depending on the kernel version. Therefore, for our tests we decided to instrument Volatility to log every field it traverses and then we recovered the full list by executing each plugin against a memory dump containing the same kernel version of the one under analysis.
As a result, the actual number of different fields and unique data structures vary among the experiments, ranging from 234 and 239 targets. As we will explain in the next sections, finding the correct offset of a field enables AutoProfile to test other chains that depends on this field. For this reason, we add to the initial set of targets any field that represent a dependency of a field used by Volatility in any access chain. Moreover, to constrain even more the offsets extracted for a structure, we expand the set of targets by adding three fields which are defined before or after any Volatility target.
7.1 Binary Analysis
To match the chains extracted during the source code analysis against the functions extracted from an actual memory dump we use
While tracking the memory accesses is independent from the source of a chain, it dictates how the system is initialized and run. Parameters and function returns are the most straightforward sources to handle. In the first case, a symbolic variable is stored in the corresponding register, while in the second—whenever the function specified in the source is called—we set the
Field Dependencies – AutoProfile often needs to deal with chains spanning multiple objects. For instance, let us consider again our sample chain:
The code reaches the target
In this case, we create multiple symbolic variables and appropriately store them when a memory access belonging to an element is detected. However, since the final assignment of a field offset is obtained by a global algorithm by majority voting, it is possible that a chain cannot be fully analyzed in one pass, but instead requires a recursive approach to first identify all its dependencies.
Nested Structures – A particular type of dependency occurs when the target field is accessed through a nested structure. In C, this may appear, for example, in the form of
This requires our tool to keep track of this displacement, as
7.2 Dealing with Inlined Functions
Since the kernel is always compiled with the optimizations turned on, the compiler is quite aggressive when it comes to function inlining. For example, compiling the Linux kernel 5.1 with the default configuration results in the inlining of more than 200,000 call sites. For this reason, being able to cope with function inlining dramatically increase the number of chains our exploration system can test.
When we analyze a memory dump and discover that a given function call has been inlined, we trigger a dedicated routine in charge of merging and inheriting its chains. Our process starts by labeling every chain of the inlined function as forward or backward. Forward chains are those that starts from a parameter, while backward ones are those that terminates in return statements. For example, in the following snippet:
the chain at line 2 is a forward chain, while the one at line 4 is both a forward and backward chain. Our algorithm is divided in two independent parts: in the first one chains are joined, while in the second one they are inherited.
The first one starts by iterating over every pair of caller and callee. If the callee is not inlined, and thus is present in the list of functions extracted from the memory dump, then no action is required. Otherwise, each argument—which is also represented with a chain—is joined with every forward chain of the callee that has the same parameter position as source. Joining is not a commutative operation: the source and the location of the argument chain are left untouched, while the list of objects of the callee chain are appended to the one of the argument chain. A similar treatment is reserved for backwards chain, but this time in the opposite direction. Every chain of the caller that has source equal to an inlined function, is joined with the backward chains of this function. Since the inlining depth can be greater than 1, i.e., functions called from inlined functions can be inlined as well, we repeat this process in a loop to propagate the presence of freshly joined chains, until any new chain is generated.
The second part of the process deals with inheriting from inlined functions all the chains which are not forward or backwards one, for example, those who access a global object. In this case, the chain is left unaltered and only added to the set of chains of the caller. In our example, as result of this process, a function that calls
Once this two steps are finalized, AutoProfile passes over the resulting chains to clean and adjust them. The cleaning process is needed because a target can be present in multiple same-source chains of a function. For this reason, given a target, we delete the chains which are a superset of others, thus ensuring that the target is tested only once. On the other hand, the adjustment deals with chains containing the reference operator or
if
The adjustment of
the compiler may effectively subtract from
7.3 Object Layout Inference
At the end of the binary exploration phase, each target (i.e., each field whose offset we need to extract) has its own list of candidate offsets. Since the lists associated to different fields can overlap, it is now a global optimization problem to find the set of offsets that maximizes the number of recovered fields. For instance, let’s assume that, according to our chain-matching algorithm, three fields of the same data structure can be located, respectively, at offsets
We solve this problem by creating a
A problem with this approach is that if the candidates of a field are wrong and contradict the position constraints, then the model become unsatisfiable. To overcome this limitation, when we run into an unsatisfiable model, we explore the solution space by recursively removing a soft unsatisfiable constraints.
Finally, the knowledge gained from the previous modeling process is added to the system. This new piece of information will most likely satisfy the dependency or the displacement of other chains that were previously not testable. Hence, we go back and forth between the binary analysis component that resolve the chains and the layout inference component that solves the extracted candidates and constraints until no other chain is available.
8 EXPERIMENTS
To test AutoProfile we collected a number of memory dumps from systems running different Linux kernels. The list of kernels (summarized in Table 2) was chosen to reflect different major versions (including 2.6, 3.1, 4.4, 4.19, and 5.6) and different configurations. In particular, the first experiment was conducted with the latest version of the kernel shipped by Debian. In the second experiment we reused the same configuration, but this time with structure layout randomization turned on. To study how different randomization seeds can impact our approach, we recompiled the kernel 10 times and reported an average value in Table 2. The last four experiments aimed instead to test AutoProfile against less common memory forensics scenarios, when the traditional approach to create a profile would be difficult to apply. For one test we retrieved the kernel used for Raspberry Pi devices, for another test we targeted the kernel used by OpenWrt, a project that targets network devices; in another we recreated a scenario involving a memory dump of an Android device, and for our last test we chose a 10 years old version of the Linux kernel that does not support Clang. While certain of the aforementioned kernels are targeted towards the embedded and IoT world, the current implementation of AutoProfile supports only
| Version | Release Date | Configuration | Used Fields | Extracted Fields |
|---|---|---|---|---|
| 4.19.37 | 04/2019 | 234 | 220 (94%) | |
| 4.19.37 | 04/2019 | 234 | 194 (83%) | |
| 5.6.19 | 03/2020 | 227 | 217 (95%) | |
| 4.4.71 | 06/2017 | 236 | 216 (92%) | |
| 3.18.94 | 05/2018 | 239 | 220 (92%) | |
| 2.6.38 | 03/2011 | 226 | 213 (94%) |
Table 2. The Linux Kernels Used in Our Experiments
To run our experiments we downloaded the kernel sources and configurations from the respective repositories. Each kernel was compiled twice, one time to be used in our experiments and the other to perform our source-code analysis. The first version was configured with the configuration shipped with the distribution, and compiled it with a supported version of
For this reason, to generate the ground truth required to perform this test, we developed a custom kernel module that, using inline assembly statements, loads in a specific register the offset of a field using the
8.1 Analysis Time
Building the profiles using our automated system took approximately eight hours in each experiment. The first phase was the fastest and the only one that depends on the size of the memory dump. Nevertheless, since the kernel is usually loaded in the lower part of the physical memory, our prototype required few seconds to analyze
8.2 Results
The fourth column of Table 2 shows how many unique fields are used by Volatility for the given image. The value range from 227 to 239 but, quite surprisingly, the intersection of these fields counts more than 180 elements. This means that, even if new features frequently land in the kernel tree, a large fraction of fields used by memory forensics is not affected by the kernel development. These fields are mostly related with process management (e.g.,
To answer this question, Table 3 breaks down, for each plugin, the number of fields that were correctly located by AutoProfile and the number of fields for which we extracted a wrong offset. Unfortunately, it is not sufficient to compare the list of fields accessed by a plugin to tell which plugin is correctly supported by our profile. For example, our instrumented version of Volatility reports that the plugin
| Volatility Plugin | ||||||||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| ● | 11 | 0 | ○ | 8 | 3 | ● | 11 | 0 | ● | 11 | 0 | ● | 12 | 0 | ● | 12 | 0 | |
| ● | 0 | 0 | ● | 0 | 0 | ● | 0 | 0 | ● | 0 | 0 | ● | 0 | 0 | ● | 0 | 0 | |
| ● | 10 | 0 | ○ | 6 | 4 | ● | 10 | 0 | ● | 40 | 0 | ● | 42 | 0 | ○ | 35 | 4 | |
| ● | 4 | 0 | ● | 4 | 0 | ● | 4 | 0 | ● | 4 | 0 | ● | 4 | 0 | ● | 4 | 0 | |
| ○ | 73 | 3 | ○ | 70 | 9 | ○ | 69 | 5 | ○ | 73 | 1 | ● | 73 | 0 | ● | 65 | 0 | |
| ● | 0 | 0 | ● | 0 | 0 | ● | 0 | 0 | ● | 0 | 0 | ● | 0 | 0 | ● | 0 | 0 | |
| ● | 11 | 0 | ○ | 7 | 1 | ● | 11 | 0 | ● | 10 | 0 | ○ | 9 | 1 | ● | 10 | 0 | |
| ○ | 31 | 0 | ○ | 30 | 1 | ○ | 31 | 0 | ○ | 30 | 0 | ● | 30 | 0 | ● | 29 | 0 | |
| ○ | 13 | 1 | ○ | 11 | 3 | ● | 14 | 0 | ○ | 12 | 1 | ● | 13 | 0 | ● | 12 | 1 | |
| ● | 2 | 0 | ◑ | 1 | 1 | ● | 2 | 0 | ● | 2 | 0 | ● | 2 | 0 | ● | 2 | 0 | |
| ○ | 0 | 0 | ○ | 0 | 0 | ○ | 0 | 0 | ○ | 0 | 0 | ○ | 0 | 0 | ● | 0 | 0 | |
| ● | 9 | 0 | ● | 9 | 0 | ● | 9 | 0 | ● | 9 | 0 | ● | 9 | 0 | ● | 9 | 0 | |
| ● | 6 | 0 | ● | 6 | 0 | ● | 6 | 0 | ● | 6 | 0 | ● | 27 | 0 | ● | 26 | 0 | |
| ● | 24 | 0 | ◑ | 23 | 1 | ● | 23 | 0 | ● | 23 | 0 | ● | 24 | 0 | ● | 22 | 0 | |
| ● | 24 | 0 | ● | 24 | 1 | ● | 24 | 0 | ● | 24 | 0 | ● | 24 | 0 | ● | 23 | 0 | |
| ● | 24 | 0 | ● | 24 | 1 | ● | 24 | 0 | ● | 24 | 0 | ● | 24 | 0 | ● | 23 | 0 | |
| ● | 16 | 0 | ● | 16 | 0 | ● | 16 | 0 | ● | 16 | 0 | ● | 16 | 0 | ● | 16 | 0 | |
| ○ | 8 | 0 | ○ | 8 | 0 | ○ | 8 | 0 | ○ | 7 | 0 | ● | 7 | 0 | ● | 7 | 0 | |
| ◑ | 11 | 1 | ◑ | 11 | 1 | ○ | 11 | 1 | ● | 12 | 0 | ● | 12 | 0 | ◑ | 11 | 1 | |
| — | — | — | — | — | — | — | — | — | ● | 11 | 0 | ○ | 9 | 2 | ● | 11 | 0 | |
| ● | 5 | 0 | ● | 5 | 0 | ● | 5 | 0 | ● | 5 | 0 | ● | 5 | 0 | ● | 5 | 0 | |
| ● | 1 | 0 | ● | 1 | 0 | ● | 1 | 0 | ● | 1 | 0 | ● | 1 | 0 | ● | 1 | 0 | |
| ● | 30 | 0 | ◑ | 28 | 2 | ● | 29 | 0 | ○ | 27 | 2 | ● | 30 | 0 | ● | 28 | 0 | |
| ● | 9 | 0 | ● | 9 | 0 | ● | 9 | 0 | ● | 8 | 0 | ● | 9 | 0 | ● | 9 | 0 | |
| ● | 9 | 0 | ● | 9 | 0 | ● | 9 | 0 | ● | 9 | 0 | ● | 9 | 0 | ● | 9 | 0 | |
| ○ | 32 | 0 | ○ | 27 | 3 | ○ | 32 | 0 | ○ | 4 | 2 | ○ | 32 | 0 | ○ | 31 | 0 | |
| ● | 6 | 0 | ● | 6 | 0 | ● | 6 | 0 | ● | 5 | 0 | ● | 5 | 0 | ● | 5 | 0 | |
| ● | 24 | 0 | ◑ | 22 | 2 | ● | 24 | 0 | ● | 24 | 0 | ● | 24 | 0 | ● | 23 | 0 | |
| ● | 15 | 0 | ○ | 14 | 1 | ● | 15 | 0 | ○ | 13 | 2 | ● | 15 | 0 | ● | 15 | 0 | |
| ● | 6 | 0 | ● | 6 | 0 | ● | 6 | 0 | ● | 6 | 0 | ● | 6 | 0 | ● | 6 | 0 | |
| ○ | 8 | 3 | ○ | 7 | 4 | ○ | 8 | 3 | ○ | 8 | 3 | ○ | 8 | 3 | ○ | 7 | 3 | |
| ● | 20 | 0 | ◑ | 20 | 1 | ● | 20 | 0 | ● | 20 | 0 | ● | 20 | 0 | ● | 19 | 0 | |
| ◑ | 16 | 1 | ◑ | 15 | 2 | ○ | 16 | 1 | ◑ | 15 | 2 | ◑ | 15 | 2 | ◑ | 15 | 2 | |
| ○ | 29 | 1 | ○ | 27 | 3 | ○ | 29 | 1 | ○ | 29 | 1 | ○ | 29 | 1 | ○ | 29 | 1 | |
| ○ | 18 | 4 | ○ | 19 | 3 | ○ | 21 | 1 | ○ | 21 | 3 | ○ | 18 | 6 | ◑ | 20 | 1 | |
| ● | 25 | 0 | ● | 24 | 1 | ● | 24 | 0 | ● | 22 | 0 | ● | 25 | 0 | ● | 22 | 0 | |
| ● | 25 | 0 | ● | 24 | 1 | ● | 24 | 0 | ● | 22 | 0 | ● | 25 | 0 | ● | 22 | 0 | |
| ● | 37 | 0 | ◑ | 34 | 3 | ● | 37 | 0 | ● | 35 | 2 | ● | 37 | 0 | ● | 35 | 0 | |
| ● | 39 | 0 | ◑ | 35 | 4 | ● | 39 | 0 | ● | 37 | 2 | ● | 39 | 0 | ● | 37 | 0 | |
| ● | 7 | 0 | ○ | 6 | 1 | ● | 7 | 0 | ● | 7 | 0 | ● | 7 | 0 | ● | 7 | 0 | |
| ● | 11 | 0 | ● | 11 | 0 | ● | 11 | 0 | ● | 11 | 0 | ● | 11 | 0 | ● | 11 | 0 | |
| ● | 8 | 0 | ● | 7 | 1 | ● | 8 | 0 | ● | 8 | 0 | ● | 8 | 0 | ● | 8 | 0 | |
| ◑ | 13 | 3 | ◑ | 14 | 2 | ● | 16 | 0 | ◑ | 15 | 1 | ◑ | 13 | 3 | ◑ | 12 | 1 | |
| ◑ | 10 | 1 | ◑ | 10 | 1 | ● | 11 | 0 | ● | 11 | 0 | ● | 11 | 0 | ◑ | 12 | 1 | |
| ● | 11 | 0 | ● | 11 | 0 | ○ | 9 | 2 | ○ | 9 | 2 | ○ | 9 | 2 | ● | 11 | 0 | |
| ◑ | 16 | 1 | ◑ | 16 | 1 | ◑ | 16 | 1 | ◑ | 18 | 2 | ◑ | 17 | 3 | ● | 20 | 0 | |
| ○ | 34 | 1 | ○ | 35 | 1 | ○ | 35 | 0 | ○ | 31 | 4 | ○ | 32 | 3 | ○ | 33 | 1 | |
| ● | 6 | 0 | ● | 6 | 0 | ● | 6 | 0 | ● | 6 | 0 | ● | 6 | 0 | ● | 6 | 0 | |
| ● | 20 | 0 | ● | 20 | 1 | ● | 20 | 0 | ● | 20 | 0 | ● | 20 | 0 | ● | 19 | 0 | |
| — | — | — | — | — | — | — | — | — | — | — | — | — | — | — | ○ | 30 | 0 | |
| ● | 3 | 0 | ● | 3 | 0 | ● | 3 | 0 | ● | 3 | 0 | ● | 3 | 0 | ● | 3 | 0 | |
| Total Working Plugins | 34 | 22 | 36 | 34 | 40 | 40 | ||||||||||||
Columns C and W represents the number of fields used by a plugin, that were correctly and wrongly extracted by AutoProfile, respectively.
Table 3. Column S Reports the Status of A Plugin: Symbol ● Denotes a Plugin is Working, ◑ Partially Working, ○ not Working, and — not Supported by Volatility
Columns C and W represents the number of fields used by a plugin, that were correctly and wrongly extracted by AutoProfile, respectively.
Overall, on the non-randomized memory dump, between 68% (for Openwrt) and 78% (for Ubuntu) of the plugins worked correctly with our profile, and between 74% (for Openwrt) and 88% (for Ubuntu) of the plugins had at least reduced functionality. In particular, the profile automatically created by AutoProfile was able to support many plugins which are fundamental for a forensics analysis. This include the support to extract the list of running process—except their starting time—and many related information such as their memory mappings, credentials, opened files, and environment variables. Moreover, our profile can be used to successfully list the content of
In other cases, AutoProfile was not able to recover the right offsets for the required fields. For instance, the field
Another interesting observation is that in rare cases plugins are reported as not functional, even if all the involved fields were correctly extracted by our framework. By carefully inspecting these cases, we discovered that some of them also require to know the total size of certain structures. For example, the
Finally, the experiment on the randomized kernel shows that the hard constraints play an important role in our system. More than 140 of the 234 fields used by Volatility are contained in structures affected by layout randomization, and currently AutoProfile is able to correctly extract the offset of 79% of them.
8.3 Chains Extraction
Table 4 shows detailed statistics about our analysis. Because of space constraints we could not include all 230 fields and we decided therefore to limit the table to the fields belonging to
| Step 1 | Step 2 | |||||||||
|---|---|---|---|---|---|---|---|---|---|---|
| Total | Expl. | Dep. | Disp. | Model | Total | Expl. | Dep. | Disp. | Model | |
| — | — | — | — | ✓ | — | — | — | — | ✓ | |
| 5 | 3 | 2 | 0 | ✓ | 2 | 2 | 0 | 0 | ✓ | |
| 6 | 3 | 3 | 0 | ✓ | 3 | 3 | 0 | 0 | ✓ | |
| 9 | 2 | 7 | 0 | ✓ | 7 | 7 | 0 | 0 | ✓ | |
| 34 | 2 | 19 | 13 | ✓ | 32 | 0 | 0 | 32 | ✓ | |
| 5 | 3 | 2 | 0 | ✓ | 2 | 2 | 0 | 0 | ✓ | |
| 5 | 3 | 2 | 0 | ✓ | 2 | 2 | 0 | 0 | ✓ | |
| 13 | 7 | 6 | 0 | ✓ | 6 | 6 | 0 | 0 | ✓ | |
| 8 | 3 | 5 | 0 | ✓ | 5 | 5 | 0 | 0 | ✓ | |
| 77 | 64 | 13 | 0 | ✓ | 13 | 11 | 2 | 0 | ✓ | |
| 8 | 1 | 7 | 0 | 4 | 7 | 7 | 0 | 0 | ✓ | |
| 5 | 4 | 1 | 0 | 2 | 1 | 1 | 0 | 0 | 2 | |
| 7 | 1 | 6 | 0 | ✓ | 6 | 6 | 0 | 0 | ✓ | |
| — | — | — | — | ✓ | — | — | — | — | ✓ | |
| 158 | 127 | 31 | 0 | ✓ | 31 | 22 | 9 | 0 | ✓ | |
| 57 | 48 | 9 | 0 | ✓ | 9 | 8 | 1 | 0 | ✓ | |
| 135 | 126 | 9 | 0 | ✓ | 9 | 5 | 4 | 0 | ✓ | |
| 198 | 180 | 18 | 0 | ✓ | 18 | 15 | 3 | 0 | ✓ | |
| 100 | 92 | 8 | 0 | ✓ | 8 | 6 | 2 | 0 | ✓ | |
| 130 | 124 | 6 | 0 | ✓ | 6 | 5 | 1 | 0 | ✓ | |
Table 4. An Excerpt of the Fields used by Volatility and Some Statistics Associated to their Exploration
8.4 Comparison with Past Attempts
The first approach to extract a valid profile from a memory dump was presented by Case [3], and subsequently refined by Zhang et al. [48]. Unfortunately, neither of these papers reports how many structure fields they were able to extract. Moreover, both approaches target a restricted number of manually-picked kernel functions to extract a field’s offset. This design restricts the applicability of these techniques. For instance, it does not deal with cases where a target function was inlined by the compiler, and the list of target functions must be kept in par with the kernel source code. In comparison, AutoProfile is able to automatically deal with these situations, and once we completed the development no changes had to be made to analyze any of the evaluated kernels. Finally, Zhang’s approach to extract the kernel symbols does not support kernels randomized with
A second attempt to solve this problem was ORIGEN [11]. While this article reports a precision of 90%, it is difficult to draw conclusions on the effectiveness of this tool, because it was tested on only 6 fields (5 fields of
9 RELATED WORK
Type inference on binary code has been a very active research topic in the past twenty years. In fact, the process of recovering the type information lost during the compilation process involves several challenges and can be tackled from different angles. The applications that benefit from advances in this field are the most diverse, including vulnerability detection, decompilation, binary code reuse, and runtime protection mechanisms. Recently, Caballero and Lin [2] have systematized the research in this area, highlighting the different applications, the implementation details, and the results of more than 35 different solutions. Among all, some of these systems are able to recover the layout of records, and in some cases to associate a primitive type (for example,
An orthogonal approach to structured memory forensics is memory carving, where pattern matching techniques are used to locate kernel structures. The different approaches presented in literature can be roughly divided in two different categories. On the one hand, we have solutions that focus on generating constraints at the field level. For example, Dolan-Gavitt et al. [10] proposed a system to find the invariants of a kernel structure—i.e., those fields which cannot be tampered by rootkits without affecting the stability of the operating system. The authors then used this information to automatically generate signatures for a given structure. Dimsum [20] uses instead a mix of boolean constraints generated from the definition of a data structure and then applies some probabilistic inference to match data structures in dead memory. On the other hand, we have techniques that rely on points-to relations between kernel objects to generate graph-based signatures [12, 21, 43]. A common problem of all these previous approaches is that they require to build their model for each target OS/kernel the analyst wants to analyze [39]. But again, at least when targeting the Linux kernel, this is only possible if the models were built with a kernel similar to the one under analysis. To overcome this limitation, Song et al. [39] recently presented DeepMem. This approach is divided in two stages. In the first one, the training stage, a Graph Neural Network model is trained by using several different memory graphs, a labeled representation of a memory dump. Then, in the detection phase, the neural network model accepts an unlabeled memory graph and classifies it. Using this machine learning approach, DeepMem is able to automatically learn the features of a kernel object across different operating system versions. Unfortunately, even DeepMem does not solve our problem. In fact, its memory graph relies on the concept of segments—that represent contiguous chunks of memory between two pointer fields. However, the presence of ifdefs or the use of structure randomization change the distance between two pointer fields, thus breaking the DeepMem segments.
Recently, the need to recover kernel structure layouts has also manifested in areas different from memory forensics. For example, Pustogarov et al. [29] solved the problem of analyzing Android device drivers of an host kernel, by loading them in a second evasion kernel running inside QEMU. In order to correctly load the driver, the layouts of the structures
Finally, the area of Virtual Machine Introspection (VMI) has been quite flourishing in the past decade [17]. Systems such as Virtuoso [9], VMST [14], and HyperLink [46] are all able to extract different information from a running virtual machine but also require to operate from the vantage point of the virtual machine monitor (VMM).
10 DISCUSSION AND FUTURE WORKS
In this section, we discuss the limitations and some potential improvements to our approach.
To assess the impact of the possibly missing information, we run our experiments twice: once with
Once the page tables have been located, 30 out of 50 plugins were working correctly. The remaining twenty were still malfunctioning because of a missing global variable—with two variables (
Nevertheless, we believe this problem might be tackled by instructing our compiler plugin to save in which functions these global variables are used and then make use of this information to extract their addresses from the kernel binary code itself. This is also facilitated by the fact that global variables can be easily identified in the code as their address is typically loaded in specific way (e.g., in
Threat Model – One of the most important applications of memory forensics is investigating attacks and malicious behaviors. Therefore, forensics tools must be resilient against attacks that tamper with their inner workings. We argue that any modification to kernel memory done with the intent of tricking AutoProfile to extract a wrong profile, is highly unlikely. First of all, kernels can be hardened against malicious modification of their code and data [27]. Moreover, even if these defenses are not deployed, certain modification might have negative consequence on the stability of the running kernel; something that rootkit authors certainly want to avoid. In particular, the only two pieces of information extracted and used by AutoProfile are the kernel symbols and the kernel code. Tampering with the first can negatively impact any kallsyms user—for example, kernel modules or the
Access Chains Improvements – The operation of accessing structure fields is ubiquitous across the kernel code base and the number of functions which only access a single field of a structure is rather small. For this reason, a major improvement to AutoProfile is to save more details about an access chain. For example, our compiler plugin could save the type of access performed on a field, i.e., if the field is only read or also written. In this way, during the exploration phase, AutoProfile could automatically filter memory accesses belonging to one type or the other. Moreover, when a field is written with a constant defined at compile time, this value could be saved in the access chain and used during the matching process. Finally, another distinctive feature might be the destination of a chain to know, for example, if the chain is used as a parameter to another function or used as return value. Overall, we believe that all these new details can drastically reduce the number of candidates extracted from a function and thus improving the layout models.
Extracting the size of a structure – Despite having a correct model for all the used fields, four plugins also require to know the size of certain structures to working properly. A first way to extract this information would be to find the offset of the last field of a structure and adding its size. A problem is that the compiler might have decided to add some padding at the end of this structure, thus the computed value might need some adjustment. However, this does not depend on the user config, but only on the compiler toolchain—and padding is often limited to few values. Moreover, if a global variable has the type of this structure, then the structure size can be deducted from the distance between the following global variable. Also in this case, the compiler might have padded the global variable instance, so minor adjustment are required. Finally, this value might also be present in the kernel binary, when the
Volatility Targets – Instead of trying to reconstruct the layout of each and every structure defined in the kernel codebase, in this article, we focused on extracting the offsets of the fields used by a considerable number of Volatility plugins. For this reason, we acknowledge that the list of targeted fields might be not exhaustive or cover every possible forensics analysis that will be developed in the future. On the other hand, we don’t believe this is a limitation of AutoProfile itself, because a way to solve this problem is simply to re-run our Phase three analysis whenever a new field is needed.
11 ARTIFACTS
We will share all the artifacts generated by our study to foster more research in this field at the following url: https://github.com/pagabuc/autoprofile. This includes the prototype tool we developed to generate the profiles, the tool to retrieve the kernel symbols from a memory dump, and all the memory images we used in our experiments.
- [1]
. 2018. Bug 84052 - Using randomizing structure layout plugin in linux kernel compilation doesn’t generate proper debuginfo. Retrieved November 2020 from https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84052.Google Scholar
- [2]
. 2016. Type inference on executables. ACM Computing Surveys
48, 4 (2016), 65. Google Scholar
Digital Library
- [3]
. 2010. Dynamic recreation of kernel data structures for live forensics. Digital Investigation
7, 1 (2010), S32–S40. Google Scholar
Digital Library
- [4]
. 2017. Memory forensics: The path forward. Digital Investigation, Special Issue on Volatile Memory Analysis 20 (2017), 23–33. Google Scholar
Digital Library
- [5]
. 2015. A practical approach for adaptive data structure layout randomization. In Proceedings of the European Symposium on Research in Computer Security. Springer, 69–89. Google Scholar
Digital Library
- [6]
. 2014. Rekall memory forensics framework. Retrieved October 15, 2021 from www.rekall-forensic.com.Google Scholar
- [7]
. 2018. Using crash with structure layout randomized kernel. Retrieved November 2020 from https://crash-utility.redhat.narkive.com/WZYTWez6/using-crash-with-structure-layout-randomized-kernel.Google Scholar
- [8]
. 2008. Z3: An efficient SMT solver. In Proceedings of the International Conference on Tools and Algorithms for the Construction and Analysis of Systems. Springer, 337–340. Google Scholar
Digital Library
- [9]
. 2011. Virtuoso: Narrowing the semantic gap in virtual machine introspection. In Proceedings of the 2011 IEEE Symposium on Security and Privacy. IEEE, 297–312. Google Scholar
Digital Library
- [10]
. 2009. Robust signatures for kernel data structures. In Proceedings of the 16th ACM Conference on Computer and Communications Security. ACM, 566–577. Google Scholar
Digital Library
- [11]
. 2016. Origen: Automatic extraction of offset-revealing instructions for cross-version memory analysis. In Proceedings of the 11th ACM on Asia Conference on Computer and Communications Security. ACM, 11–22. Google Scholar
Digital Library
- [12]
. 2014. Mace: High-coverage and robust memory analysis for commodity operating systems. In Proceedings of the 30th Annual Computer Security Applications Conference. ACM, 196–205. Google Scholar
Digital Library
- [13]
. 2011. SmartDec: Approaching C++ decompilation. In Proceedings of the 2011 18th Working Conference on Reverse Engineering. IEEE, 347–356. Google Scholar
Digital Library
- [14]
. 2012. Space traveling across vm: Automatically bridging the semantic gap in virtual machine introspection via online kernel data redirection. In Proceedings of the 2012 IEEE Symposium on Security and Privacy. IEEE, 586–600. Google Scholar
Digital Library
- [15]
. 2016. ksfinder - Retrieve exported kernel symbols from physical memory dumps. Retrieved November 2020 from https://github.com/emdel/ksfinder.Google Scholar
- [16]
. 2019. Sleak: Automating address space layout derandomization. In Proceedings of the 35th Annual Computer Security Applications Conference. 190–202. Google Scholar
Digital Library
- [17]
. 2014. Sok: Introspections on trust and the semantic gap. In Proceedings of the 2014 IEEE Symposium on Security and Privacy. IEEE, 605–620. Google Scholar
Digital Library
- [18]
. 2014. Recovering C++ objects from binaries using inter-procedural data-flow analysis. In Proceedings of the ACM SIGPLAN on Program Protection and Reverse Engineering Workshop 2014. ACM, 1. Google Scholar
Digital Library
- [19]
. 2019. POLaR: Per-allocation object layout randomization. In Proceedings of the 2019 49th Annual IEEE/IFIP International Conference on Dependable Systems and Networks. IEEE, 505–516.Google Scholar
Cross Ref
- [20]
. 2012. Dimsum: Discovering semantic data of interest from un-mappable memory with confidence. In Proceedings of the 19th Annual Network and Distributed System Security Symposium.Google Scholar
- [21]
. 2011. SigGraph: Brute force scanning of kernel data structure instances using graph-based signatures. In Proceedings of the Network and Distributed System Security Symposium.Google Scholar
- [22]
. 2009. Polymorphing software by randomizing data structure layout. In Proceedings of the International Conference on Detection of Intrusions and Malware, and Vulnerability Assessment. Springer, 107–126. Google Scholar
Digital Library
- [23]
. 2010. Automatic reverse engineering of data structures from binary execution. In Proceedings of the 11th Annual Information Security Symposium, 5. Google Scholar
Digital Library
- [24]
. 2013. System v application binary interface. AMD64 Architecture Processor Supplement, Draft v0
99, (2013), 57.Google Scholar
- [25]
. 2017. dynStruct: An automatic reverse engineering tool for structure recovery and memory use analysis. In Proceedings of the 2017 IEEE 24th International Conference on Software Analysis, Evolution and Reengineering. IEEE, 497–501.Google Scholar
Cross Ref
- [26]
. 1999. Type-based decompilation (or program reconstruction via type reconstruction). In Proceedings of the European Symposium on Programming. Springer, 208–223. Google Scholar
Digital Library
- [27]
. 2020. LKRG - Linux Kernel Runtime Guard. Retrieved November 2020 from https://www.openwall.com/lkrg/.Google Scholar
- [28]
. 2019. Back to the whiteboard: A principled approach for the assessment and design of memory forensic techniques. In Proceedings of the 28th \(\lbrace\)USENIX\(\rbrace\) Security Symposium (\(\lbrace\)USENIX\(\rbrace\) Security 19). 1751–1768. Google Scholar
Digital Library
- [29]
. 2020. Ex-vivo dynamic analysis framework for android device drivers. In Proceedings of the Symposium on Network and Distributed System Security.Google Scholar
-
. 2015. Unicorn-The ultimate CPU emulator.Google Scholar
- [31]
. 2017. Bootstomp: On the security of bootloaders in mobile devices. In Proceedings of the 26th \(\lbrace\)USENIX\(\rbrace\) Security Symposium (\(\lbrace\)USENIX\(\rbrace\) Security 17). 781–798. Google Scholar
Digital Library
- [32]
. 2020. Toward trusted sensing for the cloud: Introducing Project Freta. Retrieved November 2020 from https://www.microsoft.com/en-us/research/blog/toward-trusted-sensing-for-the-cloud-introducing-project-freta/.Google Scholar
- [33]
. 2014. Image-based kernel fingerprinting. Digital Investigation
11, Supplement 2 (2014), S13–S21. Google Scholar
Digital Library
- [34]
. 2010. Locating \(\times\) 86 paging structures in memory images. Digital Investigation
7, 1–2 (2010), 28–37.Google Scholar
Digital Library
- [35]
. 2016. SoK: (State of) the art of war: Offensive techniques in binary analysis. In Proceedings of the IEEE Symposium on Security and Privacy.Google Scholar
Cross Ref
- [36]
. 2010. DDE: Dynamic data structure excavation. In Proceedings of the 1st ACM Asia-Pacific Workshop on Systems. 13–18. Google Scholar
Digital Library
- [37]
. 2011. Howard: A dynamic excavator for reverse engineering data structures. In Proceedings of the Symposium on Network and Distributed System Security.Google Scholar
- [38]
. 2016. Automatic profile generation for live linux memory analysis. In Proceedings of the Third Annual DFRWS Europe (DFRWS’16), Volume 38.Google Scholar
- [39]
. 2018. Deepmem: Learning graph neural network models for fast and robust memory forensic analysis. In Proceedings of the 2018 ACM SIGSAC Conference on Computer and Communications Security. ACM, 606–618. Google Scholar
Digital Library
- [40]
. 2006. Grsecurity. Internet Retrieved May 27, 2006 from http://grsecurity.net/lsm.php.
Google Scholar
- [41]
. 2016. Universal memory forensic analysis of Android systems. Retrieved November 2020 from https://github.com/psviderski/volatility-android.Google Scholar
- [42]
. 2010. Reconstruction of composite types for decompilation. In Proceedings of the 2010 10th IEEE Working Conference on Source Code Analysis and Manipulation. IEEE, 179–188. Google Scholar
Digital Library
- [43]
. 2014. Sigpath: A memory graph based approach for program data introspection and modification. In Proceedings of the European Symposium on Research in Computer Security. Springer, 237–256. Google Scholar
Digital Library
- [44]
. 2021. Volatility profiles for Linux and Mac OS X. Retrieved November 2020 from https://github.com/volatilityfoundation/profiles.Google Scholar
- [45]
. 2007. The volatility framework: Volatile memory artifact extraction utility framework. Retrieved March 19, 2015 from https://www.volatilesystems.com/default/volatility.Google Scholar
- [46]
. 2016. HyperLink: Virtual machine introspection and memory forensic analysis without kernel source code. In Proceedings of the 2016 IEEE International Conference on Autonomic Computing. IEEE, 127–136.Google Scholar
Cross Ref
- [47]
. 2014. Modeling and discovering vulnerabilities with code property graphs. In Proceedings of the 2014 IEEE Symposium on Security and Privacy. IEEE, 590–604. Google Scholar
Digital Library
- [48]
. 2016. An adaptive approach for Linux memory analysis based on kernel code reconstruction. EURASIP Journal on Information Security
2016, 1 (2016), 14. Google Scholar
Digital Library
- [49]
. 2017. Research on linux kernel version diversity for precise memory analysis. In Proceedings of the International Conference of Pioneering Computer Scientists, Engineers and Educators. Springer, 373–385.Google Scholar
Cross Ref
Index Terms
AutoProfile: Towards Automated Profile Generation for Memory Analysis
Recommendations
Automatic profile generation for live Linux Memory analysis
Live Memory analysis on the Linux platform has traditionally been difficult to perform. Memory analysis requires precise knowledge of struct layout information in memory, usually obtained through debugging symbols generated at compile time. The Linux ...
Memory forensics
Traditionally, digital forensics focused on artifacts located on the storage devices of computer systems, mobile phones, digital cameras, and other electronic devices. In the past decade, however, researchers have created a number of powerful memory ...
Methods and Tools for Investigating Attacks - Memory Forensics
ICBDT '22: Proceedings of the 5th International Conference on Big Data TechnologiesThe memory of network attack and the reclusion of network crime make part of the key digital evidence only exist in physical memory or temporarily stored in the page exchange file, which makes the traditional file system-based computer forensics can ...












Comments