Abstract
Security-critical software applications contain confidential information which has to be protected from leaking to unauthorized systems. With language-based techniques, the confidentiality of applications can be enforced. Such techniques are for example type systems that enforce an information flow policy through typing rules. The precision of such type systems, especially in object-oriented languages, is an area of active research: an appropriate system should not reject too many secure programs while soundly preserving noninterference. In this work, we introduce the language SIFO which supports information flow control for an object-oriented language with type modifiers. Type modifiers increase the precision of the type system by utilizing immutability and uniqueness properties of objects for the detection of information leaks. We present SIFO informally by using examples to demonstrate the applicability of the language, formalize the type system, prove noninterference, implement SIFO as a pluggable type system in the programming language L42, and evaluate it with a feasibility study and a benchmark.
1 INTRODUCTION
In security-critical software development, it is important to guarantee the confidentiality and integrity of the data. For example, in a client-server application, the client has a lower privilege than the server. If the client reads information from the server in an uncontrolled manner, we may have a violation of confidentiality; this causes the client to release too much information to the user. On the other hand, if the server reads information from the client in an uncontrolled manner, we may have a violation of integrity; this causes the server to accept input that has not been validated.
Language-based techniques such as type systems are used to ensure specific information flow policies for confidentiality or integrity [Sabelfeld and Myers 2003]. A type system assigns an explicit security type to every variable and expression, and typing rules prescribe the allowed information flow in the program and reject programs violating the security policy. For example, we can define a security policy as a lattice of security levels with the highest level
For simple while-languages, type systems to control the information flow are widely studied [Hunt and Sands 2006; Li and Zhang 2017; Volpano et al. 1996]. We focus on the less researched area of information flow control for object-oriented languages. Analysis techniques such as fine-grained taint analysis [Enck et al. 2014; Hedin et al. 2014; Arzt et al. 2014; Huang et al. 2012; Milanova and Huang 2013; Huang et al. 2014; Graf et al. 2013] detect insecure flows from sources to secure sinks by analyzing the flow of data in the program. Coarse-grained dynamic information flow approaches [Xiang and Chong 2021; Nadkarni et al. 2016; Jia et al. 2013; Roy et al. 2009] reduce the writing effort of annotations by tracking information at the granularity of lexically or dynamically scoped section of code instead of program variables. By writing annotation, users can increase precision of the information flow results [Xiang and Chong 2021]. Moreover, there are approaches using program logic [Beckert et al. 2013; Amtoft et al. 2006; , 2008] to analyze and reason about information flow.
In this work, we focus on security type systems for object-oriented languages [Strecker 2003; Barthe et al. 2007; Banerjee and Naumann 2002; Myers 1999]. Sun, Banerjee, and Naumann [Banerjee and Naumann 2002; Sun et al. 2004] created a Java-like language annotated with security levels for the standard information flow policy with only two security levels. Myers et al. [Myers 1999] created the Jif language which extends Java with a type system that supports information flow control. The precision of the type systems for object-oriented languages is a major challenge. Both related approaches do not have an alias analysis or an immutability concept, so they conservatively reject secure programs where confidential and non-confidential references could alias the same object. This important drawback is addressed in our work. Additionally, as done for other type systems, we give a correctness guarantee through a proof of noninterference:
We introduce SIFO1 which supports information flow control for an object-oriented language with type modifiers for mutability and alias control [Giannini et al. 2019]. With respect to former work on security type systems for object-oriented languages, SIFO provides a more precise type system, allowing to type more correct programs. In this work, we show that reasoning about immutability and encapsulation is beneficial to reason about information flow. In addition to adding expressivity, SIFO allows a natural and compact programming style, where only a small part of the code needs to actually be annotated with security levels. This result is achieved by building over the concept of promotion/recovery [Giannini et al. 2019; Gordon et al. 2012], and extending it to allow methods and data structures to be implicitly parametric on the security level. For example, with promotion, a data structure can be used with any security level, but security is still enforced by not allowing data structures of different security levels to interfere with each other. This reduces the programming effort of developers and supports reuse of programs and libraries [Giannini et al. 2019].
The contents of this paper are as follows. First, we introduce the language SIFO for information flow control. Second, we formalize the type system by introducing typing and reduction rules. Third, we show that our language is sound by proving the noninterference property that secret data is never observable by a public state. Fourth, we implement SIFO and evaluate it with a feasibility study and a benchmark to compare SIFO with state-of-the-art information flow analysis tools.
2 INFORMAL PRESENTATION OF SIFO
In this section, we explain the challenges of securely checking the information flow in object-oriented languages. We then give an informal introduction to SIFO. Last, we discuss well and ill typed SIFO expressions for a more detailed explanation.
2.1 Motivating Example
Consider the following partially annotated code using two security levels
In SIFO, security is an instance based property: the person
When extracting the value of the field
Is this code conceptually correct with respect to information flow? Can we complete the type annotations on this code to make it correct? If the
The corresponding Jif code is also quite easy, but a little more involved and with a different syntax; we report it in Listing 1. Both in SIFO in Line 6, and in Jif in Line 9, the value of
Listing 1. Example in Jif syntax.
What happens if the
As you can see, not much has changed. Of course, we need to define the
On the other hand, Jif [Myers 1999] (the most closely related work) cannot accept this kind of code. In a pure object-oriented setting, everything is an object, and pre-defined types, as integers, should be treated as any other object. However, Jif treats primitive types in a privileged way. In Jif, it is possible to write more flexible code relying on primitive types than on objects. The difficulty revolves around aliasing and mutation: the local variable
In L42, the default modifier for references is
Since the
To make the same kind of behavior accepted in Jif, the code would have to be modified in the following way:
This code is accepted by both SIFO and Jif. This is a technique called defensive cloning [Bloch 2016]; it is very popular in settings where aliasing and mutability cannot be controlled.
In SIFO, we have mutable and immutable objects; where the reachable object graph (\(\mathtt {ROG}\)) of an immutable object is composed only of other immutable objects (deep immutability), while the \(\mathtt {ROG}\) of a mutable object can contain both mutable and immutable objects [Giannini et al. 2019]. The set of mutable objects in a \(\mathtt {ROG}\) is called \(\mathtt {MROG}\).
In addition to
Capsule variables are affine, that is, they can only be used zero or one time, thus if
As you can see from those examples, aliasing and mutability control is a fundamental tool needed to support information flow in the context of an object-oriented language. A typical misunderstanding of type modifiers is that a mutable field would always refer to a mutable object. This is not the case, indeed all the fields of immutable objects will transitively contain only immutable objects. This of course includes all fields originally declared as mutable. The same applies to security labels: a
Note how in our example the class
2.2 SIFO Concepts
Objects and references. As we anticipated above, in SIFO, we have mutable and (deeply) immutable objects. We also have four kinds of references:
The “only used once” restriction is necessary so that no alias for the isolated portion of the heap can be introduced, which would violate the
Finally, a
Types. Types in SIFO are composed by a security level \(\mathit {s}\), a type modifier \(\mathit {mdf}\) and a class name \(\mathit {C}\). The security levels \(\mathit {s}\) are arranged in a lattice that specifies the allowed data flow direction between them. For example, we have a lattice with a
Core Calculus. The syntax of the core calculus of SIFO is shown in Figure 1. It covers classes \(\mathit {C}\), field names f, method names m, and declarations for classes, interfaces, and methods. A class consists of fields and methods. The class itself has no modifier or security level. The modifiers and security levels are associated with references and expressions. A field has a type \(\mathit {T}\) and a name. A method has a return type, a list of input parameters with names and types, and also a security level and a type modifier for the receiver; they are specified in front of the keyword
Fig. 1. Syntax of the core calculus of SIFO.
Method Calls. A method has to be defined in a class with parameter types, a return type and a receiver type. For example, an
We can call such a method if the receiver and the actual input parameter are
Control Flow and Implicit Information Leaks. Information flow control mechanisms [Sabelfeld and Myers 2003; Volpano et al. 1996] are used to enforce an information flow policy that specifies the allowed data flow in programs. A program can leak information directly through a field update. This can be prevented by ensuring that no confidential data is assigned to a less confidential variable. However, information can also flow implicitly through conditionals, loops, and (crucial in OO) dynamic dispatch. For example, the chosen branch of a conditional reveals information about the values in the guard. As shown from Smalltalk [Goldberg and Robson 1983], in a pure OO language, dynamic dispatch can be used to emulate conditional statements and various forms of iterations and control flow. Thus, our core language does not contain explicit conditional statements, but they can be added as discussed in Section 4. Loops can be implemented through recursive method calls.
Therefore, SIFO only needs a secure method call rule to prevent implicit information flow leaks. In a method call, information of the method recevier can flow to the return value and mutable parameters. Thus, the security levels of the return value and mutable parameters have to be equal or higher than the security level of the receiver. Consider for example the following code:
Listing 2. SIFO examples.
2.3 Examples of Well-Typed and Ill-Typed SIFO Expressions
In Listing 2, we show secure and insecure programming statements to explain the reasoning about information flow in SIFO.
A class
Consider the assignments in Listing 2 starting with Line 5 (line numbers are referenced in parentheses in the following): To ensure confidentiality, the type system prevents the password to be leaked via a
Until now, we explained assignments of immutable Strings; but the most interesting challenge to guarantee confidentiality is about assigning mutable objects instead. For example, how can we update the mutable
A secure assignment without aliases is shown in (20, 21). Here, the
As discussed before, aliases over
Finally, with
For a more compelling example of our system that can promote expressions, consider the following listing:
The method
Any method that takes a single
3 DEFINITIONS FOR THE SIFO TYPE SYSTEM
In this section, we define well-formedness of the type system and useful helper methods to introduce typing rules in the following section.
Well-Formedness. A well-formed program respects the following conditions: All classes and interfaces are uniquely named. All methods in a specific class or interface are uniquely named. All fields in a specific class are uniquely named. All parameters in a method header are uniquely named, and there is no explicit method parameter called
Helper Functions. In Figure 2, we show some helper functions for our type system. The first three notations extract the security level, the type modifier, and the class name from a type. The next two return fields and class declarations. The \(\mathit {lub}\) operator is defined to return the least upper bound of a set of input security levels arranged in a lattice. For example, since
Fig. 2. Helper functions.
In Figure 2, the function \(\mathit {s}\ \mathit {mdf}\ \mathit {C}[\mathit {s}^{\prime }]\) returns a new type whose security level is the least upper bound of the two. The security level is set to \(\mathit {lub}(\mathit {s}, \mathit {s}^{\prime })\) and the modifier and class remain the same. The last function \(\mathit {mdf}\rhd \mathit {mdf}^{\prime }\) computes a resulting modifier if a field with type modifier \(\mathit {mdf}^{\prime }\) is accessed from some reference with type modifier \(\mathit {mdf}\). For example, if we access a
Multiple Method Types. Instead of a single method type as in Featherweight Java [Igarashi et al. 2001], we return a set of method types using \(\mathit {methTypes}(C, m) = \lbrace \overline{\mathit {T}_0} \rightarrow \mathit {T}_0,\\) \(\dots ,\ \overline{\mathit {T}_n} \rightarrow \mathit {T}_n\rbrace\). The programmer just declares a method with a single type, and the others are deduced by applying all the transformations shown in Figure 3. Multiple method types reduce the need of implementing the same functionality several times, where the same parameter has only different type modifiers or security levels. The base case, as declared by the programmer, can be transformed in various ways: (1) A method working on lower security data can be transparently lifted to work on higher security data (some security level \(s^{\prime }\)). This means that methods that are not concerned with security are usually declared as working on a
Fig. 3. Definition of multiple method types.
(2) The second case swaps all
can be also used as if it was declared as
where all
4 TYPING RULES
The typing rules are presented in Figure 4. We assume a reduction similar to Featherweight Java [Igarashi et al. 2001; Pierce 2002]. We have a typing context \(\Gamma ::= x_1:T_1\ldots x_n:T_n\) which assigns types \(\mathit {T}_i\) to variables \(x_i\).
Fig. 4. Expression typing rules.
Sub and Subsumption. | We allow traditional subsumption for modifiers and class names. However, we are invariant on the security level. We assume our interfaces to induce the standard subtyping between class names. | ||||
T-Var. | |||||
Field Access. | The result of the field access has the class of the field f. The security level is the least upper bound of the security levels of \(e_0\) and f. The resulting modifier is the sum of the modifiers of \(e_0\) and f as defined in Figure 2. In this way, if we read a | ||||
Field Assign. | The reference resulting from \(e_0\) has to be | ||||
Call. | We allow a method call if there is a method type where all parameters and the return value are typable. The security levels of the return type and all | ||||
New. | The newly allocated object is created as a mutable object, and with a specified security level s. This rule checks that the parameter list \(e_1 \dots e_n\) has the same length as the declared fields. The object of class C has a list of fields \(f_1 \dots f_n\). Each parameter \(e_i\) is assigned to a field \(f_i\). This assignment is allowed if the type of parameter \(e_i\) is (a subtype of) \(T_i[s]\). The programmer can choose s to raise the expected security level over the level originally declared for the fields. In order to use an actual parameter with a higher security level (s) to initialize a field defined with a lower security level, the newly created object needs to have this chosen security level s. By using rule Sec-prom, we can do the opposite, initializing higher security fields with lower security An object The object | ||||
Prom. | Promotion from | ||||
Sec-Prom. | Security promotion raises the security level of a | ||||
M-Ok. | This rule checks that the definition of a method is well typed. Using the receiver type and the parameter types, e must have the same type as the declared return type. | ||||
C-Ok. | This rule checks that the definition of a class is well typed. The rule uses a helper function \(\mathit {mhs}\) which returns the method headers declared in a set of classes or interfaces, or it directly returns the headers of a set of methods. A well typed class C implements all methods that are declared in the interfaces \(\overline{C}\). | ||||
I-Ok. | A correct interface must contain all method headers of the implemented interfaces \(\overline{C}\). | ||||
Implicit Information Flows. The language as presented is minimal but using well-known encodings it can support imperative update of local variables (use box objects with a single field and field updates) and conditionals (use any variation of the Smalltalk [Goldberg and Robson 1983] way to support control structures). However, in Figure 5, for the sake of a more compelling explanation, we show how the
Fig. 5. Extension: expression typing rules for if and declassify.
Listing 3. Ill-typed example of a method call.
Therefore, assignments to less confidential variables or fields in the branches are prohibited to prevent leaks. This means that if the expression in the guard of a conditional statement has a security level that is higher than the lowest security level, only assignments to variables of at least the security level of the guard are allowed. Additionally, only mutable objects of at least the security level of the guard can be mutated. In this way, only data whose security level is at least the one of the guard can be mutated.
Using the Smalltalk-style as discussed above, our pre-existing rules would handle the encoded code exactly as with the explicit If-rule, Booleans, and local variables. Thus, our system is minimal, but does not lack expressiveness. The If-rule has a similar constraint as the Call-rule where return type and
The following example, in both OO style and with an explicit
Declassification. A
Decl. | The | ||||
5 PROOF OF NONINTERFERENCE
In this section, we aim to ensure noninterference [Goguen and Meseguer 1982] according to our information flow policy. Noninterference is a central criterion for secure information flow, as we want to ensure that an attacker cannot deduce confidential data by observing data with lower security levels. It is based on the indistinguishability of program states. Two program states are indistinguishable (also referred to as observably similar) up to a certain security level if they agree on their memory reachable from references with a security level lower than that specific security level. Using this property, a program satisfies the noninterference Theorem 5.0 if and only if the following holds: if a program is executed in two observably similar memories up to a certain security level, then the resulting memories are also observably similar up to the same security level, but may differ in higher security levels.
If we have expressions \(e_1\) and \(e_2\) without declassification that are well typed and have the same low values, but possible different high values (\(e_1\ \mathit {lowEqual}\ e_2\) see Definition 5.6),\(\mathit {M_1}\) and \(\mathit {M_2}\) are well typed memories, \(\mathit {M_1}\) and \(\mathit {M_2}\) are low observably similar, \(\mathit {M_1}\vert e_1 \rightarrow ^*\mathit {M^{\prime }_1}\vert v_1\), \(\mathit {M_2}\vert e_2 \rightarrow ^*\mathit {M^{\prime }_2}\vert v_2\), then \(\mathit {M^{\prime }_1}\) and \(\mathit {M^{\prime }_2}\) are low observably similar, and memories \(\mathit {M^{\prime }_1}\), \(\mathit {M^{\prime }_2}\), values \(v_1\), and \(v_2\) are well typed and \(v_1\ \mathit {lowEqual}\ v_2\).
In this section, we prove noninterference for a lattice with a
5.1 Reduction Rules
SIFO is an additional type system layer and does not influence the language semantics. However, for the sake of the noninterference proof, we need to instrument the small-step reduction to keep track of security and modifiers during program execution. To this aim, we define in Figure 6 values v as a location o in a store with security level and type modifier. The store is some memory \(\mathit {M}\). In \(\mathit {M}\), a location points to some class \(\mathit {C}\) where each field is again a location \(o_i\) in the memory \(\mathit {M}\). With the evaluation context \(\mathcal {E}_v\), we define the order of evaluation. We assume two well-formedness properties. The memory is well-formed if \(\mathit {M}\) is a map from o to \(C(\overline{o})\), thus the locations \(\overline{o}\) in domain of \(\mathit {M}\) are unique. The reduction arrow (\(\mathit {M}| e \rightarrow \mathit {M^{\prime }}| e^{\prime }\)) is well-formed if \(\mathit {M}\) and \(\mathit {M^{\prime }}\) have no dangling pointers with respect to e and \(e^{\prime }\) (i.e., every pointer points to a valid object in the memory). In Figure 7, the following reduction rules are shown.
Fig. 6. Runtime syntax and values.
Fig. 7. Reduction rules.
Ctx. | This is the conventional contextual rule, allowing the execution of subexpressions. | ||||
Field Access. | A field access \(f_i\) of a value v is reduced to a location \(o_i\) if the location o of v points to the suitable class \(C(o_1 \dots o_n)\). The security level is the least upper bound of the security levels of v and \(f_i\) and the modifier is the sum of modifiers of v and \(f_i\). | ||||
Field Update. | The store \(o \mapsto C(\overline{o}\ o_0 \dots o_n)\) is updated with an assignment of \(v_2\) to the field \(f_0\). The location \(o_0\) is replaced with \(o^{\prime }\). The security level of the resulting value \(v^{\prime }_2\) is the least upper bound of the security levels of expression \(v_0\) and the field \(f_0\) as declared in C. The type modifier of \(v^{\prime }_2\) is equal to the type modifier of the field \(f_0\) as declared in C. Definition 5.1
(mostSpecMethType).\(\begin{align*} &\bullet \ {\it mostSpecMethType}(C, m, ss, {\it mdfs}) = {\it openCapsules}(Ts^{\prime }\rightarrow T, {\it mdfs})\\ &\quad s\ {\mathit {mdf}}\ {\color{purple}\text{method}}\ T_0\ m(T_1\ x_1 \dots T_n\ x_n)\ in\ C\\ &\quad {\it raiseFormalSecurity}(s\ {\mathit {mdf}}\ C\ T_1 \dots T_n\rightarrow T_0, ss) = Ts \rightarrow T\\ &\quad {\it raiseActualSecurity}(Ts, ss, {\it mdfs}) = Ts^{\prime }\\ &\\ &\bullet \ {\it raiseFormalSecurity}(T_1 \dots T_n \rightarrow T_0, s^{\prime }_1 \dots s^{\prime }_n) = T^{\prime }_1 \dots T^{\prime }_n \rightarrow T^{\prime }_0\\ &\quad T_i = s_i\ {\mathit {mdf}}_i\ C_i\\ &\quad T^{\prime }_i = \mathit {lub}(s,s_i)\ {\mathit {mdf}}_i\ C_i\\ &\quad s = \mathit {lub}(\lbrace s^{\prime }_i\vert s^{\prime }_i \gt s_i\rbrace)\\ &\\ &\bullet \ {\it raiseActualSecurity}(T_0 \dots T_n, s_0 \dots s_n, {\mathit {mdf}}_0\dots {\mathit {mdf}}_n) = T^{\prime }_0\dots T^{\prime }_n\\ &\quad T_i=s^{\prime \prime }_i\ {\mathit {mdf}}_i\ C_i\\ &\quad T^{\prime }_i=sec(T_i)\ {\mathit {mdf}}_i\ C_i\\ &\quad {\it if}\ s_i \lt sec(T_i)\ then\ {\mathit {mdf}}_i\ \in \lbrace {\color{blue}\text{capsule}},{\color{blue}\text{imm}}\rbrace \\ &\\ &\bullet \ {\it openCapsules}(T_1 \dots T_n \rightarrow T, {\mathit {mdf}}^{\prime }_1 \dots {\mathit {mdf}}^{\prime }_n) = T^{\prime }_1\dots T^{\prime }_n \rightarrow T\\ &\quad T_i = s_i\ {\mathit {mdf}}_i\ C_i\\ &\quad T^{\prime }_i = s_i\ {\mathit {mdf}}^{\prime }_i \rhd \mathit {mdf}\ C_i \end{align*}\) | ||||
Call. | We reduce a method call to an expression e, where each value \(v_i\) is assigned to a parameter \(x_i\). As we use multiple method types, the actual assigned values \(v^{\prime }_i\) can have an updated security level or type modifier. Additionally, the called method has to be declared in the class \(C_0\) pointed to by the location in \(v_0\). The concrete calculation of the updated types \(T^{\prime }_i\) with mostSpecMethType is shown in Definition 5.1. The definition calculates the exact method types to support the proof of noninterference. To calculate the types \(T^{\prime }_i\), the functions needs as parameters, the class name C, the method name m, and the used security levels ss and type modifiers \(\mathit {mdfs}\) to call this method. First with \(\mathit {raiseFormalSecurity}\), for each formal parameter type \(T_i\), the security level can be raised. We collect all security levels \(s^{\prime }_i\) where a security level higher than declared in the formal parameter is passed. The least upper bound of the collected security levels is the minimum security level for all actual parameters. It is allowed that formal parameters have higher security levels than s if a current parameter is passed with the same security level higher than s. With \(\mathit {raiseFormalSecurity}\), it is checked that the actual parameters have the same security level as the now raised formal parameters. Only actual parameters with type modifier \({\mathtt{\color {blue}{imm}}}\) or \(\mathtt{\color {blue}{capsule}}\) can be raised to the needed level. In the last step, the combined type modifier of the actual and formal parameter is calculated with the function of Figure 2. | ||||
New. | The newly created object is reduced to a location o in the memory that points to the class C where each field is again a location \(o_i\). The reduced value has the same security level s as in the expression before the reduction and a | ||||
Hrec, Hother. | The two rules in Figure 8 are added to condense the reduction of methods calls on high receivers (\(\rightarrow _{call}\)). These rules only consider a Fig. 8. Additional reduction rules for the noninterference proof. | ||||
5.2 Definition of Similarity w.r.t. Security Levels
In this subsection, we define the (observable) similarity of two memories for the proof of noninterference. Therefore, we need further definitions of reachable object graphs (\(\mathtt {ROG}\)). We define mutable,
\(\mathit {MRog}(M, e)\! = \!\overline{o}\)
\(\bullet \ o \in \mathit {MRog}(M, e)\) if:
\({}_{}\quad e = \mathcal {E}[ s\ {{\mathtt{\color {blue}{mut}}}}\ o]\)
\(\bullet \ o \in \mathit {MRog}(M, e)\) if:
\({}_{}\quad e = \mathcal {E}[ s\ {\mathtt{\color {blue}{capsule}}}\ o]\)
\(\bullet \ o_i \in \mathit {MRog}(M,e)\) if:
\({}_{}\quad o \in \mathit {MRog}(M, e)\)
\({}_{}\quad o \mapsto C(o_1 \dots o_n)\ in\ M\)
\({}_{}\quad \lbrace {\mathtt{\color {purple}{class}}}\ C\ \_\ \lbrace T_1 f_1 \dots T_n f_n \_\rbrace\)
\({}_{}\quad \mathit {mdf}(T_i) = {{\mathtt{\color {blue}{mut}}}}\)
\(\mathit {lowRog}(M, e) = \overline{o}\)
\(\bullet \ o \in \mathit {lowRog}(M, e)\) if:
\({}_{}\quad e = \mathcal {E}[ {{\mathtt{\color {red}{low}}}}\ \_\ o]\)
\(\bullet \ o_i \in \mathit {lowRog}(M, e)\) if:
\({}_{}\quad o \in \mathit {lowRog}(M, e)\)
\({}_{}\quad o \mapsto C(o_1 \dots o_n)\ in\ M\)
\({}_{}\quad {\mathtt{\color {purple}{class}}}\ C\ \_\ \lbrace T_1 f_1 \dots T_n f_n \_\rbrace\)
\({}_{}\quad \text{$\mathit {sec}$}(T_i) = {{\mathtt{\color {red}{low}}}}\)
In Definition 5.4, the
\(\bullet \ o \in \mathit {highRog}(M, e)\ \mathit {if}:\)
\({}_{}\quad e = \mathcal {E}[ {{\mathtt{\color {red}{high}}}}\ {{\mathtt{\color {blue}{mut}}}}\ o]\)
\(\bullet \ o_i \in \mathit {highRog}(M, e)\ \mathit {if}:\)
\({}_{}\quad o^{\prime } \in \mathit {highRog}(M, e)\)
\({}_{}\quad o^{\prime } \mapsto C(o_1 \dots o_n)\ in\ M\)
\({}_{}\quad {\mathtt{\color {purple}{class}}}\ C\ \_\ \lbrace T_1 f_1 \dots T_n f_n \_\rbrace\)
\({}_{}\quad \mathit {mdf}(T_i) = {{\mathtt{\color {blue}{mut}}}}\)
\(\bullet \ o_i \in \mathit {highRog}(M, e)\ \mathit {if}:\)
\({}_{}\quad o^{\prime } \in \mathit {lowRog}(M, e)\)
\({}_{}\quad o^{\prime } \mapsto C(o_1 \dots o_n)\ in\ M\)
\({}_{}\quad {\mathtt{\color {purple}{class}}}\ C\ \_\ \lbrace T_1 f_1 \dots T_n f_n \_\rbrace\)
\({}_{}\quad \text{$\mathit {sec}$}(T_i) = {{\mathtt{\color {red}{high}}}}\)
\({}_{}\quad \mathit {mdf}(T_i) = {{\mathtt{\color {blue}{mut}}}}\)
In Definition 5.5, we define the observable similarity of two memories. Two memories \(\mathit {M_1}\) and \(\mathit {M_2}\) are similar given an expression e, if and only if the
Removing the
\(\mathit {M_1}\ similar(e)\ \mathit {M_2}\) \(\leftrightarrow\)\(\mathit {M_1}[only\ lowRog(\mathit {M_1},e)] = \mathit {M_2}[only\ lowRog(\mathit {M_2},e)]\)
where \(M[only\ \overline{o}] = M^{\prime }\) is defined as:
\(\bullet \ (o_1 \mapsto C_1(\overline{o_1}) \dots o_n \mapsto C_n(\overline{o_n}))[only\ \overline{o}] =\)
\({}_{}\quad o_1 \mapsto C_1(\overline{o_1})[only\ \overline{o}] \dots o_n \mapsto C_n(\overline{o_n})[only\ \overline{o}]\)
\(\bullet \ (o\mapsto C(\_))[only\ \overline{o}] = empty\ \mathit {if}\ o\ is\ not\ in\ \overline{o}\)
\(\bullet \ (o\mapsto C(o_1\dots o_n))[only\ \overline{o}] =o\mapsto C(o^{\prime }_1 \dots o^{\prime }_n)\ \mathit {if}\ o\ in\ \overline{o}\) with:
\({}_{}\quad fields(C)= T_1\ f1\dots T_n\ fn\)
\({}_{}\quad o^{\prime }_i=o_i\ \mathit {if}\ sec(T_i)={{\mathtt{\color {red}{low}}}}\)
\({}_{}\quad o^{\prime }_i=o\ \mathit {if}\ sec(T_i)={{\mathtt{\color {red}{high}}}}\)
The Definition 5.6 compares that two expressions are equal if we only consider the
\(e\ \mathit {lowEqual}\ e^{\prime }\)
\(\bullet \ x\ \mathit {lowEqual}\ x\)
\(\bullet \ e.f\ \mathit {lowEqual}\ e^{\prime }.f\ \mathit {iff}\ e\ \mathit {lowEqual}\ e^{\prime }\)
\(\bullet \ e_0.f=e^{\prime }_0\ \mathit {lowEqual}\ e_1.f=e^{\prime }_1\ \mathit {iff}\ e_0\ \mathit {lowEqual}\ e_1\ and\ e^{\prime }_0\ \mathit {lowEqual}\ e^{\prime }_1\)
\(\bullet \ e_0.m(e_1\dots e_n)\ \mathit {lowEqual}\ e^{\prime }_0.m(e^{\prime }_1\dots e^{\prime }_n)\ \mathit {iff}\ e_i\ \mathit {lowEqual}\ e^{\prime }_i\ \mathit {for}\ i\ in\ 0\dots n\)
\(\bullet \ \mathtt{\color {purple}{new}}\ \mathit {s}\ \mathit {C}(e_1\dots e_n)\ \mathit {lowEqual}\ \mathtt{\color {purple}{new}}\ \mathit {s}\ \mathit {C}(e^{\prime }_1\dots e^{\prime }_n)\ \mathit {iff}\ e_i\ \mathit {lowEqual}\ e^{\prime }_i\ \mathit {for}\ i\ in\ 1\dots n\)
\(\bullet \ ({\mathtt{\color {red}{low}}}\ \mathit {mdf}\ o)\ \mathit {lowEqual}\ ({\mathtt{\color {red}{low}}}\ \mathit {mdf}\ o)\)
\(\bullet \ ({\mathtt{\color {red}{high}}}\ \mathit {mdf}\ o)\ \mathit {lowEqual}\ ({\mathtt{\color {red}{high}}}\ \mathit {mdf}^{\prime }\ o^{\prime })\)
The Definition 5.7 is essential to constrain reduction: an alternative to our reduction that is undesirable could trivially preserve security by adding everything to the
\(M\ \mathit {preserves(e/e^{\prime })}\ M^{\prime }\) if one of the three holds:
\((1)\ highRog(M,e) = highRog(M^{\prime },e^{\prime })\)
\((2)\ highRog(M,e), o = highRog(M^{\prime },e^{\prime })\ then\)
\({}_{}\quad \ e = ctx_0[new\ high\ C(\overline{v})],\ e^{\prime } = ctx_0[{\mathtt{\color {red}{high}}}\ mdf\ o]\)
\((3)\ highRog(M,e), MROG(M,{\mathtt{\color {red}{high}}}\ mdf^{\prime }\ o) = highRog(M^{\prime },e^{\prime })\ then\)
\({}_{}\quad \ e = ctx_0[low\ capsule\ o],\ e^{\prime } = ctx_1[{\mathtt{\color {red}{high}}}\ mdf^{\prime }\ o]\)
\({}_{}\quad \ and\ e\ \text{is equal to one of the following:}\)
\({}_{}\quad \quad \ ctx[v.f=low\ capsule\ o]\)
\({}_{}\quad \quad \ ctx[(low\ capsule\ o).m(\overline{v})],\)
\({}_{}\quad \quad \ ctx[v.m(\overline{v} (low\ capsule\ o) \overline{v^{\prime }})],\)
\({}_{}\quad \quad \ ctx[new\ C(\overline{v} (low\ capsule\ o) \overline{v^{\prime }})]\)
5.3 Noninterference Theorem and Proof
We prove noninterference according to our information flow policy. In literature, there are many proposed languages (with proofs) that are very similar to the type system proposed in this work [Giannini et al. 2019]. Here, to avoid repeating those same proofs that are already presented in those other works, we accept two assumptions. (1) Soundness: the reduction does not get stuck. (2) Im- mutability and encapsulation: In addition to not getting stuck, the reduction also never mutates the \(\mathtt {ROG}\) of an immutable object, and the \(\mathtt {ROG}\) of capsules is always encapsulated (i.e., all mutable objects can be reached only through the
To prove noninterference, we first introduce two lemmas that facilitate the proof. We show that the reduction terminates using \(\rightarrow _{call}\) and we show that, given two similar memories, each reduction step results in similar memories.
Call-Reduction Termination. We prove in Lemma 5.8 that the reduction \(\rightarrow _{call}\) does not interfere with termination: if we have a well typed memory \(\mathit {M}\) and an expression e and the program terminates with the normal reduction, then it also terminates with \(\rightarrow _{call}\). The result for both reductions is also the same.
(→callTermination)
If memory \(\mathit {M}\) and expression e are well typed and if the reduction terminates with \(\rightarrow\), then it terminates also with the reduction \(\rightarrow _{call}\) with the same result.
This can be verified by cases: \(\rightarrow _{call}\) behaves exactly like \(\rightarrow\), but has a different granularity of the steps. Therefore, \(\rightarrow _{call}\) terminates in every case where \(\rightarrow\) terminates.□
Bisimulation. To establish noninterference, Lemma 5.9, the bisimulation core, states that two well typed and similar memories \(\mathit {M_1}\) and \(\mathit {M_2}\) and expressions \(e_1\) and \(e_2\), where \(e_1\ \mathit {lowEqual}\ e_2\) holds, reduce to \(\mathit {M^{\prime }_1}|e^{\prime }_1\) and \(\mathit {M^{\prime }_2}|e^{\prime }_2\) that are also similar, and the reduced expressions \(e^{\prime }_1\) and \(e^{\prime }_2\) are \(\mathit {lowEqual}\). We need property (2) to state that both memories are observably similar, and property (3) that also the expressions are similar regarding the observable values. Both properties together represent observably similar memories as in Theorem 5.0. Furthermore, the preservation property of each memory ensures that the reduction only changes the
Given well typed memories and expressions without declassification \(\mathit {M_1}\), \(\mathit {M_2}\), \(e_1\), and \(e_2\)
where \(\mathit {M_1}\vert e_1 \rightarrow ^*\_ \vert v_1\) and \(\mathit {M_2}\vert e_2 \rightarrow ^*\_ \vert v_2\).
If the following holds
(1) \(\mathit {M_1}\vert e_1 \rightarrow _{call}\mathit {M^{\prime }_1}\vert e^{\prime }_1\),
(2) \(\mathit {M_1}\ \mathit {similar(e_1)}\ \mathit {M_2}\),
(3) and \(e_1\ \mathit {lowEqual}\ e_2\),
then:
(A) \(\mathit {M_2}\vert e_2 \rightarrow _{call}\mathit {M^{\prime }_2}\vert e^{\prime }_2\) and \(e^{\prime }_1\ \mathit {lowEqual}\ e^{\prime }_2\),
(B) \(\mathit {M^{\prime }_1}\ \mathit {similar(e^{\prime }_1)}\ \mathit {M^{\prime }_2}\),
(C) \(\mathit {M_1}\ \mathit {preserves(e_1/e^{\prime }_1)}\ \mathit {M^{\prime }_1}\),
(D) \(\mathit {M_2}\ \mathit {preserves(e_2/e^{\prime }_2)}\ \mathit {M^{\prime }_2}\),
(E) \(\mathit {M^{\prime }_1}, \mathit {M^{\prime }_2}, e^{\prime }_1, e^{\prime }_2\) are well typed
We prove that all five conditions A–E are satisfied. For A and B, we prove this theorem by cases on \(\rightarrow _{call}\). We only prove the cases including a context, because the proofs with an empty context imply the correctness of the rules without a context.
Proof of A and B by cases:
Case Ctx + Call :
If \(e_1\) and \(e_2\) are of form \(\mathcal {E}_v\) [(
Proof of A: The only point of non-determinism in this language is the way new object identities are chosen, and the only way to introduce a new o is with the New rule. From assumption (1) and Lemma 5.8, we know that there is an execution of \(\rightarrow _{call}\), containing an arbitrary number of reduction steps (\(\rightarrow\)). The reduction is of form \(M | \mathcal {E}_v[({\mathtt{\color {red}{high}}}\ \mathit {mdf}\ o).m(\overline{v})] \rightarrow _{call}M^{\prime } | \mathcal {E}_v[({\mathtt{\color {red}{high}}}\ \mathit {mdf}\ o^{\prime \prime })]\), where the result is a
Proof of B: We execute a number of steps on the
All the other parameters and the receivers are
If \(e_1\) and \(e_2\) are not of form \(\mathcal {E}_v[({\mathtt{\color {red}{high}}}\ \mathit {mdf}\ o).m(\overline{o})]\), the proof is by rule (Hother).
Proof of A: In this case we are doing a single reduction step \(\rightarrow\). The only way to introduce non-determinism is by creating a new object, but this step is a method call. Thus, \(\mathit {M_1}=\mathit {M^{\prime }_1},\ \mathit {M_2}=\mathit {M^{\prime }_2}\) and \(e^{\prime }_1\) \(\mathit {lowEqual}\) \(e^{\prime }_2\).
Proof of B: The expressions \(e_1\) and \(e_2\) must be of form \(\mathcal {E}_v[({\mathtt{\color {red}{low}}}\ \mathit {mdf}\ o).m(\overline{o})]\) and the
Case Ctx + Field Update :
Proof of A: By assumption (1), we know the expression reduces. The only way to introduce a new o is with the (new) rule; thus A holds since we use the same process to get \(v^{\prime }_2\) (the rule applies a deterministic procedure).
Proof of B: In this case, if \(v_2\) is low, we know: \(\mathit {M_1}| \mathcal {E}_v[v_1.f_0=v_2]\), \(\mathit {M_2}| \mathcal {E}_v[v_1.f_0=v_2]\), and \(\mathit {M_1}\ \mathit {similar(\mathcal {E}_v[v_1.f_0=v_2])}\ \mathit {M_2}\). Thus, \(\mathit {M^{\prime }_1}| v^{\prime }_2\), \(\mathit {M^{\prime }_2}| v^{\prime }_2\), and \(\mathit {M^{\prime }_1}\ \mathit {similar}\) \(\mathit {(\mathcal {E}_v[v^{\prime }_2])}\ \mathit {M^{\prime }_2}\).
If \(v_2\) is
However, we need to inspect all possible field update cases to check that all assignments do not violate our similarity property:
To shorten the writing for all cases, we assume a
\(\bullet\)
\(\bullet\)
\(\bullet\)
\(\bullet\)
Note for the
\(\bullet\)
\(\bullet\)
\(\bullet\)
\(\bullet\)
Case Ctx + Field Access :
Proof of A: The only way to introduce non-determinism is the way new objects are created. We are accessing a location o that is equal in both cases or it is a
Proof of B: The memory is not changed by accessing a field. \(\mathit {M_1}\ \mathit {similar(e)}\ \mathit {M_2}\) holds before and \(\mathit {M_1}=\mathit {M^{\prime }_1}\) and \(\mathit {M_2}=\mathit {M^{\prime }_2}\), thus B holds.
Case Ctx + New :
In this case, assumption (1) is of form \(\mathit {M_1}| \mathcal {E}_v[\mathtt{\color {purple}{new}}\ s\ C(\overline{v})] \rightarrow _{call}\mathit {M_1}, o\mapsto C(\_) |\) \(\mathcal {E}_v[(s\ {\mathtt{\color {blue}{mut}}}\ o)]\), and (A) is similar with \(\mathit {M_2}\).
Proof of A: This can be obtained by simply choosing a suitable reference ‘o’ for the New rule.
Proof of B: It holds because the new memories grow by adding the same exact object on both sides.
Proof of C and D:
To prove the conditions C and D, we have to show that if M and e are well typed and \(M \vert e \rightarrow _{call}\mathit {M^{\prime }}\vert e^{\prime }\) then \(M\ \mathit {preserves(e/e^{\prime })}\ \mathit {M^{\prime }}\). That means, each reduction step ensures the preserve property that the
Proof of E:
Proving condition E is more complex. From our assumptions we conclude that the well typedness of the base language is not violated, since the SIFO type system is just stricter than the regular L42 system. However, we need to prove that the \(\rightarrow _{call}\) reduction preserves the added security typing. This is quite subtle thanks to multiple method types (Figure 3): The type system as presented does not respect preservation; that is, when calling a method that has been typed using multiple method types, the inlined body of the method may not respect our provided type system. However, we are using the \(\rightarrow _{call}\) reduction, and the call reduction skips in a single step to the full method evaluation. The method body will evaluate to a value; we have not formally specified typing rules for values and memory; the intuition we present here in our proof sketch is that security is not relevant in the memory; as you can see from the grammar in Figure 6, we keep the security level on the value (outside of the memory), so the result of a high method call with \(\rightarrow _{call}\) is well typed because it is only concerned with the non-security aspects of the type system, and the security level is tracked during reduction. See how, for example, in rule Call of Figure 4, the security level and modifier of the parameters can be promoted before inlining the method body.□
To prove the noninterference Theorem 5.10, we use the property that the reduction terminates (Lemma 5.8) and the bisimulation core that each reduction step meets the requirements of noninterference (Lemma 5.9). We prove, if two memories that are similar and both expressions reduce, the new memories still have to be similar and the reduced values are equal w.r.t.
If we have expressions \(e_1\) and \(e_2\) without declassification that are well typed and \(e_1\) \(\mathit {lowEqual}\) \(e_2\), \(\mathit {M_1}\) and \(\mathit {M_2}\) are well typed memories, \(\mathit {M_1}\) \(\mathit {similar(e_1)}\) \(\mathit {M_2}\), \(\mathit {M_1}\vert e_1 \rightarrow ^*\mathit {M^{\prime }_1}\vert v_1\), and \(\mathit {M_2}\vert e_2 \rightarrow ^*\mathit {M^{\prime }_2}\vert v_2\) then \(\mathit {M^{\prime }_1}\ \mathit {similar(v_1)}\ \mathit {M^{\prime }_2}\), \(v_1\ \mathit {lowEqual}\ v_2\), and \(\mathit {M^{\prime }_1}\), \(\mathit {M^{\prime }_2}\), \(v_1\), and \(v_2\) are well typed.
From Lemma 5.8 and \(\mathit {M_1}\vert e_1 \rightarrow ^*\mathit {M^{\prime }_1}| v_1\) and \(\mathit {M_2}| e_2 \rightarrow ^*\mathit {M^{\prime }_2}| v_2\), we know that \(\mathit {M_1}| e_1 \rightarrow _{call}\mathit {M^{\prime }_1}| v_1\) and \(\mathit {M_2}| e_2 \rightarrow _{call}\mathit {M^{\prime }_2}| v_2\). Then, by induction on the number of steps of \(\rightarrow _{call}\):
Base: \(e_1\) \(\mathit {lowEqual}\) \(e_2\) \(\mathit {lowEqual}\) \(v_1\) \(\mathit {lowEqual}\) \(v_2,\ \mathit {M_1}=\mathit {M^{\prime }_1},\ \mathit {M_2}=\mathit {M^{\prime }_2}\). Thus, \(\mathit {M^{\prime }_1}\) \(\mathit {similar(v_1)}\ \mathit {M^{\prime }_2}\) holds because \(\mathit {M_1}\ \mathit {similar(e_1)}\ \mathit {M_2}\).
Inductive step: By Lemma 5.9 (bisimulation core) and the inductive hypothesis, each reduction step establishes similar memories \(\mathit {M^{\prime }_1}\) and \(\mathit {M^{\prime }_2}\), computes \(\mathit {lowEqual}\) expressions, and preserves that only necessary values are in the
6 TOOL SUPPORT AND EVALUATION
In this section we present tool support for SIFO and evaluate feasibility of SIFO by implementing five case studies. Additionally, we benchmark precision and recall of the information flow analysis by adapting the IFSpec benchmark suite [Hamann et al. 2018] to SIFO.
6.1 Tool Support
We implement SIFO as a pluggable type system for L42 [Giannini et al. 2019]. L42 is a pure object-oriented language with a rich type system supporting the type modifiers used by SIFO.
Conveniently, L42 allows pluggable type systems [Andreae et al. 2006; Papi et al. 2008] to add an additional layer of typing. We add rules to support the typing of expressions with security levels. Both Java and L42 support pluggable type systems using annotations: type names preceded by the symbol \(@\). In our SIFO library, these annotations are used to introduce the security levels.5
Some changes of SIFO are necessary to comply with L42: L42 supports the uniform access principle [Meyer 1988]; thus, there is no dedicated syntax for field assign and field access, but they are modeled by getters and setters. Additionally, the constructor does not have dedicated syntax, but it is a static method with return type
Additionally, an exception prevents the execution of the code after it was thrown. Thus, after the exception is caught, the program may collect information about when the execution was interrupted in order to discover what expression raised the exception. This is another option to propagate secret information to an observer. Our current extension supporting exceptions is quite conservative, requiring the use of a single security level for all free variables, exceptions, and results of a try-catch block. In future work, we plan to formalize the extension with exceptions more precisely.
6.2 Feasibility Evaluation
To evaluate the feasibility of SIFO, we implemented four case studies from the literature in SIFO: Battleship [Stoughton et al. 2014; Zheng et al. 2003], Email [Hall 2005], Banking [Thüm et al. 2012], and Paycard ( http://spl2go.cs.ovgu.de/projects/57). Additionally, we implemented a novel case study of our own, the Database. The metrics of the case studies are shown in Table 1.
6.2.1 Battleship.
Our evaluation is focused on the Battleship case study because this program is carefully described by Stoughton et al. [2014] as a general benchmark to evaluate information flow control. Moreover, this case study is also implemented in Jif,6 thus allowing us to directly compare their results with our work.
Battleship is the implementation of a two player board game. At the start, each player places a fixed number of ships of varying length on their private board. The board has a two-dimensional grid. The players only know the placement on their board and have to guess where the other player placed the ships. During the game, the players take turns and shoot cells on the board of the opponent to sink ships. The first player wins the game who sinks all opponent’s ships.
Thanks to our flexible SIFO type system, we implemented most of the code without any security annotations. We wrote a generic
Our
When a player shoots, it has to be correctly revealed if it was a miss, a hit, or a hit that sunk a specific ship. The process of one shooting round is shown in Listing 4. The method
Listing 4. Implementation of one shooting in SIFO.
In Jif, a player must trust that the result of a shot is correctly revealed by the opponent. In Jif, it would be easy to implement a
We now compare both implementation on a more general level. Most classes are written parameterized with a security level L in Jif. In SIFO, we write classes like
For the creation of players, we have a similar concept as Jif. While Jif used a generic
In summary, we implemented Battleship in SIFO by relying on our promotion rules and the immutability of trustworthy objects. Thanks to preexisting L42 features, we have not needed the complex expressiveness of Jifs label expression, which is also discussed in Section 7.
6.2.2 Database.
The Database is a system where two databases are not allowed to interfere. Through the different security levels, we ensure that a value read from one database cannot be inserted into the other one. This can be obtained just by annotating the
Listing 5. Gui implementation in SIFO.
The other classes are implemented without any security level. This is possible because the class
6.2.3 Further Case Studies.
The Email system ensures that encrypted emails are only decrypted if the public and private key pair used is valid. It also guarantees that private keys are not leaked. The
Banking and Paycard are two systems that represent payment systems where it is crucial that the calculations of new balances are correct and information is not leaked. By setting the balance to
6.2.4 Discussion.
Code following a pure object-oriented style is often directly supported by SIFO without any special adaptation. However, when updates to local variables and statements/conditionals are used, the programmer may have to rely on some simple programming patterns: for example, in a conditional, we cannot directly update a
Another insight is that major parts of the case studies could be written without any use of security levels because the multiple method types promote the parameters of a called method to the required security levels in necessary cases. This allowed us to write secure programs relying on libraries and data structures without any security annotation. In our case study, we could reuse a list implementation and securely promote it to any security level if needed. The type system then checks that instantiated lists of different security levels did not interfere. For example, a
Major parts of the case studies were implemented without the use of declassify. We only needed it in Battleship to declassify shot results as intended by the game. Additionally, we declassified console output at the end of program execution in the other case studies to print results for the user.
By explicitly typing references as
6.3 Benchmarking with IFSpec
To evaluate precision and recall of SIFO, we applied SIFO to the IFSpec benchmark suite [Hamann et al. 2018]. IFSpec contains 80 samples to test information flow analysis tools. In addition to the core samples, 152 samples from the benchmark suite SecuriBench Micro7 are adapted and integrated into IFSpec. The samples are all available in Java and Dalvik. To benchmark SIFO, we translated the 80 core sample when it was possible. Samples that used Java specific features were not translated. In total, 40 samples are implemented in SIFO. For these samples, we compare SIFO with Cassandra [Lortz et al. 2014], JOANA [Graf et al. 2013], JoDroid [Mohr et al. 2015], KeY [Ahrendt et al. 2016], and Co-Inflow [Xiang and Chong 2021] (with and without additional security annotations) which were all evaluated before with IFSpec.
Each sample is labeled as secure or insecure. When a sample contains a leak and a tool reports a leak, we categorize it as true positive (TP). When a sample contains no leak and a tool reports no leak, we categorize it as true negative (TN). When a sample contains no leak but a tool reports a leak, we categorize it as false positive (FP). When a sample contains a leak but a tool reports no leak, we categorize it as false negative (FN). From these four categories, we can calculate precision and recall of the tools. The recall is computed as: \(\#TP/(\#TP + \#FN)\). Recall determines the percentage of samples correctly classified as insecure considering all samples containing a leak. The precision is computed as: \(\#TP/(\#TP + \#FP)\). Precision determines the percentage of samples correctly classified as insecure considering all samples classified as insecure by the tool.
In Table 2, we show the results of the benchmarking. All six tools found the 18 samples containing a leak. This results in a recall of 100% for all tools. No tool classified a sample false negative. Regarding precision, the tools have slight differences. Cassandra and Co-Inflow without additional annotations have the lowest precision of 54.5%. JOANA has the highest precision of 62.1%, but Co-Inflow has a higher precision of 81.8% if additional security annotation is given by the programmer. SIFO has a precision of 58.1%.
Discussion of the False Positive Samples.
With SIFO, 13 samples are typed as insecure, which are labeled as secure by the authors of the benchmark. We will classify these samples into categories to discuss the result of SIFO. For six samples, the type system of SIFO is not precise enough to recognize that the sample is secure. For example, if in a conditional expression both branches assign the same value to a
Three samples are constructed to introduce a leak which is overwritten in the end. A simple example is that a secret value is assigned to a public variable and in the next line, the public variable is overwritten. We clearly prohibit the first assignment of the secret value. These examples are constructed for taint analysis tools and are not suitable for type systems.
One sample is labeled as secure because in the provided code there is no way to access the secret values. In sample
For three samples, again the modular reasoning of the type system pessimistically rejects secure programs. In
Most examples that SIFO rejects are constructed by developers to contain a security problem which is erased or not accessible in the remaining code of the sample. As our type system prohibits any introduction of security violations, we reject these samples. To support this statement, we rewrote eight of the 13 false positive samples to be semantically similar and accepted by SIFO.
7 RELATED WORK
Static and dynamic program analysis [Austin and Flanagan 2009; Nielson et al. 1999; Russo and Sabelfeld 2010; Zhang et al. 2015], as well as security type systems [Banerjee and Naumann 2002; Ferraiuolo et al. 2017; Hunt and Sands 2006; Li and Zhang 2017; Simonet 2003; Volpano et al. 1996] are used to enforce information flow policies. We refer to Sabelfeld and Myers [2003] for a detailed overview.
Taint Analysis. Taint analysis [Enck et al. 2014; Hedin et al. 2014; Arzt et al. 2014; Huang et al. 2012; Milanova and Huang 2013; Huang et al. 2014; Roy et al. 2009] is a related analysis technique that detects direct information flows from tainted sources to secure sinks by analyzing the assignments of variables and fields. Those taint analysis works do not provide a soundness property, while the SIFO noninterference proof guarantees the security of type checked code. Except for JSFlow [Hedin et al. 2014], Cassandra [Lortz et al. 2014], JOANA [Graf et al. 2013], and JoDroid [Mohr et al. 2015], these related works do not cover implicit information flows through conditionals, loop statements, or dynamic dispatch. SIFO also detects implicit information flows through dynamic dispatch (conditional and loop statements are not in the core language, but included in our implementation). Crucially, the noninterference proof of SIFO relies on detecting implicit information flow.
Coarse-grained dynamic information flow approaches [Xiang and Chong 2021; Nadkarni et al. 2016; Jia et al. 2013] track information at the granularity of lexically or dynamically scoped section of code. Instead of labeling every value individually, coarse-grained approaches label an entire section with one label. All produced values within this scope implicitly have that same label. Therefore, the writing effort for developers to annotate programs is reduced. To still obtain good results of the information flow analysis, for example, Xiang and Chong [Xiang and Chong 2021] introduce opaque labeled values to permit labeled values where programmers have not provided a label. If no further annotation is given by the programmer, the precision of the information flow analysis can be decreased. As the evaluation shows, Co-Inflow [Xiang and Chong 2021] has better precision when the programmer annotates the program. However, the precision is not a limitation of coarse-grained approaches compared to fine-grained approaches. Type systems for fine- and coarse-grained information flow control are equivalent in terms of precision as shown by Rajani et al. [Rajani and Garg 2018; Rajani et al. 2017]. For dynamic information flow control mechanisms, Vassena et al. [Vassena et al. 2019] have similar results.
The work by Huang, Milanova et al. [Huang et al. 2012; Milanova and Huang 2013; Huang et al. 2014] is closely related to our approach because viewpoint adaption with polymorphic types is similar to our \(\mathit {mdf}\rhd \mathit {mdf}^{\prime }\) operator for type modifiers. For field accesses, the type of the accessed object depends on the reference and the field type. They use read-only references to improve the precision of their static analysis technique by allowing subtyping if the reference is read-only. In SIFO, we also use deep immutable and capsule references, extending the expressiveness of our language.
Comparison to Jif. In this work, we explored the specific area of secure type systems for object-oriented languages [Barthe and Serpette 1999; Sabelfeld and Myers 2003; Banerjee and Naumann 2002; Sun et al. 2004; Myers 1999; Strecker 2003; Barthe et al. 2007]. The most important work to compare against is Jif [Myers 1999] (see Section 2). In this paper, we presented a minimal core of SIFO for the soundness and noninterference proofs. Nonetheless, we compare SIFO with Jif by discussing their common and different features. A main difference is the handling of aliases: Jif does not use any kinds of regions or alias analysis to reason about bounded side effects. Therefore, Jif pessimistically discards many programs introducing aliases (see the example in Section 2.1 that is not typable in Jif). On the other hand, SIFO restricts the introduction of insecure aliases and is therefore able to safely type more programs. SIFO’s expressiveness relies on the safe promotion of
The SIFO type system leverages on a minimalistic syntax of security annotation, where types contain a security level. Jif offers a much more elaborated syntax: in Jif, a security label is an expression consisting of a set of policies [Myers and Liskov 2000]. Each policy has an owner o and a set of readers r. For example, the label \({o_1: r_1; o_2: r_1, r_2}\) states that the policy of \(o_1\) allows \(r_1\) to read the value and \(o_2\) allows \(r_1\) and \(r_2\) to read the value. Hence, \(r_1\) is the only reader to fulfill both policies. These label expressions get more complicated, the more policies are conjoined, but a programmer gets more flexibility to express fine-grained access restrictions.
SIFO has a similar expressiveness to Jif, but does not need to resort to such complex label expressions. To show this, consider the following scenario from Jif [Myers and Liskov 2000]: A person Bob that wants to create his tax form using an online service. In the scenario, Bob wants to prevent his information from being leaked to the online service, and the provider of the service does not want its technology and data to be leaked in the process of generating the tax form. This constraint is related to the mutually distrustful players of the Battleship case study. To comply with these constraints, Bob labels his data with \({bob:bob}\) and sends it to the online service provider. The provider, labels its own data with \({provider:provider}\), so the calculated tax has the label \({bob:bob; provider:provider}\). This result cannot be read because the labels disagree on their reader sets. To release the information to Bob, the provider declassifies the label by removing the provider policy. As only the final tax form is declassified, the released information from the provider is limited. The final tax form with the label \({bob:bob}\) is then sent to Bob.
In SIFO, we can handle the same scenario as follows: Bob wants to protect his private information, so he can set the security level to
With this example, we discuss the secure transfer of data. In SIFO, by using a
Listing 6. Class Protected in Jif with security parameterization [Myers 1999].
To grasp the difference in the annotation burden, consider the Jif example in Listing 6 of a class that protects data from insecure access. Jif uses a parameterized label system where a class or methods have a generic label L. The label L can be initialized with any specific security level. This places a large conceptual burden on the programmer, as the label L is used in every field and method of the class. Additionally, no legacy code can be used that is not parameterized properly.
SIFO encourages a style where most code is completely clear of any security annotation; in particular, most algorithms and most common data types like collections do not need any kind of security annotations at all. Only code that is explicitly and directly involved in the handling of security-critical data needs to be written with security in mind. Unlabeled classes and methods are implicitly annotated with the lowest security level. Thanks to the flexibility of multiple method types, they can be safely promoted to any higher level.
Jif has additional features that are not presented in the core of SIFO. The SIFO core works with a finite lattice of security levels instead of the complex label expressions in Jif [Myers and Liskov 2000] with an infinite set of possible labels. Thanks to the embedding in L42, we get label polymorphism for free by relying on L42 encodings for generics. Thus, on one side SIFO allows to remove the complexity of having most of the labels generic, on the other side when generic labels are truly needed (for example to write code that have to work on unknown labels) we can rely on the L42 generics encoding, as we do in the Battleship example.
Jif has dynamic checks of security labels. See Line 9 in Listing 6 where the security level of the object
In a more concrete example, a Person Bob creates a
Jif has robust declassification [Chong and Myers 2006] which means that an attacker is not able to declassify information, or to influence what information is declassified by the system that is above the security level that the attacker is allowed to read. In the full embedding in L42, declassification can be sealed behind the object capability model, as we did in the Battleship case study. The L42 object capability model is flexible and can provide a range of useful guarantees [Miller 2006]. Indeed, you can see the Battleship case study as an exemplar representation of robust declassification. Even if we replace one of the players with adversarial code, such code will not be able to declassify the opposing board; even while holding a reference to such a board.
In future work, we want to extend SIFO to work with any partial order of security levels as discussed in Section 9. With this feature, we are closer to the expressive power of Jifs label expressions.
Other Information Flow Techniques. Hoare-style program logics are also used to reason about information flow. The work of Andrews and Reitman [Andrews and Reitman 1980] encodes information flow in a logical form for parallel programs. Amtoft et al. [Amtoft et al. 2006; Amtoft and Banerjee 2004] use Hoare-style program logic and abstract interpretation to analyze information flow. This approach is the basis in SPARK Ada for specifying and checking information flow [Amtoft et al. 2008]. For Java, Beckert et al. [Beckert et al. 2013] formalized the information flow property in a programming logic using self-composition of programs and an existing program verification tool to check information flow. Similarly, Barthe et al. [Barthe et al. 2004] and Darvas et al. [Darvas et al. 2005] analyze the information flow by using self-composition of programs and standard software verification systems. Terauchi and Aiken [Terauchi and Aiken 2005] combined a self-composition technique with a type system to profit from both techniques. Küsters et al. [Küsters et al. 2015] propose a hybrid approach by using JOANA [Graf et al. 2013] and verification with KeY [Ahrendt et al. 2016] to check the information flow.
The related IFbC approach by Schaefer et al. [Runge et al. 2020; Schaefer et al. 2018] ensures information flow-by-construction. Here, the information flow policy is ensured by applying a sound set of refinement rules to a starting specification. Instead of checking the security after program creation, the programmer is guided by the rules to never violate the policy. Compared to SIFO, their approach is limited to a while language without objects.
8 CONCLUSION
In this work, we presented a type system of an object-oriented language for secure lattice-based information flow control using type modifiers that detect direct and implicit information flows. This language supports secure software development by enforcing noninterference. We leverage previous work on immutability and encapsulation to greatly increase the expressive power of our language. Additionally, promotion/multiple method types encourage reusability of secure programs without burdening the developer. We formalized the secure type system, proved noninterference, and showed feasibility by implementing SIFO as a pluggable type system for L42, and conducting an evaluation with several case studies. In the future, we want to formalize exceptions in SIFO to extend the expressiveness of the language. We also want to generalize the proof to include declassification. Furthermore, we could reduce the typing effort of programmers by introducing type inference.
9 FUTURE WORK: INTEGRITY AND CONFIDENTIALITY
As noted by Biba [1977], integrity can be seen as a dual to confidentiality, which means that either of them can be checked with the same information flow analysis techniques. For confidentiality, information must not flow to inappropriate destinations; dually, for integrity, information must not flow from inappropriate sources. In this work, we made all our discussion about confidentiality. If a user of SIFO is instead interested in integrity, they can simply use our type system with any lattice of integrity levels. However, it is also possible to track both properties at the same time; the trick is to not rely too much on data sources with the lowest or highest security level (e.g.,
Note that we can still declare
In this work, we assumed a lattice of security levels. However, Logrippo [2018] has proposed that just a partially ordered set would be appropriate to model security. If we allowed just a partial order of security levels, SIFO would allow to encode both integrity and confidentiality at the same time instead of using two lattices for confidentiality levels and integrity levels. Consider the following example, where Bob and Alice have both confidential and trusted data. We can define a partially ordered set as shown on the right.
Integrity:
Being able to express integrity and confidentiality at the same time with the same lattice is clearly a great advantage; however, we are still investigating if supporting partially ordered sets instead of a lattice would have subtle consequences that interfere with our noninterference property.
Footnotes
1 SIFO is an acronym for Secure Information Flow in an Object-oriented language.
Footnote2 In the examples, we use a rich language including local variables and literals with the usual semantics. Those are supported by our artifact, but in the formal model we present a minimal language where we keep only the most crucial OO features.
Footnote3 To help readability, in the rest of the paper we will write type modifiers and security levels explicitly, but a syntactically much lighter style, as shown above, is accepted by our artifact and it is the preferred way to code, once the programmer gets used to those defaults.
Footnote4 https://www.cs.cornell.edu/jif/doc/jif-3.3.0/manual.html.
Footnote5 You can find a version of L42 with our SIFO library and the case studies at https://l42.is/SifoArtifactLinux.zip and https://l42.is/SifoArtifactWin.zip. This also contains more information about the detailed syntax in the readme.
Footnote6 See [Zheng et al. 2003] found on the Jif website https://www.cs.cornell.edu/jif/.
Footnote7 https://github.com/too4words/securibench-micro.
Footnote
- , , , , , and (Eds.). 2016. Deductive Software Verification - The KeY Book - From Theory to Practice.
Lecture Notes in Computer Science , Vol. 10001. Springer.Google ScholarCross Ref
- . 2006. A logic for information flow in object-oriented programs. In POPL. 91–102.Google Scholar
- . 2004. Information flow analysis in logical form. In SAS(
LNCS , Vol. 3148). Springer, 100–115.Google Scholar - . 2008. Specification and checking of software contracts for conditional information flow. In FM. Springer, 229–245.Google Scholar
- . 2006. A framework for implementing pluggable type systems. In OOPSLA. 57–74.Google Scholar
- . 1980. An axiomatic approach to information flow in programs. TOPLAS 2, 1 (1980), 56–76.Google Scholar
Digital Library
- . 2014. FlowDroid: Precise context, flow, field, object-sensitive and lifecycle-aware taint analysis for Android apps. In PLDI, Vol. 49. ACM, 259–269.Google Scholar
- . 2009. Efficient purely-dynamic information flow analysis. In PLAS. ACM, 113–124.Google Scholar
- . 2002. Secure information flow and pointer confinement in a Java-like language. In CSFW, Vol. 2. 253.Google Scholar
- . 2004. Secure information flow by self-composition. In CSF. IEEE, 100–114.Google Scholar
- . 2007. A certified lightweight non-interference Java bytecode verifier. In European Symposium on Programming. Springer, 125–140.Google Scholar
Digital Library
- . 1999. Partial evaluation and non-interference for object calculi. In FLOPS, Vol. LNCS. Springer, 53–67.Google Scholar
- . 2013. Information flow in object-oriented software. In LOPSTR, Vol. LNCS. Springer, 19–37.Google Scholar
- . 1976. Secure Computer System: Unified Exposition and Multics Interpretation.
Technical Report . MITRE Corp Bedford MA.Google ScholarCross Ref
- . 1977. Integrity Considerations for Secure Computer Systems.
Technical Report . MITRE Corp Bedford MA.Google Scholar - . 2016. Effective Java. Pearson Education India.Google Scholar
- . 2006. Decentralized robustness. In 19th IEEE Computer Security Foundations Workshop (CSFW’06). IEEE, 12–pp.Google Scholar
- . 2005. A theorem proving approach to analysis of secure information flow. In SPC, Vol. LNCS. Springer, 193–209.Google Scholar
- . 1976. A lattice model of secure information flow. CACM 19, 5 (1976), 236–243.Google Scholar
Digital Library
- . 2014. TaintDroid: An information-flow tracking system for realtime privacy monitoring on smartphones. TOCS 32, 2, Article
5 (June 2014), 29 pages. Google ScholarDigital Library
- . 2017. Secure information flow verification with mutable dependent types. In DAC. IEEE, 1–6.Google Scholar
- . 2019. Flexible recovery of uniqueness and immutability. Theoretical Computer Science 764 (2019), 145–172.Google Scholar
Digital Library
- . 1982. Security policies and security models. In S&P. IEEE, 11–11.Google Scholar
- . 1983. Smalltalk-80: The Language and its Implementation. Addison-Wesley Longman Publishing Co., Inc.Google Scholar
Digital Library
- . 2012. Uniqueness and reference immutability for safe parallelism. 47, 10 (2012), 21–40.
DOI: Google ScholarDigital Library
- . 2013. Using JOANA for information flow control in Java programs - a practical guide. In Proceedings of the 6th Working Conference on Programming Languages (ATPS’13) (Lecture Notes in Informatics (LNI) 215). Springer, 123–138.Google Scholar
- . 2005. Fundamental nonmodularity in electronic mail. Automated Software Engineering 12, 1 (2005), 41–79.Google Scholar
Digital Library
- . 2018. A uniform information-flow security benchmark suite for source code and bytecode. In Nordic Conference on Secure IT Systems. Springer, 437–453.Google Scholar
Cross Ref
- . 2014. JSFlow: Tracking information flow in JavaScript and its APIs. In SAC (Gyeongju, Republic of Korea). ACM, 1663–1671. Google Scholar
- . 2014. Type-based taint analysis for Java web applications. In FASE(
LNCS , Vol. 8411). Springer, 140–154.Google Scholar - . 2012. ReIm & ReImInfer: Checking and inference of reference immutability and method purity. ACM SIGPLAN Notices 47, 10 (2012), 879–896.Google Scholar
Digital Library
- . 2006. On flow-sensitive security types. SIGPLAN Not. 41, 1 (
Jan. 2006), 79–90. Google ScholarDigital Library
- . 2001. Featherweight Java: A minimal core calculus for Java and GJ. TOPLAS 23, 3 (2001), 396–450.Google Scholar
Digital Library
- . 2013. Run-time enforcement of information-flow properties on Android. In European Symposium on Research in Computer Security. Springer, 775–792.Google Scholar
- . 2015. A hybrid approach for proving noninterference of Java programs. In 2015 IEEE 28th Computer Security Foundations Symposium. IEEE, 305–319.Google Scholar
Digital Library
- . 2017. Towards a flow-and path-sensitive information flow analysis. In CSF. IEEE, 53–67.Google Scholar
- . 2018. Multi-level access control, directed graphs and partial orders in flow control for data secrecy and privacy. In Foundations and Practice of Security. Springer International Publishing, 111–123.Google Scholar
Cross Ref
- . 2014. Cassandra: Towards a certifying app store for Android. In Proceedings of the 4th ACM Workshop on Security and Privacy in Smartphones & Mobile Devices. 93–104.Google Scholar
Digital Library
- . 1988. Eiffel: A language and environment for software engineering. Journal of Systems and Software 8, 3 (1988), 199–246.Google Scholar
Digital Library
- . 2013. Composing polymorphic information flow systems with reference immutability. In FTfJP. ACM, Article
5 , 7 pages.Google Scholar - . 2006. Robust Composition: Towards a Unified Approach to Access Control and Concurrency Control. Ph.D. Dissertation. Johns Hopkins University, Baltimore, Maryland, USA.Google Scholar
Digital Library
- . 2015. JoDroid: Adding Android support to a static information flow control tool. In Software Engineering (Workshops). Citeseer, 140–145.Google Scholar
- . 1999. JFlow: Practical mostly-static information flow control. In POPL (San Antonio, Texas, USA). ACM, New York, NY, USA, 228–241. Google Scholar
- . 2000. Protecting privacy using the decentralized label model. TOSEM 9, 4 (2000), 410–442.Google Scholar
Digital Library
- . 2016. Practical DIFC enforcement on Android. In 25th USENIX Security Symposium (USENIX Security’16). 1119–1136.Google Scholar
- . 1999. Principles of Program Analysis. Springer.Google Scholar
Digital Library
- . 2008. Practical pluggable types for Java. In ISSTA. 201–212.Google Scholar
- . 2002. Types and Programming Languages. MIT Press.Google Scholar
Digital Library
- . 2017. Type systems for information flow control: The question of granularity. ACM SIGLOG News 4, 1 (2017), 6–21.Google Scholar
Digital Library
- . 2018. Types for information flow control: Labeling granularity and semantic models. In 2018 IEEE 31st Computer Security Foundations Symposium (CSF’18). IEEE, 233–246.Google Scholar
- . 2009. Laminar: Practical fine-grained decentralized information flow control. In Proceedings of the 30th ACM SIGPLAN Conference on Programming Language Design and Implementation. 63–74.Google Scholar
Digital Library
- . 2020. Lattice-based information flow control-by-construction for security-by-design. In FormaliSE.
To appear. Google Scholar - . 2010. Dynamic vs. static flow-sensitive security analysis. In CSF. IEEE, 186–199.Google Scholar
- . 2003. Language-based information-flow security. J-SAC 21, 1 (2003), 5–19.Google Scholar
Digital Library
- . 2009. Declassification: Dimensions and principles. Journal of Computer Security 17, 5 (2009), 517–548.Google Scholar
Digital Library
- . 2018. Towards confidentiality-by-construction. In ISoLA(
LNCS , Vol. 11244). Springer, 502–515.Google Scholar - . 2003. Flow Caml in a nutshell. In APPSEM-II. 152–165.Google Scholar
- . 2014. You sank my battleship! A case study in secure programming. In Proceedings of the Ninth Workshop on Programming Languages and Analysis for Security (Uppsala, Sweden) (
PLAS’14 ). Association for Computing Machinery, New York, NY, USA, 2–14.Google ScholarDigital Library
- . 2003. Formal analysis of an information flow type system for MicroJava. Technische Universität München, Tech. Rep (2003).Google Scholar
- . 2004. Modular and constraint-based information flow inference for an object-oriented language. In SAS, Vol. LNCS. Springer, 84–99.Google Scholar
- . 2005. Secure information flow as a safety problem. In SAS, Vol. LNCS. Springer, 352–367.Google Scholar
- . 2012. Family-based deductive verification of software product lines. In GPCE. 11–20.Google Scholar
- . 2019. From fine-to coarse-grained dynamic information flow control and back. Proceedings of the ACM on Programming Languages 3, POPL (2019), 1–31.Google Scholar
Digital Library
- . 1996. A sound type system for secure flow analysis. JCS 4, 2/3 (1996), 167–188.Google Scholar
Digital Library
- . 2021. Co-Inflow: Coarse-grained information flow control for Java-like languages. In 2021 IEEE Symposium on Security and Privacy (SP’21). IEEE, 18–35.Google Scholar
- . 2015. A hardware design language for timing-sensitive information-flow security. ACM SIGPLAN Notices 50, 4 (2015), 503–516.Google Scholar
Digital Library
- . 2003. Using replication and partitioning to build secure distributed systems. In 2003 Symposium on Security and Privacy, 2003. IEEE, 236–250.Google Scholar
Cross Ref
Index Terms
Immutability and Encapsulation for Sound OO Information Flow Control
Recommendations
Composing polymorphic information flow systems with reference immutability
FTfJP '13: Proceedings of the 15th Workshop on Formal Techniques for Java-like ProgramsInformation flow type systems, such as EnerJ (a type system for energy efficiency), and integrity and confidentiality, are unsound if subtyping for references is allowed because of the presence of mutable references. The standard approach is to disallow ...
A practical type system and language for reference immutability
OOPSLA '04: Proceedings of the 19th annual ACM SIGPLAN conference on Object-oriented programming, systems, languages, and applicationsThis paper describes a type system that is capable of expressing and enforcing immutability constraints. The specific constraint expressed is that the abstract state of the object to which an immutable reference refers cannot be modified using that ...
Enforcing robust declassification and qualified robustness
Special issue on CSFW17Noninterference requires that there is no information flow from sensitive to public data in a given system. However, many systems release sensitive information as part of their intended function and therefore violate noninterference. To control ...































Comments