**Marwa Ziadia 1,***∗***, Jaouhar Fattahi 1, Mohamed Mejri <sup>1</sup> and Emil Pricop <sup>2</sup>**


Received: 31 January 2020; Accepted: 26 February 2020; Published: 27 February 2020

**Abstract:** Today, Android accounts for more than 80% of the global market share. Such a high rate makes Android applications an important topic that raises serious questions about its security, privacy, misbehavior and correctness. Application code analysis is obviously the most appropriate and natural means to address these issues. However, no analysis could be led with confidence in the absence of a solid formal foundation. In this paper, we propose a full-fledged formal approach to build the operational semantics of a given Android application by reverse-engineering its assembler-type code, called Smali. We call the new formal language Smali+. Its semantics consist of two parts. The first one models a single-threaded program, in which a set of main instructions is presented. The second one presents the semantics of a multi-threaded program which is an important feature in Android that has been glossed over in the-state-of-the-art works. All multi-threading essentials such as scheduling, threads communication and synchronization are considered in these semantics. The resulting semantics, forming Smali+, are intended to provide a formal basis for developing security enforcement, analysis and misbehaving detection techniques for Android applications.

**Keywords:** Android applications; multi-threading; operational semantics; reverse engineering; Smali<sup>+</sup>

## **1. Introduction**

A few years ago, mobile phones were used to make calls or send messages. Today, they surpass computers as the most commonly used digital device. They manage our agenda, emails, credit cards, itineraries and business documents. Android is the most popular operating system for mobiles and embedded devices, having the largest application market and 85% of all smartphones sold in 2019 were equipped with an Android OS [1]. Android is an open nature platform, which means that applications could be downloaded from sources other than the official Google play store. This is an important feature that has contributed to its unquestionable success, given the breadth of the available application that draws people to the platform, making it an ideal target for malicious application downloads.

Indeed, users are increasingly exposed to attacks targeting the Android environment via malicious applications. They thus endanger privacy information, by disclosing sensitive data (FakeNetflix malware [2]) or collecting sensitive banking information, especially with the increasing use of banking applications (Anubis trojan [3]). Furthermore, the installation of apparently legitimate malicious applications can lead to: clandestine eavesdropping on telephone conversations; tracking GPS position; exploiting pay services to cause financial losses to the user for the benefit of the attacker by calling or

sending SMS messages to premium-rate numbers without the user's knowledge (SMS Trojan such as FakePlayer, AsiaHitGroup and GGTracker [4–6].

To deal with this, automated tools for analyzing, verifying and enforcing the security of Android applications are highly needed [7–10]. Nevertheless, they must be based on a formal specification of the target platform to give solid results. In this paper, we propose formal operational semantics for a subset of the low-level Android code, which we consider particularly relevant for modeling Android applications and which we call Smali+. It includes the main bytecode instructions of Dalvik, and a few important API methods related to Java concurrency. Smali<sup>+</sup> is ultimately written from Smali with some essential native methods that were replaced with macro-instructions for simplification. Smali<sup>+</sup> is intended to serve as a basis for further analysis of Android applications and security implementation techniques. Android applications are mainly written in Java. The Java source code is first compiled into a Java Virtual Machine (JVM) bytecode using a standard Java compiler called *Javac*. Following this, the Java source files are converted into class files that store Java bytecode. The Java bytecode is then translated to an optimized bytecode called *Dalvik* through a tool called *dx*. At this stage, all the class files are converted and consolidated into a single DEX file called Dalvik EXecutable or simply a DEX to save memory. An Android Package Kit (APK) is essentially a zip of the DEX file accompanied by a Androidmanifest.xml file, a set of resources and potentially shared libraries. Figure 1 illustrates these steps.

**Figure 1.** Compilation steps of an Android application.

In this work, we focus on the DEX format file, which contains the Dalvik binary code used even by the successor of Dalvik (since Android 5.0) called Android Runtime (ART).

Formalizing a low-level code, rather than high-level Java source or intermediate level Java bytecode, is our choice for many reasons. Firstly, Dalvik byte code is always available and it is easily obtainable from any Android application. Secondly, Dalvik bytecode is the common executable format for all Android applications and therefore the code is much closer to the code really executed. Even though decompilation from Dalvik back to Java or to Java bytecode is possible using reverse engineering tools (such as dex2jar and ded), there is no guarantee to recover the original source code since there is not a 100% robust and correct Dalvik-to-Java reverse translation tool [11]. However, even though that it is possible to retrieve source code or Java bytecode from Dalvik, editing or improving code at this level requires the user to reconvert it back to Dalvik and running the application afterward will often fail [9]. Focusing directly on Smali will avoid such problems. Hence, binary code obtained at this level, in DEX file, is illegible and requires conversion into a more understandable format prior to being analyzed, improved or edited. Reverse engineering in software makes it possible to convert a machine-readable binary file into a human-readable file, which is the case with DEX files.

Apktool [12] is a reverse engineering tool that simplifies the entire process of assembling and disassembling Android applications. It includes "*Smali*" and "*bakSmali*", which are equivalent to "assembler" and "disassembler", respectively allowing the passage from and to the DEX format. Apktool allows the user to disassemble applications to nearly original form. It uses *BakSmali* to produce, from an APK, a human-readable format akin to assembly languages called Smali (Smali is

both the name of a mnemonic language for the Dalvik bytecode and its assembler version.). This code is nothing but a translation of the machine code generated by the DVM. In other words, it is a readable representation of Dalvik bytecode in an assembly-like code, with mnemonic instructions. *BakSmali* creates a Smali file for each class in the application preserving the original signature. The structure of such a file is presented in Figure 2. In addition to the code contained in the classes.dex file, Apktool generates the application decoded resources, as well as the *AndroidManifest.xml* file (in a readable version. These reverse engineering analysis techniques are still effective with the newly introduced ART environment [13].

> **.class** *modifiers* Lsome/package/Someclass; **.super** Lsome/package/Someclass; **.implements** Lsome/package/Someinterface; **.source** "*someclass.Java*" 5 **.field** *modifiers* fieldname : type; 7 **.method** *modifiers* methodname (type,...)type **.locals** ... *instruction* ... *instruction* ... *instruction* ... 13 ... **.end method** 15 ...

**Figure 2.** Structure of a Smali file.

In this paper, we put forward formal semantics for Smali. Smali is an assembly-like language that runs on Android's DVM. It is obtained by 'bakSmaling' the Dalvik executable file (.dex). A syntax and semantics have been adopted to specify this low-level code. The resulting formal language is a sub-language of Smali and a simpler language, called Smali+. A set of the most used Dalvik instructions have been generalized into 12 semantically different instructions (see [11] for generalization process), compared to more than 200 Dalvik original instructions in Smali. In addition to this set, our semantics includes instructions related to multi-threading. We plan to use Smali<sup>+</sup> in the near future to specify security properties for Android applications and this in order to protect the user from security threats that target the Android environment through downloaded applications.

The paper is organized as follows. In Section 2, we present some related work with similar ideas of bytecode formalization and we discuss their advantages as well as their drawbacks and limitations. In Section 3, we give some essential preliminaries related to Smali (registers, some adopted notations, types, etc.). In Section 4, we present the operational syntax and semantics of Smali<sup>+</sup> for a single-threaded application. In Section 5, we present the operational syntax and semantics of Smali<sup>+</sup> for a multi-threaded application. In Section 8, we conclude and we introduce the future avenues of our research.

#### **2. Related Work**

Mostly, the studies based on formal semantics of Android target a single well-defined goal. This can be an analysis for certification, a detection of potential vulnerabilities or malicious behavior of an application, or a verification of any aspect. It can also be a means to reveal security breaches of Android applications [14]. We will see in the studies we are presenting hereafter that formalization elements substantially differ from one objective to another. This being said, it is practically impossible to evaluate the efficiency of analyzes that are not based on the formal specification of the targeted platform.

In [15], Payet et al. define operational semantics for a subset of Dalvik opcodes that present registers manipulation, arithmetic operations, object creation, access and method calls as well as Android activities. Semantics rules were relatively complex. An Android program was modeled as a graph of blocks where each block has one or more instructions among the selected instructions. Blocks are linked in a way that they express control flow passing from one block to another. They require that invoke and return instructions only occur at the beginning and the end of a block, respectively. Blocks of semantics integrate instruction semantics for those that are different from a call or a return. Call instruction semantics allow passing from the caller method block to the callee method block. Activity semantics depend on the activity state, method callback, activity life cycle and external events. These semantics are defined to be the basis of static analyses that take into account the life-cycle of the activities. Despite the importance of thread-activity connection in Android semantics, threading was detached from activities semantics and concurrency was ignored in this work.

In [16,17], the authors propose a formal operational semantics for the Dalvik bytecode. The formalization was accompanied by a control flow analysis to detect potential malicious actions. Although the results highlight threading as the most often used language features with a (90.18%), this feature was omitted in both analyses and semantics to focus, instead, on reflection, exceptions and dynamic dispatch with 73.00% and 19.53%, respectively, which we find somewhat awkward. This motivates us to pay a special attention to the mutli-threading aspect modeling for Android.

In [11], the authors present SymDroid as a Dalvik bytecode interpreter for eventual security vulnerabilities detection. It is a symbolic execution for a simplified intermediate language of a fraction of Dalvik opcodes, named *μ-Dalvik*. SymDroid receives the Dalvik bytecode (the .dex file) as input. The opcode is first translated to *μ-Dalvik*, which one is based on 16 instructions considered as the most relevant ones to perform code analysis. Then, it is processed by a symbolic execution core using the SMT solver to generate traces as an intermediate result. Finally, the post-analyzer inspects the output traces and determines the final result. Entry points and all possible events affecting the application's behavior were developed according to a client-oriented specification (it is up to the user to model it) to drive the application under test as desired. Although this work's models, in addition to modeling bytecode instructions, the system libraries including Bundle and Intent, Android components life cycle, services and views; it ignores the system's concurrent nature, either in the selected bytecode instructions or at the program symbolic execution level, which is considered as being sequential.

In the same vein, Julia presented in [18] is a static analyzer for Java bytecode based on abstract interpretation. It was extended in [19] and adapted to analyze Dalvik bytecode and handle specific features of Android such as event-driven nature, potentially concurrent entry points and dynamic inflation of graphical views. It applies several static analyses for Android applications' classcast, nullness, dead code and termination analysis, but does not track information flow. Multi-threaded applications were not included in this work and event handlers are executed by a single thread.

Gunadi et al. [20,21] propose an operational semantics of DEX bytecode for certifying non-interference properties through type system. This study includes a translating process from Java bytecode semantics developed in [22] to Dalvik bytecode, concluding that if the first type system guarantees non-interference then its translation into Dalvik bytecode is also typable. Therefore existing bytecode verifiers for Java could certify non-interference properties of Dalvik bytecode.

Multi-threading programming semantics in applications have lately drawn increasing attention. Some combine it with event handling [23–25], others consider the main API methods relating to it [26]. In [24], Kanade proposes a semantic of a combined concurrency model of threads and events. All the focus in this work goes to the event-driven nature of Android and its relationship with the application's threads. As a consequence, all other states that semantics could reach, such as those resulting from basic instruction execution (method call, jump, return instruction, etc.), have not been treated. The semantics proposed in [26] were the closest to ours. They cover the main important Dalvik instructions and handle multi-threading. This paper could be seen an extension of [27], with the obviously major change of the semantics needed for the concurrent setting and exception handling. However, thread scheduling was not discussed and thread spawning is left to the virtual machine to execute in an unpredictable point in time.

In the same stream of thought, in [28], Chaudhuri presents a formal security study on Android using operational semantics and a system of types for specific Android constructs. However, semantics ignore all Java constructs that may appear in Android applications (no class and method modeling), to focus instead on Android components, intents and all Android-specific features related to it (binding a service, sending an intent, etc.). This can be seen as a unified formal understanding of security for users and developers of Android applications to deal with their security concerns.

Some works have focused on other issues of Android such as multi-tasking. For instance, ASM presented in [29] is a formal model that formalizes all Android elements related to multi-tasking, such as activities, back stacks and tasks. An Android application is somewhat seen as a collection of activities with different types that interact with the user through a back stack. ASM has recently been extended in [30] to capture all the core elements of the multi-tasking mechanism used in inter-component communication.

Over time, formalization has included the permissions system as well [31–33]. For example, Bagheri et al. propose in [31] a formal specification for Android application's permission system through an ad-hoc specification language called Alloy. It aims to formally specify the behavior of Android applications, in particular, the mutual interaction between applications based on permissions and security consequences caused by it or what authors call inter-app permission leakage vulnerabilities. Almost all Android elements related to inter-app permissions were taken into account in the formalism. Every application is modeled as a set of components, permissions, intent filters and vulnerable paths. Similarly, in [33], a formal model of the Android's permissions is specified in the theorem prover Coq syntax.

Acteve++ [34] is an automated testing tool for Android Apps. It is based on Acteve [35] but is improved to support input events and broadcast events in order to achieve higher coverage. Authors use a non-standard operational semantics that describes the concolic execution of the program. Semantics describe program execution in response to a sequence of events generated automatically from an external environment. All other features and instructions that Android handles were neglected to focus instead on the event-driven paradigm, which we found not expressive enough to model an Android application. Our operational semantics consider, besides the concurrent feature, a variety of instructions that models methods invocations, object creation and the whole tree structure of an application (class, method and fields).

In [36], the authors focused on the low-level interactions with the operating system, by recording the system calls (syscalls) invoked. To benefit from two levels, the analysis uses generic low-level syscall traces to reconstruct the high-level semantics. While syscalls analysis offers more security guarantees, it, in our opinion, complicates the task more. Especially, this information is extracted from internal interfaces between the Android libraries and the kernel, which may change in the next versions of Android without notice. In our work, we propose a rich semantics that covers all API calls at a high level and we consider that it is sufficient to enforce security policies later.

Some studies like those conducted by Stowway [8] and Comdroid [37] for flow analysis directly analyze the disassembled DEX file for a given application to identify potential component and/or communication vulnerabilities. Despite the promising results of both tools in analyzing Dalvik bytecode and Android's API, proving its soundness and evaluating its efficiency or deficiency is practically impossible in the absence of formal specification and proof.

Concurrent programming concepts and techniques are widely used in Android in order to manage different tasks and threads. Our formalism Smali<sup>+</sup> consider this important feature that was neglected before given its complexity. Overall, none of the aforementioned studies, including those considering multi-threading, offer complete semantics covering all the states that a thread can reach nor representing all multi-threading essentials. Most of the studies formalizing Dalvik byte code and handling multi-threading include only the two Dalvik instructions related to monitor use, *monitor-enter* and *monitor-exit*, since Dalvik opcodes encompass only these two instructions with regard to threading. However, a semantic for an Android program should not be limited to these instructions and must also

consider instructions related to threads communication, signaling and scheduling. In this paper, we fill this gap by proposing semantics that incorporate, in addition to Dalvik instructions, a wide range of API methods covering multi-threading essentials formulated in macro instructions for the sake of simplicity. In comparison with all test-based approaches, Smali<sup>+</sup> is based on formal methods with their foundation in mathematical logic, allowing us to achieve rigorous and unambiguous reasoning in the system specification and proofs, ensuring the system proprieties, while test-based approaches can only ensure that systems satisfy the requirements for test cases. In sum, the proposed formal language is expressive enough to enforce security proprieties and to detect security critical APIs (i.e., those related to sensitive data access such as camera, SMS, telephony and contact list). Its syntax includes the class fully qualified name for each invoked method facilitating to localize such APIs.

#### **3. Preliminaries**

In this section, we present the most essential information for Smali. First, we present the DVM architecture and how it affects Smali syntax. Then, we present method invocation and how it affects Smali registers. Finally, we present Smali special notations for types.

#### *3.1. Registers*

Being optimized to run on devices on which resources and processor speed are scarce and the DVM architecture is register-based. Local variables are assigned to any of the 212 available registers. A register is used to hold any data value, except for *double* and *long* values where each one requires two registers (64 bits). The Dalvik opcodes operate on the register's content instead of operating directly on values and accesses elements on a program stack such as stack-based virtual machines. Hence, registers allow the DVM to keep track of program evolution while it executes bytecode [38]. Each method in Smali has its own set of registers for each method's arguments, local variables and a special register for its return value. We will see later that most of the instructions include source and destination registers. Smali language denotes each set of registers differently, which allows us to visually distinguish between the method's local and argument registers.

The alternate *.locals* directive specifies the number of local registers used by the method (non-parameter registers) which is statically known. Local registers in Smali are denoted with *v*0, *v*1, *v*2, ..., *vn*, where *v*<sup>0</sup> is the first local register, *v*<sup>1</sup> the second and so on until the last register. This includes a special register for a method return value that allows passing return values from the callee back to the caller, which one is denoted by *ret*.
