1. Introduction
We live in a technologically advanced world where the number of mobile devices is increasing rapidly owing to their mobility. However, these mobile devices have limited battery capacity and users need to operate these devices for the entire day. Different mobile companies are attempting to increase battery capacity; however, there are certain constraints on battery capacity (such as size and material) [
1]. Therefore, efficient operating systems (OSs) and applications must be run on these devices. Android, iPhone OS and Windows OS are the most popular OSs in mobile devices, and manufacturers are working to make their OSs more power efficient by providing dark, and power efficiency modes [
2]. Similarly, popular applications (apps) such as Facebook (
https://play.google.com/store/apps/details?id=com.facebook.katana, accessed on 15 July 2021), WhatsApp (
https://play.google.com/store/apps/details?id=com.whatsapp, accessed on 15 July 2021), and Google Chrome (
https://play.google.com/store/apps/details?id=com.android.chrome, accessed on 15 July 2021) also provide a dark mode for increased power efficiency [
3]. The number of Android devices has increased significantly [
4], and the Android OS is the most popular OS in mobile devices because it is an open-source OS; moreover, most of the apps available are free for download, and are available for low-end mobile devices, thereby capturing large mobile user markets [
5]. This also implies that a large community of developers is working to build new Android apps and update the existing apps to improve them. On average, 3700 apps are uploaded daily on Google Play Store [
6]. Maintaining the quality of apps is necessary, which is measured based on the number of downloads, reviews, and ratings provided by the users [
7]. The quality of certain apps suffers because developers of the app focus on functional requirements and ignore non-functional requirements such as performance [
8], power consumption [
9] and resource usage [
10]. Apps should be power efficient, and developers must efficiently use the application programming interfaces (APIs) provided by the OS to control power-hungry components. The Android OS provides
WAKE_LOCK API to control the power consumption of the apps [
11].
These
WAKE_LOCK API are used when the app needs to work in the background or prevent the CPU, display, and keyboard to go to sleep (if not used for some time). For example, when watching video on social media app or updating the app content in background when the phone is locked. If the lock acquired by these apps is left unattended (i.e., not released after completing work), they lead to unwanted power consumption. Carefully using wake lock is important, because it controls the power-hungry component (CPU, display) of the device [
12].
The power consumption of the device and app can be measured using different tools [
13]. To uncover unwanted power consumption, different tools are available for detecting power leaks, such as static [
14], dynamic [
15] or hybrid analysis [
16]. Static analysis tools use function call graph (FCG) [
17], control flow graph (CFG) [
18] and, data flow graph (DFG) [
19] to extract the information regarding the app. Li et al. [
20] provided details of state-of-the-art tools that use static analysis in the Android app. They concluded, that Soot, a framework for optimizing Java bytecode [
21], and Jimple [
22], an intermediate language (IL), are adopted by most tools. This guide can be used to obtain the initial knowledge of static analysis tools. Code smells [
23] are also used to detect the energy leak in Android apps; however the source code of the app is required to determine the code smells because the source code is analyzed at compile time to indicate the problems in the code. In contrast, dynamic analysis run the application to extract the flow of the app and identify the APIs used in the flow [
24,
25]. The primary disadvantage of using dynamic analysis is that it may not cover all the paths of the app because the app flow depends on the actions performed on the device. Dynamic analysis methods suffer from the construction of an execution environment and the creation of input data to inspect different paths [
26].
These techniques have both advantages and disadvantages. The OS version is updated every year [
27,
28] with new features and changes to deprecate APIs to improve the quality of the OS. The developers of an app must make these changes and update their app to ensure its compatibility with the new version. The static and dynamic analysis tools will not adopt the new changes automatically and must be updated every time there is a change in the related APIs which is difficult. Notably, every time there is a major change in the API, a new tool is required to detect power consumption, such as in the case of GreenDroid [
29] and Relda [
30], which were upgraded to E-GreenDroid [
24] and Relda2 [
31], respectively, to provide functionality with the newer version. This problem can be overcome by a using the multi-layer perceptron (MLP) because it can be automatically trained using new data and can subsequently classify the apps.
MLP is a field of artificial intelligence that is currently used in different fields to determine the pattern and predict the output of a complex problem [
32]. MLP minimizes manual human intervention to the largest extent and ensure that classification decisions are primarily dependent on the sample for automatic feature extraction and pattern recognition. It is also used for malware detection [
33], user behavior prediction [
34], app description analysis [
35], and several other applications. In this study, we used MLP to detect wake lock leaks in Android apps and determined how it can effectively detect these leaks. Our work is the first to use MLP to detect wake lock leaks and can be extended to other resource leaks. The primary advantage of an MLP is that it can automatically learn the representation (features) from the raw data to perform the detection task.
In this study, we first determined how frequently a wake lock is used in the apps based on the permission declared in the apps. To train the MLP, we collected different apps from different studies and manually validated the problem of wake lock leaks from GitHub. The selected apps were preprocessed before training. Then, the MLP was applied to determine its efficiency in detecting wake lock leaks. We divided our study and answered the following questions:
RQ1: AreWAKE_LOCKpermissions prevalent in Android apps?
RQ2: Can MLP be used to detect wake lock leaks?
RQ3: Is the performance of the MLP model better than that of traditional machine learning (ML) algorithms?
We answer the RQ1 by collecting 800 popular apps from Google Play Store and we found that 98% of apps use WAKE_LOCK permission, which is second most popular permission in the dataset and the APIs of wake lock should be carefully used to prevent energy leakage. To answer the RQ2, we applied MLP and found that MLP is effective in detecting wake lock leak with high accuracy. We compared the accuracy of traditional ML algorithms with MLP in RQ3 and shows that ML algorithms can also be used to detect wake lock leak with a little less accuracy.
This paper is divided into different sections. In
Section 2 covers the fundamentals of Android APK, wake lock leak with example, MLP, and synthetic minority oversampling technique (SMOTE). In
Section 3, different related works are discussed. The wake lock leak detection model utilized in this study is presented in
Section 4. The evaluation of the model is discussed in
Section 5. In
Section 6, we address limitations of our work, and in
Section 7, we conclude our work.
4. Wake Lock Leak Detection Model
Our wake lock leak detection model consists of three stages, as shown in
Figure 1. In the first stage, data is collected from different sources and labeled as “Clean APK” or “Leak APK”. The labeling is performed manually. In the second stage, the labeled data is encoded by extracting useful information from the APK and preprocessing is performed. In the last stage, an MLP model is trained to detect an app having wake lock leaks. We elaborate on each stage as follows.
4.1. Collecting and Labeling Data
Data collection is an important part of training the MLP. The apps that are used in training must be carefully identified. We used apps in the APK format to cover a large set of app data because most of the downloaded and highly rated apps are available in stores and can be freely downloaded; however, their source code is not available online. This is a problem faced by other studies in this field; consequently, not many apps have been identified as having wake lock leaks. To solve this problem, we used APK in our study; therefore, the MLP-trained model can be used to analyze apps from different app stores to ensure the quality of the app.
Identifying apps with excessive power consumption on the app store is challenging because only apps with significant battery drainage issues are reported as having energy leaks by users, who then award a low rating for the app [
52]. These identified apps are then analyzed by the developer to reduce the power consumption. Most of the users do not comment on the primary problem and only enter general comments such as “bad app.”
There are also certain apps whose source code is available in the GitHub (
https://github.com/, accessed on 15 July 2021) repository, which can be used by the open-source community to enhance the capability of apps and reuse the code [
63]. As these open-source codes are used by several other developers, bugs are identified and corrected early. When a bug is identified, it is assigned an issue number and is closed when it is resolved [
64]. These issues have comments that define the problem in simple English with an error message for the developer; when the error is resolved, the developer includes comments on how the problem was resolved. This newly committed version of the code at GitHub is indicated with the added code in green, removed code in red; moreover, the files that are changed by the developer are also indicated. In our study, we collected apps from two sources:
We collected open-source apps from different studies and tools (as discussed in
Section 3) that use the source code of the app to detect different resource leaks. We selected only the apps with wake lock leak problems from these studies and verified their GitHub versions manually to ensure that the downloaded apps have a wake lock leak in their code. We collected both clean and leaked versions of the identified apps. The collected GitHub dataset is summarized in
Table 2. It lists the app name, fixed version, which contains the GitHub version of the app in which the wake lock leak has been rectified, leak version, which contains the version of the app where the problem of wake lock exists, and references to studies where the respective apps were used. The names of tools corresponding to the references are listed below the table. After collecting the data related to the clean/fixed and leak versions from GitHub repositories, we built the source code to generate the APK format using Android Studio.
Clean apps were collected from the empirical study [
56], which is a large database of apps with wake locks. This dataset contains 44,736 apps (
http://sccpu2.cse.ust.hk/elite/downloadApks.html, accessed on 15 July 2021). We used 2778 apps that use wake lock in their code but do not have any wake lock leak. The distribution of wake lock permissions for these apps is shown in
Figure 2. For easy readability of the figure, we only indicated the permissions that were requested over 1000 times in the dataset.
Figure 2 illustrates that all the apps that were used from the empirical study dataset require
WAKE_LOCK permission. We can add more apps from Google Play Store or other repositories, but if the app utilizes the wake lock permission, we cannot tell whether it has a wake lock leak. Therefore, we used data from the empirical study to ensure that the app does not have any wake lock leak.
Three authors in the paper labeled the data manually by visiting the GitHub page of the app, finding the specific version, and reviewing the code changes by the developer. If the changes are related to wake lock API, then labeled the app as clean. We also find the previous version of the app and label it as a leak. The labeled data was finalized if two authors agreed on the same label for the app.
4.2. Features Extraction
After labeling the APK files, we extracted features from the apps so that they can be used as input for the MLP because APK files cannot be sent directly to the MLP. The APK contains .dex files, also known as Dalvik bytecode [
69] (explained in detail in
Section 2). Therefore, it is desirable to convert APK files into ILs. IL is the lowest-level human-readable programming language that is created automatically by reversing tools [
70] by converting the executable code into its textual representation. There are different reverse engineering tools available for extracting information from the APK files, such as Soot (
https://github.com/soot-oss/soot, accessed on 15 July 2021), APKtool (
https://github.com/iBotPeaches/Apktool, accessed on 15 July 2021), Androguard (
https://github.com/androguard/androguard, accessed on 15 July 2021). These tools convert APK files into ILs, such as Jimple, Jasmin, Smali. For example, Relda [
30] and Relda2 [
31] use Androguard to convert APK into Smali to detect wake lock leaks; similarly, APKtool and Soot can convert APK into Smali and Jimple, respectively. A comparison between these ILs is presented in a previous study [
71], which concluded that Smali is the most accurate IL, which maintains the program representation as it was written and is easily readable by humans.
We used Androguard to extract information from the APK files because it converts the APK into Smali, supports Python language (which we used for implementation), and extracts more user-friendly information. Androguard is used to extract a call graph (CG) from the APK files. A CG contains the information flow of the app, which illustrates how each method interacts with others. In Androguard, the CG is constructed using an
Analysis object that generates a
DiGraph (directed graph), which involves a pair
, where
V is a set of vertices or nodes and
E is a set of edges between different nodes. By default, these nodes are labeled as file names and method names. The instruction set of the method is stored in the attributes of the node. The labels of the nodes are important to identify which node represents which file, class, and method; however, for machines, the label is only considered as a string and does not provide much information; therefore, we updated these labels. The instructions contained in the node and connections between these nodes (edges) are important to provide a summary of the node and its surroundings. To encode the instructions and neighbor information of each node, we first encoded the labels of CGs using the instruction set contained in their node. In the CG, each method is represented as a node, and the interaction between these nodes is represented by edges, as shown in
Figure 3. The figure shows the CG of the simple program that is shown on the left.
We consider basic Dalvik bytecode instructions [
72] that are listed in
Table 3. In Dalvik bytecode, there are 256 instructions; however, for simplicity, we only considered the basic instruction class; for example,
monitor instruction has different variations such as
monitor-enter and
monitor-exit. The 15 bits are chosen because of two reasons, one they are most commonly used instructions and second is the study [
73] proves that the comprehensive features are not suitable and shows that full opcode features have less accuracy and consume more time and space. The instruction class and labels can be represented as follows.
Here,
C is the instruction class, as represented in
Table 3. The label of the node
v is represented as
and the number of bits is represented by the field
m. In our case,
m is of 15 bits.
is a function associated with node
v.
If we include more instructions, we require more bits (
m) to represent them and will require greater memory and processing capability. These 15 instruction classes (based on Equation (
1)) are represented using the 15 bit label of the node (according to Equation (
2)). If a node contains the instructions from these classes, these bits are converted to one (according to Equation (
3)).
Table 4 lists a simple code, in which the “Instruction” column provides the instruction sequence used in the node, “Instruction Class” and “Bit” columns depict the equivalent instruction and bit representations, respectively, as listed in
Table 3. From the sample code listed in
Table 4, we can obtain the bit representation of the label, as listed in
Table 5. Notably, only the bits pertaining to the instructions contained in the node were converted from “0” to “1”. Furthermore, multiple “invoke” instructions present in the function would not affect the label bits once they were converted to “1”, because instructions can be in a different order in different methods, but the methods perform the same functionality. This indicates that if the instruction sets in different nodes are the same, they will have the same label, which also reduces the complexity of the graph [
17].
After encoding the instruction information of each node of the CG in the label, we must also provide neighborhood information of the node, which is computed using the neighborhood hash graph kernel (NHGK) [
74]. It is a kernel that acts on the enumerable collection of sub-graphs of the labeled graph. It has low computational complexity and a highly expressive visual structure. The NHGK of each node can be computed by first identifying all its neighbors and then determining the XOR of their labels.
We can compute the hash of a given node
v and the set of its adjacent nodes
using
where
r is a rotation to the left of a single bit and ⊕ represents a bit wise XOR on the binary labels. For each node, this computation can be performed in constant time, more precisely in
time, where
d is the maximum out-degree and
m is the length of the binary label.
We can obtain greater details of a neighborhood by including a neighbor of the initially determined neighbor; however, this increases the complexity. NHGK is used to gather the neighborhood information of the function into a single hash value. The primary advantage of NHGK is that it runs in linear time on the number of nodes and processed graphs with thousands of nodes, such as CG. The label is replaced with the calculated hash value, and the number of bits of this hash value is identical to the label. Thus, we can obtain a hashed node that contains the information related to the instructions of the function and the neighborhood. After extracting and embedding the instructions and neighborhood information to the label of the node, we normalized the output in an array with 32,768 items, which was then used as the input to the MLP.
4.3. Classification Using Multi-Layer Perceptron
We chose the MLP because it has three different layers (input, output, and hidden layer). The signal to be processed is received by the input layer. The output layer completes the necessary operations, such as categorization. The real computational engine of the MLP consists of an arbitrary number of hidden layers positioned between the input and output layers. In an MLP, data travel in the forward direction from the input to the output layer, similar to a feedforward network [
75]. The back propagation learning [
76] technique was used to train the neurons in the MLP.
The following are the computations performed by each neuron in the output and hidden layers.
Here, is the output layer, is the hidden layer, and, are bias vectors, and, are weight matrices, and G and, s are activation functions. The parameters to be learned are and .
As shown in
Figure 4, our MLP consists of one input layer, one fully connected layer, and one output layer. We can increase the number of hidden layers, but this does not improve accuracy and may cause overfitting; therefore, we only used one fully connected layer [
77].
For the input we randomly split the training and testing sets into 80 and 20%, respectively. We ensured that the training and testing data were balanced (i.e., the number of samples of “leak” and “clean” data were equal) using
stratified distribution [
78]. To avoid overfitting,
L1 regularization with
= 0.001 and dropout of 0.3 was applied.
Adamax optimizer with a
learning rate of 0.1 was used. The
sigmoid (
) function was used in the output layer to classify the data. We trained MLP for 500
epochs and determine the validation accuracy to illustrate the accuracy of the MLP model.
6. Limitations
Data collection is the most important part of the research because most of the apps used in the evaluation of other tools were not available online or we did not find the appropriate version of the app. Certain apps were obtained, but they did not have any leaks. To remove this threat, we only used the apps that were obtained from GitHub and manually verified, if the wake lock leak was removed from their updated version. We converted these apps to the APK format because we processed only APK files in our experiment.
Imbalanced data were another problem in our research. The number of identified apps with wake lock leaks was significantly low when compared to the number of clean apps. To remove this threat, we used SMOTE, which is the most popular oversampling technique. There are certain other variations of oversampling methods, but SMOTE has the best performance [
83].
To validate and compare with other tool we do not have enough data. Benchmark apps are required to overcome this problem which can be used to validate the tools and find the effectiveness of tool in detecting wake lock leaks.
Another limitation in this work is that we only considered 15 Dalvik bytecode instruction classes during feature extraction, which can lead to inaccurate representation of the method. We can remove this threat by including all the basic instructions; however, the memory and processing requirements will be considerably high and will affect the processing time.
7. Conclusions and Future Work
Reducing power consumption of mobile devices is important. Display and CPU of the device consumes most of the power. Wake lock APIs are used to control the state of the display, CPU, and keyboard which affect the power consumption when the app is running. We see from our RQ1, that more than 98% of the app uses WAKE_LOCK permission in their app and control the power state of the device when the app is running. If these APIs (acquire(), and released()) are not used properly, they will lead to unwanted power consumption.
To detect wake lock leaks in Android apps, we extracted CG from APK, encoded instructions and neighbor information of the node in their label for more descriptions about the node. The apps were collected from GitHub, an empirical study, and oversampled using SMOTE. This encoded and oversampled data was then input to train the MLP. After training, we tested the MLP and calculated the accuracy and loss. The results illustrate that MLP can detect wake lock leaks with high accuracy of 99%. We also compare the MLP model with other ML algorithms, which demonstrated that MLP outperforms the other ML algorithms in detecting wake lock leaks.
For our future work, we plan to include other resource leaks in the study and create a larger dataset that represents all resource leaks in Android app; then, we plan to evaluate the effectiveness of the MLP in detecting all resource leaks.