GAShellBreaker: A Novel Method for Java Fileless Webshell Detection Based on Grayscale Images and Deep Learning

Zhang, Yuan; Li, Daofeng; Xie, Yuqin

doi:10.3390/electronics14081678

Open AccessArticle

GAShellBreaker: A Novel Method for Java Fileless Webshell Detection Based on Grayscale Images and Deep Learning

by

Yuan Zhang

¹,

Daofeng Li

^1,2,*

and

Yuqin Xie

¹

School of Computer and Electronic Information, Guangxi University, Nanning 530004, China

²

Guangxi Colleges and Universities Key Laboratory of Multimedia Communications and Information Processing, Guangxi University, Nanning 530004, China

^*

Author to whom correspondence should be addressed.

Electronics 2025, 14(8), 1678; https://doi.org/10.3390/electronics14081678

Submission received: 14 March 2025 / Revised: 17 April 2025 / Accepted: 19 April 2025 / Published: 21 April 2025

Download

Browse Figures

Versions Notes

Abstract

:

Webshells are widely used by attackers to maintain access during the post-exploitation phase. As security defenses improve, traditional file-based Webshells are increasingly detectable. To evade detection, attackers are shifting toward fileless Webshells, which reside entirely in memory and present significant challenges to conventional security tools. However, research on fileless Webshell detection remains limited. To address this gap, we analyzed various fileless Webshell samples, summarized their behavioral patterns, and constructed a corresponding threat model. Based on this, we propose a novel detection approach named GAShellBreaker, which leverages grayscale image transformation and deep learning. GAShellBreaker first establishes a dual-layer in-memory monitoring mechanism to capture suspicious classes within the Java Virtual Machine (JVM) and export them as bytecode files. It then extracts opcode sequences from these files, transforms them into grayscale images, and employs a ResNet50-based classifier for detection. Due to the limited availability of fileless samples, we trained and evaluated the model on a larger dataset of 1351 file-based scripts (383 Webshells and 968 benign samples), and used 56 fileless Webshells for validation. Experimental results show that GAShellBreaker achieves 99.10% accuracy on file-based Webshells and 89.29% accuracy on fileless Webshells, outperforming existing algorithms. Moreover, it maintains low computational overhead (6.7%), confirming its practical feasibility.

Keywords:

fileless Webshell detection; deep learning; malicious code detection

1. Introduction

Fileless Webshells have emerged as a stealthier and more sophisticated evolution of traditional file-based Webshells, posing an increasing threat to server security. Unlike traditional Webshells—which are malicious server-side scripts (e.g., written in Java or PHP), typically uploaded via injection or arbitrary file upload vulnerabilities—fileless Webshells operate entirely in memory, leaving no persistent footprint on disk. This characteristic presents significant challenges for conventional static analysis and signature-based detection methods.

Among various types of cyberattacks, Webshell-based intrusions remain both prevalent and highly impactful. According to Cisco’s 2024 report [1], Webshells were involved in 35% of cyberattacks. Furthermore, Asiainfo Security’s threat intelligence report [2] indicates that the majority of modern Webshell attacks increasingly adopt fileless techniques, with 80% of successful intrusions leveraging this approach. Once deployed, Webshells allow attackers to execute remote commands, exfiltrate sensitive data, and infiltrate internal networks to launch more sophisticated attacks. Therefore, detecting Webshells is crucial for maintaining network security and stability.

In response to traditional Webshells, researchers have proposed various detection mechanisms, including text-based feature analysis [3], statistical modeling [4], and network traffic monitoring [5]. In recent years, deep learning techniques have also been increasingly integrated to improve detection accuracy. These methods are generally effective in detecting traditional file-based Webshells. However, the majority of existing research in this area focuses on PHP-based Webshells, which account for approximately 68.18% of all studies [6]. In contrast, research on Java-based Webshells remains limited, even though Java has become one of the most widely used and fastest-growing server-side programming languages since early 2021 [7].

In addition, most Webshell detection approaches are based on text classification, typically involving the transformation of fixed-length code fragments into word vectors, which are then fed into a classifier. Due to the use of fixed-length inputs, attackers can easily evade detection by inserting malicious logic at the end of a benign file segment—an area often ignored by the classifier. Moreover, because fileless Webshells reside entirely in memory and are dynamically loaded, these traditional methods are largely ineffective against them.

Despite the increasing threat posed by fileless Webshells, current research still predominantly focuses on traditional file-based variants, with Java-based fileless Webshells receiving little attention. To the best of our knowledge, only two studies have explicitly addressed fileless Webshell detection, of which only one specifically targets Java-based fileless Webshells. Although this method introduces a novel detection approach, it is limited in scope—capable of detecting only a narrow range of fileless Webshell types—and suffers from relatively low detection accuracy. Furthermore, while several open-source tools are available for detecting fileless Webshells, most rely on simple rule-based static text matching, resulting in low efficiency and poor detection performance. Overall, existing research and tools for fileless Webshell detection remain constrained by limited detection range and suboptimal accuracy.

Through our investigation and analysis of existing detection methods, we have identified two critical challenges that have significantly hindered progress in the detection of fileless Webshells. First, due to the nascent nature of this threat, there exists a substantial scarcity of publicly available datasets. This data scarcity hampers the ability of detection models to learn generalized behavioral patterns of in-memory attacks, thereby limiting their effectiveness. Second, Java-based fileless Webshells often reside within the Java Virtual Machine (JVM) as dynamically loaded classes, making them difficult to extract and monitor. Traditional security mechanisms lack the capability to capture and analyze such in-memory execution behaviors, further complicating the detection process.

To address these challenges, this paper proposes a hybrid detection framework for fileless Webshells, named GAShellBreaker. At the dynamic level, GAShellBreaker performs the real-time monitoring of the JVM to capture suspicious classes loaded into memory. At the static level, we introduce a novel detection method that transforms class files into grayscale images representing malicious code, which are then classified using a deep learning model. This grayscale image-based approach effectively mitigates the limitations associated with fixed-length inputs commonly used in traditional text-based detection methods.

Specifically, to mitigate the limitations caused by the scarcity of fileless Webshell samples, we leverage the similarity in malicious logic between fileless and traditional Webshells. The model is first pre-trained on a traditional file-based Webshell dataset and subsequently applied to the detection of fileless Webshells. To overcome the limitations of existing methods in identifying malicious memory-resident classes, we design a dual-layer monitoring mechanism that captures such classes in real time during program execution.

In summary, the main contributions of this paper are as follows:

We conduct an in-depth analysis of the principles underlying fileless Webshells, categorize their behavioral characteristics, and develop a corresponding threat model. In addition, we manually examine existing fileless Webshell samples and construct a dedicated test set;
We propose a dynamic, JVM-based dual-layer monitoring approach capable of capturing fileless Webshells in real time without disrupting the operation of web applications;
We present a novel static detection method for Java-based Webshells, which is applied to fileless Webshell detection. This method transforms opcode adjacency pairs into grayscale images that are subsequently classified using the ResNet50 deep learning model [8].

The remainder of this paper is organized as follows. Section 2 reviews related work on both traditional and fileless Webshell detection. Section 3 introduces the threat model for fileless Webshells. Section 4 details the architecture of the proposed GAShellBreaker framework. Section 5 presents the experimental setup and results. Finally, Section 6 concludes the paper and outlines future directions.

2. Related Work

In this section, we review and discuss related work on Webshell detection and fileless Webshell detection.

2.1. File-Based Webshell Detection

Traditional Webshell detection methods primarily rely on static analysis, identifying Webshells by detecting characteristic values and dangerous functions within the code [9]. While this approach enables the rapid detection and precise identification of known Webshells, it suffers from a high false-positive rate and is largely ineffective against obfuscated, mutated, or encrypted malicious code. In recent years, advancements in machine learning have significantly expanded its applications in cybersecurity [10]. Unlike conventional signature-based detection methods, machine learning can autonomously extract complex code features, substantially improving detection accuracy and reducing false positives. As a result, it has emerged as a powerful approach for enhancing Webshell detection efficiency [11,12,13,14,15,16,17,18,19].

To better extract abstract features and capture richer semantic information, researchers have adopted word vectorization techniques to convert code text into feature vectors. For example, Guo et al. [12] applied 2-gram and Term Frequency-Inverse Document Frequency (TF-IDF) models to vectorize PHP opcode sequences and used a Naive Bayes (NB) model for detection. Phan et al. [14] initially filtered known Webshells using regular expression matching and then refined the selection by computing TF-IDF values to identify distinctive features for Webshell detection. Although these word vectorization methods are capable of converting text into numerical vectors, they primarily emphasize word frequency or local contextual relationships. As a result, they fail to effectively capture global semantics and long-range dependencies within the code.

To address this limitation, Liu et al. [15] utilized Word2Vec to generate word embeddings from source code and employed Bidirectional Gated Recurrent Units (BiGRUs) for classification. Pu et al. [16] adopted the Bidirectional Encoder Representations from Transformers (BERT) model to extract word vectors, followed by classification using an XGBoost ensemble model. Wang et al. [17] leveraged CodeBERT to obtain code embeddings and input them into a Gated Recurrent Unit (GRU) network for detection. However, all of the aforementioned approaches are designed for source code-based detection. In practice, attackers frequently use evasion techniques such as string splitting and character obfuscation. Consequently, detection methods that directly rely on source code are highly vulnerable to such obfuscation strategies, which can degrade their performance.

To mitigate the risk of source code-based detection being easily bypassed by obfuscation techniques, Viet et al. [18] proposed a method based on deep neural networks (DNNs) that identifies Webshell activity through a real-time analysis of HTTP traffic. However, since most websites today employ HTTPS encryption, detection methods that rely on HTTP traffic analysis face inherent limitations. Additionally, Lee et al. [19] transformed Webshells into abstract syntax trees (ASTs) and applied machine learning models for classification. Yet, this method shares a key limitation with earlier approaches—it relies on fixed-length input for text classification. Typically, these methods extract a fixed-length code segment (often 128 characters) and feed it into the model. The effectiveness of such methods heavily depends on whether the extracted segment contains the malicious payload. When malicious code is embedded deeper within a longer benign segment, fixed-length truncation often results in capturing only the benign portion, thereby reducing detection accuracy.

In addition, most of the aforementioned Webshell detection methods are limited to PHP-based Webshells. To address these limitations, we propose a detection approach based on grayscale image transformation. By converting the entire bytecode file into a grayscale image, this method effectively avoids detection failures caused by the fixed-length truncation of code segments. Moreover, since the target scenario involves fileless Webshells—typically operating in Java environments—our detection method is primarily designed for identifying Java-based Webshells.

2.2. Fileless Webshell Detection

As a novel attack method in recent years, fileless Webshells possess an invisible nature, making them more difficult to detect compared to file-based Webshells. To the best of our knowledge, only two studies have explicitly focused on the detection and analysis of fileless Webshells. Among them, Lima et al. [20] proposed a detection method that leverages statistical learning models trained on personal malicious activity patterns. The approach then dynamically monitors these malicious behaviors to identify Webshells. While achieving an accuracy of 99.95%, the approach is primarily designed for PHP-based fileless Webshells and is focused on personal computer security. Notably, PHP-based fileless Webshells often achieve persistence by continuously generating disk files, whereas Java-based fileless Webshells operate entirely within memory. This distinction renders Java-based variants significantly more stealthy and challenging to detect.

In the context of Java-based fileless Webshell detection, Song et al. [21] proposed a dynamic detection method that fits the runtime environment with probes and performs taint analysis. While this approach is effective in identifying most fileless Webshells, it has two notable limitations. First, the deployed probes are restricted to detecting specific types of fileless Webshells and fail to identify agent-based variants. Second, the taint analysis technique employed in this method exhibits limited detection accuracy.

In summary, inspired by the above detection methods, this study proposes a novel approach tailored for Java-based fileless Webshells. By conducting a comprehensive analysis of existing fileless Webshell behaviors, we design a new monitoring mechanism specifically aimed at detecting their runtime activities. Furthermore, by transforming opcode sequences into grayscale images, our method preserves complete bytecode information and avoids the truncation issues commonly associated with text-based approaches. In contrast to prior studies that rely on textual features or partial code segments, our method leverages the full bytecode structure and integrates dynamic monitoring with deep learning, thereby achieving superior detection accuracy and enhanced robustness against obfuscation techniques.

3. Threat Model of Fileless Webshells

For Java-based fileless Webshells, malicious code is injected into memory and executed through middleware processes, eliminating the need for a remotely accessible Webshell file on the server. As a result, these Webshells exhibit enhanced stealth and present significant challenges for detection. Based on their exploitation techniques, fileless Webshells can be broadly classified into two categories: component-based and agent-based. In this section, we introduce a threat model for fileless Webshells. Building upon this model, we propose GAShellBreaker, a novel detection framework specifically designed to mitigate such threats effectively.

3.1. Component-Based Fileless Webshell

As illustrated in Figure 1, a typical Java web application consists of three core components: Servlet, Filter, and Listener. According to the Java Servlet 3.0 specification [22], these components can be dynamically registered through the ServletContext during web container initialization. Exploiting this mechanism, attackers can leverage vulnerabilities—such as insecure deserialization—to inject malicious components directly into memory, thereby achieving a fileless Webshell effect. Once registration is complete, the attacker can access a predefined path to carry out subsequent malicious operations.

In addition to the three standard components, attackers may also target framework-specific elements such as the Controller and Interceptor in the Spring framework, or the Valve component in Tomcat.

3.2. Agent-Based Fileless Webshell

The agent-based fileless Webshell leverages Java Agent technology to dynamically modify classes loaded within the Java Virtual Machine (JVM), enabling the injection of malicious code into otherwise benign classes. While the corresponding class files on disk remain unchanged, their in-memory representations contain injected logic, thereby achieving a fileless characteristic. This approach significantly enhances stealth and evasion capabilities, making detection considerably more challenging.

The implementation process of an agent-based Webshell is illustrated in Figure 2. Initially, the attacker exploits a vulnerability to upload a Java Archive (JAR) file or execute shellcode that loads the Java Agent. The attacker then injects the JAR file into the target JVM and deletes it afterward to erase traces. Once injection is completed, a predefined path can be accessed to trigger the Java Agent, which dynamically modifies the loaded classes and activates the Webshell.

Both component-based and agent-based fileless Webshells can be dynamically injected into a web server by exploiting security vulnerabilities. However, component-based Webshells typically involve registering new components that contain malicious Webshell code. In contrast, agent-based Webshells utilize bytecode instrumentation to directly modify existing classes already loaded in the JVM, thereby eliminating the need to introduce new class files.

3.3. Threat Model of Fileless Webshell

According to the definition of backdoor programs proposed by Thomas et al. [23], fileless memory Trojans can be decomposed into four key components: input source, trigger, attack payload, and privilege state. As illustrated in Figure 3, a fileless Webshell embedded within a web application can be characterized using this model:

Input Source: The input source refers to the origin of the input that activates the Webshell trigger. In fileless Webshells, common input sources include Java web components (e.g., Servlet, Filter, and Listener), Java framework components (e.g., the Controller and Interceptor in the Spring framework and the Valve component in Tomcat), as well as loaded classes, such as agent-based Webshells that dynamically modify classes within the JVM.

Trigger: The trigger determines how the Webshell is executed to facilitate subsequent malicious privilege escalation. In fileless Webshells, triggers typically exploit deserialization vulnerabilities, arbitrary file upload vulnerabilities, or similar attack mechanisms.

Payload: The attack payload consists of malicious code that serves as the core of a fileless Webshell, enabling control over the web application. The payload dictates whether the application transitions into a malicious privileged state.

Privilege State: The privilege state represents the Webshell’s activation status. If successfully activated, the Webshell enters a malicious privileged state, allowing the attacker to execute further penetration operations. Conversely, if activation fails, the Webshell remains in its normal state. Once in a malicious state, the attacker gains persistent access to the Webshell, enabling the execution of additional attack strategies.

As the trigger mainly involves web vulnerabilities rather than the Webshell itself, and a detailed analysis of web attacks may increase server overhead, this study focuses on detecting Webshells based on the input source, attack payload, and malicious privileged state.

Fileless Webshells typically rely on input sources to register or modify loaded classes for injection purposes. Therefore, by detecting specific input source functions, it becomes possible to capture the injection process of fileless Webshells and intercept them before they transition into a malicious state. Furthermore, once a fileless Webshell enters a malicious privileged state, its behavior closely resembles that of a traditional Webshell—primarily aiming to gain control over the server through operations such as command execution and database access. Consequently, this study also focuses on detecting malicious functions executed by the Webshell after it reaches the malicious state.

The attack payload of a fileless Webshell generally resembles that of a traditional Webshell, typically involving the establishment of malicious connection paths or the execution of harmful operations. However, unlike traditional Webshells, fileless variants often incorporate additional component registration logic during the initial stage of the payload. Previous research on Java-based Webshell detection has predominantly relied on text classification-based methods. These approaches commonly truncate the input text to a fixed length of 128 characters before feeding it into the detection model. Due to this length constraint, the extracted content may fail to capture the core malicious logic, thereby reducing detection effectiveness—particularly for fileless Webshells. To address this limitation, this study proposes a novel detection method that transforms opcode sequences into grayscale images and leverages deep learning to enhance the accuracy of the detection model.

4. GAShellBreaker Model

In this section, we first introduce the overall framework of GAShellBreaker, followed by a detailed explanation of its individual components.

4.1. Overview

As previously mentioned, in Java web applications, the code logic of a fileless Webshell typically resides in the JVM as dynamically loaded classes. A conventional detection approach, such as CopAgent [24], attempts to extract the bytecode of all classes in the JVM and detect Webshells based on predefined rules.

However, due to the complexity of real-world applications, the JVM may load a large number of classes, which can lead to significant performance overhead and reduced detection efficiency. To address this issue, we propose a dynamic detection framework named GAShellBreaker, which combines grayscale image transformation and deep learning to identify fileless Webshells based on their behavioral characteristics and exploitation processes.

The overall architecture of GAShellBreaker is depicted in Figure 4, comprising three core modules: (1) Monitoring Probe, (2) Detector, and (3) Alert System. Upon initialization, GAShellBreaker is injected into the target web application’s JVM and activates the Monitoring Probe to track suspicious loaded classes. The probe continuously monitors sensitive methods and privileged functions, such as Webshell component registration or command execution routines. When a dynamically loaded class invokes any of these functions, the probe captures the class and exports its bytecode and metadata for further analysis.

The extracted class file is then passed to the Detector module, where the bytecode is converted into a grayscale image using an opcode adjacency matrix. This image is then classified by a trained classifier to determine whether it represents a Webshell file. Based on the classification result, the Alert System handles subsequent response actions: if the sample is benign, normal execution proceeds and a low-risk alert is logged; if the sample is identified as a Webshell, a high-risk alert is triggered, and detailed information is forwarded to the system administrator.

The following sections provide an in-depth description of each component of the GAShellBreaker framework.

4.2. Monitoring Probe

Through an extensive analysis of fileless Webshells, we observed that, regardless of the sophistication of their concealment techniques, they must execute certain essential operations or invoke specific methods to function. For instance, Servlet-based component Webshells must call addServletMapping() to complete dynamic registration, while, in a malicious state, attackers frequently utilize ProcessBuilder.start() to execute system commands for further exploitation. Therefore, by monitoring these critical methods, it becomes possible to capture and intercept the Webshell during its injection or execution phases.

In this section, we further elaborate on the selection criteria for the key methods being monitored and provide a detailed explanation of the injection process for the monitoring probe.

4.2.1. Key Method Selection

For fileless Webshells, the proposed model employs a two-layer monitoring mechanism to capture such threats. The first monitoring layer focuses on component class functions associated with Webshell input sources, referred to as sensitive class functions. These functions are commonly used to register legitimate web components and do not inherently possess harmful capabilities. However, they are also frequently leveraged during the registration process of fileless Webshells, serving as key entry points for activating the Webshell. Therefore, we define such functions—those that are not inherently malicious but can be dynamically captured during the injection process—as sensitive functions.

The second monitoring layer targets functions required to transition into a malicious state, termed malicious class functions. These functions are considered malicious because they enable attackers to perform operations such as server intrusion, data exfiltration, and command execution, thereby posing a considerable security threat.

The first monitoring layer (sensitive methods). The selection of sensitive methods must meet specific criteria. An excessive number of methods may introduce unnecessary resource consumption for web applications and lead to a high false-positive rate. Conversely, selecting an insufficient number of methods may result in missed detections. Therefore, this study proposes four evaluation criteria for method selection:

Sensitive functions must originate from the Java web framework rather than being custom-defined by the attacker;
Sensitive functions should be as close as possible to the Webshell script itself. For instance, if Program C calls m1, and m1 calls m2, where both m1 and m2 satisfy Criterion 1, then m1 should be selected as the sensitive function. This ensures a more accurate tracing of the caller program in subsequent detection steps;
Sensitive functions should appear in at least the majority of fileless Webshells of the same type;
On the basis of satisfying the first three rules, if a method has overloaded versions, all methods with the same name should be considered sensitive functions. For example, in the Agent class, both addTransformer(Transformer, boolean) and addTransformer(Transformer) should be classified as sensitive functions.

Based on the above rules, we conducted a manual analysis of all publicly available fileless Webshells and identified 11 sensitive functions for GAShellBreaker to monitor. The selected sensitive functions are detailed in Table 1.

The second monitoring layer (malicious class functions). Fileless Webshells, like their traditional counterparts, aim to gain unauthorized control over the server to perform various malicious operations. These typically include remote command execution, database manipulation, and file system access. To identify the functions associated with such activities, we analyzed a large number of real-world Webshell samples and consulted official Java API documentation [25], as well as security audit reports from prior research [26]. Based on this analysis, we compiled a list of commonly exploited functions, such as command execution (e.g., Runtime.exec()) and database connection routines (e.g., Driver.connect()). Monitoring these sensitive functions enables our system to detect and intercept potentially malicious Webshell behaviors at runtime.

To further address persistence threats, we also monitor functions that are commonly abused to achieve long-term control. For instance, since Webshell components are typically destroyed when a web application is shut down or restarted, attackers may register shutdown hooks (e.g., Runtime.addShutdownHook()) to execute custom threads upon JVM termination. These threads can write the Webshell to disk and subsequently reload it into memory when the server restarts, thereby achieving both persistence and fileless characteristics.The complete list of identified malicious functions and their usage contexts is presented in Table 2.

4.2.2. Operation Process of the Monitoring Probe

GAShellBreaker is a dynamic, real-time monitoring technology that enhances the bytecode of functions in loaded classes within the JVM without affecting the normal operation of the target application. Upon startup, GAShellBreaker offers two injection options: the premain mode, which injects the monitoring mechanism at the time the web application starts, and the agentmain mode, which enables injection while the web application is already running. Once injected into the JVM, the system executes an algorithm to insert monitoring probes.

Algorithm 1 illustrates the process of inserting monitoring probes into the JVM. When the insertion program is initiated, it first traverses all currently loaded classes within the JVM. If a class name matches those listed in Table 1 or Table 2, the system uses Javassist’s insertBefore() method to inject the monitoring logic into the corresponding method.

When a program triggers the monitoring probe, GAShellBreaker reads and retains the current stack trace to capture the call site of the monitored method. The stack trace records information about the current call stack. For example, if a probe is inserted into method B, and method A subsequently calls B, the context information of A—including the call chain—will be stored in the stack trace. By traversing this call chain, GAShellBreaker can locate the program that invoked the monitored method and retrieve its corresponding contextual information.

In addition, the monitoring probe is capable of exporting flagged suspicious classes from the JVM and converting them into bytecode files (.class). GAShellBreaker leverages Javassist’s CtClass interface to transform memory-resident classes into <classname>.class files.

Algorithm 1 Monitoring Probe Insertion Process

Require:: S, M, $C l$ // S, and M represent the sets of sensitive methods and malicious functions, respectively; $C l$ represents the corresponding set of classes.
Ensure:: Target methods in JVM classes are injected with probes
1:: $C \leftarrow g e t A l l L o a d e d C l a s s e s ()$
2:: for $e a c h c_{i} \in C$ do
3:: if $c_{i} = C l$ then
4:: $F \leftarrow g e t D e c l a r e d M e t h o d s (c_{i})$
5:: for $e a c h F_{i} \in F$ do
6:: if $F_{i} \in S$ or $F_{i} \in M$ then
7:: $F_{i} \leftarrow i n s e r t B e f o r e (M o n i t o r i n g p r o b e c o d e)$
8:: end if
9:: end for
10:: end if
11:: end for

4.3. Detector

This study presents a novel static detection approach that leverages grayscale image transformation and deep learning techniques. The proposed detector consists of two key components: a grayscale conversion engine and a deep learning classifier. The core methodology involves converting bytecode files into grayscale images using the transformation engine, followed by feature extraction through the ResNet50 deep learning model to enable the efficient and accurate identification of suspicious class files.

Given the shared payload characteristics between fileless and traditional file-based Webshells—where fileless variants typically include an additional component registration step—and the limited availability of fileless Webshell samples, our training dataset primarily consists of conventional Webshell samples. After training, the system is then applied to detect fileless Webshells. The following sections provide a detailed description of the detector’s components and implementation.

4.3.1. Grayscale Conversion Engine

In contrast to traditional Webshell payloads, fileless Webshells typically incorporate a component registration process, which poses challenges for text classification-based detection due to input length constraints that may hinder the model’s ability to capture the complete malicious logic. To address this limitation and counteract evasion techniques such as obfuscation and string splitting, this study employs low-level opcode sequences to construct grayscale images for detection purposes.

The grayscale conversion process, as depicted in Figure 5, initiates with the extraction of opcode sequences from class files using the javap -c command based on regular expression matching. The javap utility, a standard Java class file disassembler, is utilized for its capability to decompile and analyze bytecode generated by the Java compiler, providing a robust foundation for bytecode extraction and subsequent processing.

The subsequent step involves transforming the extracted opcode sequence into a grayscale image, as outlined in the pseudocode presented in Algorithm 2. To facilitate this conversion, GAShellBreaker constructs an

N \times N

opcode adjacency matrix, where N denotes the number of unique opcode instructions identified in the dataset. Specifically, we determined

N = 149

by extracting all opcode instructions from the Java samples described in Section 5.1 and counting the number of unique opcodes that appeared. Deprecated, reserved, or unused opcodes were excluded to ensure that only relevant opcodes were considered. These selected opcodes were then used as both row and column indices to construct a

149 \times 149

adjacency matrix, where each element holds an integer value ranging from 0 to 255. Once the opcode sequence is transformed into the adjacency matrix, GAShellBreaker generates a corresponding grayscale image. In this image, each pixel represents the frequency of a specific opcode adjacency pair. Consequently, the resulting grayscale image also has dimensions of

149 \times 149

.

Algorithm 2 Opcode Sequence to Grayscale Image Conversion

Require:: O // Input opcode sequence $O = [o_{1}, o_{2}, \dots, o_{m}]$
Ensure:: Grayscale image $I M G \in R^{149 \times 149}$
1:: $A M \leftarrow 0^{149 \times 149}$ // Initialize adjacency matrix
2:: for $i = 0$ to $m - 1$ do
3:: $o p_{1} \leftarrow Index (s_{i})$ // Map opcode to matrix row index
4:: $o p_{2} \leftarrow Index (s_{i + 1})$ // Map opcode to matrix column index
5:: if $A M [o p_{1}, o p_{2}] < = 255$ then
6:: $A M [o p_{1}, o p_{2}] \leftarrow A M [o p_{1}, o p_{2}] + 1$
7:: else
8:: continue
9:: end if
10:: end for
11:: Convert $A M$ to grayscale image $I M G$
12:
13:: return $I M G$

In Table 3, we present the frequency rankings of opcode adjacency pairs in both benign and Webshell samples. Specifically, we collected .class files corresponding to benign and malicious Java programs (the dataset sources are detailed in Section 5.1) and used the javap -c command to extract the opcode sequences. We then computed the frequency of adjacent opcode instruction pairs (i.e., opcode 2-grams) within each sample and aggregated these frequencies according to their respective categories (benign vs. Webshell). Finally, we ranked the opcode pairs based on their overall frequency within each category and listed the ten most frequent patterns. As illustrated in Table 3, the top-10 most frequent opcode adjacency pairs in benign samples differ significantly from those in Webshell samples. This pronounced disparity highlights the effectiveness of using grayscale images derived from opcode adjacency pairs as a method for detecting malicious web activities.

Figure 6 presents two grayscale images constructed by GAShellBreaker. In these images, the majority of pixel values are 0, indicating that most opcode instruction pairs do not exhibit adjacency relationships—visually represented by black regions. The white spots in the image reflect the frequency of calls between two opcode instructions: the brighter the spot, the more frequent the interaction. The yellow rectangles highlight key differences in opcode adjacency patterns between benign and Webshell samples. Notably, Webshell samples tend to exhibit a higher frequency of opcode calls within the highlighted regions.

4.3.2. Classifier

After extracting the bytecode files from memory and converting them into grayscale images, GAShellBreaker employs a deep learning model to learn the visual features of these representations. Given the relatively low dimensionality and complexity of the generated grayscale images, we consider it unnecessary to adopt overly complex model architectures. Among various candidates, ResNet50 is selected due to its well-documented performance in image classification tasks and its ability to strike a balance between accuracy and computational efficiency [27]. To empirically support this choice, we also conduct comparative experiments using a baseline Convolutional Neural Network (CNN) model. The results, presented in the subsequent experimental section, further confirm the superior detection performance of ResNet50 in our task.

4.4. Alert System

The alert system is primarily responsible for post-processing the detection results of fileless Webshells within the GAShellBreaker framework. For each suspicious class captured by the runtime probe, if the static detector classifies it as benign, the original response is allowed to proceed, and a low-risk alert is issued for logging purposes. Conversely, if the class is identified as a Webshell, a high-risk alert is triggered, and a notification is sent to the administrator via email.

When a high-risk Webshell is detected, GAShellBreaker automatically generates an alert message containing key information, including the alert timestamp, risk level, metadata of the suspicious class, and relevant attachments. The alert timestamp corresponds to the moment that the detector confirms the presence of a malicious class. In such cases, the system also records and exports critical details such as the file path, class name, and corresponding bytecode file for further analysis. Table 4 provides a representative example of an alert generated in our experimental environment.

5. Experiments and Analysis

In this section, we conduct two primary experiments. The first experiment focuses on evaluating the overall performance of the proposed static detection model for Java-based Webshells. Specifically, the model is trained and evaluated on a dataset of traditional file-based Webshells and compared with several existing methods to validate its effectiveness. The trained static detection model is subsequently integrated into GAShellBreaker to assess its capability in detecting fileless Webshells. The second experiment aims to evaluate the overall performance of the complete GAShellBreaker framework, including its effectiveness in capturing and detecting fileless Webshells in a realistic runtime environment.

5.1. Dataset

Given the limited research on Java-based Webshells—particularly fileless variants, which remain in the early stages of investigation—and the lack of standardized, publicly available datasets, the Webshell samples used in this study were primarily collected from open-source repositories. As shown in Table 5, the malicious Webshell samples were obtained from projects publicly released by security researchers on GitHub (https://github.com, accessed on 14 September 2024). For the benign samples used during training, we collected widely used Tomcat instances to construct the normal sample set.

To ensure data integrity, duplicate samples were removed by computing and comparing their MD5 hash values. As a result, we obtained a curated dataset comprising 383 file-based Java Webshell samples, 968 benign Java samples, and 56 fileless Java Webshell samples.

In the subsequent model training process, the file-based Webshell dataset was split into a training set and a testing set in an 8:2 ratio. To ensure the robustness of the results, we conducted three independent test runs and averaged the outcomes to derive the final performance metrics. For the fileless Webshell dataset, all samples were used to evaluate the performance of GAShellBreaker, including its effectiveness in capturing fileless Webshell classes loaded into memory and the accuracy of the static detector in identifying these Webshells.

As shown in Table 6, based on the exploitation process of fileless webshells, we categorized them into Component Class and Agent Class. Specifically, there are a total of 49 Component Class samples. However, the number of Agent Class samples is relatively small, with only 7 samples collected so far.

5.2. Experimental Setting

During the training phase of the detector, GAShellBreaker was trained and evaluated on a system equipped with 16GB RAM, an NVIDIA GeForce RTX 2060 GPU NVIDIA Corporation, Santa Clara, CA, USA), and an AMD Ryzen 4900H processor (Advanced Micro Devices, Inc., Santa Clara, CA, USA). The model was implemented and evaluated using Python 3.9.13, with PyTorch 2.6.0 and Torchvision 0.21.0. Given the proven effectiveness of the ResNet50 architecture in image-based classification tasks, it was adopted for pretraining in this study. The detector was trained for 20 epochs using the Adam optimizer, with a learning rate of 0.001 and a batch size of 32.

In the detection experiments for fileless Webshells, the GAShellBreaker detector was deployed on Oracle JDK 8 (Build 1.8_401). This version was selected due to its technical maturity and widespread adoption in the Java ecosystem, making it the preferred choice for most real-world enterprise applications. To comprehensively evaluate the detection capabilities and performance of the proposed model, a realistic testing environment was set up on a cloud server running Ubuntu 22.04, equipped with a dual-core processor and 4 GB of RAM. The environment included Tomcat, Spring Web, and other relevant components, all compiled and executed under the same JDK version (Oracle JDK 8, Build 1.8_401).

As illustrated in Figure 7, two vulnerabilities were deliberately configured in the server environment to assess the monitoring and detection effectiveness of GAShellBreaker. Specifically, a deserialization vulnerability was implemented based on the Commons-Collections library version 3.2.1. The ysoserial tool was subsequently used to generate deserialization payloads, which were injected into the environment via curl to simulate fileless Webshell attacks. Additionally, an arbitrary file upload vulnerability was introduced to support the deployment of JSP-based fileless Webshells, thereby providing a comprehensive and representative test scenario for the proposed detection framework.

5.3. Evaluation Metrics

In the experiments targeting standard Java-based Webshells, we evaluated detection performance using four key metrics: accuracy, precision, recall, and F1-score. Accuracy (Acc) measures the proportion of correctly classified instances among all samples. Precision (Pre) indicates the likelihood that a sample predicted as a Webshell is indeed a Webshell. Recall quantifies the proportion of actual Webshells that are correctly identified. F1-score is the harmonic mean of precision and recall, providing a balanced assessment that accounts for both false positives and false negatives.

Given the significant class imbalance between normal and Webshell samples, we adopted macro-averaged precision, recall, and F1-score as our primary evaluation metrics. This averaging strategy assigns equal weight to each class, thereby ensuring that the performance on minority classes is fairly reflected in the overall evaluation.

5.4. Analysis of Detector Performance Results

In this section, we evaluate the effectiveness of our proposed static detection approach, which is based on grayscale image transformation and deep learning. Due to the limited availability of fileless Webshell samples and the high similarity in malicious logic between fileless and traditional file-based Webshells, the model was initially trained and evaluated on a dataset of file-based Webshells. To ensure statistical reliability, we conducted three independent test runs and calculated the average performance, standard deviation, and 95% confidence intervals for each metric.

To further validate the effectiveness of our approach, we compared it with several representative methods under the same experimental setup, including Word2Vec-BiGRU [15], BERT-XGBoost [16], and CodeBERT-BiGRU [17]. Additionally, we conducted a baseline experiment using a standard CNN-based classifier to assess the contribution of the ResNet50 backbone.

As shown in Table 7, our method achieved the highest average accuracy (99.10%) and precision (99.13%), accompanied by notably small standard deviations and narrow confidence intervals, which indicate both high effectiveness and performance stability. While the recall of our model is slightly lower than that of CodeBERT-BiGRU, the gap is marginal and likely stems from the broader semantic representation capabilities of pretrained language models. Nevertheless, our approach demonstrates a well-balanced performance across all metrics, and its superior accuracy and F1-score underscore its robustness and practical applicability in detecting Webshell.

5.5. Experimental Analysis of GAShellBreaker Performance for Fileless Webshells

In this section, we evaluate the effectiveness of GAShellBreaker by comparing it with two open-source tools and with JShellDetector, proposed by Song et al. [21]. Specifically, the evaluation focuses on two key aspects of our approach: (1) the capability of the monitoring module to accurately detect and capture in-memory Webshells, and (2) the performance of the static detection model in identifying fileless Webshells. To systematically assess the overall effectiveness and practical applicability of GAShellBreaker, we define the following research questions as evaluation criteria:

RQ1. Effectiveness: Can GAShellBreaker accurately detect fileless Webshells?
RQ2. Comparison with other tools: Does GAShellBreaker perform better than other tools?
RQ3. Feasibility: Does GAShellBreaker impose excessive performance overhead on the web server?

5.5.1. Answering RQ1: Effectiveness

To systematically evaluate the effectiveness of our model, we perform separate assessments of the two core components of GAShellBreaker: the monitoring module and the detector module. For the monitoring module, we evaluate its ability to accurately capture suspicious classes that invoke monitored methods. For the detector module, we assess its classification accuracy in identifying whether the captured classes are Webshells.

Table 8 presents the comparative performance of GAShellBreaker and JShellDetector across different types of fileless Webshells (additional details are provided in Appendix A). The experimental results show that GAShellBreaker’s monitoring probe successfully captures all fileless Webshells, achieving a perfect detection rate of 100%. This outcome supports the validity of our analysis regarding the input sources of fileless Webshells. In contrast, JShellDetector is limited to detecting fileless Webshells based on the Servlet API and Spring framework, and fails to identify Agent-based Webshells or other component types (e.g., Tomcat–Valve). As a result, JShellDetector achieves a significantly lower overall detection accuracy of only 76.79%.

Additionally, in terms of detection accuracy, the GAShellBreaker detector correctly identifies 50 out of 56 fileless Webshell samples, achieving an accuracy of 89.29%. The remaining six undetected samples may be attributed to the fact that their core execution logic does not exhibit overtly malicious behavior. Instead, these samples perform potentially harmful actions—such as writing to files—only after receiving specific traffic requests. Consequently, during static analysis, GAShellBreaker classifies them as benign. In comparison, JShellDetector successfully detects only 45 samples, corresponding to an accuracy of 80.36%. These experimental results underscore the superior detection performance of GAShellBreaker.

5.5.2. Answering RQ2: Comparison with Other Tools

In this section, we evaluate the performance of the trained static detection model in identifying fileless Webshells. Specifically, we input 56 fileless Webshell loading classes—captured by the runtime monitoring module of GAShellBreaker—into the detector. The detector transforms these bytecode files into grayscale images and classifies them using the trained ResNet50 model. As the static detector in this experiment is exclusively employed to verify its effectiveness in identifying fileless Webshells, we report only the accuracy metric, which reflects the model’s success rate in detecting these malicious samples.

In this section, we compare our approach with two open-source tools: Copagent [24] and OpenRASP [28]. Copagent is a rule-based static detection tool that identifies in-memory Webshells by retrieving all loaded classes in the JVM, filtering high-risk classes using a predefined blacklist, and applying rule-based matching. Its source code is publicly available on GitHub (https://github.com/LandGrey/copagent, accessed on 11 March 2025). OpenRASP, in contrast, is a Runtime Application Self-Protection (RASP)-based dynamic detection tool that provides comprehensive runtime application monitoring and protection, covering specific scenarios of Webshell behavior. The source code for OpenRASP can be accessed from its official website (https://rasp.baidu.com, accessed on 11 March 2025). In addition, we compare our approach with JShellDetector, proposed by Song et al. [21]. JShellDetector is a dynamic detection method based on JVM instrumentation and taint analysis. Similar to our monitoring mechanism, it injects probes into specific framework-related classes (e.g., Servlet and Spring) to trace the propagation of untrusted input. However, it is limited in scope and cannot detect fileless Webshells embedded in non-framework components such as Tomcat–Valve or WebSocket.

As shown in Table 9, GAShellBreaker demonstrates superior performance in detecting fileless Webshells, achieving high accuracy across various types of fileless Webshells. In contrast, the two open-source tools exhibit overall poor detection performance.

Copagent’s detection effectiveness is limited primarily because of its simplistic rule configurations, which attackers can easily bypass by altering the architecture of the Webshell. Meanwhile, OpenRASP is only capable of detecting the exploitation phase of fileless Webshells and cannot identify their injection process. Additionally, OpenRASP operates exclusively in premain mode, requiring the Java Agent to be specified during the startup of the web application, and cannot be embedded during runtime. In contrast, GAShellBreaker offers enhanced flexibility by supporting the agentmain mode, enabling injection during program execution and significantly improving its detection capabilities.

It is important to note that high detection accuracy alone does not necessarily translate to superior performance in real-world scenarios. A comprehensive evaluation should also account for runtime efficiency and system overhead. Therefore, in the following section, we further assess the feasibility of deploying GAShellBreaker by examining its performance overhead in a realistic web application environment.

5.5.3. Feasibility

Given that GAShellBreaker’s monitoring probe operates at application runtime, it is crucial to evaluate its performance to ensure minimal impact on the web server. Considering that OpenRASP is a widely adopted open-source tool in the field of Runtime Application Self-Protection (RASP) and has been deployed in numerous real-world security scenarios, this study assesses the feasibility of GAShellBreaker by comparing its actual runtime performance overhead with that of OpenRASP.

For the performance evaluation, we utilize Apache JMeter [29] as the benchmarking tool. The experimental environment is deployed on the same cloud server configuration described in Section 5.2, with Tomcat serving as the web application container. To simulate real-world user workloads, each request initiates 1000 hash computations on the server, approximating typical web response latency. We evaluate three distinct deployment scenarios: (1) a baseline environment without any protection, (2) an environment with GAShellBreaker deployed, and (3) an environment with OpenRASP deployed.

To emulate realistic load conditions, JMeter is configured with 1000 threads and a ramp-up period of 30 s, representing 1000 users accessing the server concurrently within that timeframe. The test is executed 10 times, generating a total of 10,000 HTTP requests. Evaluation metrics include the average response time and performance overhead. Let

T_{1}

and

T_{2}

denote the average response times before and after deploying the protection mechanism, respectively. The performance overhead T is calculated as shown in Equation (1):

T = \frac{T_{2} - T_{1}}{T_{1}} \times 100 %

(1)

As shown in Table 10, when no protection program is running, the web server’s average response time is 1.49 s. After deploying GAShellBreaker, the average response time increases to 1.59 s, resulting in a performance overhead of 6.7%, which is within an acceptable range. In contrast, after deploying OpenRASP, the web server’s average response time increases to 1.87 s, leading to a performance overhead of 25.5%, which has a slight impact on the server’s operation.

As shown in Figure 8, we have plotted the response time distribution curves before and after running the detection tools. Overall, deploying a protection program increases the response time. However, when running GAShellBreaker, the response time curve remains closer to that of the unprotected environment. In contrast, OpenRASP exhibits a more noticeable impact on response time. Therefore, GAShellBreaker has a smaller impact on the web system, demonstrating better feasibility.

6. Conclusions

Webshells are malicious server-side scripts commonly used by attackers to maintain access after compromising a server. However, as security defenses continue to evolve, traditional file-based Webshells are becoming increasingly difficult to deploy undetected. Consequently, fileless Webshells—characterized by their stealth and lack of persistence on disk—have emerged as a new trend. Due to their memory-resident nature, traditional detection tools struggle to effectively identify fileless Webshells. To address this challenge, this paper systematically investigates Java-based fileless Webshells, analyzing the principles underlying two main categories and constructing their corresponding threat models. We then propose a novel detection method, GAShellBreaker, which targets three key aspects of fileless Webshells: input sources, payloads, and privilege states. GAShellBreaker comprises two core components: a monitoring probe and a static detector. The monitoring probe captures suspicious in-memory classes by monitoring specific function invocations and exports them as bytecode files. The static detector employs a ResNet50-based deep learning model combined with grayscale image transformation to classify these bytecode files. Given the similarity in core malicious logic between fileless and traditional file-based Webshells, the detector is initially trained on a file-based Webshell dataset, achieving excellent detection performance with an average accuracy of 99.10%, outperforming other comparable methods. The trained model is subsequently used to analyze captured suspicious classes. Experimental results demonstrate that our approach achieves 89.29% accuracy in detecting fileless Webshells, with a runtime performance overhead of only 6.7%, highlighting the practical deployment potential of GAShellBreaker in real-world Java-based web server environments.

Although GAShellBreaker has demonstrated strong detection performance and robustness against source-level obfuscation techniques, it has not yet been evaluated against bytecode-level evasion strategies, such as opcode injection and instruction-level obfuscation. These techniques alter the structure of bytecode without modifying its semantic behavior, potentially undermining the effectiveness of static detection. In future work, we plan to assess the impact of such evasion techniques and explore corresponding countermeasures to further enhance the robustness of our approach.

In addition, several other aspects require further improvement. For instance, the limited availability of fileless Webshell samples currently restricts our ability to directly train detection models on such data. Moreover, insufficient attention has been paid to detecting exploit triggers during the vulnerability exploitation phase. To address these limitations, we plan to build a larger and more standardized dataset of fileless Webshells, optimize the static detection module, and strengthen vulnerability monitoring to improve overall detection efficiency.

Author Contributions

Conceptualization, Y.Z.; data curation, Y.Z.; funding acquisition, D.L.; methodology, Y.Z.; project administration, D.L.; resources, D.L.; software, D.L.; supervision, D.L.; validation, Y.Z. and Y.X.; writing—original draft, Y.Z.; writing—review and editing, Y.Z. and Y.X. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by the National Natural Science Foundation of China, grant number 61662004.

Data Availability Statement

Data are contained within the article.

Conflicts of Interest

The authors declare no conflicts of interest.

Appendix A

Table A1 presents the detection results of GAShellBreaker and JShellDetector across all fileless Webshell samples.

Table A1. Detection of fileless Webshell.

	Category	Name	GAShellBreaker		JShellDetector
	Category	Name	Monitoring Probe	Detector	Suspicious Class Filter	Webshell Detection
Component	Servlet	1AddServlet	✓	✓	✓	✓
		2addservlet	✓	✗	✓	✓
		3memservlet	✓	✓	✓	✓
		4icememservlet	✓	✓	✓	✓
		7icememservlet	✓	✓	✓	✓
		7memservlet	✓	✓	✓	✓
		AddServlet	✓	✓	✓	✗
		ISRain	✓	✓	✓	✓
		ISRain10	✓	✓	✓	✗
		SRain	✓	✓	✓	✓
		SRain10	✓	✗	✓	✗
		TestServlet	✓	✓	✓	✓
	Filter	addFilter	✓	✓	✓	✓
		memfilter8910	✓	✓	✓	✓
		4icememfilter	✓	✓	✓	✓
		icememfilter8910	✓	✓	✓	✓
		AddFilter	✓	✓	✓	✓
		FRain	✓	✓	✓	✗
		FRain10	✓	✓	✓	✓
		IFRain	✓	✓	✓	✗
		IFRain10	✓	✓	✓	✓
		TestFilter	✓	✓	✓	✓
	Listener	1AddListener	✓	✓	✓	✓
		addlistener	✓	✗	✓	✓
		icememlistener	✓	✓	✓	✓
		listener	✓	✓	✓	✓
		memlistener	✓	✓	✓	✓
		2AddListener	✓	✓	✓	✗
		3AddListener	✓	✓	✓	✓
		ILRain	✓	✓	✓	✓
		ILRain10	✓	✓	✓	✓
		LRain	✓	✓	✓	✓
		LRain10	✓	✓	✓	✓
		TestListener	✓	✓	✓	✓
	Spring–Controller	AddController	✓	✓	✓	✗
		ControllerBased	✓	✓	✓	✓
		Evil	✓	✓	✓	✓
		InjectToController	✓	✓	✓	✓
		invisibleShell	✓	✓	✓	✓
		ReController	✓	✓	✓	✗
	Spring–Interceptor	AddInterceptor	✓	✓	✓	✗
		TestInterceptor	✓	✗	✓	✓
		TestInterceptor1	✓	✓	✓	✓
	Tomcat–Valve	myValve	✓	✓	✗	✓
	Tomcat–Valve	myValve1	✓	✓	✗	✓
	WebSocket	cmdbypass	✓	✓	✗	✓
	WebSocket	wscmd	✓	✗	✗	✗
	Executor	AddExecutor	✓	✓	✗	✓
	Upgrade	AddTUpgrade	✓	✓	✗	✓
Agent	-	DefineTransformer	✓	✓	✗	✓
		ProcessUtil	✓	✓	✗	✓
		Shell	✓	✗	✗	✗
		ShellChecker	✓	✓	✗	✓
		Shell1	✓	✓	✗	✓
		Shell2	✓	✓	✗	✓
		WriteShell	✓	✓	✗	✓

References

Lexi DiScola. Talos IR Trends Q4 2024: Web Shell Usage and Exploitation of Public-Facing Applications Spike. 2024. Available online: https://blog.talosintelligence.com/talos-ir-trends-q4-2024 (accessed on 11 March 2025).
Asiainfo Security. Asiainfo Security Technologies-2020 Thematic Analysis Report on New Ransomware Virus Without File Attack Techniques. 2020. Available online: https://www.asiainfo-sec.com/security/notice/detail-6071.html (accessed on 11 March 2025).
Li, Y.; Huang, J.; Ikusan, A.; Mitchell, M.; Zhang, J.; Dai, R. Shellbreaker: Automatically detecting php-based malicious web shells. Comput. Secur. 2019, 87, 101595. [Google Scholar] [CrossRef]
Cui, H.; Huang, D.; Fang, Y.; Liu, L.; Huang, C. Webshell detection based on random forest–gradient boosting decision tree algorithm. In Proceedings of the 2018 IEEE Third International Conference on Data Science in Cyberspace (DSC), Guangzhou, China, 18–21 June 2018; pp. 153–160. [Google Scholar]
Yang, W.; Sun, B.; Cui, B. A webshell detection technology based on HTTP traffic analysis. In Proceedings of the Innovative Mobile and Internet Services in Ubiquitous Computing: Proceedings of the 12th International Conference on Innovative Mobile and Internet Services in Ubiquitous Computing (IMIS-2018), Matsue, Japan, 4–6 July 2018; pp. 336–342. [Google Scholar]
Hannousse, A.; Yahiouche, S. Handling webshell attacks: A systematic mapping and survey. Comput. Secur. 2021, 108, 102366. [Google Scholar] [CrossRef]
W3Techs. Web Technology Surveys. 2025. Available online: https://w3techs.com/ (accessed on 11 March 2025).
Koonce, B.; Koonce, B. ResNet 50. In Convolutional Neural Networks with Swift for Tensorflow: Image Recognition and Dataset Categorization; Apress: Berkeley, CA, USA, 2021; pp. 63–72. [Google Scholar]
Luczko, P.; Thornton, J. PHP Shell Detector. 2012. Available online: https://github.com/emposha/PHP-Shell-Detector (accessed on 11 March 2025).
Thangavel, M.; TGR, A.S.; Priyadharshini, P.; Saranya, T. Review on machine and deep learning applications for cyber security. In Research Anthology on Machine Learning Techniques, Methods, and Applications; IGI Global: Hershey, PA, USA, 2022; pp. 1143–1164. [Google Scholar]
Wang, Z.; Yang, J.; Dai, M.; Xu, R.; Liang, X. A method of detecting webshell based on multi-layer perception. Acad. J. Comput. Inf. Sci. 2019, 2, 81–91. [Google Scholar]
Guo, Y.; Marco-Gisbert, H.; Keir, P. Mitigating webshell attacks through machine learning techniques. Future Internet 2020, 12, 12. [Google Scholar] [CrossRef]
Min, J. Conv-Bilstm: A New Intelligent Webshell Detection Network Based on bi-lstm. Master’s Thesis, Lanzhou University, Lanzhou, China, 2021. [Google Scholar]
Phan, V.A.; Jerabek, J.; Le, D.K.; Gotthans, T. New Approach to Shorten Feature Set via TF-IDF for Machine Learning-Based Webshell Detection. In Proceedings of the 2024 IEEE International Conference on Cyber Security and Resilience (CSR), London, UK, 2–4 September 2024; pp. 50–55. [Google Scholar]
Liu, Z.; Li, D.; Wei, L. A new method for webshell detection based on bidirectional gru and attention mechanism. Secur. Commun. Netw. 2022, 2022, 3434920. [Google Scholar] [CrossRef]
Pu, A.; Feng, X.; Zhang, Y.; Wan, X.; Han, J.; Huang, C. BERT-Embedding-Based JSP Webshell Detection on Bytecode Level Using XGBoost. Secur. Commun. Netw. 2022, 2022, 4315829. [Google Scholar] [CrossRef]
Wang, G.Y.; Ko, H.J.; Chiang, C.P.; Wang, W.J. Webshell detection based on codebert and deep learning model. In Proceedings of the 2024 5th International Conference on Computing, Networks and Internet of Things, Tokyo, Japan, 24–26 May 2024; pp. 484–489. [Google Scholar]
Viet, H.L.; Phung, O.V.; Nguyen, H.N. Enhancing Webshell Detection with Deep Learning-Powered Methods. arXiv 2024, arXiv:2412.05532. [Google Scholar]
Lee, H.J.; Hwang, S.J.; Pratiwi, M.; Choi, Y.H. Obfuscated PHP Webshell Detection Using the Webshell Tailored TextRank Algorithm. In Proceedings of the 39th ACM/SIGAPP Symposium on Applied Computing, Avila, Spain, 8–12 April 2024; pp. 1358–1365. [Google Scholar]
Lima, S.M.; Silva, S.H.; Pinheiro, R.P.; Souza, D.M.; Lopes, P.G.; de Lima, R.D.; de Oliveira, J.R.; Monteiro, T.d.A.; Fernandes, S.M.; Albuquerque, E.d.Q.; et al. Next-generation antivirus endowed with web-server sandbox applied to audit fileless attack. Soft Comput. 2023, 27, 1471–1491. [Google Scholar] [CrossRef]
Song, X.; Qin, Y.; Liu, X.; Cui, B.; Fu, J. JShellDetector: A Java Fileless Webshell Detector Based on Program Analysis. Comput. Mater. Contin. 2023, 75, 2061–2078. [Google Scholar] [CrossRef]
IBM Corporation. Java Servlets 3.0. 2022. Available online: https://www.ibm.com/docs/en/was-liberty/base?topic=features-java-servlets-30 (accessed on 11 March 2025).
Thomas, S.L.; Francillon, A. Backdoors: Definition, deniability and detection. In Proceedings of the International Symposium on Research in Attacks, Intrusions, and Defenses, Heraklion, Greece, 10–12 September 2018; pp. 92–113. [Google Scholar]
LandGrey. Copagent. 2021. Available online: https://github.com/LandGrey/copagent (accessed on 11 March 2025).
Oracle Corporation. Java Platform SE 8 API Specification. 2014. Available online: https://docs.oracle.com/javase/8/docs/api/ (accessed on 9 April 2025).
Zang, C. Java Risky Functions Collection. 2022. Available online: https://github.com/zangcc/Java_Risky_Functions (accessed on 9 April 2025).
Sarwinda, D.; Paradisa, R.H.; Bustamam, A.; Anggia, P. Deep learning in image classification using residual network (ResNet) variants for detection of colorectal cancer. Procedia Comput. Sci. 2021, 179, 423–431. [Google Scholar] [CrossRef]
Baidu Security. OpenRASP. 2022. Available online: https://rasp.baidu.com (accessed on 5 October 2023).
APACHE. APACHE JMeter. 2022. Available online: https://jmeter.apache.org (accessed on 11 March 2025).

Figure 1. Implementation process of component-based Webshell.

Figure 2. Implementation process of agent-based Webshell.

Figure 3. Threat model of fileless Webshell.

Figure 4. The architecture of the proposed framework GAShellBreaker.

Figure 5. The architecture of the grayscale conversion engine.

Figure 6. Grayscale image representation of normal and Webshell samples.

Figure 7. Overview of the fileless Webshell experimental environment.

Figure 8. Response time distribution curve before and after running the detection tool.

Table 1. Sensitive methods for GAShellBreaker.

No	Type	Class	Method
1	Servlet	StandardContext	addservletmapping
2	Filter	FilterDef	setFilterName
3	Listener	StandardContext	addApplicationEventListener
4	Filter	Standardcontext	setApplicationEventListeners
5	Tomcat–Valve	Pipeline(standardcontext)	addValve
6	WebSocket	WsServerContainer	addEndpoint
7	Executor	AbstractEndpoint	setExecutor
8	Upgrade	httpUpgradeProtocols	put
9	Spring–Controller	RequestMappingHandlerMapping	registerMapping
10	Spring–Interceptor	ArrayList(List)	add
11	Agent	InstrumentationImpl	addTransformer

Table 2. Malicious functions.

Type	No	Class	Method
Command Execution	1	ProcessImp1	start
	2	ProcessBuilder	start
	3	Runtime	exec
File Operations	4	File	delete
	5	Files	newInputStream
			newOutputStream
			newBufferedReader
			newBufferedWriter
Database Operations	6	Driver	connect
Database Operations	7	Statement	executeQuery
Special Functions	8	Runtime	addShutdownHook
	9	ClassLoader	defineClass
	10	DomainMBean	createShutdownClass
	10	DomainMBean	createStartupClass

Table 3. Top-10 most frequent opcode pairs.

Frequency Rankings	Opcode Adjacency Pairs of Normal Samples	Opcode Adjacency Pairs of Webshell Samples
1	aload_0 -> getfield	ldc_w -> invokevirtual
2	aload_0 -> aload_1	invokevirtual -> aload
3	aload_0 -> invokevirtual	aload -> ldc_w
4	areturn -> aload_0	invokevirtual -> invokevirtual
5	putfield -> aload_0	new -> dup
6	new -> dup	aload -> invokevirtual
7	invokedynamic -> invokevirtual	astore -> aload
8	invokevirtual -> astore	invokevirtual -> astore
9	aload_1 -> putfield	invokevirtual -> ldc_w
10	return -> aload_0	invokestatic -> invokespecial

Table 4. Alert information content.

Type	Warning Message
Time	Request Timestamp
Threat level	High Risk/Low Risk
Suspicious class information	Suspicious Class Loading Path, Filename, and Other Details
Attachments	Webshell Bytecode File

Table 5. Sample sources.

Sample Types	Source
Webshell sample	https://github.com/tennc/Webshell (accessed on 14 September 2024)
	https://github.com/xl7dev/Webshell (accessed on 14 September 2024)
	https://github.com/gxu-yuan/ysrc-back/ (accessed on 20 April 2025)
	https://github.com/threedr3am/JSP-Webshells (accessed on 14 September 2024)
Benign sample	https://github.com/apache/tomcat (accessed on 14 September 2024)
Fileless Webshell sample	https://github.com/java-security/Webshelldataset (accessed on 14 September 2024)
	https://github.com/jweny/MemShellDemo (accessed on 14 September 2024)
	https://github.com/Getshell/Mshell (accessed on 14 September 2024)

Table 6. Fileless Webshell experimental data.

Types	Category	Count
Component	Servlet	12
	Filter	10
	Listener	12
	Spring–Controller	6
	Spring–Interceptor	3
	Tomcat–Valve	2
	WebSocket	2
	Executor	1
	Upgrade	1
Agent	-	7

Table 7. Performance comparison of GAShellBreaker and other detection methods.

Method Source	Acc	Pre	Recall	F1
Word2vec-BiGRU [15]	0.9834 ± 0.0023 [0.9778, 0.9890]	0.9718 ± 0.0070 [0.9544, 0.9891]	0.9790 ± 0.0046 [0.9677, 0.9904]	0.9753 ± 0.0035 [0.9667, 0.9840]
BERT-XGBoost [16]	0.9805 ± 0.0024 [0.9746, 0.9864]	0.9699 ± 0.0052 [0.9570, 0.9828]	0.9666 ± 0.0102 [0.9413, 0.9918]	0.9682 ± 0.0029 [0.9609, 0.9755]
CodeBERT-BiGRU [17]	0.9857 ± 0.0040 [0.9758, 0.9955]	0.9722 ± 0.0083 [0.9517, 0.9927]	0.9842 ± 0.0039 [0.9745, 0.9939]	0.9782 ± 0.0060 [0.9633, 0.9931]
Our detector (CNN)	0.9590 ± 0.0162 [0.9187, 0.9992]	0.9597 ± 0.0154 [0.9214, 0.9979]	0.9590 ± 0.0162 [0.9187, 0.9992]	0.9571 ± 0.0180 [0.9125, 1.0000]
Our detector (ResNet50)	0.9910 ± 0.0012 [0.9880, 0.9939]	0.9913 ± 0.0006 [0.9899, 0.9927]	0.9830 ± 0.0086 [0.9618, 1.0000]	0.9870 ± 0.0045 [0.9759, 0.9980]

Note: Bold values indicate the best performance under each evaluation metric.

Table 8. Experiment results of each fileless Webshell case.

Types	Category	GAShellBreaker		JShellDetector
Types	Category	Monitoring Probe	Detector	Suspicious Class Filter	Webshell Detection
Component	Servlet	12	10	12	9
	Filter	10	10	10	8
	Listener	12	11	12	11
	Tomcat–Valve	2	2	0	2
	WebSocket	2	1	0	1
	Executor	1	1	0	1
	Upgrade	1	1	0	1
Component (Spring)	Controller	6	6	6	4
Component (Spring)	Interceptor	3	2	3	2
Agent	-	7	6	0	6
Count	-	56	50	43	45

Table 9. Comparison with other tools.

Type	GAShellBreaker	JShellDetector	Copagent	OpenRASP
Component	90%	82.5%	77.5%	65%
Component(Spring)	88.9%	66.7%	11.1%	88.9%
Agent	85.71%	85.71%	42.3%	71.4%
Overall detection rate	89.29%	80.36%	62.5%	69.6%

Table 10. Performance test results.

Test Methodology	Average Response Time (s)	Performance Overhead (%)
No Security Protection	1.49	-
GAShellBreaker	1.59	6.7%
OpenRASP	1.87	25.5%

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Zhang, Y.; Li, D.; Xie, Y. GAShellBreaker: A Novel Method for Java Fileless Webshell Detection Based on Grayscale Images and Deep Learning. Electronics 2025, 14, 1678. https://doi.org/10.3390/electronics14081678

AMA Style

Zhang Y, Li D, Xie Y. GAShellBreaker: A Novel Method for Java Fileless Webshell Detection Based on Grayscale Images and Deep Learning. Electronics. 2025; 14(8):1678. https://doi.org/10.3390/electronics14081678

Chicago/Turabian Style

Zhang, Yuan, Daofeng Li, and Yuqin Xie. 2025. "GAShellBreaker: A Novel Method for Java Fileless Webshell Detection Based on Grayscale Images and Deep Learning" Electronics 14, no. 8: 1678. https://doi.org/10.3390/electronics14081678

APA Style

Zhang, Y., Li, D., & Xie, Y. (2025). GAShellBreaker: A Novel Method for Java Fileless Webshell Detection Based on Grayscale Images and Deep Learning. Electronics, 14(8), 1678. https://doi.org/10.3390/electronics14081678

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

GAShellBreaker: A Novel Method for Java Fileless Webshell Detection Based on Grayscale Images and Deep Learning

Abstract

1. Introduction

2. Related Work

2.1. File-Based Webshell Detection

2.2. Fileless Webshell Detection

3. Threat Model of Fileless Webshells

3.1. Component-Based Fileless Webshell

3.2. Agent-Based Fileless Webshell

3.3. Threat Model of Fileless Webshell

4. GAShellBreaker Model

4.1. Overview

4.2. Monitoring Probe

4.2.1. Key Method Selection

4.2.2. Operation Process of the Monitoring Probe

4.3. Detector

4.3.1. Grayscale Conversion Engine

4.3.2. Classifier

4.4. Alert System

5. Experiments and Analysis

5.1. Dataset

5.2. Experimental Setting

5.3. Evaluation Metrics

5.4. Analysis of Detector Performance Results

5.5. Experimental Analysis of GAShellBreaker Performance for Fileless Webshells

5.5.1. Answering RQ1: Effectiveness

5.5.2. Answering RQ2: Comparison with Other Tools

5.5.3. Feasibility

6. Conclusions

Author Contributions

Funding

Data Availability Statement

Conflicts of Interest

Appendix A

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI