*6.3. Feature Extraction and Selection*

Machine learning techniques have been increasingly used to detect ransomware due to their ability to learn behavior patterns and detect anomalies. In this section, we will discuss different features used for ransomware detection using machine learning and the techniques used for feature selection, such as principal component analysis and correlation analysis [18,47].

#### 6.3.1. Features Used for Ransomware Detection

There are several features that can be used for ransomware detection, with the most common ones including the following:


5. Static analysis is the examination of the executable file's source code to spot malicious activity. Features such as code size, entropy, and string patterns can be used for this purpose. For example, analysis of code size and entropy may reveal that a file contains obfuscated code, which could indicate ransomware activity [32]. Behavioral analysis and dynamic analysis are similar in that they both involve the monitoring of running processes to identify malicious activity. However, there are some key differences between the two approaches.

Behavioral analysis involves monitoring the behavior of running processes on a system to identify anomalies that indicate malicious activity. This is typically carried out in realtime, allowing the detection of ransomware as it is executed on a system. In contrast, dynamic analysis involves running an executable file in a controlled environment, such as a sandbox, to observe its behavior and identify any malicious activity. This is typically conducted prior to deploying the executable file on a production system.

The confusion between static and dynamic analysis may arise from the fact that both approaches involve the analysis of executable files, but they do so in different ways. Static analysis involves looking at the executable file's source code to spot malicious activity, while dynamic analysis involves running the executable file in a controlled environment to observe its behavior.

Dynamic analysis can be performed in real-time, but it can also be conducted in a sandbox environment before deploying the executable file on a production system. In a sandbox environment, the executable file is executed in a controlled environment, allowing its behavior to be monitored and analyzed without affecting the production system. Once the analysis is complete, the results can be used to determine whether the executable file is malicious or benign.

In the case of ransomware, real-time behavioral analysis is typically the preferred approach for detecting and responding to attacks. However, dynamic analysis can also be useful for identifying new and previously unseen variants of ransomware, which can then be used to improve the effectiveness of real-time behavioral analysis.

By using these features, machine-learning-based ransomware-detection methods can achieve high detection rates and low false-positive rates.

6.3.2. Feature Selection Techniques


#### *6.4. Performance Evaluation of Machine Learning Models for Ransomware Detection*

Evaluating the performance of machine learning models for ransomware detection is crucial to determine their effectiveness in detecting and preventing its spread. In this section, we will discuss different evaluation metrics used for measuring the performance of machine learning models for ransomware detection, including accuracy, precision, recall, F1-score, and ROC curve.


known as precision. A model with a high precision score will have a low false-positive rate, making it less likely to mistakenly label innocent files as ransomware [52].

