**5. Performance Evaluation**

In this section, we evaluate the proposed approach in terms of several performance metrics. To conduct our experiments, we use a repository of 500 .odt files collected from different sources, with different sizes ranging from 10 KB to 30 MB. All experiments were conducted on a Ubuntu 18.0 machine with a Core i5-1.8 GHz Intel processor and 4 GB RAM. Creating a protected version of each file was achieved by running a shell script that included all the steps outlined in the proposed framework discussed in Section 4. We performed multiple experiments to measure the performance of the proposed SH-VARR framework. SH-VARR uses zip/unzip for file compression/decompression as it is the default compression/decompression algorithm used in connection with XML documents. Meanwhile, SH-VARR still has the flexibility of operating with any other compression algorithm. Therefore, different compression algorithms were investigated investigated (zip, gzip, and bzip2) under our experimental set up. In this effort, we evaluate our proposed SH-VARR framework opposite storage overhead, time requirement, CPU utilization, and memory usage.

Creating a protected version of a file (i.e., a snapshot) represents a major step in our framework which results in extra storage requirements. Hence, our objective is to quantify the amount of the resulting storage. This overhead depends mainly on the compression

algorithm used to create the snapshot. Figure 6a–c show how the storage overhead increases with the original file size for the cases when using the zip, gzip, and bzip2 compression algorithms. Figure 6d illustrates all cases together for the purpose of comparison. Generally, by increasing the file size, the size of the resulting snapshot increases proportionately. With that said, the size of the resulting file remains smaller than that of the original file. It is quite evident from the comparison that the bzip2-based SH-VARR slightly outperforms the other two versions. However, it consumes more time, as we will discuss next. This would also imply that there is a trade-off between time and storage overhead. Meanwhile, given the lower storage costs involved in today's technologies, the time required to create a protected snapshot may play out as a more pronounced factor.

**Figure 6.** Storage overhead by SH-VARR snapshot based on three compression algorithms. (**a**) Using zip algorithm; (**b**) Using gzip algorithm; (**c**) Using bzip2 algorithm; (**d**) All algorithms.

The proposed SH-VARR framework involves several steps to create a protected snapshot for each file version. Therefore, it is important to measure the amount of time required to perform such an operation. Figure 7a–c show how the time requirement increases with the original file size for creating the snapshot in the proposed SH-VARR approach when leveraging the zip, gzip, and bzip2 compression algorithms, respectively. Figure 7d illustrates all cases together for comparison purposes. Creating a protected version for small files (e.g., less than 1 MB) takes a negligible amount of time that would, on average, not exceed 120 ms. However, for larger file sizes exceeding 10 MBs, more time is required to create the protected version. It can be observed that the amount of time varies as file compression depends on the amount of redundancy in each file and the type of content (e.g., text, images, etc.) contained in each file. It is evident from the outcomes of using both the zip and the gzip algorithms that the results are fairly comparable and they are seen to offer much better results than when using the bzip2 algorithm. In fact, the bzip2 is observed to consume considerable amounts of time to create the protected version, especially when the file sizes involved are quite large.

**Figure 7.** Time requirement for SH-VARR snapshot based on three compression algorithms. (**a**) Using zip algorithm; (**b**) Using gzip algorithm; (**c**) Using bzip2 algorithm; (**d**) All algorithms.

Figure 8a–c show how the CPU utilization varies against the original file size for creating the snapshot in the proposed SH-VARR schema when leveraging the zip, gzip, and bzip2 compression algorithms, respectively. Figure 8d illustrates all cases together for the purpose of comparison. Here, CPU utilization is the amount of work handled by the CPU while creating a protected version for each file. Generally, for small files, CPU utilization increases with increasing file size. However, for larger file sizes, it levels off to some decent value. By monitoring the CPU utilization for each job executed when creating a protected version, we observed that when the bzip2 compression algorithm was used the CPU utilization was evidently the highest.

Figure 9a–c show how the memory usage changes against the original file size to create the snapshot in the proposed SH-VARR schema when leveraging the zip, gzip, and bzip2 compression algorithms. Figure 9d illustrates all cases together for comparison purposes. It is readily seen that the memory usage, for the cases when the zip and gzip compression algorithms are used, is almost fixed (around 6.8 KBs) where it does not show any dependence on file size. Meanwhile, memory usage for the case involving the bzip2 compression algorithm is seen to increase with increasing file size, then it remains constant (around 28 KBs) for files with large sizes. This is because all the compression algorithms (zip, gzip, and bzip2) involved in our assessment of the proposed framework do not capture the entire file into the memory. Instead, they acquire it as a stream requiring a specific amount of memory each time (i.e., takes a chunk of data of a specific size each time), and the amount needed depends on the compression method used and the file size involved.

**Figure 8.** CPU utilization by SH-VARR snapshot based on three compression algorithms. (**a**) Using zip algorithm; (**b**) Using gzip algorithm; (**c**) Using bzip2 algorithm; (**d**) All algorithms.

**Figure 9.** Memory usage by SH-VARR snapshot based on three compression algorithms. (**a**) Using zip algorithm; (**b**) Using gzip algorithm; (**c**) Using bzip2 algorithm; (**d**) All algorithms.

Finally, we compare the proposed mechanism with the work presented in [54,55]. In [55], the authors presented a Ransomware protection framework that depends on a network connection to backup files on a local or a remote server. However, they did not provide any performance evaluation of their framework in terms of time and storage requirements. In [54], the authors proposed backing up critical data in a fully isolated spare space that is not reachable by Ransomware, regardless of what privilege it can obtain. The authors assumed that the computing device has a particular portion of extra space, which can be utilized to create the backup volume to store encoded files with reverse deltas. This is different than the proposed work, where we can hold both reverse deltas and complete snapshots of files. We also used compression techniques to utilize the storage better. Moreover, our proposed work is portable because it can be shipped as a plugin that can be attached to documents; a feature that is not supported by [55] or [54].
