**1. Introduction**

Compact data structures [1] are examined in this paper as they can provide real-time processing and compression of remote sensing images. These structures are stored in reduced space in a compact form. Functions can be used to access and query each datum or groups of data directly in an efficient manner without an initial full decompression. This compact data should also have a size which is close to the information-theoretic minimum. The idea was explored and examined by Guy Jacobson in his doctoral thesis in 1988 [2] and in a paper published by him a year later [3]. Prior to this, works had been done to express similar ideas. However, Jacobson's paper is often considered the starting point of this topic. Since then it has gained more attention and a number of research papers have been published. Research on algorithms such as FM-index [4,5] and Burrows-Wheeler transform [6] were proposed and applications were released, notable examples of which include bzip2 (https: //linux.die.net/man/1/bzip2), Bowtie [7] and SOAP2 [8]. One of the advantages of using compact data structures is that the compressed data form can be loaded into main memory and accessed directly. The smaller compressed size also helps data move through communication channels faster. The other advantage is that there is no need to compress and decompress the data as is the case with data compressed by a classical compression algorithm such as gzip or bzip2, or by a specialized algorithm such as CCSDS 123.0-B-1 [9] or KLT+JPEG 2000 [10,11]. The resulting image will have the same quality as the original.

Hyperspectral images are image data that contain a multiple number of bands from across the electromagnetic spectrum. They are usually taken by hyperspectral satellite and airborne sensors. Data are extracted from certain bands in the spectrum to help us find the objects that we are specifically looking for, such as oil fields and minerals. However, due to their large sizes and the huge amount of data that have been collected, hyperspectral images are normally compressed by lossy and lossless algorithms to save space. In the past several decades, a lot of research studies have gone into keeping the storage sizes to a minimum. However, to retrieve the data, it is still necessary to decompress all the data. With our approach using compact data structures, we can query the data without fully decompressing them in the first place, and this is the main motivation for this work.

Prediction is one of the schemes used in lossless compression. CALIC (Context Adaptive Lossless Image Compression) [12,13] and 3D-CALIC [14] belong to this class of scheme. In 1994, Wu et al. introduced CALIC, which uses both context and prediction of the pixel values. In 2000, the same authors proposed a related scheme called 3D-CALIC in which the predictor was extended to the pixels between bands. Later in 2004, Magli et al. [15] proposed M-CALIC whose algorithm is related to 3D-CALIC. All these methods take advantage of the fact that in a hyperspectral image, neighboring pixels in the same band (spatial correlation) are usually close to each other and even more so for neighboring pixels of two neighboring bands (spectral correlation).

Differential encoding is another way of encoding an image by taking the difference between neighboring pixels and in this work, it is a special case of the predictive method. It only takes advantage of the spectral correlation. However, this correlation between the pixels in the bands will become smaller as the distance between the bands are further apart and therefore, its effectiveness is expected to decrease when the bands are far from each other.

The latest studies on hyperspectral image compression, both lossy and lossless, are focused on CCSDS 123.0, vector quantization, Principal Component Analysis (PCA), JPEG2000, and Lossy Compression Algorithm for Hyperspectral Image Systems (HyperLCA), among many others. Some of these research works are listed in [16–19]. In this work, however, we investigate lossless compression of hyperspectral images through the proposed *k*2-raster for 3D images, which is a compact data structure that can provide bit-rate reduction as well as direct access to the data without full decompression. We also explore the use of a predictor and a differential encoder as preprocessing on the compact data structure to see if it can provide us with further bit-rate reduction. The predictive method and the differential method are also compared. The flow chart shown in Figure 1 depicts how the encoding/decoding of this proposal works.

This paper is organized as follows: In Section 2, we present the *k*2-raster and discuss it in detail, beginning with quadtree, followed by *k*2-tree and *k*2-raster. Later in the same section, details of the predictive method and the differential method are discussed. Section 3 shows the experimental results on how the two methods fare using *k*2-raster on hyperspectral images, and more results on how some other factors such as using different *k*-values can affect the bit rates. Finally, we present our conclusions in Section 4.

**Figure 1.** A flow chart showing the encoding and decoding of this coder.

#### **2. Materials and Methods**

One way to build a structure that is small and compact is to make use of a tree structure and do it without using pointers. Pointers usually take up a large amount of space, with each one having a size in the order of 32 or 64 bits for most modern-day machines or programs. A tree structure with n pointers will have a storage complexity of O(*n* log *n*) whereas a pointer-less tree only occupies O(*n*). For pointer-less trees, to get at the elements of the structure, rank and select functions [3] are used, and that only requires simple arithmetic to find the parent's and child's positions. This is the premise that compact data structures are based on. In this work, we will use *k*2-raster from Ladra et al. [20], a concept which was developed from *k*2-tree, also a type of compact data structure, as well as the idea of using recursive decomposition of quadtrees. The results of *k*2-raster were quite favorable for the data sets that were used. Therefore, we are extending their approach for hyperspectral images and investigate whether it would be possible to use that structure for 3D hyperspectral images. The Results section will show us that the results are quite competitive compared to other commonly-used classical compression techniques. There is a bit-rate reduction of up to 55% for the testing images. Upon more experimentation with predictive and differential preprocessing, a further bit-rate reduction of up to 64% can be attained. For that reason, we are proposing in this paper our encoder using the predictor or differential method on *k*2-raster for hyperspectral images.

#### *2.1. Quadtrees*

Quadtree structures [21], which have been used in many kinds of data representations such as image processing and computer graphics, are based on the principle of recursive decomposition. As there are many variants of quadtree, we will describe the one that is pertinent to our discussion: region quadtree. Basically, a quadtree is a tree structure where each internal node has 4 children. Given a 2D square matrix, it is partitioned recursively into four equal subquadrants. If a tree is built to represent this, it will have a root node at level 0 with 4 children nodes at level 1, each child representing a node and a subquadrant. Next, if the subquadrant has a size larger than 22, then each of these subquadrants will be partitioned to give 4 more children and a new level 2 is added to the tree. Note that the tree nodes are traversed in a left to right order.

Considering a matrix of size *n* × *n* where *n* is a power of 2, it is recursively divided until each subquadrant has a size of 22. For example, if the size of the matrix is 8 × 8, after the recursive division of matrix, (82)/(22) = 16 subquadrants are obtained. It should be noted that the value of *n* in the image matrix needs to be a power of 2. Otherwise, the matrix has to be enlarged widthwise and heightwise to a value which is the next power of 2, and these additional pixels will be padded with zeros. As *k*2-trees are based on quadtrees, the division and the resulting tree of a quadtree are very similar to those of a *k*2-tree. Figure 2 illustrates how a quadtree's recursive partitioning works.

**Figure 2.** A graph of 6 nodes (top) with its 8 × 8 binary adjacency matrix at various stages of recursive partitioning. At the bottom, a *k*2-trees (*k*=2) is constructed from the matrix.

#### *2.2. LOUDS*

*k*2-tree is based on unary encoding and LOUDS, which is a compact data structure introduced by Guy Jacobson in his paper and thesis [2,3]. A bit string is formed by a breadth-first traversal (going from left to right) of an ordinal (rooted, ordered) tree structure. Each parent node is encoded with a string of '1' bits whose length indicates the number of children it has and each string ends with a '0' bit. If the parent node has no children, only a single '0' bit suffices.

The parent and child relationship can be computed by two cornerstone functions for compact data structures: rank and select. These functions give us information about the node's first-child, next-sibling(s), and parent, without the need of using pointers. They are described below:


By default, *b* is 1, i.e., rank(*m*) = rank1(*m*). These operations are inverses of each other. In other words, rank(select(*m*)) = select(rank(*m*)) = *m*. Since a linear scan is required to process the rank and select functions, the worst-case time complexity will be O(*n*).

To clarify how these functions work, consider the binary trees depicted in Figure 3 where the one on the left shows the values and the one on the right shows the numbering of the same tree. If the node has two children, it will be set to 1. Otherwise, it is set to 0. The values of this tree are put in a bit string shown in Figure 4. Figure 5 shows how the position of the left child, right child or parent of a certain node *m* is computed with the rank and select functions. An example follows:

To find the left child of node 8, we first need to compute rank(8), which is the total number of 1's from node 1 up to and including node 8 and the answer is 7. Therefore, the left child is located in 2\*rank(8) = 2\*7 = 14 and the right child is in 2\*rank(8)+1 = 2\*7+1 = 15. The parent of node 8 can be found by computing select(8/2) or select(4). The answer can be arrived at by counting the total number of bits starting from node 1, skipping the ones with '0' bits. When we get to node 4 which gives us a total bit count of 4, we then know that node 4 is where the parent of node 8 is.

**Figure 3.** A binary tree example for LOUDs. The one on the left shows the values of the nodes and the one on the right shows the same tree with the numbering of the nodes in a left-to-right order. In this case the numbering starts with 1 at the root.


**Figure 4.** A bit string with the values from the binary tree in Figure 3.


**Figure 5.** With the rank and select functions listed in the first column, we can navigate the binary tree in Figure 3 and compute the position node for the left child, right child or parent of the node.

In the next section, we will explain how the rank function can be used to determine the children's positions in a *k*2-tree, thus enabling us to query the values of the cells.
