1. Introduction
Matter, energy, and information are the three fundamental elements constituting the objective world. Information technology, in particular, plays a key role in humanity’s perception of the objective world, with infrared imaging technology serving as a vital component of this technology. Infrared imaging technology encompasses types such as line-array detector scanning imaging and area-array detector staring imaging [
1]. The area-array detector captures the entire field of view at once and is commonly used in applications such as night vision devices, video surveillance, and others. In contrast, the line-array detector captures panoramic images by rotating 360 degrees, making it ideal for large-field applications like airborne small-target detection and border security surveillance. As infrared imaging technology advances toward higher resolution, faster frame rates, and greater pixel bit depth, high-rate data transmission and storage face corresponding challenges. For example, a 14-bit line-scan image with 3072 pixels per column and approximately 60,000 columns per panoramic image has an imaging cycle of about 25 microseconds per column, leading to an imaging rate of 240 MB/s and a frame size of 350 MB. Lossless data compression, which can reduce data redundancy, is essential for improving data transmission and storage efficiency.
Image compression methods leverage the natural compressibility of images by utilizing correlations between adjacent pixels to reduce redundancy and concentrate information. Finally, entropy coding is applied to remove coding redundancy, further achieving compression. Image transformation and prediction are widely used techniques for reducing image redundancy. To complement these approaches, dictionary-based compression methods, such as Lempel–Ziv–Welch (LZW), Lempel–Ziv 77 (LZ77), and Lempel–Ziv 78 (LZ78), are also employed [
2,
3]. These methods reduce redundancy by identifying and encoding repeated patterns or sequences in the image data, which can be especially effective for certain types of image content. Transform-based image compression converts images into a domain with less redundancy, where information is concentrated and easier to encode. The image is usually transformed into the frequency or spatial domain and then encoded using the properties of the transform coefficients, such as image compression based on discrete wavelet transform (DWT) [
4,
5,
6], discrete cosine transform (DCT) [
7], and integer discrete Tchebichef transform [
8]. Prediction-based image compression exploits the correlation between pixels. The residual image obtained by subtracting the original image from the predicted image contains less redundant information. The residual image typically has a narrow range of pixel values, making it more effectively compressed through entropy coding techniques such as Huffman coding [
9] or arithmetic coding [
10]. Typical prediction-based compression includes DPCM [
11] and LOCO-I [
12]. Predictive techniques are typically integrated with transformations, such as utilizing DPCM for DC coefficients in the DCT within the JPEG, or applying DPCM to wavelet coefficients in JPEG 2000. This amalgamation allows for further compression by exploiting the inter-coefficient correlation. The intra prediction coding in video coding standards such as H.264/AVC and High Efficiency Video Coding (HEVC) also utilizes DPCM [
13,
14]. Dictionary-based compression methods dynamically build or reference predefined dictionaries of patterns, replacing repeated sequences with shorter symbols to achieve efficient compression without relying on transformation-specific knowledge. However, the data encoded by the dictionary may still contain redundancy, such as certain symbols or indices appearing more frequently. To address this, entropy coding is applied to further compress the data by assigning shorter codes to more frequently occurring symbols, such as in the Lempel–Ziv–Markov chain Algorithm (LZMA) and Deflate [
15,
16]. Dictionary-based lossless image compression has inherent limitations, as these methods rely on repeated patterns within the data and do not fully leverage spatial redundancy in images, which makes them less effective in eliminating spatial redundancy. Furthermore, dictionary encoding lacks sufficient capability to handle fine details. These limitations become particularly evident in 14-bit line-scan panoramic images, which feature a large number of source symbols, a wide symbol range, abundant details, complex structures, and rapidly changing scenes.
Traditional image compression methods, such as JPEG, JPEG-LS, and JPEG 2000 [
17,
18,
19,
20,
21], are designed to minimize perceived quality loss by the human visual system (HVS) while reducing data transmission rates to improve the efficiency of image transmission and storage. JPEG (Joint Photographic Experts Group) is a widely used lossy image compression standard that reduces file sizes by discarding information that is less noticeable to the human eye, often resulting in some loss of image quality. JPEG-LS (JPEG Lossless and Near-Lossless Compression), on the other hand, is a standard for lossless or near-lossless compression, providing higher compression ratios while maintaining the original image quality. In recent years, more advanced image compression schemes have been continuously proposed and developed. For instance, ref. [
22] proposed a compressive sensing-based image compression system. Nevertheless, their advantages over JPEG are not significant enough to justify the creation of new standards.
Image compression includes both lossy and lossless methods. JPEG is a lossy image compression method designed for lower bit-depth images, specifically for 8-bit images. Ref. [
22] is essentially a lossy compression method. However, lossy compression may not be suitable for many imaging applications that require high precision, such as hyperspectral imaging and infrared weak target detection. Since lossless compression methods can fully recover the original data, they are more favored in these applications. JPEG 2000 offers lossless compression capabilities, achieving efficient compression at the cost of increased computational and memory resource consumption. Its complexity arises from the implementation of the 5/3 lifting wavelet transform, bit-plane coding, and MQ coding, which refers to a context-based arithmetic coding technique used to efficiently encode the quantized wavelet coefficients in the image, improving compression performance by exploiting the statistical dependencies between the coefficients. JPEG-LS achieves a low-complexity lossless compression that is easy to implement in hardware, using simple prediction, context modeling, and Golomb coding. This approach sacrifices compression efficiency in favor of speed improvement. The performance of JPEG-LS improves with simpler image scenes, whereas in more complex scenes, the prediction and context updating become more intricate, leading to a decrease in compression speed. JPEG-LS is only applicable to 8-bit and 12-bit images and is not suitable for images with higher bit depths. The complexity of image redundancy removal and entropy coding, along with limitations in pixel bit depth, restricts the application of JPEG-based algorithms in line-scan imaging with high data rates and high pixel bit depths.
This study focuses on 14-bit line-scan infrared panoramic images. Unlike traditional area-array images, each row of the image is generated by the same photosensitive element, leading to stronger inter-column correlation. Conventional JPEG series image compression methods do not take into account the characteristics of the line-column scanning image. Based on the characteristics of 14-bit line-scan infrared panoramic images, this paper analyzes the feasibility of removing spatial redundancy through inter-column differencing. The inter-column differencing DPCM prediction method is used to replace the complex wavelet transform in JPEG2000 for removing spatial redundancy in the image. This paper also designs an improved Huffman coding scheme to replace the complex entropy coding in JPEG 2000. The improved Huffman coding simplifies the image compression process by using a code table. Additionally, a method for generating the code table is proposed, simplifying the compression process by avoiding pixel statistics in entropy coding. Based on the proposed methods, a low-complexity lossless compression algorithm based on the code table is ultimately implemented using a simple lookup method. The structure of this paper is as follows:
Section 2 introduces related work on Huffman coding,
Section 3 analyzes methods for redundancy removal in line-scan infrared panoramic images,
Section 4 presents the code-table-based lossless compression method proposed in this paper,
Section 5 provides the experimental results, and
Section 6 concludes the paper.
2. Related Works
The speed and efficiency of image compression are common objectives in both academia and industry. There are two main approaches to improving the encoder speed: hardware acceleration and designing low-complexity algorithms. The speed of the encoder is dependent on the processor’s performance, and the slowdown in processor performance improvements has driven the development of parallel architecture processors. As a result, image encoding algorithms have transitioned from single-threaded to multi-threaded algorithms [
23,
24]. However, in embedded platforms with limited hardware resources, it is crucial to design lossless compression algorithms with higher bit depths and lower computational complexity.
The core of the method proposed in this article is Huffman coding. We researched and improved the Huffman code. Lossless data compression is based on information theory, with its theoretical limit being entropy [
25]. Huffman coding needs to count the probability of occurrence of source symbols, assign shorter codes to the symbols with high probability and longer codes to those with low probability, thereby achieving entropy coding, which approximates the theoretical entropy value. Huffman coding requires two passes over the source data to build the frequency table and generate the code table. Vitter [
26] proposed a dynamic Huffman algorithm to scan the data only once. However, with the constant modification of the Huffman tree as new symbols appear, the dynamic Huffman algorithm leads to a rapid increase in computational effort. This algorithm is not suitable for compressing large datasets with multiple source symbols. Schwartz’s [
27] canonical Huffman coding requires minimal data storage to reconstruct the Huffman tree. Unfortunately, both Huffman and canonical Huffman coding need to count the probability of symbol occurrence before compression, which seriously affects the compression speed. Reinhardt [
28] improved compression speed by pre-allocating a Huffman code table. Nevertheless, this method requires offline data analysis to define the code table and is limited to 256 symbols, restricting its range of applications. Yunge’s [
29] algorithm for dynamic code table enhances adaptability by increasing the number of code table. However, it requires evaluating the variance of symbol changes to reselect the code table each time, leading to high complexity and reduced compression speed. The code table method necessitates longer codes when compressing large datasets with multiple source symbols. Nonetheless, symbols corresponding to these longer codes typically have low occurrence probabilities, and constructing such long codes can adversely affect both the compression efficiency and processing speed. Reinhardt [
30] proposed a truncated Huffman tree algorithm by only preserving codes with high-frequency occurrences, while symbols with low-frequency occurrences have no designated code. To distinguish between coded and uncoded bit streams, an extra 1-bit (0 or 1) is required, which increases the length of all encoded symbols by one bit and significantly degrades the compression performance. Xu’s [
31] modified adaptive Huffman coding algorithm adds unknown symbol nodes to achieve uniform coding without requiring additional 1-bit identifiers, but it suffers from high complexity.
To reduce the algorithmic complexity, this paper proposes a novel method for constructing a Huffman code table that bypasses the process of calculating pixel occurrence probabilities in entropy coding. Additionally, an improved Huffman coding scheme is introduced to handle the longer codes required for 14-bit images by truncating longer codes with low complexity and minimal compression ratio loss. This ultimately achieves a low-complexity lossless compression method based on a code table for infrared images.
3. Redundancy Analysis of Line-Scan Panoramic Infrared Images
From information theory, information can be measured by the self-information. Suppose the set of source symbols is
. The probability of occurrence of each symbol is
. The self-information
of
is defined by the equation
The smaller the probability of symbol
, the more information it conveys. The information entropy
is the mathematical expectation of the self-information
. The information entropy
indicates the minimum number of bits needed to represent each source symbol in a binary computer, which is defined as
A two-dimensional (2D) image is a kind of information that humans can intuitively perceive. The continuous tonal distribution in nature leads to significant spatial redundancy in visible images, while the scene’s infrared radiation continuity results in similar redundancy in infrared images. The image spatial redundancy is manifested as a large number of neighboring pixels with little or no change, resulting in a high correlation between the image pixels. Infrared images, due to their spatial redundancy, exhibit low actual information entropy, indicating high compression potential. However, obtaining the exact source entropy is challenging and can only be approximated closely by certain methods. In the field of digital imaging, image differencing represents the changes between adjacent pixels. The correlation between difference pixels is weak, effectively reducing spatial redundancy. Digital image differentiation involves both inter-column and inter-row differencing. Inter-column differencing is defined as
where
is the current column pixel,
is the previous column pixel, and
is the differential pixel
. The inter-row differencing can be derived analogously.
Infrared line scanning imaging, as it is distinct from area array imaging, involves capturing panoramic images through a 360-degree rotating scan. In line-scan images, each row is acquired by the same photosensitive element, leading to stronger inter-column correlations compared to traditional images. We choose two 640 × 512 14-bit infrared images, which are a portion of a line-scan panoramic infrared image, and analyze the correlation of the original and differential images, named image A and image B. We calculate the inter-column correlation coefficients and count the probability of occurrence pixels. The experimental results are shown in
Figure 1 and
Table 1.
The inter-column correlation coefficient is defined as
where
is the mean value of the
jth column. The inter-row correlation coefficient can be derived analogously.
The experimental results indicate that the correlation coefficients of the original image all exceed 0.99. Inter-column differencing significantly reduces the correlation of the image. The entropy of the original image A is 10.3854 bit/symbol, while the entropy of the inter-column differential image A, at 4.9182 bit/symbol, is lower than that of the inter-row differential image A, which is 6.3071 bit/symbol. Further analysis on images from 51 diverse scenarios consistently demonstrates that the entropy of inter-column differential images reaches a minimum. These findings are summarized in
Table 2. The 51 images are 14-bit infrared images captured by a line-scan infrared detector in different scenarios. The image sizes vary and include 640 × 512, 1000 × 2000, 2000 × 4000, 1000 × 4000, and 2000 × 8000.
Additionally, the infrared panoramic images processed in this study consist of 3072 pixels per column, with an image width of approximately 60,000 columns. When employing inter-column or inter-row differencing, preserving the first column or row is necessary for the restoration of subsequent rows or columns. As the first column of the image is generated at the beginning of the detector’s operation, only 3072 original pixels need to be immediately stored for inter-column differencing as opposed to 60,000 for inter-row differencing at different times. To maximize the compression speed, we only utilize inter-column differencing to eliminate image redundancy.
After the image redundancy is removed, the subsequent entropy coding requires the probability distribution information of the pixels. For a 14-bit infrared image, the dynamic range of pixel values is , but the differential image dynamic range is doubled to . The experimental results also show that differential images often contain many 0 pixels, approximating a Laplace distribution. Therefore, through extensive experimentation, a general probability distribution model can be found to predict other differential images. A general distribution probability model can generate a general Huffman code table applicable to generic images. The higher the prediction accuracy of the probability distribution model, the better the compression.
5. Results
The mean square error (MSE) between the original and reconstructed images is used to verify that the compression is lossless. MSE is defined as
where
is the pixel at row
i and column
j of the reconstructed image, and
is a pixel at row
i and column
j of the original image. MSE describes the reconstruction error between the reconstructed image and the original image. The MSEs of the 53 scenes calculated during our experiment are all 0, indicating that the proposed algorithm achieves lossless compression.
Additionally, SSIM (Structural Similarity Index) is a metric used to measure the similarity between two images, defined as follows:
where
and
are the mean values of the original and reconstructed images, respectively.
and
denote the variances of the original and reconstructed images, respectively.
represents the covariance between the original and reconstructed images.
and
are small constants used to stabilize the computation and prevent the denominator from approaching zero. All 53 images have an SSIM of 1, further confirming that the compression is lossless and the reconstructed images are structurally identical to the originals, with no loss in image quality.
The code table is calculated from the 53 scenes, so we need another scene to verify the algorithmic generality. We capture another 37 scenes for validation, partially displayed in
Figure 7.
The compression ratio is defined as
We use a 16-bit space to store a 14-bit pixel, so the theoretical limit compression ratio is defined as
5.1. Proposed Method Compared with JPEG Series Algorithms
On an experimental platform of 12th Gen Intel(R) Core(TM) i7-12700H, a 20-core CPU at 2.30 GHz, and 16 GB of RAM, we test JPEG 2000, JPEG XL, JPEG XT, and the method proposed in this paper, and the results are shown in
Figure 8 and
Table 8.
Figure 8a shows the compression ratio test results of the proposed method and JPEG series methods on 37 images, along with the theoretical compression ratio calculated based on the entropy of the difference image.
Figure 8b presents the speed test results.
Table 8 records the average values of the compression ratio and speed from
Figure 8. It also includes the percentage change in the compression ratio of the proposed algorithm compared to both the JPEG series methods and the theoretical value.
The method proposed in this paper achieves an average compression ratio of 3.3 for infrared line-scan images, which is 5% lower than the theoretical value. Compared to JPEG 2000, the proposed method incurs a 10% loss in compression efficiency but provides a 20-fold speed improvement, reaching an average of 210 MB/s. Its performance and speed both outperform JPEG XT, and it is approximately 8 times faster than JPEG XL.
5.2. Proposed Method Compared with TIFF
TIFF (Tagged Image File Format) is an image storage format. The term “Tagged” in “TIFF” refers to the complex file structure of this format. TIFF allows the flexible use of compression methods to maintain image integrity and clarity. Lossless compression methods that can be used include LZW, Deflate, LZMA, and Packbits [
32,
33]. The dictionary-based table lookup encoding in TIFF, such as LZW, completely avoids frequency statistics, providing a very high compression speed. However, dictionary-based lossless image compression has limitations, as it cannot effectively remove spatial redundancy in images. The Deflate method can achieve higher compression efficiency by adjusting the dictionary size. However, using an excessively large dictionary increases memory and time consumption, which may degrade the compression speed.
In the experiment, we test the performance of LZW, Deflate, LZMA, and Packbits. The results are shown in
Figure 9 and
Table 9.
The Deflate method combines LZ77 and Huffman coding. Under the condition of maximum compression efficiency, its speed is 8 times faster than JPEG 2000. However, it experiences a significant loss in compression ratio, approximately 46%. In contrast, the proposed method outperforms Deflate in both speed and efficiency.
In TIFF, the compression efficiency of the LZW method is not adjustable. Although it achieves a significant speed improvement, approximately 31 times faster than JPEG 2000, the compression ratio loss is substantial, reaching 63%. The proposed method, compared to LZW, is better at achieving high-speed image compression while maintaining high compression efficiency.
The LZMA method achieves a high compression ratio, but this comes at the expense of compression speed, with speeds comparable to JPEG2000 at its highest compression efficiency, yet resulting in a 17% loss in compression ratio. The proposed method outperforms LZMA in both speed and efficiency.
The PackBits method is a simple variant of Run-Length Encoding (RLE). The RLE pattern is as follows: repeated symbol + count of repetitions. While the PackBits method provides high compression speed, in images with few repeating patterns, the overhead of recording the repetition count causes the compressed files to be twice the size of the original, resulting in no effective compression.
It is worth noting that the method in this paper was tested only in single-threaded, single-core mode, without any SIMD (Single Instruction, Multiple Data) optimizations. Based on the above experiments, we conclude that, compared to dictionary-based methods, the proposed method ensures high compression efficiency while achieving fast compression for line-scan panoramic infrared images.
6. Conclusions
This paper designs a new low-complexity lossless compression algorithm for 14-bit line-scan infrared images. This paper proposes a method for constructing a Huffman code table, replacing the pixel probability statistical step in entropy coding, thereby improving the compression speed. For more bit images, the method proposed in this paper can be used to find a new probability model to compute a new code table. Additionally, this paper designs an improved Huffman coding scheme to handle the longer codes of 14-bit images, truncating long codes with low complexity and minimal compression ratio loss, ultimately realizing a low-complexity lossless image compression algorithm. The method proposed in this paper achieves an average compression ratio of 3.3 for infrared line-scan images, which is 5% lower than the theoretical value. Compared to JPEG 2000, the proposed method incurs a 10% loss in compression efficiency but provides a 20-fold speed improvement, reaching an average of 210 MB/s. Its performance and speed both outperform JPEG XT. Compared to dictionary-based lossless compression methods, the proposed method can achieve high-speed compression while maintaining high compression efficiency.
The method proposed in this paper can be extended to other general image domains that require high-speed compression and can tolerate some loss in compression ratio. Additionally, if the code table is concealed, this method could also facilitate encrypted image transmission, making it applicable to secure communication systems. However, there are certain limitations to the proposed method. For instance, while the method significantly speeds up compression, the 10% loss in compression efficiency compared to JPEG 2000 suggests that further optimizations could be made to balance both the speed and compression ratio. Additionally, the proposed algorithm might face challenges when applied to more complex image types or images with larger bit depths, as the probability model used may need to be adapted to these cases.
Future research can explore several directions. First, improving the Huffman coding scheme to achieve better compression efficiency without compromising speed is a potential area of investigation. Second, adapting the proposed method to work with images of higher bit depths or more complex data types may enhance its applicability in other fields, such as medical imaging or remote sensing. This could be achieved by incorporating more widely used image prediction methods to reduce redundancy, which would allow the approach to better handle more complex image types. Finally, investigating hybrid methods that combine the strengths of both dictionary-based and statistical compression techniques could lead to even more efficient algorithms for infrared image compression.