1. Introduction
With the increasing prevalence of the Internet, the security of communication channels cannot be guaranteed due to the risk of eavesdropping. In cases where transmitted data is confidential, such as military intelligence or medical histories, encryption is a common method used to protect sensitive information. However, encrypted data is often more easily detected by eavesdroppers due to its complex nature compared to unencrypted data. Consequently, steganography [
1,
2,
3,
4] has emerged as a crucial technique for enhancing information security.
Steganography, in contrast to cryptography, is concerned with concealing communication [
5]. The digital objects typically used in information security include images, texts, videos, and others. Among these, digital images are frequently employed due to their redundancy and ubiquitous use in daily life. The resulting images after secret data is embedded are known as stego-images, while those without such data are called cover images. The evaluation of a steganography technique involves three primary benchmarks: the capacity to conceal secret data, the visual quality of the stego-images, and the security, which refers to the detection of steganalysis [
2]. A steganography scheme should therefore aim to maximize the amount of data that can be embedded in stego-images while minimizing the distortion between cover images and stego-images and avoiding suspicion of the presence of any secret information. Recently, there has been a growing interest in deep learning-based image steganography techniques, and this has involved discussions on existing methods and has provided valuable insight [
3].
With advances in imaging technology, higher-quality images that better resemble real-world scenes are being sought after. This has led to the development of a new format for digital images, which can be divided into two categories: low-dynamic-range (LDR) images and high-dynamic-range (HDR) images. An LDR image is a traditional image that is composed of three color channels, red (R), green (G), and blue (B), each represented using 8 bits, for a total of 24 bits. In contrast, an HDR image represents each R, G, or B channel using 32 bits, for a total of 96 bits. Unlike LDR images, the dynamic range of HDR images can be recorded using 96 bits, providing greater detail in even the darkest and brightest regions beyond what the human eye can perceive. HDR images can be generated using HDR cameras or by combining multiple LDR images with different exposures [
6]. Due to the high cost of storing and transmitting 96-bit HDR images, researchers have developed various compact formats for HDR images, including RGBE [
7], LogLuv [
8], OpenEXR [
9], and JPEG-XT [
10].
While steganography schemes for LDR images have reached relative maturity, the research on steganography for HDR images is still in its early stages. HDR imaging techniques have been developed to capture the full range of color and light that the human eye can perceive in the real world. As a result, HDR media contain significantly richer content than their LDR counterparts and are therefore much more valuable. In fact, most cameras and smartphones currently available on the market are capable of capturing HDR images. Thus, it is crucial to develop proper tools for protecting the intellectual property of digital HDR media from the early stages of technology development.
There are two primary categories of data hiding or steganography algorithms designed for HDR images. The first type aims to achieve high-capacity data hiding [
11,
12] by conveying a significant amount of secret messages, but at the expense of producing a stego-image with substantial distortion. These algorithms are currently state-of-the-art and offer an embedding rate of at least 5 bits per pixel. The second type of algorithm aims to achieve a high image quality in data hiding [
4,
13,
14,
15,
16,
17]. These algorithms leverage the RGBE HDR encoding format to conceal a small quantity of messages; however, the capacity provided by these algorithms is limited to less than 0.5 bits per pixel. They are also known as distortion-free algorithms since any distortion produced after secret message embedding is negligible, resulting in a stego-image that is identical to the cover tone-mapped image after the tone-mapping operation. Due to the limited capacity offered by these distortion-free algorithms, it becomes challenging for them to support applications that require a large capacity.
Wang et al. [
11] were among the pioneers to propose an HDR steganographic algorithm that conceals secret messages inside images in the Radiance RGBE format without impairing the image quality, making it undetectable to potential attackers. Yu and Wang [
18] proposed a different approach for steganography in HDR images in RGBE format, leveraging the characteristics of the format to separate flat and boundary areas in cover images and embed data using two-side methods [
19] with different strategies. Li et al. [
12] introduced a data hiding scheme for HDR images in LogLuv format, which involves embedding data in the least significant bit (LSB) of the HDR image. The core of Li et al.’s method is inspired by the optimal pixel adjustment process (OPAP) [
20,
21], which adjusts pixel values to minimize pixel variation after LSB replacement. However, most of the aforementioned research modifies steganography techniques originally designed for LDR images to apply to HDR images, rather than developing tailored approaches for HDR image formats.
In a recent study, Yu et al. [
13] presented a data-hiding approach for HDR images in the RGBE format, which can be used for image annotation or steganography. The scheme proposed by Yu et al. leverages the homogeneity present in the RGBE format to embed secret data. Building upon this work, Chang et al. [
14] (2016) proposed a novel data-hiding scheme for image annotation application by utilizing homogeneous representation groups more efficiently. However, it is noteworthy that potential vulnerabilities may exist in Yu et al.’s data-hiding scheme when applied for steganography purposes.
- A.
Motivation
This research is centered around the steganography scheme proposed by Yu et al. [
13], which has an average capacity within the range of 0.0010–0.0026 bits per pixel. In order to fully leverage the strengths of Yu et al.’s scheme tailored for HDR images, it is worth exploring potential methods of enhancing its capacity and security for steganography purposes. Our findings reveal that, while Yu et al.’s scheme maintains imperceptibility in pixel differences before and after embedding, it suffers from a scarcity of embeddable pixels in general HDR images, potentially resulting in low capacity. Additionally, the embedding process disrupts the normal distribution of pixel values, which introduces potential weaknesses when stego-images are subjected to statistical analysis attacks. Thus, a potential security concern exists.
- B.
Main Contributions
This paper presents an improved steganography scheme for HDR images that are encoded with the RGBE format. The scheme builds upon the distortion-free steganography scheme proposed by Yu et al., which is thoroughly analyzed in this paper. Although Yu et al.’s scheme exhibits imperceptible changes in pixel differences before and after embedding, the scheme’s embeddable pixels are scarce for general HDR images. This is due to the low capacity of the scheme and the disruption of number distribution during the embedding process, making stego-images vulnerable to statistical analysis attacks.
To address these limitations, the proposed scheme employs pre-processing to convert the original pixels into embeddable pixels, thereby increasing the number of embeddable pixels. Additionally, post-processing is designed to eliminate abnormal number distributions by incorporating random numbers and addressing potential security vulnerabilities arising from statistical analysis. The primary contribution of this study lies in the proposed scheme, which not only builds upon the strengths of Yu et al.’s scheme but also significantly improves the embedding capacity of HDR images. Simultaneously, the proposed scheme ensures the preservation of visual quality and reinforces security against statistical analysis attacks. The experimental result shows that the capacity increases 10 times without visual distortion.
The rest of this paper is organized as follows.
Section 2 briefly introduces the related work including Yu et al.’s scheme and its potential weaknesses. The enhanced steganography scheme is proposed in
Section 3. Experimental results and discussions are shown in
Section 4, which is followed by the conclusion in
Section 5.
2. Preliminary
This section provides a brief overview of the scheme proposed by Yu et al. [
13], followed by an analysis of its potential weaknesses with respect to steganography security and hiding capacity.
2.1. Embedding in Yu et al.’s Scheme
The RGBE format, also known as the Radiance format, was introduced by Ward [
7] in 1991 as the first efficient HDR image format. This format utilizes 32 bits to represent each pixel in an HDR image, which is more efficient compared to the uncompressed HDR image format that requires 96 bits. In the raw HDR image format, each pixel comprises three color channels: red, green, and blue, and each channel is represented by a 32-bit floating-point value. However, in RGBE images, each pixel consists of four channels to capture the red, green, blue, and exponent values. These channels are represented by 8-bit integers, and their values range from 0 to 255.
In the context of HDR image processing, we denote a pixel of raw HDR images as
P(
r,
g,
b), whereas a pixel that has been encoded with the RGBE format is represented as
. The transformation of pixel values from the raw HDR format to the RGBE format can be achieved using the methods described in Equations (1) and (2).
Due to the exponent channel in the RGBE format, multiple representations exist to express the color of a pixel. Specifically, a pixel
can also be represented as
(or
) by multiplying (or dividing) each color channel by 2 and subtracting (or adding) 1 in the exponent channel. This feature is referred to as the homogeneity of RGBE, as defined by Yu et al. [
13].
To facilitate comprehension, an illustrative example is presented to explicate the embedding operations described below.
Example 1. Table 1 presents the homogeneous representation group (HRGp) elements of a pixel
, which includes
, , and
sorted in ascending order of homogeneity, as determined by the exponent channel values. The number of elements in HRGp is equivalent to the homogeneity value (HVp), while each element is assigned a homogeneity index (HIp). It should be noted that the element exhibits an odd value of 11 in the blue channel, rendering the division operation inapplicable and thus classifying it as the dominant channel for P. To illustrate the embedding operations in Yu et al.’s scheme in a generalized manner, we present the following example. Specifically,
Table 2 is employed to embed the corresponding secret bits into a cover pixel with varying homogeneity values. Building upon the example provided in
Table 1, we first compute the
HVp and HI
p values of cover pixel
. Since the HV value of pixel
P is 3,
Table 2 is consulted to determine that one bit can be conveyed. Assuming a secret bit of 0 is to be embedded,
Table 2 indicates an HI value of 1, resulting in the modification of
to
, as depicted in
Table 1. Conversely, if the embedding secret bit is 1, the corresponding HI value is 0, and
remains unchanged.
Yu et al. proposed a classification method for pixels, dividing them into seven distinct categories, as presented in
Table 3. Here, the variable
denotes the highest value present in a given pixel. Based on their characteristics, pixels are further classified into either “
regular” or “
irregular” categories. The “
regular” category indicates that the highest value of a pixel across the three channels, denoted as
, is equal to or greater than 128, while the “
irregular” category refers to
being less than 127. Importantly, a pixel can only belong to one of these two categories, but not both.
For a comprehensive understanding of Yu et al.’s scheme, readers are referred to the original paper.
2.2. Security by Statistical Analysis
The scheme proposed by Yu et al. serves the dual purpose of image annotation and image steganography. In the context of steganography, the primary objective is to ensure security against statistical analysis. As in the case of cryptography, steganography algorithms are assumed to be publicly available, and the security of the scheme is based on the concealment of secret data without detection. If it can be demonstrated that an algorithm can determine, with a higher success rate than random guessing, whether a given image contains embedded secret data, then the steganography scheme is deemed to have been compromised.
Steganalysis, akin to cryptanalysis in cryptography, involves identifying concealed messages that have been concealed through steganography using various techniques, including perceptual, statistical, specific, and universal blind analysis [
22]. While Yu et al. have shown promising results in perceptual analysis, which pertains to the level of distortion, their scheme has a clear vulnerability in specific analysis. The specific analysis concentrates on identifying weaknesses in a particular steganography scheme. While Yu et al. have asserted that stego-HDR-images do not arouse suspicion among eavesdroppers, our findings demonstrate that this claim is not always valid.
Two categories of pixels are identified (see
Table 3) as suitable for the embedding of secret messages based on the following definitions [
13]:
embeddable and
promising.
In the context of the “embeddable” category, a pixel is considered embeddable if its HV value falls within the range from 2 to 7. An embeddable pixel may also be designated as “promising”, which can be further divided into two specific categories. The first category corresponds to pixels with a maximum value of 127 in any of the Red, Green, or Blue channels, while the second category requires that the maximum value be 254 and that all three channel values be even. Specifically, the first category of “promising” pixels with can facilitate one multiplication operation, while the second category with can facilitate one division operation. Therefore, the “promising” pixels with an HV value of 2 (HVp = 2) have the ability to convey one bit of a secret message.
Yu et al.’s image steganography scheme employs the technique of utilizing “promising” pixels to conceal secret messages, thereby ensuring that the magnitude of change resulting from embedding the secret data remains within a reasonable range in each case. However, upon conducting a detailed analysis, two potential vulnerabilities have been identified in Yu et al.’s steganography approach.
Observation 1. The proportion of even-valued integers present in the three color channels where will be altered from 25% to 12.5% following the embedding procedure in Ref. [13]. As noted in Ref. [
13], the capacity is primarily attributable to pixels with a homogeneity value of 2. These pixels make up roughly 11% of the pixels. Thus, it can be inferred that the majority of almost embeddable pixels possess an HV of 2. As only “
promising” pixels are selected for embedding secret messages in steganography, it is important to note that these pixels possess certain characteristics, namely HV = 2 and
or
. In order to examine homogeneity, the following analysis is presented. With the exception where
or
, the other four values contain at least two even numbers, which can be combined to generate another element through a division by 2 or multiplication by 2 of another element. The remaining two numbers have a 50% probability of being even.
In summary, in the three color channels of embeddable pixels, the likelihood of an arbitrary number’s being even is 75%, while the likelihood of it’s being odd is 25%. Prior to message embedding, the probability that all three color channels are even in the cover image is 25% (i.e.,
). However, after embedding the secret message, approximately half of the even numbers will become odd due to multiplication and division, resulting in a probability of 12.5% that all three color channels are even in the stego-images, illustrated in
Figure 1.
Observation 2. The number of embeddable “irregular” pixels noticeably increases following the embedding procedure in Ref. [13]. Based on the analysis of the HDR image database in the study by Yu et al., the number of embeddable “
irregular” pixels is typically negligible. However, following the embedding process, the number of embeddable “
irregular” pixels noticeably increases. This operation disrupts the typical distribution, as roughly half of the “
promising” pixels in each HDR image are transformed into embeddable “
irregular” pixels. As shown in
Figure 2, these non-normal distributions of pixel counts can easily trigger suspicion upon a specific analysis of the stego-image. The above analysis is presented in
Section 4.
As the eavesdropper acquires an increasing number of HDR images, the vulnerability of the security scheme to specific forms of analysis becomes increasingly evident.
2.3. Hiding Capacity
In the context of image steganography and image annotation applications, the concealment capacity is consistently regarded as a critical factor. The experimental findings in Yu et al.’s image steganography reveal a small capacity ranging from 0.0010 to 0.0026 bits per pixel (bpp). However, this level of capacity is not considered outstanding.
In summary, there are two key challenges that require addressing: (1) security, i.e., the need to mitigate abnormal distribution of data, and (2) capacity, i.e., the requirement to enhance the embedding performance.
3. Proposed Scheme
To address the aforementioned issues, an improved steganography scheme composed of three phases is presented. The corresponding flowchart is depicted in
Figure 3. The proposed approach involves three phases: pre-processing, embedding, and post-processing. In the pre-processing phase, the original cover image undergoes modification to generate additional pixels with homogeneity values, increasing embeddable pixel capacity. The subsequent embedding phase entails the incorporation of secret data into the cover image. In the post-processing phase, a pseudo-random number generator (PRNG) generates a random bit stream, which is utilized to adjust the pixel values of the embedded image, resulting in a stego-image with a histogram similar to that of typical natural images.
The statistical analysis conducted prior to embedding reveals that, with the exception of 0, the maximum value of the three color channels for most pixels is not less than 127. This finding indicates that the maximum pixel value cannot be divided more than once. To address this limitation, the proposed scheme assumes that the embeddable pixels can be divided or multiplied once for each of the three color channels while increasing the exponent channel by 1. To implement this assumption, we propose a straightforward scheme that involves two steps.
First, we identify specific pixels using one of four cases. Second, we adjust the pixel values using one of four different methods, depending on the case. These two steps are further elaborated below.
Step 1. The pixels with , where || means the or operation, are selected.
Step 2. The selected pixels are further modified as follows.
Case 1: If a pixel with , the individual R, G, and B values of the pixel are unaltered.
Case 2: If a pixel with , the maximum value, i.e., 128, undergoes a decrement of 1, and the new maximum value becomes 127.
Case 3: If a pixel with , the odd pixel values can be converted to even values by subtracting 1.
Case 4: If a pixel with , either the maximum value of 255 or any odd values will decrease by one to become even values.
The embedding procedures are analogous to those delineated in Yu et al.’s scheme and, accordingly, are briefly introduced below.
Step 1. For a in RGBE format, its homogeneous representation group (HRGP) elements are determined, and its corresponding homogeneity value (HVp) for this pixel is defined as the number of elements in the homogeneous representation group.
Step 2. Every element in the HRG is sorted according to the value represented in the exponent channel in ascending order, and an index is assigned to each sorted element. This enables the definition of a homogeneity index (HI) for every sorted element in HRG, with the HIP having a range from 0 to (HVp-1).
Step 3. Upon determining the homogeneous group and the homogeneity value of a pixel K, HVK, the pixel capacity in bits is computed, denoted as , as shown in Equation (4).
Step 4. The homogeneity index table (
Table 2) is utilized to facilitate the embedding process of the cover pixel
. Depending on its homogeneity value HV
K, the numbers of bits that can be conveyed are listed in the first column of
Table 2. In the third column, the bit patterns of the secret message that can be concealed, corresponding to different homogeneity indices, are described. By referring to
Table 2, the cover status
(HV
K, HI
K) can be modified, and the status of the cover pixel, which is changed to the stego status S(HV
K, HI′
K), is recorded. This indicates that a desired bit pattern of the secret message has been conveyed by the stego pixel. If the homogeneity value (HV
K) is less than or equal to 1, this pixel cannot convey any secret message.
Once the embedding process of our enhanced scheme is completed, the resulting embedded image exhibits a concentration of pixel values towards even numbers. To address this, we perform a further adjustment to transform the embedded images into stego-images that exhibit a normal distribution of numbers across the three color channels. This is achieved through the following two-step process. Firstly, a PRNG is employed to generate a randomly arranged bitstream. Secondly, for each of the three color channels, a random bit is selected and mapped to a corresponding number. While this process may lead to increased image distortion, it is an acceptable trade-off to achieve a higher level of security, as the eavesdropper is unable to obtain the original or cover images. Moreover, the distortion introduced by our scheme is imperceptible to the human eye. The aforementioned two steps are further elaborated below.
Case 1: Given a pixel with , the maximum values are increased by adding 1 to become 128.
Case 2: Given a pixel with , if the corresponding randomly generated bit is “0”, the maximum value remains unchanged. However, if the bit is “1”, the maximum value is incremented by 1 to become 255. Similarly, the values of the remaining two color channels are modified based on their respective randomly generated bits. If a bit is “1”, the value remains the same. It is worth noting that the maximum value of the embedded image is limited to 254 to avoid overflow during post-processing.
The secret message extraction process is made straightforward. A stego HDR image is provided, and every pixel is examined. The homogeneity value, HV
K, for each stego pixel (e.g.,
K) that is inspected, is computed. If HV
K is less than or equal to 1, no secret message is conveyed by this pixel. Alternatively, if HV
K is greater than 1, the homogeneous representation group, HRG
K, for this stego pixel is produced, and the number of concealed secret message bits in the cover pixel
K is calculated using Equation (3). The homogeneity index of the cover pixel, HI
K, is determined by comparing the cover pixel
K with all elements in HRG
K, and the status of the stego pixel, S(HV
K, HI’
K), is generated. Ultimately, by referring to the homogeneity index table (
Table 2) and S(HV
K, HI’
K), the bits of the secret message can be extracted.