Next Article in Journal
MDEM: A Multi-Scale Damage Enhancement MambaOut for Pavement Damage Classification
Previous Article in Journal
Compact Colocated Bimodal EEG/fNIRS Multi-Distance Sensor
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Priori Knowledge Makes Low-Light Image Enhancement More Reasonable

by
Zefei Chen
,
Yongjie Lin
*,
Jianmin Xu
,
Kai Lu
and
Zihao Huang
School of Civil Engineering & Transportation, South China University of Technology, Guangzhou 510641, China
*
Author to whom correspondence should be addressed.
Sensors 2025, 25(17), 5521; https://doi.org/10.3390/s25175521
Submission received: 21 July 2025 / Revised: 18 August 2025 / Accepted: 19 August 2025 / Published: 4 September 2025

Abstract

This paper presents a priori knowledge-based low-light image enhancement framework, termed Priori DCE (Priori Deep Curve Estimation). The priori knowledge consists of two key aspects: (1) enhancing a low-light image is an ill-posed task, as the brightness of the enhanced image corresponding to a low-light image is uncertain. To resolve this issue, we incorporate priori channels into the model to guide the brightness of the enhanced image; (2) during the enhancement of a low-light image, the brightness of pixels may increase or decrease. This paper explores the probability of a pixel’s brightness increasing/decreasing as its prior enhancement/suppression probability. Intuitively, pixels with higher brightness should have a higher priori suppression probability, while pixels with lower brightness should have a higher priori enhancement probability. Inspired by this, we propose an enhancement function that adaptively adjusts the priori enhancement probability based on variations in pixel brightness. In addition, we propose the Global-Attention Block (GA Block). The GA Block ensures that, during the low-light image enhancement process, each pixel in the enhanced image is computed based on all the pixels in the low-light image. This approach facilitates interactions between all pixels in the enhanced image, thereby achieving visual balance. The experimental results on the LOLv2-Synthetic dataset demonstrate that Priori DCE has a significant advantage. Specifically, compared to the SOTA Retinexformer, the Priori DCE improves the PSNR index and SSIM index from 25.67 and 92.82 to 29.49 and 93.6, respectively, while the NIQE index decreases from 3.94 to 3.91.

1. Introduction

Due to irreversible environmental factors and technical constraints, some photographs are often captured under suboptimal conditions, such as underexposure or overexposure [1]. This not only challenges human visual perception but also poses significant difficulties for more advanced image processing tasks, such as object detection [2], multi-object tracking [3], and instance segmentation [4]. Therefore, low-light image enhancement has been a prominent research focus. Low-light image enhancement is primarily achieved through two methods: (1) adjusting the camera parameters based on the environmental conditions before capturing, such as increasing the ISO, decreasing the shutter speed, and widening the aperture; (2) applying algorithms to map low-light images to reference ones after capturing, such as histogram equalization [5], gamma correction, and retinex [6].
Although existing methods enhance image brightness, they also introduce noise, blur, and artifacts in the enhanced images [7]. Plain methods, such as histogram equalization and gamma correction, often produce unnatural artifacts. This occurs because these methods naively apply enhancement functions to map low-light images to reference images, without considering the image as a whole. Land et al. [6] proposed the retinex hypothesis (a synthesis of the retina and cortex). According to this hypothesis, the observed image can be decomposed into two components: an illumination map, which represents the light intensity from the external environment, and a reflectance map, which corresponds to the intrinsic properties of the objects. Many researchers have applied the retinex theory to model the image decomposition process and recover the reflectance map [8,9]. However, due to limitations such as the capturing device and environmental conditions, the observed image often contains noise and distortions, which are subsequently propagated to the reflectance map during decomposition.
In the past decade, with the rapid development of computational power in computers, methods based on convolutional neural networks (CNNs) [10] have emerged as mushrooms after the rain [11,12]. Thanks to the powerful ability of convolutional neural networks, these methods demonstrated remarkable performance across a wide range of tasks. However, deep learning methods are data-driven, which limits their application in low-light image enhancement. With the emergence of paired image (low-light image and reference image) datasets, such as LOLv1 [13], LOLv2 [14], LSRW [15] and SICE [16], deep learning has started to be widely applied to low-light image enhancement tasks. Many researchers have attempted to use CNNs to approximate the decomposition process of the retinex theory [7,13,17,18,19]. In addition, GANs [20] have also been widely used in low-light image enhancement, such as pix2pix [21], CycleGAN [22], PD-GAN [23], and EnlightenGAN [24]. However, deep learning-based methods still suffer from noise, blur, and artifacts. To address this issue, we explain the reasons from the following three perspectives and propose corresponding improvements.
Firstly, in the process of low-light image enhancement, we intuitively assume that a pixel with higher brightness is more likely to be suppressed (become darker), while a pixel with lower brightness is more likely to be enhanced (become brighter), so as to achieve visual balance in the enhanced image. Once the structure and hyperparameters of the low-light image enhancement model are fixed, the enhancement/suppression probability of pixels is also fixed. Therefore, we refer to this probability as the pixel’s priori enhancement/suppression probability. Most existing methods are end-to-end black-box models [7,25], and their operational mechanisms are uninterpretable. Unlike end-to-end models, ref. Guo [1,26] propose using parameterized enhancement functions to individually map pixels in a low-light image, with parameters generated by a deep learning model. In [1,26], the priori enhancement probability of any pixel is fixed at 0.5, which is evidently unreasonable. To resolve this issue, we introduce an enhancement function that adaptively adjusts the priori enhancement probability based on pixel brightness. This ensures that, during the low-light image enhancement process, pixels with lower brightness have a higher priori enhancement probability, while those with higher brightness have a lower priori enhancement probability.
Secondly, in existing low-light image enhancement methods, the receptive field [27] of each pixel in the enhanced image is limited. This limitation causes the pixels to be inconsistent with the overall image, leading to issues such as noise, blur, and artifacts. Ref. [25] introduces the Squeeze-and-Excitation [28] into [29], proposing SE-Res2Net and using it as the backbone network for low-light image enhancement tasks. This method enhances pixel interaction in the enhanced image by averaging each channel and then multiplying it by the corresponding channel, thereby increasing the interaction between pixels to some extent. Recently, Carion et al. [30] introduced the Transformer [31] into computer vision tasks, proposing DETR. However, DETR computes interactions between the features at each position and all positions in the feature map, resulting in a substantial computational burden. Zhu et al. [32] incorporated the deformable convolutions [33,34] into the Transformer, significantly reducing the computational complexity. Inspired by this, this paper proposes the GA Block (Global-Attention Block). The GA Block first computes four reference points for each position in the feature map. Then, the weights of these reference points are calculated based on cosine similarity. Finally, the weighted sum of these features at the four reference points is computed to obtain the output feature at the corresponding position. Since the GA Block computes each pixel in the enhanced image based on all pixels in the low-light image, it alleviates issues such as noise, blur, and artifacts in the enhanced image.
Thirdly, since the relationship between abnormal images and reference images in low-light image enhancement tasks is many-to-many, the brightness of the enhanced images should be controllable. The brightness of the enhanced images generated by [13,24,35] is uncertain. Ref. Guo [1,26] maps arbitrary low-light images to enhanced images with an average brightness of 0.5. Refs. [7,19] incorporate the ratio of the brightness between low-light and reference images into the illumination adjustment process. To address this issue, we propose a priori channels to indicate the brightness of the enhanced image, which is then cross-concat with the low-light image. During the inference stage, the brightness of the enhanced image can be adjusted by modifying the priori channels. The code and experiments have been open-sourced at https://github.com/zefeichen/PrioriDCE (accessed on 18 August 2025).
The contributions of this paper are as follows:
  • This paper presents a parametric enhancement function in which the priori enhancement probability is adaptively adjusted based on pixel’s brightness.
  • This paper proposes GA Block, which allows each pixel in the enhanced image to be computed from all the pixels in the low-light image, making the enhanced image appear more natural.
  • This paper proposes a priori channels for indicating the brightness of the enhanced image, which allows the enhanced image to be freely adjusted.
  • This paper presents comprehensive experiments, and the results demonstrate that the proposed Priori DCE significantly enhances the quality of the enhanced images.

2. Related Works

2.1. Plain Methods

Histogram equalization transforms the probability density function of a low-light image into a uniform distribution using an enhancement function. Pizer et al. [36] proposed Adaptive Histogram Equalization (AHE). AHE divides the image into several subimages, such as multiple 8 × 8 subimages, and then applies histogram equalization to each subimage. Gamma correction maps a low-light image to a reference image using a gamma function. However, due to its simplicity, gamma correction has noticeable drawbacks, such as loss of detail, inappropriate exposure, and unnatural boundary enhancement. To overcome these drawbacks, Bennett et al. [37] enhanced each pixel in a low-light image by adjusting its virtual exposure. Yuan et al. [38] used an S-shaped curve with two parameters to enhance low-light images, automatically adjusting these two parameters based on the image’s brightness.

2.2. Retinex

Land et al. [6] first proposed a low-light image enhancement method based on the imaging principles of the camera, named retinex. The principle of retinex is to decompose the observed image into a reflectance map and an illumination map. Based on this principle, they first introduced SSR (Single-Scale Retinex). SSR estimates the illumination map by applying a Gaussian filter to the observed image and then calculates the reflectance map by deriving it from the observed image and the illumination map. Expanding upon SSR, Rahman et al. [39] introduced MSR (Multi-Scale Retinex), which involves estimating the illumination map through a Gaussian filter at various scales to enhance image details and textures. Nonetheless, akin to SSR, MSR encounters challenges with pronounced color casts in images. To address this issue, Jobson et al. [40] improved MSR and proposed MSRCR (Multi-Scale Retinex with Color Restoration). MSRCR builds upon MSR by introducing a color restoration factor C, to adjust for the color distortion caused by contrast enhancement in local regions of the image. Wang et al. [41] proposed using a dual logarithmic transformation function to map pixels, which helps balance detail and natural in the enhanced image.

2.3. Deep Learning

In recent years, the increasing computational power has led to the widespread application of deep learning-based methods in low-light image enhancement tasks. Given that the decomposition process in retinex theory remains unknown and deep learning models exhibit powerful fitting capabilities, Wei et al. [13] utilized a deep learning model to fit the decomposition process in the retinex theory and proposed RetinexNet. Jiang et al. [24] proposed EnlightenGAN, which employs both a local and a global discriminator to evaluate the local and global features of an image, thereby preserving its details and textures in the enhanced image. Cai et al. [42] incorporated the transformer [31] into the retinex model, proposing Retinexformer. As the attention mechanism in transformers is pixel-based, it considerably increases the model’s computational cost. To address this, Retinexformer modifies the attention mechanism from a pixel-based to a channel-based mechanism. However, whether using retinex theory-based methods or end-to-end methods, the essence of these approaches is to use deep learning to fit the mapping relationship from the low-light image to the reference image. Unlike the aforementioned models, Guo et al. [1] employ a series of parametric functions as enhancement functions, with a deep learning model generating the parameters for these functions. These enhancement functions are then applied to map the low-light image to the reference image. Although [1] is not fundamentally different from previous deep learning-based methods, this approach makes the low-light image enhancement process interpretable. However, the parametric functions used in the methods of [1] assign a priori enhancement probability of 0.5 to all pixels, which is clearly unreasonable.

3. Methodology

In this section, we first present the framework and data flow of the Priori DCE, as illustrated in Figure 1a. Subsequently, we provide a detailed description of the model’s key components, including the parameter generation model (Section 3.2), the enhancement function (Section 3.3), and the training process (Section 3.4).

3.1. Pipeline

Since the photos taken of the same scene change with variations in lighting intensity, the relationship between an abnormal image and a reference image in the same scene is not one-to-one but rather many-to-many. Consequently, the input to Priori DCE comprises two components: the low-light image and the priori channels, as depicted on the left side of Figure 1a. The low-light image represents the scene information under abnormal lighting conditions, while the priori channels represents the desired brightness of the enhanced image. To achieve this goal, during the training stage, we use the channels information of the reference image as the priori channels. Unlike the training stage, the inference stage does not have a corresponding reference image, so we need to predefine the priori channels. Next, we first introduce the data flow during the training stage.
  • The reference image serves as the enhancement target for a low-light image of size M × N × 3 , and the process of deriving the priori channels from it is illustrated in Figure 2. The reference image is first segmented by channel into Chan R, Chan G, and Chan B, denoted as [ R , G , B ] , R , G , B R M × N × 1 . Then, the mean value of each corresponding channel is calculated, and these priori channels (Priori R, Priori G, and Priori B) are generated based on these mean values, denoted as [ R p , G p , B p ] , R p , G p , B p R M × N × 1 . This process can be represented by Equation (1).
    C p = m e a n ( C ) , C [ R , G , B ] .
  • Cross-concat takes the low-light image and the priori channels as inputs and merges them in the order shown in Figure 1a. The merged result is denoted as I p , I p R M × N × 6 , and the merging process is illustrated in Equation (2).
    I p = C o n c a t ( [ R , R p , G , G p , B , B p ] )
  • The parameter generation model processes I p , generating parameters A k that correspond to a set of enhancement functions for each pixel in the low-light image.
  • The low-light image is enhanced using the obtained enhancement functions to generate the corresponding enhanced image.
  • Calculate the loss value between the enhanced image and the reference image, then perform backpropagation and update the parameters in the parameter generation model.
During the inference stage, in order to enhance the low-light image to the corresponding brightness, the low-light image is first split by channels, and then the average value of each channel is calculated to obtain the corresponding Chan R, Chan G, and Chan B. Finally, the Chan R, Chan G, and Chan B are scaled by a factor of γ to serve as prior channels, as shown in the inference stage of Figure 1a.

3.2. Parameter Generation

The parameter generation model is based on UNet [43] and comprises DownSample, UpSample, Concat, Parameter Head, and GA Block C i n , C o u t . Figure 1b shows the details of DownSample, UpSample, and Parameter Head. DownSample is an average pooling layer with a kernel size of 2 × 2 , stride of 2, and padding of 0. Its primary role is to perform 2 × downsampling on the feature map, thereby increasing the model’s receptive field. UpSample is a deconvolution layer with input channels of C, output channels of C 2 , a kernel size of 2 × 2 , a stride of 2, and padding of 0. Its primary role is to double the height and width of the feature map while reducing the channels to half of the original. The role of Concat is to combine feature maps from multiple scales. The Parameter Head comprises a convolutional layer and an activation layer (Tanh+2). The convolutional layer has input channels of 128, output channels of 3 × k , a kernel size of 3 × 3 , a stride of 1, and padding of 1, where k denotes the number of iterations of the enhancement function. This layer adjusts the channels of the output feature map based on the number of iterations of the enhancement function. The activation layer then scales the output feature map to the range [ 1 , 3 ] . GA Block C i n , C o u t refers to a block where the input is a feature map with C i n channels and the output is a feature map with C o u t channels. Figure 3 illustrates the structure of GA Block C i n , C o u t on the left and the core GA on the right. In Figure 3, Conv/ Linear C i n , C o u t indicates that the input to the Conv/Linear layer is a feature map with C i n channels, and the output is a feature map with C o u t channels. A Conv/Linear layer without special notation implies that both the input and output feature maps have the same size and channels. Notably, GA Block C i n , C o u t alters only the channels in the feature map (from C i n to C o u t ), without modifying its size.
GA is the core component of GA Block C i n , C o u t , as depicted on the right side of Figure 3. To illustrate the data processing procedure of GA, we use the feature f p at the red point p with coordinates ( m , n ) as an example. The processing steps are outlined as follows.
  • A linear transformation of f p is performed using the fully connected layer Linear (top) to obtain t f p , as shown in Equation (3).
    t f p = L i n e a r ( f p )
  • The position of the feature f p is encoded [32], and then the result of the position encoding is added to f p to obtain q u e r y p , as shown in Equation (4).
    q u e r y p = P o s E m b ( f p ) + f p
  • The fully connected layer Linear C , 8 (below) is used to linearly transform q u e r y p , mapping the q u e r y p with C channels to an positional feature r p with 8 channels, where r p = [ m 1 , n 1 , m 2 , n 2 , m 3 , n 3 , m 4 , n 4 ] , as shown in Equation (5). r p corresponds to four reference points related to point p, namely the cyan point r p 1 ( m 1 , n 1 ) , the green point r p 2 ( m 2 , n 2 ) , the pink point r p 3 ( m 3 , n 3 ) , and the yellow point r p 4 ( m 4 , n 4 ) .
    r p = L i n e a r C , 8 ( q u e r y p )
  • The weight w i of the feature t f r p i is calculated based on the cosine similarity between the features t f r p i and the feature t f p , as shown in Equation (6).
    w i = e c o s ( t f p , t f r p i ) j 4 e c o s ( t f p , t f r p j )
  • Calculate the weighted sum of the four reference points to obtain the output feature at the red point p, as shown in Equation (7).
    o u t p u t p = i = 1 4 w i · t f r p i

3.3. Enhancement Function

3.3.1. Priori Probability

Under conditions of excessively strong or weak lighting, photos may suffer from overexposure or underexposure. The SICE dataset [16] includes a variety of abnormal images taken under different lighting conditions, along with their corresponding reference images. Figure 4 demonstrates the pixel ratios in the grey, R, G, and B channels that require enhancement (brightening) as brightness changes during the transformation from abnormal images to reference images. It is evident that as pixels’ brightness increases, the proportion of pixels requiring enhancement decreases gradually. This suggests that during the enhancement process of abnormal images, a pixel with higher brightness should have a higher probability to be suppressed, while a pixel with lower brightness should have a higher probability to be enhanced.
The enhancement function is formed by a series of parameterized base functions that undergo multiple iterations. These base functions are denoted as f α , where α is a parameter and satisfies α I , I = [ s , e ] . For convenience, this paper denotes the enhancement function with k iterations as f A k , where A k = [ α 1 , α 2 , , α i , , α k ] I k , and α i is the parameter corresponding to the base function at the i-th iteration. The enhancement function maps the pixel’s brightness from x to y. The enhancement function obtained by iterating the basis function once or twice is shown in Equations (8) and (9), respectively, while the enhancement function with three or more iterations of the basis function follow a similar form.
y = f A 1 ( x ) = f α 1 ( x )
y = f A 2 ( x ) = f α 2 ( f α 1 ( x ) )
A k is a k-dimensional space spanned over the interval I, and the measure M A k represents the size of space A k . For example, A 1 is a 1-dimensional space, and its corresponding measure M A 1 is the length; A 2 is a 2-dimensional space, and its corresponding measure M A 2 is the area; A 3 is a 3-dimensional space, and its corresponding measure M A 3 is the volume. Each point in the space A k represents an enhancement function. For a pixel with brightness x, the space A k is divided into three subspaces based on whether the pixel is enhanced: (1) the subspace where y = f A k ( x ) > x , corresponding to an enhancement function that increases the pixel’s brightness; this subspace is denoted as A k + ; (2) the subspace where y = f A k ( x ) = x , corresponding to an enhancement function that leaves the pixel unchanged; this subspace is denoted as A k 0 ; (3) the subspace where y = f A k ( x ) < x , corresponding to an enhancement function that decreases the pixel’s brightness; this subspace is denoted as A k . It is important to note that, whether increasing, leaving unchanged, or decreasing the pixel’s brightness, the overall goal is to improve the image’s quality.
Given that the parameter α i is the output of the deep learning model, and the deep learning model is an inherently uninterpretable black-box model. Therefore, we assume that the probability of α i taking any value in the interval I is equal, which means that the parameter α i can be considered as a random variable following a uniform distribution on the interval I, denoted as α i U ( s , e ) . A k is the joint distribution of ( α 1 , α 2 , , α i , , α k ) ; thus, the probability density function of A k is given by Equation (10).
p ( A k ) = 1 M A k
Based on the above analysis, the probability that a pixel with brightness x is enhanced, unchanged, or suppressed is shown in Equation (11).
p ( A k + ) = M A k + M A k , p ( A k 0 ) = M A k 0 M A k , p ( A k ) = M A k M A k p ( A k + ) + p ( A k 0 ) + p ( A k ) = 1
p ( A k + ) , p ( A k 0 ) , and p ( A k ) are solely dependent on the basis functions, which are determined and immutable. Consequently, these probabilities are termed the prior enhancement probability, priori unchanged probability, and priori suppression probability for a pixel with brightness x, respectively. For simplicity, we collectively refer to these probabilities as priori probabilities.

3.3.2. The Shortcomings of the Current Method

Zero DCE employs a series of parameterized quadratic functions g α as basis functions, as showed in Equation (12).
y = g α ( x ) = α · x 2 + ( 1 α ) · x
where α I , I = [ 1 , 1 ] .
Figure 5a illustrates the curves of three special basis functions g 1 , g 0 , and g 1 , which also act as the enhancement functions when the basis functions are iterated once. The horizontal axis represents the original pixel’s brightness, while the vertical axis represents the enhanced brightness. Specifically, the enhancement function g 0 coincides with y = x . As shown in Figure 5a, for a pixel with an original brightness of x, when the parameter α gradually changes from 1 to 0, the enhanced brightness decreases from y c to y b ( y b = x ) ; when the parameter α then gradually changes from 0 to 1, the enhanced brightness decreases further from y b to y a . The space A 1 of parameter α is 1-dimensional, and its corresponding measure M A 1 is a length of 2. At the same time, the measures of its corresponding subspaces A 1 + and A 1 are both 1. Furthermore, the priori enhancement probability p ( A 1 + ) and priori suppression probability p ( A 1 ) satisfy p ( A 1 + ) p ( A 1 ) 0.5 .
The enhancement function g A 2 is derived by iterating the basis function g α twice. At this point, the parameter space A 2 is a 2-dimensional plane, and the corresponding measure M A 2 is a square with an area of 4. Figure 5b shows pixels with different brightness and the corresponding planar space A 2 . The vertical axis (x-axis) represents the pixel’s brightness. The A 2 space is a parameter space composed of parameters α 1 and α 2 . Surfaces of the same color are parallel to the α 1 α 2 plane, representing the subspace A 2 + of a pixel with brightness x, while different colors represent the area of the corresponding surface. It can be seen that as we move upward along the vertical axis, the pixels’ brightness x gradually increases, and the area of the corresponding subspace A 2 + decreases from 1.61 (red) to 1.34 (blue).
Figure 6 illustrates the priori enhancement probabilitiy for the enhancement functions g A 1 , g A 2 , and g A 3 , respectively. It is observed that the priori enhancement probability of g A 1 remains constant at 0.5, whereas the priori enhancement probabilities of g A 2 and g A 3 decrease as brightness increases. Overall, these priori enhancement probabilities remain below 0.5. In other words, the prior suppression probabilities for g A 2 and g A 3 consistently exceed their prior enhancement probabilities, which is unreasonable.

3.3.3. Solutions

To solve the problem raised in Section 3.3.2, in this section, we propose using the cubic function h α as the basis function, as shown in Equation (13).
y = h α ( x ) = α · x 3 3 · x 2 + ( 4 α ) · x
where α I , I = [ 1 , 3 ] .
Figure 7a shows the curves of these basis functions h 1 , h 3 x + 1 , and h 3 , respectively. It can be seen that for a pixel with brightness x, when the parameter α gradually increases from 1 to 3 x + 1 , its enhanced brightness decreases from y c to y b . During this process, the pixel’s brightness is enhanced. When the parameter α gradually increases from 3 x + 1 to 3, the enhanced brightness decreases from y b to y a . During this process, the pixel’s brightness is suppressed. For a pixel with brightness x, its priori enhancement probability is p ( A 1 + ) = 1.5 x + 1 0.5 . In other words, the priori enhancement probability of a pixel with brightness x is adaptive.
When the basis function h α is iterated twice to obtain the enhancement function h A 2 , the corresponding parameter space A 2 forms a square with the area of 4. Similar to Figure 5b, Figure 7b shows the enhancement subspace A 2 + corresponding to pixels with different brightness. It can be observed that as x increases, the area of the enhancement subspace A 2 + gradually decreases from 4 (red) to 0.94 (blue).
Figure 6 shows these priori enhancement probabilities of the enhancement functions h A 1 , h A 2 , and h A 3 , respectively. It can be observed that as the pixel’s brightness gradually increases, these priori enhancement probabilities decrease from 1 to 0.
From the above analysis, it can be seen that ( g A 1 , g A 2 , and g A 3 ) is more suitable as an enhancement function than ( h A 1 , h A 2 , and h A 3 ).

3.4. Training

The Structural Similarity Index Measure (SSIM) quantifies the similarity between two different images in terms of brightness, contrast, and structure. When SSIM is used as a loss function, it is defined by Equation (14):
L S S I M = 1 j = 1 J ( 2 μ I j μ I j + c 1 ) ( σ I j , I j + c 2 ) ( μ I j 2 + μ I j 2 + c 1 ) ( σ I j 2 + σ I j 2 + c 2 )
where j represents the index of an image, while J denotes the total number of images. I j refers to the low-light image, and I j corresponds to the enhanced image. μ I j and μ I j are the mean values of I j and I j , respectively. σ I j and σ I j represent the variances of I j and I j , respectively, while σ I j , I j is the covariance between I j and I j . c 1 and c 2 are constants.
The Mean Squared Error (MSE) is employed to quantify the brightness difference between the low-light image and the enhanced image. When MSE is used as a loss function, it is defined by Equation (15).
L M S E = j = 1 J h = 1 H w = 1 W 1 W · H I j ( w , h ) I j ( w , h ) 2 ]
where H and W represent the height and width of the image, respectively. I j ( w , h ) represents the brightness of the pixel at coordinates ( w , h ) in the image I j .
The total loss function includes L S S I M and L M S E , as shown in the following Equation (16).
L = λ M S E · L M S E + λ S S I M · L S S I M
where λ M S E and λ S S I M represent the weights of the loss functions L M S E and L S S I M , which are set to 20 and 5, respectively, in this paper.

4. Experiments

4.1. Experimental Details

4.1.1. Implementation Details

All experiments in this paper were conducted on a platform with the Ubuntu 22.04 operating system, and the hardware environment consists of Intel i7-13700K CPU, NVIDIA RTX 4090 24 GB GPU, 32 GB RAM. The Priori DCE is implemented based on the PyTorch framework. To accelerate the convergence of the model training process, the optimizer is set to AdamW [44] with a weight decay of 1 e 4 , and the batchsize, numworkers, epochs, and initial learning rate are set to 1, 4, 120, and 2 e 5 , respectively. The learning rate decay strategy is that reducing the learning rate to 0.8 of its previous value every 10 epochs.

4.1.2. Datasets

In recent years, several low-light image datasets have been proposed, which are mainly classified into two categories: paired datasets (with reference) and unpaired datasets (without reference).
The paired datasets used in the experiments include LOLv1 [13], LOLv2 [14], and LSRW [15]. LOLv1 is the first dataset that includes images from real-world scenarios and contains 485 training pairs and 15 validation pairs. The LOLv2 dataset comprises two subsets: LOLv2-Real and LOLv2-Synthetic. LOLv2-Real contains 689 training pairs and 100 validation pairs, while LOLv2-Synthetic includes 900 training pairs and 100 validation pairs. The LSRW dataset also includes two subsets: LSRW-Huawei and LSRW-Nikon. LSRW-Huawei, collected with a Huawei phone, consists of 2450 training pairs and 30 validation pairs. LSRW-Nikon, collected with a Nikon camera, contains 3150 training pairs and 20 validation pairs. The unpaired datasets used in the experiments are DICM [45], LIME [8], MEF [46], and NPE [41], which contain 69, 10, 17, and 85 low-light images, respectively.

4.1.3. Metrics

The evaluation on paired datasets primarily relies on two metrics: PSNR (Peak Signal-to-Noise Ratio; higher is better) and SSIM (Structural Similarity; higher is better). PSNR quantifies the ratio of signal to noise in an image, while SSIM measures the similarity between the enhanced image and the reference image in terms of luminance, contrast, and structure. Notably, the unit of SSIM is expressed as a percentage (%) in the experiments. For unpaired datasets, the evaluation primarily utilizes NIQE [47] (Natural Image Quality Evaluator; lower is better). On the one hand, the metrics for each image in the dataset are computed and averaged to assess the model’s performance, denoted as PSNR m , SSIM m , and NIQE m , respectively. On the other hand, to evaluate the stability of enhancement performance, we use the standard deviation instead of the mean, represented as PSNR s , SSIM s , and NIQE s , respectively.

4.2. Ablation Study

4.2.1. The Impact of Loss Function Strategies

In order to enhance the model’s capability for low-light image enhancement, the loss function used during training is composed of two components: L M S E and L S S I M . To evaluate the impact of L M S E and L S S I M , the model is first trained using L M S E as the sole loss function. Subsequently, both L M S E and L S S I M are employed together as the loss function for training.
Table 1 shows the quantization performance of Priori DCE trained with different loss function strategies on the datasets LOLv1, LOLv2-Real, and LOLv2-Synthetic. As can be seen from Table 1, compared to using L M S E alone as the loss function, the inclusion of L S S I M improves the Priori DCE’s enhancement capability to some extent.
Figure 8 visually presents the enhancement results using different loss function strategies. As shown, when L M S E is used alone, both the brightness and visual quality of the enhanced image improve significantly. However, some blurring is observed on the surfaces of the wooden floor and the chair. Incorporating L S S I M into the loss function alleviates this shortcoming, leading to more natural texture transitions. Notably, when both L M S E and L S S I M are combined, the enhanced image outperforms the reference image in terms of the NIQE metric. In the reference image, the texture transition on the wooden floor is relatively sharp, whereas in the enhanced image, the texture becomes noticeably softer, improving the overall visual quality.

4.2.2. The Impact of Hyperparameter k

As the hyperparameter k (the number of iterations of the basis function) increases, the range of pixel’s brightness x that is mapped becomes broader, allowing the enhanced pixel to either become brighter or darker. Figure 9 demonstrates the enhancement performance of the Priori DCE with different k on the LOLv1 dataset. The x-axis represents the hyperparameter k, and the y-axis shows the corresponding evaluation metrics ( PSNR m , SSIM m , and NIQE m ). It is evident that as k increases from 1 to 5, the Priori DCE’s enhancement performance gradually improves, after which it stabilizes. Figure 10 visually presents the enhancement results of the Priori DCE on two low-light images (Bowling Alley and Kitchen Cabinet) with different k. The numbers ( a | b | c ) at the top of the images indicate the corresponding PSNR, SSIM, and NIQE, respectively. It can be observed that when k is 1, although the brightness of the enhanced image improves relative to the original low-light image, the issue of underexposure remains. This is due to the limited range of enhanced pixels mapped by the corresponding enhancement function. As k increases from 1 to 5, this issue is alleviated. The corresponding enhanced images gradually brighten, and the image quality significantly improves, with the NIQE decreasing from 5.497 and 6.044 to 2.726 and 3.769, respectively. However, the increase in k does not result in indefinite improvement in enhancement performance. Once k exceeds 5, the brightness and quality of the enhanced images stabilize, with the NIQE changing from 2.726 and 3.769 to 2.823 and 4.166, respectively.

4.2.3. Model Structure

To verify the role of different modules in the Priori DCE proposed in this paper, we use a plain model without any special modules as the baseline and progressively incorporate various modules. Training and testing are performed on the LOLv1, LOLv2-Real, and LOLv2-Synthetic datasets, with quantitative results presented in Table 2. As shown in Table 2, compared to the baseline, the inclusion of the priori channels improves model performance across all three datasets. The priori channels specifies the direction of enhancement for the low-light image, so the enhanced image is more similar to the reference image. Subsequently, incorporating GA further boosts performance, as the enhancement of each pixel in the low-light image is no longer confined to local regions but can be inferred globally. Finally, the addition of the priori probability further improves enhancement performance by providing adaptive priori enhancement probability that varies based on the brightness of pixels.

4.2.4. The Impact of Scale Factor γ

To investigate the effect of the scale factor γ on enhancement performance, experiments were conducted on the datasets DICM, LIME, MEF, and NPE, with the results presented in Figure 11 (top). As shown, optimal performance is achieved on the datasets DICM, LIME, MEF, and NPE when γ is set to 1.1 , 2.9 , 3.5 , and 1, respectively. This is due to the variability in the brightness of different low-light images, as well as the non-uniqueness of the reference images corresponding to them. Therefore, the corresponding priori channels are also different. Figure 11 (bottom) shows the proportion of pixels with different brightness in these four datasets. It can be seen that the brightness in the LIME and MEF datasets are mainly concentrated in the low-light regions. Therefore, to achieve a good enhancement effect, a larger γ is required. In contrast, the brightness distribution in the DICM and NPE datasets is more balanced, so a smaller γ value is sufficient.
A γ value of 0.5, 1, 2, and 4 indicates that the brightness of the enhanced image is 0.5 times, 1 times, 2 times, and 4 times that of the original low-light image, and Figure 12 visualizes the enhancement results corresponding to these γ , with different colored numbers representing the average brightness on the corresponding channels. As shown in Figure 12, the average brightness of the three channels of the original low-light image is 0.27, 0.32, and 0.33, respectively. The three channels of the enhanced image corresponding to a γ of 0.5 are 0.18, 0.22, and 0.21, respectively, which is significantly lower than that of the original low-light image. The enhanced image when the γ is 1 is almost the same brightness as the original low-light image. When the γ is 2, the brightness of the enhanced image is close to 0.5, and the quality of the enhanced image is also the best. When the γ is 4, the corresponding enhanced image is significantly overexposed. As can be seen from the above, the output of the model can be adjusted by adjusting the value of γ , so that for different low-light images, enhanced images with different brightness can be generated.

4.3. Performance Evaluation

4.3.1. Reference Evaluation

The LOLv1 [13], LOLv2 [14], and LSRW [15] datasets are currently popular paired datasets. In this section, we evaluate several recent SOTA methods along with Priori DCE on these paired datasets, with the results presented in Table 3. As shown, EnlightenGAN, MELLEN-IC, Retinexformer, LLFlow, and Priori DCE achieve the best performance, with Priori DCE demonstrating the strongest enhancement performance. Notably, when comparing Priori DCE with the second-best performing Retinexformer on the LOLv2-Synthetic dataset, the PSNR m and SSIM m improve from 25.67 and 92.82 to 29.49 and 93.6, respectively, while the NIQE m for image quality decreases slightly from 3.94 to 3.91. Additionally, Table 3 reports the NIQE m for both the low-light image (Low) and reference image (Reference). It is evident that some methods, such as MELLEN-IC, Retinexformer, LLFlow, and Priori DCE, produce enhanced images with NIQE m lower than those of the reference images. This can be attributed to the fact that low-light image enhancement models are trained on large datasets, most of which contain high-quality reference images (lower NIQE). Consequently, these models not only learn how to enhance brightness but also improve the overall image quality.
Table 3 also shows the complexity and running speed of different models. It can be seen that DeepUPE has a very small number of parameters, and its running speed even reaches 2426, and in exchange, its enhanced quality is terrible. Since the Priori DCE model introduced the attention mechanism GA, its running speed was reduced from 24 to 11.67 compared to Zero DCE, and in exchange, the enhancement quality of Zero DCE was also weaker than that of Priori DCE. It can be seen that although the enhanced quality of Priori DCE has been enhanced, how to improve the operation efficiency is an important research direction in the future.
Figure 13, Figure 14, Figure 15, Figure 16 and Figure 17 visually display the enhancement results on different datasets. The numbers ( a | b | c ) at the top of these images correspond to the PSNR, SSIM, and NIQE values of the respective images. The histograms at the bottom represent the frequency distribution of brightness in the R, G, and B channels for the corresponding images.
Figure 13 shows the low-light image of a swimming pool scene from the LOLv1 dataset and these corresponding enhanced images generated by several SOTA methods. As shown in the figure, the brightness of BL, DeepUPE, and Zero DCE is relatively low. Correspondingly, the brightness distribution histograms are concentrated primarily in the lower brightness range on the left. The enhancement quality of these three methods is even worse than that of the original low-light image. In contrast, EnlightenGAN, GLADNet, LIME, RetinexNet, and Zero DCE++ exhibit significant improvements in brightness. However, while the brightness was increased, different levels of noise were also introduced, with the noise in the trees and the surrounding areas of the enhanced image being particularly noticeable. RetinxeNet even encountered serious stylization issues. This led to a decrease in the enhancement quality. The performance of KinD, KinD++, MELLEN-IC, Retinexformer, LLFlow, and Priori DCE is relatively better. In the reference image, the text on the clock is brighter. In the enhanced images from KinD, KinD++, MELLEN-IC, Retinexformer, and LLFlow, there is a noticeable color shift in the clock text. Meanwhile, thanks to the inclusion of priori channels, The enhancement effect of Priori DCE on these texts in the clock is closest to that of reference.
Figure 14 shows the low-light image of a grassland scene from the LOLv2-Real dataset and the corresponding enhanced images generated by several SOTA methods. BL and DeepUPE still have weak brightness, and their corresponding frequency distribution histograms are mainly concentrated in the low-brightness region. EnlightenGAN is notably yellowish. In fact, the brightness of the reference image is also low, but the brightness of these enhanced images generated by GLADNet, KinD, KinD++, RetinexNet, and Zero DCE++, is significantly increased, even surpassing the brightness of the reference. The enhancement quality of these methods is therefore not satisfactory, as the NIQE is greater than the reference and even exceeds 7. In comparison, Retinexformer, LLFlow, and Priori DCE are closer to the reference. However, based on the enhancement results of the grass area within the cyan and green boxes in the image, Retinexformer images are relatively blurry, while LLFlow and Priori DCE images are more satisfactory.
Figure 15 shows the low-light image of building from the dataset LOLv2-Synthetic and these corresponding enhanced images generated by several SOTA methods. It can be observed that the brightness of the reference is high, and the frequency distribution histogram is more evenly distributed. From the rooftop area within the cyan box, the colors in the reference image are predominantly composed of gray and warm yellow tones. Among these methods, only MELLEN-IC, Retinexformer, and Priori DCE are closest to the reference, while the colors of other methods show significant deviations. In BL, LIME, and LLFlow, the transition between the roof and the sky even shows a noticeable white area.
Figure 16 shows the low-light image of the rest room scene in the LSRW-Huawei dataset and these corresponding enhanced images generated by several SOTA methods. The brightness of BL and DeepUPE is relatively weak. EnlightenGAN, GALDNet, KinD++, and LIME exhibit significant color deviation and generate excessive noise. RetinexNet still suffers from stylization issues. MELLEN-IC and Zero DCE++ exhibit higher brightness compared to the reference, which results in lower PSNR and SSIM values for these methods. This occurs because these methods focus solely on enhancing the low-light image without accounting for the reference. Among these methods, Priori DCE most closely matches the reference image, successfully improving both image quality and brightness.
Figure 17 shows the low-light image of multi-floor building from the LSRW-Nikon dataset and the corresponding enhanced images generated by several SOTA methods. It is evident that the building areas in the reference image are relatively bright. Among these methods, only LLFlow and Priori DCE yield satisfactory results in both brightness and enhancement quality. Other methods either produce insufficient brightness or exhibit noticeable color distortions. In the window highlighted by the cyan box, Priori DCE introduces more noise, resulting in lower enhancement quality compared to LLFlow.
The enhancement results above indicate that Retinexformer, LLFlow, and Priori DCE achieve good enhancement quality. However, the performance of Retinexformer and LLFlow is unstable. For instance, Retinexformer produces darker enhanced image on the LSRW-Nikon dataset, while LLFlow introduces white transitions in the image from the LOLv2-Synthetic dataset. In contrast, Priori DCE demonstrates a clear advantage in both enhancement quality and stability across these datasets.

4.3.2. Unreference Evaluation

DICM, LIME, MEF, and NPE are currently popular datasets without reference. Similar to Section 4.3.1, we conduct experiments on these datasets using the same SOTA methods, with the results presented in Table 4. Since these datasets do not have reference images, we use NIQE m and NIQE s as evaluation metrics. Additionally, we calculate the average results across different datasets, as shown in the a v g column of Table 4. As shown in Table 4, the top three models are EnlightenGAN, MELLEN-IC, and Priori DCE. Among these, Priori DCE shows a clear advantage in both enhancement quality and stability. Compared to MELLEN-IC, which ranks second in the a v g metric, Priori DCE reduces the NIQE m and NIQE s scores from 3.15 and 0.914 to 3.031 and 0.839, respectively. This demonstrates the effectiveness and stability advantages of Priori DCE.
Figure 18 shows the low-light image of the lunar landing scene from the DICM dataset and the corresponding enhanced images generated by several SOTA methods. It can be observed that RetinexNet still exhibits obvious stylization, while the astronaut’s color in LLFlow and GLADNet has turned white. The top three methods with the best enhancement quality are Priori DCE, MELLEN-IC, and EnlightenGAN, with corresponding NIQE values of 3.05, 2.778, and 2.34, respectively. From the zoomed-in views of the radar and wheels, it can be seen that EnlightenGAN provides the highest enhancement quality, preserving rich details while increasing brightness. Similar to EnlightenGAN, MELLEN-IC increases brightness further but sacrifices some details. Priori DCE, while retaining details, has insufficient brightness.
Figure 19 shows the low-light image of a streetlight scene from the LIME dataset and the corresponding enhanced images generated by several SOTA methods. From the zoomed-in views of the buildings and streetlight, it can be seen that GLADNet, KinD++, LIME, RetinexNet, and Zero DCE exhibit the most significant noise. LLFlow produces the brightest image with only a small amount of noise. Priori DCE introduces the least noise around the streetlight and provides the most natural transition between the streetlight and its surrounding environment.
Figure 20 shows the low-light image of the venice from the MEF dataset and these corresponding enhanced images generated by several SOTA methods. Notably, Priori DCE achieves the highest brightness while maintaining strong enhancement quality. Furthermore, the enhancement quality of EnlightenGAN, LIME, and Retinexformer is also relatively high, primarily due to their lower brightness in these enhanced images.
Figure 21 shows the low-light image of the forest from the NPE dataset and the corresponding enhanced images generated by several SOTA methods. The zoomed-in view of the sky reveals that BL, DeepUPE, GLADNet, Zero DCE++, and LLFlow largely overlook its texture, whereas RetinexNet and Priori DCE provide the most effective enhancement of the sky’s texture. In the zoomed-in view of the forest, the brightness of DeepUPE, GLADNet, Retinexformer, LLFlow, and Priori DCE are relatively low, while other methods are noticeably overexposed. KinD appears somewhat blurred, neglecting the texture. RetinexNet appears highly unnatural.

5. Conclusions

This paper incorporates priori knowledge into low-light image enhancement method through the use of priori channels and priori enhancement probability and names the method Priori DCE. The priori channels enable the adjustable brightness of the enhanced image, while the priori enhancement probabilities are adaptively adjusted according to the brightness of individual pixels. Additionally, the GA module is integrated to facilitate interactions between pixels in the enhanced image, promoting visual balance in the enhanced image. Experimental results demonstrate the superior performance of the proposed Priori DCE. However, due to the introduction of GA modules, on the one hand, it improves the enhanced quality of the model, and on the other hand, it also reduces the operational efficiency. How to solve this problem is one of the main research directions in the future.

Author Contributions

Z.C.: Data analysis, writing original draft; Y.L.: Supervision, review and editing; J.X.: Supervision; K.L.: Supervision; Z.H.: Review. All authors have read and agreed to the published version of the manuscript.

Funding

This paper is supported by Natural Science Foundation of Guangdong Province Youth Enhancement Project (2023A1515030120) and State Key Laboratory of Subtropical Building and Urban Science (2023KA04).

Institutional Review Board Statement

Not applicable.

Data Availability Statement

The datasets used in this work are openly available.

Conflicts of Interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

References

  1. Guo, C.; Li, C.; Guo, J.; Loy, C.C.; Hou, J.; Kwong, S.; Cong, R. Zero-Reference Deep Curve Estimation for Low-Light Image Enhancement. In Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Seattle, WA, USA, 13–19 June 2020; pp. 1777–1786. [Google Scholar] [CrossRef]
  2. Redmon, J.; Farhadi, A. YOLOv3: An Incremental Improvement. arXiv 2018, arXiv:1804.02767. [Google Scholar] [CrossRef]
  3. Meinhardt, T.; Kirillov, A.; Leal-Taixé, L.; Feichtenhofer, C. TrackFormer: Multi-Object Tracking with Transformers. In Proceedings of the 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), New Orleans, LA, USA, 18–24 June 2022; pp. 8834–8844. [Google Scholar] [CrossRef]
  4. He, K.; Gkioxari, G.; Dollár, P.; Girshick, R. Mask R-CNN. In Proceedings of the 2017 IEEE International Conference on Computer Vision (ICCV), Venice, Italy, 22–29 October 2017; pp. 2980–2988. [Google Scholar] [CrossRef]
  5. Cheng, H.; Shi, X. A simple and effective histogram equalization approach to image enhancement. Digit. Signal Process. 2004, 14, 158–170. [Google Scholar] [CrossRef]
  6. Land, E.H.; McCann, J.J. Lightness and Retinex Theory. J. Opt. Soc. Am. 1971, 61, 1–11. [Google Scholar] [CrossRef]
  7. Zhang, Y.; Guo, X.; Ma, J.; Liu, W.; Zhang, J. Beyond Brightening Low-light Images. Int. J. Comput. Vis. 2021, 129, 1013–1037. [Google Scholar] [CrossRef]
  8. Guo, X.; Li, Y.; Ling, H. LIME: Low-Light Image Enhancement via Illumination Map Estimation. IEEE Trans. Image Process. 2017, 26, 982–993. [Google Scholar] [CrossRef]
  9. Fu, X.; Zeng, D.; Huang, Y.; Zhang, X.P.; Ding, X. A Weighted Variational Model for Simultaneous Reflectance and Illumination Estimation. In Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, NV, USA, 27–30 June 2016; pp. 2782–2790. [Google Scholar] [CrossRef]
  10. Lecun, Y.; Bottou, L.; Bengio, Y.; Haffner, P. Gradient-based learning applied to document recognition. Proc. IEEE 1998, 86, 2278–2324. [Google Scholar] [CrossRef]
  11. Simonyan, K.; Zisserman, A. Very Deep Convolutional Networks for Large-Scale Image Recognition. In Proceedings of the 3rd International Conference on Learning Representations, ICLR 2015, San Diego, CA, USA, 7–9 May 2015. [Google Scholar]
  12. He, K.; Zhang, X.; Ren, S.; Sun, J. Deep Residual Learning for Image Recognition. In Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, NV, USA, 27–30 June 2016; pp. 770–778. [Google Scholar] [CrossRef]
  13. Wei, C.; Wang, W.; Yang, W.; Liu, J. Deep Retinex Decomposition for Low-Light Enhancement. In Proceedings of the British Machine Vision Conference, Newcastle, UK, 3–6 September 2018; British Machine Vision Association: Durham, UK, 2018. [Google Scholar]
  14. Yang, W.; Wang, W.; Huang, H.; Wang, S.; Liu, J. Sparse Gradient Regularized Deep Retinex Network for Robust Low-Light Image Enhancement. IEEE Trans. Image Process. 2021, 30, 2072–2086. [Google Scholar] [CrossRef] [PubMed]
  15. Hai, J.; Xuan, Z.; Yang, R.; Hao, Y.; Zou, F.; Lin, F.; Han, S. R2RNet: Low-light image enhancement via Real-low to Real-normal Network. J. Vis. Commun. Image Represent. 2023, 90, 103712. [Google Scholar] [CrossRef]
  16. Cai, J.; Gu, S.; Zhang, L. Learning a Deep Single Image Contrast Enhancer from Multi-Exposure Images. IEEE Trans. Image Process. 2018, 27, 2049–2062. [Google Scholar] [CrossRef]
  17. Gharbi, M.; Chen, J.; Barron, J.T.; Hasinoff, S.W.; Durand, F. Deep bilateral learning for real-time image enhancement. ACM Trans. Graph. 2017, 36, 1–12. [Google Scholar] [CrossRef]
  18. Wang, R.; Zhang, Q.; Fu, C.W.; Shen, X.; Zheng, W.S.; Jia, J. Underexposed Photo Enhancement Using Deep Illumination Estimation. In Proceedings of the 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Long Beach, CA, USA, 15–20 June 2019; pp. 6842–6850. [Google Scholar] [CrossRef]
  19. Zhang, Y.; Zhang, J.; Guo, X. Kindling the Darkness: A Practical Low-light Image Enhancer. In Proceedings of the 27th ACM International Conference on Multimedia, MM ’19, Nice, France, 21–25 October 2019; Association for Computing Machinery: New York, NY, USA, 2019; pp. 1632–1640. [Google Scholar] [CrossRef]
  20. Goodfellow, I.; Pouget-Abadie, J.; Mirza, M.; Xu, B.; Warde-Farley, D.; Ozair, S.; Courville, A.; Bengio, Y. Generative adversarial networks. Commun. ACM 2020, 63, 139–144. [Google Scholar] [CrossRef]
  21. Isola, P.; Zhu, J.Y.; Zhou, T.; Efros, A.A. Image-to-Image Translation with Conditional Adversarial Networks. In Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Honolulu, HI, USA, 21–26 July 2017; pp. 5967–5976. [Google Scholar] [CrossRef]
  22. Zhu, J.Y.; Park, T.; Isola, P.; Efros, A.A. Unpaired Image-to-Image Translation Using Cycle-Consistent Adversarial Networks. In Proceedings of the 2017 IEEE International Conference on Computer Vision (ICCV), Venice, Italy, 22–29 October 2017; pp. 2242–2251. [Google Scholar] [CrossRef]
  23. Liu, Y.; Wang, Z.; Zeng, Y.; Zeng, H.; Zhao, D. PD-GAN: Perceptual-Details GAN for Extremely Noisy Low Light Image Enhancement. In Proceedings of the ICASSP 2021—2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Toronto, ON, Canada, 6–11 June 2021; pp. 1840–1844. [Google Scholar] [CrossRef]
  24. Jiang, Y.; Gong, X.; Liu, D.; Cheng, Y.; Fang, C.; Shen, X.; Yang, J.; Zhou, P.; Wang, Z. EnlightenGAN: Deep Light Enhancement Without Paired Supervision. IEEE Trans. Image Process. 2021, 30, 2340–2349. [Google Scholar] [CrossRef] [PubMed]
  25. Fan, G.D.; Fan, B.; Gan, M.; Chen, G.Y.; Chen, C.L.P. Multiscale Low-Light Image Enhancement Network with Illumination Constraint. IEEE Trans. Circuits Syst. Video Technol. 2022, 32, 7403–7417. [Google Scholar] [CrossRef]
  26. Li, C.; Guo, C.; Loy, C.C. Learning to Enhance Low-Light Image via Zero-Reference Deep Curve Estimation. IEEE Trans. Pattern Anal. Mach. Intell. 2022, 44, 4225–4238. [Google Scholar] [CrossRef]
  27. Luo, W.; Li, Y.; Urtasun, R.; Zemel, R. Understanding the effective receptive field in deep convolutional neural networks. In Proceedings of the 30th International Conference on Neural Information Processing Systems, NIPS’16, Barcelona, Spain, 5–10 December 2016; Curran Associates Inc.: Red Hook, NY, USA, 2016; pp. 4905–4913. [Google Scholar]
  28. Hu, J.; Shen, L.; Sun, G. Squeeze-and-Excitation Networks. In Proceedings of the 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA, 18–23 June 2018; pp. 7132–7141. [Google Scholar] [CrossRef]
  29. Gao, S.H.; Cheng, M.M.; Zhao, K.; Zhang, X.Y.; Yang, M.H.; Torr, P. Res2Net: A New Multi-Scale Backbone Architecture. IEEE Trans. Pattern Anal. Mach. Intell. 2021, 43, 652–662. [Google Scholar] [CrossRef]
  30. Carion, N.; Massa, F.; Synnaeve, G.; Usunier, N.; Kirillov, A.; Zagoruyko, S. End-to-End Object Detection with Transformers. In Proceedings of the Computer Vision—ECCV 2020, Glasgow, UK, 23–28 August 2020; Vedaldi, A., Bischof, H., Brox, T., Frahm, J.M., Eds.; Springer Nature: Cham, Switzerland, 2020; pp. 213–229. [Google Scholar]
  31. Vaswani, A.; Shazeer, N.; Parmar, N.; Uszkoreit, J.; Jones, L.; Gomez, A.N.; Kaiser, L.; Polosukhin, I. Attention is all you need. In Proceedings of the 31st International Conference on Neural Information Processing Systems, NIPS’17, Long Beach, CA, USA, 4–9 December 2017; Curran Associates Inc.: Red Hook, NY, USA, 2017; pp. 6000–6010. [Google Scholar]
  32. Zhu, X.; Su, W.; Lu, L.; Li, B.; Wang, X.; Dai, J. Deformable DETR: Deformable Transformers for End-to-End Object Detection. In Proceedings of the 9th International Conference on Learning Representations, ICLR 2021, Virtual Event, Vienna, Austria, 3–7 May 2021. [Google Scholar]
  33. Dai, J.; Qi, H.; Xiong, Y.; Li, Y.; Zhang, G.; Hu, H.; Wei, Y. Deformable Convolutional Networks. In Proceedings of the 2017 IEEE International Conference on Computer Vision (ICCV), Venice, Italy, 22–29 October 2017; pp. 764–773. [Google Scholar] [CrossRef]
  34. Zhu, X.; Hu, H.; Lin, S.; Dai, J. Deformable ConvNets V2: More Deformable, Better Results. In Proceedings of the 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Long Beach, CA, USA, 15–20 June 2019; pp. 9300–9308. [Google Scholar] [CrossRef]
  35. Wang, W.; Wei, C.; Yang, W.; Liu, J. GLADNet: Low-Light Enhancement Network with Global Awareness. In Proceedings of the 2018 13th IEEE International Conference on Automatic Face & Gesture Recognition (FG 2018), Xi’an, China, 15–19 May 2018; pp. 751–755. [Google Scholar] [CrossRef]
  36. Pizer, S.M.; Amburn, E.P.; Austin, J.D.; Cromartie, R.; Geselowitz, A.; Greer, T.; ter Haar Romeny, B.; Zimmerman, J.B.; Zuiderveld, K. Adaptive histogram equalization and its variations. Comput. Vision Graph. Image Process. 1987, 39, 355–368. [Google Scholar] [CrossRef]
  37. Bennett, E.P.; McMillan, L. Video enhancement using per-pixel virtual exposures. In Proceedings of the ACM SIGGRAPH 2005 Papers, SIGGRAPH ’05, Los Angeles, CA, USA, 31 July–4 August 2005; Association for Computing Machinery: New York, NY, USA, 2005; pp. 845–852. [Google Scholar] [CrossRef]
  38. Yuan, L.; Sun, J. Automatic Exposure Correction of Consumer Photographs. In Proceedings of the Computer Vision—ECCV 2012, Florence, Italy, 7–13 October 2012; Fitzgibbon, A., Lazebnik, S., Perona, P., Sato, Y., Schmid, C., Eds.; Springer: Berlin/Heidelberg, Germany, 2012; pp. 771–785. [Google Scholar]
  39. Rahman, Z.; Jobson, D.; Woodell, G. Multi-scale retinex for color image enhancement. In Proceedings of the 3rd IEEE International Conference on Image Processing, Lausanne, Switzerland, 19 September 1996; Volume 3, pp. 1003–1006. [Google Scholar] [CrossRef]
  40. Jobson, D.; Rahman, Z.; Woodell, G. A multiscale retinex for bridging the gap between color images and the human observation of scenes. IEEE Trans. Image Process. 1997, 6, 965–976. [Google Scholar] [CrossRef] [PubMed]
  41. Wang, S.; Zheng, J.; Hu, H.M.; Li, B. Naturalness Preserved Enhancement Algorithm for Non-Uniform Illumination Images. IEEE Trans. Image Process. 2013, 22, 3538–3548. [Google Scholar] [CrossRef]
  42. Cai, Y.; Bian, H.; Lin, J.; Wang, H.; Timofte, R.; Zhang, Y. Retinexformer: One-stage Retinex-based Transformer for Low-light Image Enhancement. In Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), Paris, France, 2–3 October 2023; pp. 12504–12513. [Google Scholar]
  43. Ronneberger, O.; Fischer, P.; Brox, T. U-Net: Convolutional Networks for Biomedical Image Segmentation. In Proceedings of the Medical Image Computing and Computer-Assisted Intervention—MICCAI 2015, Munich, Germany, 5–9 October 2015; Navab, N., Hornegger, J., Wells, W.M., Frangi, A.F., Eds.; Springer Nature: Cham, Switzerland, 2015; pp. 234–241. [Google Scholar]
  44. Loshchilov, I.; Hutter, F. Decoupled Weight Decay Regularization. In Proceedings of the International Conference on Learning Representations, Toulon, France, 24–26 April 2017. [Google Scholar]
  45. Lee, C.; Lee, C.; Kim, C.S. Contrast Enhancement Based on Layered Difference Representation of 2D Histograms. IEEE Trans. Image Process. 2013, 22, 5372–5384. [Google Scholar] [CrossRef]
  46. Ma, K.; Zeng, K.; Wang, Z. Perceptual Quality Assessment for Multi-Exposure Image Fusion. IEEE Trans. Image Process. 2015, 24, 3345–3356. [Google Scholar] [CrossRef]
  47. Mittal, A.; Soundararajan, R.; Bovik, A.C. Making a “Completely Blind” Image Quality Analyzer. IEEE Signal Process. Lett. 2013, 20, 209–212. [Google Scholar] [CrossRef]
  48. Ma, L.; Jin, D.; An, N.; Liu, J.; Fan, X.; Luo, Z.; Liu, R. Bilevel Fast Scene Adaptation for Low-Light Image Enhancement. Int. J. Comput. Vis. 2023. [Google Scholar] [CrossRef]
  49. Wang, Y.; Wan, R.; Yang, W.; Li, H.; Chau, L.P.; Kot, A. Low-Light Image Enhancement with Normalizing Flow. Proc. AAAI Conf. Artif. Intell. 2022, 36, 2604–2612. [Google Scholar] [CrossRef]
Figure 1. (a) The framework of Priori DCE. (b) Implementation details in (a).
Figure 1. (a) The framework of Priori DCE. (b) Implementation details in (a).
Sensors 25 05521 g001
Figure 2. The process of deriving the priori channels during the training stage.
Figure 2. The process of deriving the priori channels during the training stage.
Sensors 25 05521 g002
Figure 3. Global-Attention Block.
Figure 3. Global-Attention Block.
Sensors 25 05521 g003
Figure 4. The ratio of pixels with different brightness that are enhanced during the process of transforming abnormal images to reference images in the SICE dataset.
Figure 4. The ratio of pixels with different brightness that are enhanced during the process of transforming abnormal images to reference images in the SICE dataset.
Sensors 25 05521 g004
Figure 5. (a,b) illustrate the enhancement spaces of the enhancement functions g A 1 and g A 2 , respectively.
Figure 5. (a,b) illustrate the enhancement spaces of the enhancement functions g A 1 and g A 2 , respectively.
Sensors 25 05521 g005
Figure 6. The prior enhancement probability of different enhancement functions.
Figure 6. The prior enhancement probability of different enhancement functions.
Sensors 25 05521 g006
Figure 7. (a,b) illustrate the enhancement spaces of the enhancement functions h A 1 and h A 2 , respectively.
Figure 7. (a,b) illustrate the enhancement spaces of the enhancement functions h A 1 and h A 2 , respectively.
Sensors 25 05521 g007
Figure 8. The visualization of the enhancement results of Priori DCE on the same low-light image under different loss function strategies. * indicates that the value does not exist.
Figure 8. The visualization of the enhancement results of Priori DCE on the same low-light image under different loss function strategies. * indicates that the value does not exist.
Sensors 25 05521 g008
Figure 9. As k gradually increases from 1 to 8, the enhancement performance of the corresponding Priori DCE on the LOLv1 dataset.
Figure 9. As k gradually increases from 1 to 8, the enhancement performance of the corresponding Priori DCE on the LOLv1 dataset.
Sensors 25 05521 g009
Figure 10. As k gradually increases from 1 to 8, the enhanced images generated by corresponding Priori DCE.
Figure 10. As k gradually increases from 1 to 8, the enhanced images generated by corresponding Priori DCE.
Sensors 25 05521 g010
Figure 11. (Top): the enhancement performance of Priori DCE with different scale factor γ on datasets DICM, LIME, MEF and NPE, respectively. (Bottom): the ratio of pixels with different brightness in these four datasets.
Figure 11. (Top): the enhancement performance of Priori DCE with different scale factor γ on datasets DICM, LIME, MEF and NPE, respectively. (Bottom): the ratio of pixels with different brightness in these four datasets.
Sensors 25 05521 g011
Figure 12. The enhanced result of priori DCE when the γ was 0.5, 1, 2, and 4, respectively. The numbers in red, green, and blue represent the average on the corresponding channel.
Figure 12. The enhanced result of priori DCE when the γ was 0.5, 1, 2, and 4, respectively. The numbers in red, green, and blue represent the average on the corresponding channel.
Sensors 25 05521 g012
Figure 13. The enhancement visualization on the LOLv1 dataset.
Figure 13. The enhancement visualization on the LOLv1 dataset.
Sensors 25 05521 g013
Figure 14. The enhancement visualization on the LOLv2-Real dataset.
Figure 14. The enhancement visualization on the LOLv2-Real dataset.
Sensors 25 05521 g014
Figure 15. The enhancement visualization on the LOLv2-Synthetic dataset.
Figure 15. The enhancement visualization on the LOLv2-Synthetic dataset.
Sensors 25 05521 g015
Figure 16. The enhancement visualization on the LSRW-Huawei dataset.
Figure 16. The enhancement visualization on the LSRW-Huawei dataset.
Sensors 25 05521 g016
Figure 17. The enhancement visualization on the LSRW-Nikon dataset.
Figure 17. The enhancement visualization on the LSRW-Nikon dataset.
Sensors 25 05521 g017
Figure 18. The enhancement visualization on the DICM dataset.
Figure 18. The enhancement visualization on the DICM dataset.
Sensors 25 05521 g018
Figure 19. The enhancement visualization on the LIME dataset.
Figure 19. The enhancement visualization on the LIME dataset.
Sensors 25 05521 g019
Figure 20. The enhancement visualization on the MEF dataset.
Figure 20. The enhancement visualization on the MEF dataset.
Sensors 25 05521 g020
Figure 21. The enhancement visualization on the NPE dataset.
Figure 21. The enhancement visualization on the NPE dataset.
Sensors 25 05521 g021
Table 1. The enhancement performance of Priori DCE under different loss function strategies.
Table 1. The enhancement performance of Priori DCE under different loss function strategies.
LossLOLv1LOLv2-RealLOLv2-Synthetic
PSNR m SSIM m NIQE m PSNR m SSIM m NIQE m PSNR m SSIM m NIQE m
L M S E 25.01575.1024.44425.82573.8125.35728.32691.3803.869
L M S E + L S S I M 25.77581.2223.57226.82880.7914.38129.48893.5983.911
Table 2. The enhancement performance of the models after adding different modules to the baseline. Red font indicates that the item is optimal for the column.
Table 2. The enhancement performance of the models after adding different modules to the baseline. Red font indicates that the item is optimal for the column.
Priori
Channels
GAPriori
Probability
LOLv1LOLv2-RealLOLv2-Synthetic
PSNR m SSIM m NIQE m PSNR m SSIM m NIQE m PSNR m SSIM m NIQE m
baseline21.00277.7373.67522.02876.9074.29021.49589.7933.879
22.40876.9453.82323.87776.8154.38222.45390.5803.818
24.37679.1623.81925.07478.5294.39429.11593.6613.895
25.77581.2223.57226.82880.7914.38129.48893.5983.911
Table 3. The enhancement performance of different SOTA models on the LOLv1, LOLv2, and LSRW datasets. The best and second-best performances are represented in red and blue, respectively.
Table 3. The enhancement performance of different SOTA models on the LOLv1, LOLv2, and LSRW datasets. The best and second-best performances are represented in red and blue, respectively.
MethodComplexityLOLv1LOLv2_RealLOLv2_SyntheticLSRW_HuaweiLSRW_Nikon
MACs (G) Params (M) FPS PSNR m SSIM m NIQE m PSNR m SSIM m NIQE m PSNR m SSIM m NIQE m PSNR m SSIM m NIQE m PSNR m SSIM m NIQE m
Low 5.72 6.01 4.09 3.16 3.45
Reference 4.25 4.73 4.19 3.44 4.24
BL [48]150.7991.606146.89510.3140.137.3112.8943.537.7313.5861.444.7411.7831.243.0613.4336.193.85
DeepUPE [18]45.9350.0792426.39412.7145.047.7914.6047.028.2313.8260.504.3713.6336.253.0013.3635.973.64
EnlightenGAN [24] 42.98317.4865.154.8918.6467.675.5016.5777.153.8317.8548.922.9415.9242.093.18
GLADNet [35]200.612.1579.83919.7268.206.8019.8268.477.7318.1182.593.9919.0049.452.9616.6344.073.36
KinD [19]61.01114.3524.38517.6477.133.9020.5881.784.1417.2775.784.2517.0349.882.6415.4744.043.46
KinD++ [7]105017.4214.84417.7575.824.0117.6676.094.2017.4878.574.7616.9741.153.0214.7436.803.72
LIME [8] 3.46316.0548.608.7917.1648.029.3116.3773.744.7617.1339.313.4414.6434.993.61
MELLEN-IC [25]25328.2751.43217.2375.443.3120.7578.983.3221.5788.083.9818.2253.482.6416.7145.083.41
Retinexformer [42] 9.29325.1584.342.9722.7983.863.5925.6792.823.9416.2549.482.6015.5642.383.27
RetinexNet [13] 9.34216.7742.509.7316.1040.7110.5617.1475.645.6916.8238.504.3313.4928.944.27
Zero DCE [1]517.12928.53924.09114.8656.248.2218.0657.958.7717.7681.404.3616.4146.193.1515.0541.373.40
Zero DCE++ [26]0.1090.594123.97617.0456.258.4618.1455.189.0618.6483.524.5518.1245.553.2715.1039.203.56
LLFlow [49] 0.76124.0686.024.0726.4390.264.5319.2282.414.6620.0955.072.8816.8845.663.73
Priori DCE834.236.2111.67325.7781.223.5726.8380.794.3829.4993.603.9121.3956.763.1318.3348.523.40
Table 4. The enhancement performance of different SOTA models on the DICM, LIME, MEF, and NPE datasets. The best and second-best performances are represented in red and blue, respectively.
Table 4. The enhancement performance of different SOTA models on the DICM, LIME, MEF, and NPE datasets. The best and second-best performances are represented in red and blue, respectively.
Method NIQE m NIQE s
DICM LIME MEF NPE avg DICM LIME MEF NPE avg
Low3.3173.5663.2563.1873.332
BL [48]4.0784.2163.3834.4244.0251.5341.9330.6651.8771.502
DeepUPE [18]3.5423.7933.1993.5913.5310.9852.0940.5461.2581.221
EnlightenGAN [24]3.0563.3802.8953.3683.1750.8231.5140.4661.2411.011
GLADNet [35]3.2763.9023.1793.2713.4070.9552.6080.6250.9861.293
KinD [19]3.3514.3573.3783.2693.5891.0173.7940.4640.9341.552
KinD++ [7]3.2804.8533.4713.6363.8101.0004.4660.4731.2201.790
LIME [8]3.4713.8353.4883.4703.5661.1792.3640.8041.2581.401
MELLEN-IC [25]2.9113.5033.0973.0873.1500.7841.6730.4150.7830.914
Retinexformer [42]3.3533.7053.1393.1743.3430.9721.4680.7531.0141.052
RetinexNet [13]4.3154.9164.9044.3884.6311.7153.5571.4751.4962.061
Zero DCE [1]3.4303.7863.3093.4333.4891.1952.0930.8611.2231.343
Zero DCE++ [26]3.5434.0923.5683.6033.7011.2592.4890.9531.2491.488
LLFlow [49]3.3683.8913.5153.5563.5830.8852.0070.5420.8771.078
Priori DCE (Ours)3.1553.1532.8482.9683.0310.8721.1740.6030.7080.839
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Chen, Z.; Lin, Y.; Xu, J.; Lu, K.; Huang, Z. Priori Knowledge Makes Low-Light Image Enhancement More Reasonable. Sensors 2025, 25, 5521. https://doi.org/10.3390/s25175521

AMA Style

Chen Z, Lin Y, Xu J, Lu K, Huang Z. Priori Knowledge Makes Low-Light Image Enhancement More Reasonable. Sensors. 2025; 25(17):5521. https://doi.org/10.3390/s25175521

Chicago/Turabian Style

Chen, Zefei, Yongjie Lin, Jianmin Xu, Kai Lu, and Zihao Huang. 2025. "Priori Knowledge Makes Low-Light Image Enhancement More Reasonable" Sensors 25, no. 17: 5521. https://doi.org/10.3390/s25175521

APA Style

Chen, Z., Lin, Y., Xu, J., Lu, K., & Huang, Z. (2025). Priori Knowledge Makes Low-Light Image Enhancement More Reasonable. Sensors, 25(17), 5521. https://doi.org/10.3390/s25175521

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop