Unsupervised Low-Light Image Enhancement in the Fourier Transform Domain

Ming, Feng; Wei, Zhihui; Zhang, Jun

doi:10.3390/app14010332

Open AccessArticle

Unsupervised Low-Light Image Enhancement in the Fourier Transform Domain

by

Feng Ming

¹

,

Zhihui Wei

^1,* and

Jun Zhang

²

¹

School of Mathematics and Statistics, Nanjing University of Science and Technology, Nanjing 210094, China

²

Qian Xuesen Academy, Nanjing University of Science and Technology, Nanjing 210094, China

^*

Author to whom correspondence should be addressed.

Appl. Sci. 2024, 14(1), 332; https://doi.org/10.3390/app14010332

Submission received: 18 October 2023 / Revised: 22 December 2023 / Accepted: 27 December 2023 / Published: 29 December 2023

Download

Browse Figures

Versions Notes

Abstract

:

Low-light image enhancement is an important task in computer vision. Deep learning-based low-light image enhancement has made significant progress. But the current methods also face the challenge of relying on a wide variety of low-light/normal-light paired images and amplifying noise while enhancing brightness. Based on existing experimental observation that most luminance information concentrates on amplitudes while noise is closely related to phases, an unsupervised low-light image enhancement method in the Fourier transform domain is proposed. In our method, the low-light image is firstly transformed into the amplitude component and phase component via Fourier transform. The luminance of low-light image is enhanced by CycleGAN in the amplitude domain, and the phase component is denoising. The cycle consistency losses both in the Fourier transform domain and spatial domain are used in training. The proposed method has been validated on publicly available test sets and shows that our method achieves superior results than other approaches in low-light image enhancement and noise suppression.

Keywords:

low-light image enhancement; fourier transform; noise suppression; deep learning; noise suppression; CycleGAN

1. Introduction

Owing to factors such as insufficient lighting and limitations of hardware devices, many images captured in fields like night photography, security surveillance, and medical imaging often suffer from issues such as blurriness, low contrast, and color loss. These low-light images not only make it difficult to discern the specific content of the images, but also, they often lack the crucial details required for computer vision tasks [1,2].

In recent years, numerous scholars have proposed various methods to address the problem of low-light image enhancement from different perspectives. Traditional methods often rely on handcrafted rules or image processing techniques. Mu et al. [3] applies the hybrid genetic algorithm, which combines the difference and genetic algorithm, to low-light image enhancement. Through the fast search ability of the algorithm, the optimal transform curve can be obtained, and it has better adaptive ability. Pizer et al. [4] proposed an adaptive Histogram Equalization method to expand the dynamic range by changing the histogram of the image. However, Histogram Equalization applies a strategy of global adjustment, resulting in local overexposure and amplification of hidden noise. Therefore, on this basis, there are a lot of methods to propose different improvements. Ibrahim et al. [5] proposed the use of average intensity to maintain the uniformity of illumination, and Arici et al. [6] proposed a noise robustness model to further suppress noise. In addition, Lee et al. [7] proposed a new dedistortion model to better maintain color balance.

Retinex image enhancement is a computational technique that aims to improve the visual quality of images by separating the reflectance component from the illumination component. The core idea of the Retinex theory [8] is that an image can be decomposed into its reflectance component and illumination component. Based on the Retinex theory, Jobson et al. [9] proposed the Single Scale Retinex (SSR) algorithm, which stretches the dynamic range to some extent using a Gaussian function and considers the reflectance component as the enhancement result. SSR operates on global pixels and does not take into account local overexposure/underexposure. Therefore, the Multi-Scale Retinex (MSR) et al. [10] was introduced as an improvement over SSR. MSR combines different single-scale Retinex outputs to achieve local dynamic range compression. To enhance the adaptability of the model, Lee et al. [11] suggested calculating the weights of SSR adaptively based on the input image. Fu et al. [12] proposed merging different advantages into a single illumination estimation based on multiple derivatives. Guo et al. [13] proposed restoring the initial illumination with a structurally aware prior.

Different from traditional methods, machine learning methods leverage large datasets to learn the mapping between low-light images and their enhanced versions. While these methods show promising performance, many of them heavily rely on pairs of low-light/normal-light images [14,15,16], which are expensive or even impossible to obtain in real-world scenarios. LLNet [14] is the first deep learning-based low-light image enhancement (LLIE) method, which proposes that one alternative way to cheaply generate such training pairs is to synthesize a low-light image from its counterpart, captured under a normal light condition. RUAS [17] develops a model based on Retinex principles to analyze the inherent under-exposure patterns in low-light images. They also devised a collaborative learning strategy, independent of reference images, which achieves high-performance results while requiring minimal computational resources. Almost all of these methods require a large number of paired images of low-light and normal-light, which take advantage of the powerful learning ability of neural networks; but, it cannot be ignored that its result in overfitting reduces the generalization ability of the model. Collecting paired images of the same scene in both low and normal light conditions is also sometimes difficult.

Unsupervised deep learning-based methods have been developed for image enhancement to eliminate the reliance on paired data. EnlightenGAN [18] uses attention-guided U-net as a generator and uses dual discriminators to guide global and local information. In addition to global and local adversarial losses, the global and local self feature preserving losses are proposed to pre-serve the image content before and after the enhancement. Additionally, the CycleGAN [19] model is an unsupervised image-to-image transformation or translation pocess that can be applied to image enhancement. These methods adopt generative adversarial networks (GANs) [20], which lie in the concept of adversarial loss that is the key to new unsupervised learning methods.

In general, the fundamental tasks of low-light image enhancement are enhancing luminance and denoising [21], but most existing learning-based methods tend to amplify noise while enhancing brightness [22], such as EnlightenGAN [18] and CycleGAN [19]. The solution of some approaches is to enhance the image first and then remove noise; however, it may change the local details and textures of the image in the enhancement stage, which makes it difficult for subsequent denoising algorithms to accurately identify and process the noise. Li et al. [23] provides a method to deal with noise and brightness in the Fourier transform domain. They have observed that the brightness information of an image is concentrated in the magnitude component after performing the Fourier transform, while the noise primarily resides in the phase component. Although enhancement in the Fourier transform domain successfully avoids amplifying noise during the enhancement of image luminance, it is a supervised learning process that requires paired images and overlooks the degradation process from normal-light to low-light images.

Inspired by the method of Li et al. [23], our approach also enhances images in the Fourier transform domain. In this article, our idea is to enhance the luminance of low-light images in the amplitude domain using the CycleGAN model, which involves an unsupervised image generation process that also includes the process from low-light to normal-light images. In the training process, we concurrently introduce the cycle consistency loss of the spatial domain and feature domain. Considering the issue of color deviation during the style transfer process from low-light images to normal-light images in the CycleGAN model, we additionally incorporate a color constancy loss function into our model. In summary, the contributions of this paper are as follows:

(1): Enhancing luminance using the CycleGAN model in the amplitude domain avoids the issue of amplifying noise, and we use the convolution neural network to denoise during the phase domain at the same time. Our model also considers the degradation process from normal-light to low-light images, which can simulate the real environment and expand the training dataset so that the method can better adapt to the image processing task under low-light conditions.
(2): We incorporate the cycle consistency loss functions both in the space domain and the Fourier transform domain, which can provide more training signals and help to accelerate the convergence and stability of the training process. The spatial domain cycle consistency loss function allows for a direct comparison of pixel differences between the generated images and the original images. This evaluation enables an assessment of the quality of the generated images at the pixel level. Meanwhile, the cycle consistency loss function in the Fourier transform domain enables the similarity of brightness information.

Having introduced the background and presented our method in the first section, the second section introduces the related work; the third section introduces our network framework and specific loss function; the fourth section is the experiment; and the last section is the full text summary.

2. Related Work

Low-light image enhancement based on deep learning is an important direction that has been widely studied in the field of computer vision in recent years. Based on our research content, we divide the related work into two parts. The first part lists unsupervised methods, while the second part introduces methods that enhance luminance and denoise simultaneously.

2.1. Unsupervised Deep Learning Methods

In contrast to supervised learning, unsupervised deep learning methods enhance low-light images without paired image datasets. Some approaches treat image enhancement as an image-specific mapping estimate and will employ deep learning to calculate the best mapping. Zero-DCE [24] was the first to propose a low-light enhancement network that does not need paired training data. It designed a per-pixel high-order curve, demonstrating the potential of training image enhancement networks through a no-reference loss function in the absence of reference images. Subsequently, an improved version, Zero-DCE++ [25], was proposed, which leverages the advantages of a mini-network with only 10K parameters, resulting in significantly improved computation speed. Additionally, Ref. [26] developed a novel self-calibrating illumination (SCI) learning framework and established a cascade illumination learning process with weight sharing to handle this task. GANS [20] is also used for low-light image enhancement [18,19,27,28]. These methods employ generative adversarial networks to encourage the distribution of the generated images to approach the distribution of the target images without pairing supervision. The CycleGAN model [19,29] introduced in this paper presents a method for learning to translate images from the source domain X to the target domain Y without paired examples, providing inspiration for generating “paired” images. There are, indeed, several image enhancement methods based on improved CycleGAN [30]. Tang et al. [31] proposed an improved CycleGAN model that incorporates perceptual loss into the generator’s objective function. This enhancement aims to retain more image details and structural information in the low-light image enhancement process.

2.2. Enhancing Simultaneously Denoising Methods

Traditional image enhancement methods are primarily built upon histogram equalization [32] or Retinex theory [33]. The Retinex model-based approach decomposes a low-light image into a reflection component and an illumination component by some kind of prior or regularization. Recently, learning-based methods have been proposed to learn illumination enhancement [34]. In terms of image denoising, there have been a lot of traditional methods, such as Filter denoisingand BM3D [35]. Most learning-based denoising models require paired data for training, but there are also deep unsupervised learning-based models, such as DnCNN [21] and Noise2Noise [36]. Despite the rapid development of deep learning-based low-light image enhancement approaches, there are a lot of methods that only focus on luminance enhancement and neglect to remove the noise of low-light images [17,24,26]. RUAS [17] amplifies the noise when discovering low-light prior architectures from a compact search space. So, these methods would amplify the noise while enhancing the brightness of the image. Some methods consider the problem of denoising when enhancing luminance [22,37], but they enhance luminance and remove noise in the spatial domain, leading to unsatisfactory results. It is meaningful work to study how to enhance the brightness of low-light images without amplifying the noise.

3. Proposed Method

In this paper, our idea is to utilize the CycleGAN model to enhance luminance without paired images. Taking advantage of the particularity that the noise after the Fourier transform is mainly concentrated in the phase domain, we propose an unsupervised low-light image enhancement framework in the Fourier transform domain. On the one hand, the CycleGAN model is used to enhance luminance in the amplitude domain. On the other hand, we use the three-layer convolution neural network in the phase domain of the Fourier transform to denoise. The specific network framework is shown in Figure 1. For the input low-light image

I_{l o w} = f (x, y)

, the fast Fourier transform (FFT) formula is shown as:

F (u, v) = \sum \sum f (x, y) e^{- i 2 π (\frac{u x}{P} + \frac{v y}{Q})},

(1)

where

F (u, v)

denotes the complex value in the frequency domain, u and v represent the horizontal and vertical frequency components, respectively, x and y represent the position coordinates in the horizontal and vertical directions, respectively, in the spatial domain, P denotes the width of the image, and Q denotes the height of the image. Let

R (u, v)

and

I (u, v)

represent the real part and the imaginary part of the

F (u, v)

, respectively; we can obtain the amplitude spectrum

A_{l o w}^{r e a l}

and phase spectrum

P_{l o w}^{r e a l}

as follows:

A_{l o w}^{r e a l} = \sqrt{R^{2} (u, v) + I^{2} (u, v)}

(2)

P_{l o w}^{r e a l} = arctan \frac{I (u, v)}{R (u, v)} .

(3)

In Figure 1a, generator

G_{1}

is for generating the normal-light amplitude image

A_{n o r}^{f a k e}

from the low-light amplitude image

A_{l o w}^{r e a l}

, and generator

G_{2}

is for generating the low-light amplitude image

A_{l o w}^{r e c}

again.

A_{n o r}^{f a k e} = G_{1} (A_{l o w}^{r e a l}), A_{l o w}^{r e c} = G_{2} (A_{n o r}^{f a k e})

(4)

There is a discriminator

D_{1}

that identifies whether it is the generated normal-light image or the initial normal-light image. In the phase domain, we use the three-layer convolution denoiser to denoise the phase image

P_{l o w}^{r e a l}

. To ensure the consistency of

I_{l o w}

and

I_{r e c}

in terms of noise distribution and color offset, the residual noise map

N^{^{'}}

adds to phase spectrum

{\hat{P}}_{l o w}^{r e a l}

, which denotes the residual noise map that is calculated as:

N^{^{'}} = P_{l o w}^{r e a l} - {\hat{P}}_{l o w}^{r e a l} .

(5)

The final recovered image

I_{l o w}^{r e c}

is obtained by the inverse fast Fourier transform (IFFT). And the formula of IFFT is calculated as:

I_{n o r}^{f a k e} = A_{n o r}^{f a k e} * e^{i {\hat{P}}_{l o w}^{r e a l}}, I_{l o w}^{r e c} = A_{l o w}^{r e c} * e^{i P_{l o w}^{r e c}},

(6)

where

A_{n o r}^{f a k e}

is the enhanced normal-light amplitude image generated by generator

G_{1}

, and

A_{l o w}^{r e c}

is the recovered low-light amplitude image generated by generator

G_{2}

.

{\hat{P}}_{l o w}^{r e a l}

is the denoised phase image of the low-light image, and

P_{l o w}^{r e c}

is the same for the phase image of the initial low-light image

I_{l o w}

.

Figure 1b represents the reverse process; the structure of generator

G_{1}

and

G_{2}

, discriminator

D_{2}

, and denoiser are all the same to Figure 1a. The added noise N denotes the random Gaussian distribution noise map.

3.1. Enhancement Module

Our method utilizes the GANs to enhance luminance. Since we have no paired images, our enhancement module adopts the structure of CycleGAN [19], which consists of a cyclic architecture. The structure of the generator is based on [38], which includes three convolutions, two fractionally-strided convolutions, several residual blocks [39], and a convolutional layer that maps features to RGB.

The discriminator employs a PatchGAN [40] structure, which maps the input to a

N \times N

patch (matrix) X. The values

X_{i j}

represent the probability of each patch being a real sample. Taking the average of

X_{i j}

yields the final output of the discriminator. X is essentially the feature map output of the convolutional layers.

3.2. Denoising Module

Our denoising is carried out in the phase domain of the image. We use 3 × 3 convolution for de-noising, each layer is the activation function structure of convolution+ReLU, and the de-noised phase spectrum is used for the subsequent IFFT. The specific network architecture is shown in Figure 2.

3.3. Loss Function

(1): Adversarial loss

The generator wants to be able to generate an image that will fool the discriminator, and the discriminator wants to be able to correctly identify whether the image is a fake image generated by the generator or a non-generated true image. In CycleGAN [19], because there are two generators and two discriminators, there are two adversarial losses—the adversarial losses of generator

G_{1}

and discriminator

D_{1}

, and the adversarial losses of generator

G_{2}

and discriminator

D_{2}

.

L_{GAN} (G_{1}, D_{1}) = E_{I_{n o r} \sim p_{data 2 (I_{n o r})}} [log D_{1} (I_{l o w})] + E_{I_{l o w} \sim p_{data 1 (I_{l o w})}} [log (1 - D_{1} (I_{n o r}^{f a k e})]

(7)

Similarly, we can obtain

L_{GAN} (G_{2}, D_{2}) = E_{I_{l o w} \sim p_{data 1 (I_{l o w})}} [log D_{2} (I_{n o r})] + E_{I_{n o r} \sim p_{data 2 (I_{n o r})}} [log (1 - D_{2} (I_{l o w}^{f a k e})] .

(8)

(2): Cycle consistency loss

We hope that the image

I_{n o r}^{f a k e}

generated by

I_{l o w}

can be consistent with the initial image

I_{l o w}^{r e c}

generated by the following generator

G_{2}

, that is,

I_{l o w} \to I_{n o r}^{f a k e} \to I_{l o w}^{r e c} \approx I_{l o w}

. These things considered, using a novel approach, we combine the cycle consistency loss in the amplitude domain that can help the model learn higher-level amplitude features. Therefore, the following loss function can be obtained, which is the core of CycleGAN [19] that can remain structurally consistent with the generated normal-light images.

\begin{matrix} L_{cyc} (G_{1}, G_{2}) = E_{I_{n o r} \sim p_{data 2 (I_{n o r})}} [∥ I_{n o r}^{r e c} - I_{n o r} ∥_{1}] + E_{I_{l o w} \sim p_{data 1 (I_{l o w})}} [∥ I_{l o w}^{r e c} - I_{l o w} ∥_{1}] \\ + E_{A_{n o r}^{r e a l} \sim p_{data 3 (A_{n o r}^{r e a l})}} [∥ A_{n o r}^{r e c} - A_{n o r}^{r e a l} ∥_{1}] + E_{A_{l o w}^{r e a l} \sim p_{data 4 (A_{l o w}^{r e a l})}} [∥ A_{l o w}^{r e c} - A_{l o w}^{r e a l} ∥_{1}] \end{matrix}

(9)

(3): Identity loss

The meaning of this loss is that if the image of the

I_{n o r}

domain is fed into the generator

G_{1}

, the result should be itself as much as possible, and no other conversion should be performed. The identity loss is mainly to limit the generator

G_{1}

from automatically modifying the color of the input image, while ensuring the generator

G_{1}

’s function of generating normal-light images.

L_{idt} (G_{1}, G_{2}) = E_{I_{n o r} \sim p_{data 2 (I_{n o r})}} [∥ G_{1} (I_{n o r}) - I_{n o r} ∥_{1}] + E_{I_{l o w} \sim p_{data 1 (I_{l o w})}} [∥ G_{2} (I_{l o w}) - I_{l o w} ∥_{1}]

(10)

(4): Color constant loss

Since the CycleGAN framework is used for image style transfer and does not involve the problem of original color preservation between different styles, we add a color constant loss function to maintain the color preservation when generating normal-light images.

L_{c o l} = m e a n ({(r_{1} - r_{2})}^{2} + {(g_{1} - g_{2})}^{2} + {(b_{1} - b_{2})}^{2}),

(11)

where

r_{1}

,

b_{1}

, and

g_{1}

are the average intensity values of the three channels of the normal-light image

I_{n o r}

; and

r_{2}

,

b_{2}

, and

g_{2}

are the average intensity values of the three channels of

I_{n o r}^{r e c}

, respectively. In summary, the overall loss is as follows:

L = L_{GAN} (G_{1}, D_{1}) + L_{GAN} (G_{2}, D_{2}) + L_{cyc} (G_{1}, G_{2}) + L_{idt} (G_{1}, G_{2}) + L_{c o l} .

(12)

4. Experiments

4.1. Experimental Settings

The experiment was trained and tested on an NVIDIA Tesla P40 graphics card. We used the same training set of EnlightenGAN [18], which includes 1016 normal-light images and 914 low-light images from several datasets. The test was performed on 10 pairs of images selected from the LOL [15] test dataset. In the training process, the epoch was set to 200, the initial learning rate was set to 0.0002, the batch size to 1. The experiment was conducted on the Linux system with the network framework of pytorch.

4.2. Experimental Comparison

We compared the results with some classic methods, LLnet [14], RetinexNet [15], Zero-DCE [24], Zero-DCE++ [25], SCI [26], EnlightenGAN [18], and RUAS [17] on 10 pairs of the same test images. Experiments consisted of copying their original test code and testing it in the same environment settings. We calculated several classic evaluation indexes commonly used in low-light image enhancement, such as peak signal-to-noise ratio (PSNR), structural similarity index (SSIM), and learned perceptual image patch similarity (LPIPS) [41]. Unlike traditional metrics that focus on pixel-wise differences, LPIPS evaluates the similarity based on learned representations of image patches. It utilizes a deep neural network trained on large-scale human perceptual judgments to compute the perceptual distance between images. In addition, LPIPS takes into account various visual factors, such as color, texture, and structure, providing a more comprehensive assessment of image similarity. The experimental results are shown in Figure 3.

From Figure 3, it can be intuitively found that our method is closest to the original image after enhancement and has the best result. Among them, Zero-DCE [24], Zero-DCE++ [25], SCI [26], and RUAS [17] are under-exposed and do not enhance the luminance significantly. RetinexNet [15] has serious artifacts, and the background information is lost. In addition, LLNet [14] and EnlightenGAN [18] also achieve poor recovery. Although all the images become clear, these images have problems of underexposure and color distortion. The quantitative experimental results are shown in Table 1. Our method has achieved the best results on the three evaluation indicators. Meanwhile, we have also done experiments on facial image enhancement in computer vision. In Figure 4, our method and EnlightenGAN [18] yield natural exposure and clear details; our method has a little better exposure. In comparison, RetinexNet [15] also produces over-exposed artifacts. There is an obvious overexposure problem of RUAS [17], and the background of LLNet [14] is severely blurred. Other methods cannot recover face information and details well either.

4.3. Noise Experiment

In order to verify the necessity of our method in the Fourier transform domain, that is, amplitude-domain enhancement does not amplify the noise of the original image. We add noise to the original ten low-light test images; Gaussian noise with a variance of 5 and 20 are added in the two experiments, respectively. The low-light images after noise processing are re-tested, and their PSNR, SSIM, and LPIPS values are calculated. The results are shown in Table 2 and Table 3.

The experimental results of the same test images we selected are shown in Figure 5.

As can be seen from Figure 5, the LLnet [14] and RetinexNet [15] methods have certain denoising effects, but there are color distortion problems. The noise of Zero-DCE [24], Zero-DCE++ [25], SCI [26], and EnlightenGAN [18] are all significantly amplified and cannot remove the noise well. We can clearly see a lot of noise from the result of RUAS [17]. Our method in Table 2 has also achieved the best results in the three evaluation indicators, achieving image enhancement and image denoising to a certain extent, and achieving better restoration in image details and brightness. The experimental results show that the low-light image processing in the Fourier transform domain can separate the noise, and the noise will not be amplified during the enhancement process. Both qualitative and quantitative results prove the validity of the model.

4.4. Denoiser Experiment

Our methods use 3 × 3 convolution for de-noising, but some methods also use low-pass filters to deal with noise. The phase noise in the Fourier transform primarily arises due to phase discontinuity or instability. The low-pass filter operates on the principle of reducing the energy of the high-frequency portion of the spectrum by suppressing high-frequency signal components. This approach aims to achieve signal smoothing and denoising. When it comes to phase noise denoising, the low-pass filter is effective in removing high-frequency noise components while preserving the low-frequency information in the signal, thereby mitigating the impact of phase noise. Therefore, we conducted an experimental comparison between convolution and low-pass filtering for denoising. Under the same training conditions, the results of the test dataset are shown in Table 4.

When the phase noise is mainly concentrated in the high frequency part, the phase noise can be effectively removed, and the low frequency information in the signal can be preserved by selecting a suitable low-pass filter and cutoff frequency. However, the results of the experiment may be affected by the different parameter settings of our model, such as the required learning rate and step size, etc. Perhaps we can find better denoising results of a low-pass filter in subsequent experiments.

5. Conclusions

In this paper, an unsupervised low-light image enhancement method in the Fourier transform domain is proposed. This method harnesses the potential of the CycleGAN model to effectively enhance luminance in the amplitude domain via Fourier transform, which not only enables us to separate the noise better, but also enhance the luminance in the amplitude domain without amplifying the noise in low-light images. By considering the degradation process from normal-light to low-light conditions using the CycleGAN model, this approach better simulates real-world environments and adapts to the specific requirements of image processing in low-light scenarios. In addition, we also incorporate the cycle consistency loss functions both in the space domain and the Fourier transform domain. The spatial domain cycle consistency loss enables an assessment of the quality of the generated images at the pixel level, and the cycle consistency loss function in the Fourier transform domain enables the similarity of brightness information. The experiments of adding noise also confirm the effectiveness of our proposed model well. The experimental demonstration in this paper has yielded significant improvements in both objective evaluation metrics and subjective visual quality, showcasing a novel research idea for low-light image enhancement. In the future, we will focus on some enhancement tasks for large images, which still require a lot of training time for current models.

Author Contributions

Methodology, Z.W.; Software, Z.W.; Formal analysis, Z.W.; Writing—original draft, F.M.; Writing—review & editing, F.M.; Supervision, J.Z.; Project administration, J.Z. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Data Availability Statement

Publicly available datasets were analyzed in this study. This data can be found https://github.com/VITA-Group/EnlightenGAN, accessed on 26 December 2023.

Conflicts of Interest

There is no conflict of interest in the research and writing of this paper.

References

Li, J.; Feng, X.; Hua, Z. Low-light image enhancement via progressive-recursive network. IEEE Trans. Circuits Syst. Video Technol. 2021, 31, 4227–4240. [Google Scholar] [CrossRef]
Wu, C.; Shao, S.; Tunc, C.; Satam, P.; Hariri, S. An explainable and efficient deep learning framework for video anomaly detection. Clust. Comput. 2021, 25, 2715–2737. [Google Scholar] [CrossRef] [PubMed]
Mu, D.; Xu, C.; Ge, H. Hybrid genetic algorithm based image enhancement technology. In Proceedings of the 2011 International Conference on Internet Technology and Applications, Wuhan, China, 16–18 August 2011; pp. 1–4. [Google Scholar]
Pizer, M.; Johnston, E.; Ericksen, P.; Yankaskas, C.; Muller, E. Contrast-limited adaptive histogram equalization: Speed and effectiveness. In Proceedings of the Conference on Visualization in Biomedical Computing, Atlanta, GA, USA, 22–25 May 1990; pp. 337–345. [Google Scholar]
Ibrahim, H.; Kong, N.S.P. Brightness preserving dynamic histogram equalization for image contrast enhancement. IEEE Trans. Consum. Electron. 2007, 53, 1752–1758. [Google Scholar] [CrossRef]
Arici, T.; Dikbas, S.; Altunbasak, Y. A histogram modification framework and its application for image contrast enhancement. IEEE Trans. Image Process. 2009, 18, 1921–1935. [Google Scholar] [CrossRef] [PubMed]
Lee, C.; Kim, J.; Lee, C.; Kim, C. Optimized brightness compensation and contrast enhancement for transmissive liquid crystal displays. IEEE Trans. Circuits Syst. Video Technol. 2014, 24, 576–590. [Google Scholar]
Land, E.H.; McCann, J.J. Lightness and retinex theory. J. Opt. Soc. Am. 1971, 61, 1–11. [Google Scholar] [CrossRef]
Jobson, D.J.; Rahman, Z.; Woodell, G.A. Properties and performance of a center/surround retinex. IEEE Trans. Image Process. 1997, 6, 451–462. [Google Scholar] [CrossRef]
Jobson, D.J.; Rahman, Z.; Woodell, G.A. A multiscale retinex for bridging the gap between color images and the human observation of scenes. IEEE Trans. Image Process. 1997, 6, 965–976. [Google Scholar] [CrossRef]
Lee, C.H.; Shih, J.L.; Lien, C.C.; Han, C.C. Adaptive multiscale retinex for image contrast enhancement. In Proceedings of the 2013 International Conference on Signal-Image Technology & Internet-Based Systems, Kyoto, Japan, 2–5 December 2013; pp. 43–50. [Google Scholar]
Fu, X.; Zeng, D.; Huang, Y.; Liao, Y.; Ding, X.; Paisley, J. A fusion-based enhancing method for weakly illuminated images. Signal Process. 2016, 129, 82–96. [Google Scholar] [CrossRef]
Guo, X.; Li, Y.; Ling, H. LIME: Low-light image enhancement via illumination map estimation. IEEE Trans. Image Process. 2016, 26, 982–993. [Google Scholar] [CrossRef]
Lore, K.G.; Akintayo, A.; Sarkar, S. Llnet: A deep autoencoder approach to natural low-light image enhancement. Pattern Recognit. 2017, 61, 650–662. [Google Scholar] [CrossRef]
Wei, C.; Wang, W.; Yang, W.; Liu, J. Deep retinex decomposition for low-light enhancement. arXiv 2018, arXiv:1808.04560. [Google Scholar]
Lv, F.; Lu, F.; Wu, J.; Lim, C. MBLLEN: Low-light image/video enhancement using cnns. In Proceedings of the British Machine Vision Conference, Newcastle upon Tyne, UK, 3–6 September 2018. [Google Scholar]
Liu, R.; Ma, L.; Zhang, J.; Fan, X.; Luo, Z. Retinex-inspired unrolling with cooperative prior architecture search for low-light image enhancement. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Nashville, TN, USA, 20–25 June 2021. [Google Scholar]
Jiang, Y.; Gong, X.; Liu, D.; Cheng, Y.; Fang, C.; Shen, X.; Yang, J.; Zhou, P.; Wang, Z. EnlightenGAN: Deep light enhancement without paired supervision. IEEE Trans. Image Process. 2021, 30, 2340–2349. [Google Scholar] [CrossRef] [PubMed]
Zhu, J.; Park, T.; Isola, P.; Efros, A.A. Unpaired image-to-image translation using cycle-consistent adversarial networks. ICCV 2017, 2223–2232. [Google Scholar]
Goodfellow, I.; Pouget-Abadie, J.; Mirza, M.; Xu, B.; Warde-Farley, D.; Ozair, S.; Courville, A.; Bengio, Y. Generative adversarial nets. NIPS 2014, 27. [Google Scholar]
Zhang, K.; Zuo, W.; Chen, Y.; Meng, D.; Zhang, L. Beyond a gaussian denoiser: Residual learning of deep cnn for image denoising. IEEE Trans. Image Process. 2017, 26, 3142–3155. [Google Scholar] [CrossRef]
Wu, W.; Weng, J.; Zhang, P.; Wang, X.; Yang, W.; Jiang, J. Uretinex-net:Retinex-based deep unfolding network for low-light image enhancement. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, New Orleans, LA, USA, 18–24 June 2022. [Google Scholar]
Li, C.; Guo, C.L.; Zhou, M.; Liang, Z.; Zhou, S.; Feng, R.; Loy, C.C. Embedding fourier for ultra-high-definition low-light image enhancement. arXiv 2023, arXiv:2302.11831. [Google Scholar]
Guo, C.; Li, C.; Guo, J.; Loy, C.C.; Hou, J.; Kwong, S.; Cong, R. Zero-reference deep curve estimation for low-light image enhancement. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Seattle, WA, USA, 13–19 June 2020. [Google Scholar]
Li, C.; Guo, C.; Loy, C.C. Learning to enhance low-light image via zeroreference deep curve estimation. IEEE Trans. Pattern Anal. Mach. Intell. 2021, 44, 4225–4238. [Google Scholar]
Ma, L.; Ma, T.; Liu, R.; Fan, X.; Luo, Z. Toward fast, flexible, and robust low-light image enhancement. In Proceedings of the EEE/CVF Conference on Computer Vision and Pattern Recognition, New Orleans, LA, USA, 18–24 June 2022; pp. 5637–5646. [Google Scholar]
Yang, W.; Wang, S.; Fang, Y.; Wang, Y.; Liu, J. From fidelity to perceptual quality: A semi-supervised approach for low-light image enhancement. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Orleans, LA, USA, 18–24 June 2022; pp. 3063–3072. [Google Scholar]
Chen, Y.-S.; Wang, Y.-C.; Kao, M.-H.; Chuang, Y.-Y. Deep photo enhancer: Unpaired learning for image enhancement from photographs with gans. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA, 18–23 June 2018; pp. 6306–6314. [Google Scholar]
Liu, M.; Breuel, T.; Kautz, J. Unsupervised image to-image translation networks. Adv. Neural Inf. Process. Syst. 2017, 30, 700–708. [Google Scholar]
Cho; Woon, S.; Baek, N.R.; Koo, J.H.; Arsalan, M.; Park, K.R. Semantic segmentation with low light images by modified CycleGAN-based image enhancement. IEEE Access 2020, 8, 93561–93585. [Google Scholar] [CrossRef]
Tang, G.; Ni, J.; Chen, Y.; Cao, W.; Yang, S.X. An Improved CycleGAN Based Model For Low-light Image Enhancement. IEEE Sens. J. 2023. [Google Scholar] [CrossRef]
Pizer, S.M.; Amburn, E.P.; Austin, J.D.; Cromartie, R.; Geselowitz, A.; Greer, T.; Romeny, B.T.H.; Zimmerman, J.B.; Zuiderveld, K. Adaptive histogram equalization and its variations. Comput. Vision Graphics Image Process. 1987, 39, 355–368. [Google Scholar] [CrossRef]
Land, E.H. The retinex theory of color vision. Sci. Am. 1977, 237, 108–129. [Google Scholar] [CrossRef] [PubMed]
Gharbi, M.; Chen, J.; Barron, J.T.; Hasinoff, S.W.; Durand, F. Deep bilateral learning for real-time image enhancement. ACM Trans. Graph. (TOG) 2017, 36, 1–12. [Google Scholar] [CrossRef]
Dabov, K.; Foi, A.; Katkovnik, V.; Egiazarian, K. Image denoising by sparse 3-d transform-domain collaborative filtering. IEEE Trans. Image Process. 2007, 16, 2080–2095. [Google Scholar] [CrossRef]
Lehtinen, J.; Munkberg, J.; Hasselgren, J.; Laine, S.; Karras, T.; Aittala, M.; Aila, T. Noise2noise: Learning image restoration without clean data. arXiv 2018, arXiv:1803.04189. [Google Scholar]
Xu, X.; Wang, R.; Fu, C.; Jia, J. SNR-aware low-light image enhancement. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, New Orleans, LA, USA, 18–24 June 2022. [Google Scholar]
Johnson, J.; Alahi, A.; Li, F.-F. Perceptual losses for real-time style transfer and super-resolution. In Proceedings of the Computer Vision—ECCV 2016: 14th European Conference, Amsterdam, The Netherlands, 11–14 October 2016. [Google Scholar]
He, K.; Zhang, X.; Ren, S.; Sun, J. Deep residual learning for image recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA, 27–30 June 2016. [Google Scholar]
Isola, P.; Zhu, J.-Y.; Zhou, T.; Efros, A.A. Image to-image translation with conditional adversarial networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA, 21–26 July 2017. [Google Scholar]
Zhang, R.; Isola, P.; Efros, A.A.; Shechtman, E.; Wang, O. The unreasonable effectiveness of deep features as a perceptual metric. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA, 18–23 June 2018. [Google Scholar]

Figure 1. The network framework of our proposed methods.

Figure 2. The structure of the denoiser in the phase domain.

Figure 3. Enhancement results of each method. (a) Input. (b) LLNet [14]. (c) RetinexNet [15]. (d) Zero-DCE [24]. (e) Zero-DCE++ [25]. (f) SCI [26]. (g) EnlightenGAN [18]. (h) RUAS [17]. (i) Ours. (j) “GT” is the ground truth.

Figure 4. Enhancement results of each low-light face image. (a) Noisy image. (b) LLNet [14]. (c) RetinexNet [15]. (d) Zero-DCE [24]. (e) Zero-DCE++ [25]. (f) SCI [26]. (g) EnlightenGAN [18]. (h) RUAS [17]. (i) Ours.

Figure 5. Enhancement results of adding Gaussian noise with a variance of 20 for each method. (a) Noisy image. (b) LLNet [14]. (c) RetinexNet [15]. (d) Zero-DCE [24]. (e) Zero-DCE++ [25]. (f) SCI [26]. (g) EnlightenGAN [18]. (h) RUAS [17]. (i) Ours. (j) “GT” is the ground truth.

Table 1. Quantitative comparison of the results of each method. The best results are in bold.

Method	PSNR	SSIM	LPIPS
LLNet [14]	28.09178683	0.8009	0.2233
RetinexNet [15]	28.01414532	0.5184	0.4011
Zero-DCE [24]	27.63589704	0.6290	0.2490
Zero-DCE++ [25]	27.66889766	0.5464	0.2630
SCI [26]	27.55003468	0.5452	0.2610
EnlightenGAN [18]	27.77812987	0.7078	0.2334
RUAS [17]	27.76587577	0.5709	0.2889
Ours	28.67713423	0.8056	0.1207

Table 2. Quantitative results of adding Gaussian noise with a variance of 5 for each method. The best results are in bold.

Method	PSNR	SSIM	LPIPS
LLNet [14]	27.97043800	0.7468	0.2772
RetinexNet [15]	27.80025249	0.2079	0.8041
Zero-DCE [24]	27.70068916	0.3056	0.6482
Zero-DCE++ [25]	27.69722996	0.2790	0.6575
SCI [26]	27.68119815	0.2718	0.6620
EnlightenGAN [18]	27.81145379	0.3794	0.5937
RUAS [17]	27.8588932	0.2532	0.6900
Ours	28.01347462	0.7652	0.1873

Table 3. Quantitative results of adding Gaussian noise with a variance of 20 for each method. The best results are in bold.

Method	PSNR	SSIM	LPIPS
LLNet [14]	28.06767111	0.5964	0.5720
RetinexNet [15]	27.92532647	0.0720	1.1575
Zero-DCE [24]	27.91220942	0.0859	1.1218
Zero-DCE++ [25]	27.88815927	0.0759	1.1510
SCI [26]	27.92670618	0.0741	1.1769
EnlightenGAN [18]	27.87609919	0.1411	0.9781
RUAS [17]	27.92985406	0.0621	1.2399
Ours	28.11758516	0.6818	0.4379

Table 4. Quantitative comparison results for each method. The best results are in bold.

Denoiser	PSNR	SSIM	LPIPS
3 × 3 convolution	28.67713423	0.8056	0.1207
low-pass filter	28.43365017	0.7932	0.1519

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Ming, F.; Wei, Z.; Zhang, J. Unsupervised Low-Light Image Enhancement in the Fourier Transform Domain. Appl. Sci. 2024, 14, 332. https://doi.org/10.3390/app14010332

AMA Style

Ming F, Wei Z, Zhang J. Unsupervised Low-Light Image Enhancement in the Fourier Transform Domain. Applied Sciences. 2024; 14(1):332. https://doi.org/10.3390/app14010332

Chicago/Turabian Style

Ming, Feng, Zhihui Wei, and Jun Zhang. 2024. "Unsupervised Low-Light Image Enhancement in the Fourier Transform Domain" Applied Sciences 14, no. 1: 332. https://doi.org/10.3390/app14010332

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Unsupervised Low-Light Image Enhancement in the Fourier Transform Domain

Abstract

1. Introduction

2. Related Work

2.1. Unsupervised Deep Learning Methods

2.2. Enhancing Simultaneously Denoising Methods

3. Proposed Method

3.1. Enhancement Module

3.2. Denoising Module

3.3. Loss Function

4. Experiments

4.1. Experimental Settings

4.2. Experimental Comparison

4.3. Noise Experiment

4.4. Denoiser Experiment

5. Conclusions

Author Contributions

Funding

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI