1. Introduction
Distortion phase recovery, as a measurement technique based on optical principles, is utilized for quantitatively describing the phase and amplitude distribution of optical waves during their propagation. It can identify and correct distortions within the system through comparisons with the actual wavefront, thereby enhancing imaging quality and performance. Distortion phase recovery can be applied in a wide range of fields, such as atmospheric optics, laser device manufacturing, and optical communication. Moreover, it plays a crucial role in supporting the development of adaptive optical systems.
Because determining the system’s aberration function properly and quickly is essential to the function of adaptive optics systems, numerous techniques have been devised to accomplish this. In more traditional methods, the aberration function can be directly measured using wavefront sensors like the Shack–Hartmann wavefront sensor [
1,
2], the curvature wavefront sensor [
3], and the shearing interferometer [
4] multi-order diffractive optical element [
5,
6,
7]. These wavefront sensors that rely on complex specialized optical hardware devices have the advantages of high precision and good stability. However, they also suffer from drawbacks such as high hardware costs, high computational complexity, and limited scalability. Since the 1990s, artificial neural networks (ANNs) and deep learning have been applied to determine the Zernike coefficients representing a given wavefront [
8]. This is because they are capable of learning complex relationships without the need for specific physical rule programming [
9,
10]. As research has progressed, methods have been developed to directly reconstruct wavefront phases from intensity images using deep learning [
11]. Utilizing ANNs, whether directly or indirectly for phase recovery, allows the desired features to be inferred directly from the data, enabling a higher degree of adaptability. However, it is important to acknowledge that traditional neural networks for phase reconstruction come with significant computational costs and memory usage, and their processing speed is slightly inferior to hardware wavefront sensors.
Optical Neural Networks (ONNs) constructed using optical matrices have emerged as a promising alternative for next-generation neural computing [
12]. By leveraging the speed of light and massive parallelism of optical signals within a medium, optical networks offer potential solutions to the challenges faced by electronic counterparts, such as computational power and energy consumption. In 2018, Lin et al. introduced an all-optical deep learning framework composed of multiple layers of diffractive surfaces that formed the structure of a deep learning network, referred to as the Diffractive Deep Neural Network (DDNN) [
13]. DDNNs can be created via element-wise multiplication by utilizing the interaction between light and matter, where ‘pixels’ on the diffractive surface are similar to ‘neurons’ on network layers; these neurons are then coupled via optical diffraction principles. Diffractive Deep Neural Networks (DDNNs) possess tremendous flexibility [
14,
15], scalability [
16], and significant advantages in terms of their processing power and performance. Subsequently, researchers thoroughly investigated and verified the generalization abilities of DDNNs, using them in a variety of fields, such as scattering imaging [
17], gesture classification [
18], and orbital angular momentum spectrum measurements [
19].
DDNNs have demonstrated promising achievements in the retrieval and correction of optical interference phases. In 2022, Zhao et al. proposed an adaptive optical compensation scheme based on a DDNN to restore the distortions induced by oceanic turbulence on vortex beams [
20]. A comparative analysis with wavefront recovery schemes based on CNN and GS algorithms was conducted, revealing that the DDNN achieved the highest improvement in mode purity for compensated vortex beams. Subsequently, the team introduced a Hybrid Optoelectronic Deep Neural Network (HOEDNN), where the DDNN is trained to establish mapping between the distorted orbital angular momentum (OAM) intensity patterns, and the intensity distribution without oceanic turbulence interference [
21]. A CNN is then employed to recognize the output of the DDNN. In addition, in 2023, Elena Goi et al. introduced a compact multi-layer diffraction neural network module imprinted on an imaging sensor [
22]. This module first focuses the input light through a lens, then reconstructs the Zernike-based pupil phase distribution from the point spread function. By integrating CMOS sensors with diffraction elements, they achieved direct pupil phase recovery based on the superposition of the first 14 orders of Zernike polynomials. Furthermore, in the field of real-time wavefront correction systems, Cui et al. trained the DDNN as a wavefront corrector, validated its correction effect in scenarios such as off-axis and binary stars, and positioned it between the imaging lens and the image plane to improve the wavefront correction frequency [
23].
Existing research predominantly emphasizes end-to-end direct processing methods, yet in the application of optical systems with relatively simple and easily modellable aberration patterns, Zernike polynomials, representing common distortion modes in optical systems, can more effectively and concisely describe wavefront aberrations. To bridge the gap concerning indirect phase retrieval methods within the domain of diffraction neural networks combined with wavefront adaptive optics, we propose an indirect phase retrieval scheme based on Zernike polynomials, as shown in
Figure 1. We employ deep learning to train a set of transmissive diffractive layers and achieve the all-optical inversion of the mapping relationship between the distorted phase carried by the input beam and its corresponding Zernike coefficients. After the jointly modulated light, which carries an unknown distorted wavefront, passes through a prepared multilayer diffractive element, the desired intensity distribution is obtained in a specific region of the output plane, as shown in
Figure 1a. The intensity within this region corresponds to the coefficients of Zernike polynomials, and a straightforward combination operation yields the distribution of the unknown distorted phase, as shown in
Figure 1b. The simulation results indicate an average root mean square (RMS) error of 0.086
for the phase results obtained via this method, thus meeting the imaging quality requirements of the system. This approach holds great potential for widespread application across various fields in the future.
2. Theory and Analysis
2.1. Indirect Phase Recovery Scheme Based on Diffraction Neural Network
The conceptual illustration of our proposed diffraction neural network-based indirect phase recovery scheme is presented in
Figure 2. In this system, the wavefront detection module comprises a diffraction network and a Charge-Coupled Device (CCD1), while the wavefront control module is managed by a Personal Computer (PC). The wavefront correction module consists of a polarizer and half wave plate (HWP), used to adjust the polarization direction of the incident beam, and a spatial light modulator (SLM), used for phase modulation. In this scheme, incident beams experiencing distortion due to factors such as fluid (e.g., atmospheric turbulence) or biological tissues have their phases modulated layer by layer through a well-trained multilayer diffraction neural network. The intensity representation of the mapped distorted phase Zernike coefficients is obtained in the output field. Subsequently, the intensity distribution captured by the CCD1, the resolution of which needs to be greater than 400 × 400, is fed into the computer for the reconstruction of the phase screen and reverse operations, resulting in a compensatory phase screen. Finally, the spatial light modulator corrects the beam, effectively compensating for the distorted wavefront. In this scheme, a 50:50 beam splitter (BS) is used to divide the light into two beams. One beam is used to complete the above wavefront detection work, and the other beam is used as the verification beam. The comparison of the point spread function (PSF) before and after correction can be seen through the imaging of the beam on the CCD2 through the lens, which is used to describe the response of the focusing optical imaging system to the point source or point object.
The diffractive neural network is an all-optical machine learning platform that calculates a given task through successive transmission layers, with each diffraction layer typically composed of tens of thousands of diffraction units in order to modulate the phase or amplitude of the incident light. Similar to deep learning techniques, it can learn a certain mapping relationship from a large dataset. Subsequently, through error backpropagation and optimization methods such as stochastic gradient descent, it refines the modulation values of each layer to map the complex-valued input field containing optical information that is of interest to the desired output field. Assuming that the amplitude of a light beam is
and its wavefront aberration is
, where
and
are polar coordinates on the input pupil plane, the pupil function
of the input beam can be expressed as follows:
where
is the wave number and the imaginary unit
.
As has been established,
can be expressed as a linear combination of a series of Zernike polynomials, and it is represented as follows:
In this expression, represents the -th Zernike polynomial, and represents the -th Zernike coefficient.
Moreover, since the output plane intensity does not have negative values, in order to allow the positive and negative values of Zernike coefficients to be represented in the output plane as intensities, the coefficients
undergo a Sigmoid transformation to ensure their distribution within the (0, 1) range, which is expressed as follows:
where
represents the transformed value of the
-th coefficient; its relationship with the intensity of the output plane is illustrated in
Figure 3.
denotes the coordinates of the
-th region in the plane. According to Equations (1) and (2), the complex amplitude of the distorted input beam and the Zernike coefficients
, representing the wavefront aberration, satisfy a certain mapping relationship:
This paper utilizes the learning capability of a diffraction neural network similar to a deep neural network to accurately estimate the mapping . Equation (3) reveals the relationship between the output field intensity and the coefficients , which addresses the challenge of expressing the output plane in the negative domain using this transformation.
2.2. Network Structure and Parameter
DDNN is a system composed of a series of transmission or reflection diffraction layers and is endowed with the capacity to learn and modulate the optical field. Each layer consists of N × N pixels, where each pixel serves as a sensor node corresponding to neurons in a neural network. Similarly, the complex-valued transmission coefficients (including amplitude and phase) of each pixel in the diffraction element are trainable network parameters. The propagation of light between diffraction layers in the DDNN follows a connectivity pattern that is highly analogous to traditional fully connected neural networks, where each unit in a diffraction layer is connected to all units in the next layer. Each diffraction neuron can be regarded as the starting point of a secondary wave, and the complex-valued transmission coefficients and input field of each neuron jointly determine the amplitude and phase of the secondary wave. The free space propagation (FSP) between adjacent layer neurons adheres to the Rayleigh–Sommerfeld diffraction formula [
24]. Therefore, the optical field of the secondary wave can be expressed as follows:
where
represents the
-th neuron unit located at the position
in the
-th layer of the DDNN.
is the distance from the starting point to this neuron.
is the wavelength.
The transmission coefficients
of layer
can be represented by the amplitude and phase terms, formulated as follows:
This paper considers the ideal scenario of a pure phase-type DDNN structure, where
; this neglects the optical loss. According to the Huygens–Fresnel principle, the incident wave at layer
is the coherent superposition of the secondary waves emitted by each unit in layer
[
25]. Therefore, the complex amplitude
of the light field output by the
-th neuron at position
in layer
can be expressed as follows:
In this equation, is the sum of the outputs of all sensor nodes in layer , which, after propagating to layer , undergoes phase modulation via the -th neuron in layer . The secondary wave from the previous layer diffracts to the next layer, eventually diffracting to the output layer.
The fully connected structure of the network requires a high level of connectivity between the diffraction elements in each layer. The derived model of the second-order wavefront field is valid only when there is sufficient transmission of information between layers, i.e., when the diffracted light outputs of each layer can be fully interconnected. Therefore, before designing the network parameters, it is necessary to calculate the diffraction angles of the light passing through the diffractive neurons, ensuring that the next layer of diffractive optical elements can be fully covered. The maximum half-cone diffraction angle can be calculated using the Fraunhofer diffraction formula. When the diffraction order is minimized, the expression for the maximum half-cone angle is as follows:
where
represents the size of the diffractive neurons in this model.
It is evident that a combination of larger wavelengths and smaller neurons yields a larger half-cone diffraction angle. Therefore, in previous studies, terahertz lasers were commonly used as light sources. In this paper, a He-Ne laser with a wavelength of 632.8 nm was employed as the light source, following the general approach used to design the visible light diffraction neural networks mentioned in reference [
26]. For a square diffraction layer, it is necessary to ensure that the radius
of each diffraction point is greater than the side length
of each diffraction layer. This ensures that the entire region of the next layer’s diffractive elements is covered by the output light field of the previous layer. The side length
of the diffraction layer can be expressed in terms of the number of neurons
and the size
of the diffractive neuron. The diffraction radius is determined by both the maximum half-cone diffraction angle and the interlayer distance
. The physical quantity relationships included in the above model can be summarized as Equation (9).
By combining this with Equation (8), it can be deduced that the interlayer distance
must satisfy the following inequality:
As shown in
Figure 4, the diffraction neural network proposed in this paper for use in indirect phase recovery consists of 5 layers of diffractive elements. Each layer has 400 × 400 pixels, and each pixel has a size of 4
. Without considering pixel gaps, the size of the square diffraction layer is 3.2
. After calculation, the distance between adjacent diffraction layers,
d, is set to 20
. Considering the requirement for the intensity of the received plane, the distance
between the last diffraction layer and the CCD is set as 10
.
2.3. Network Backpropagation
In order to ensure consistency between the network output and the target output, the Adam gradient descent optimizer is employed during the training process to adjust the network weights and minimize the loss function. The loss function in this paper consists of two parts. Firstly, it is assumed that the input data for training are , where represents the distorted wavefront phase. The true values for each input data correspond to an array , where the elements represent the coefficients of the first ten Zernike polynomials. In the training set, the ground truth can be represented as . The mapping relationship between the distorted phase and the true coefficients is denoted as . Performing the prediction task for these data using the diffractive optical neural network system involves modulating the distorted wavefront phase in the input data onto a coherent light beam, obtaining the input field , and then collecting the intensity result of the output field via a photoelectric coupling device.
The training dataset comprises the mapping relationship
, where
represents the input data with a distorted wavefront phase. In the optical system, the desired mapping relationship is
. These two mapping relationships share a common essence at their core but can be considered as two different approaches during approximation, guiding the network convergence and progressively optimizing the phase parameters
of diffractive neurons. Consequently, two distinct loss functions are derived: the RMS error between the output plane intensity
and the ideal output intensity
, and the RMS between the computed actual coefficient array
and the target coefficient array
in the training set. Mathematically, these can be expressed as follows:
Loss1 represents the loss of the diffractive neural network’s output plane light intensity, which serves to avoid the presence of stray light spots in the background of the output field, thus achieving overall optimization.
Loss2 is the loss of label accuracy in the neural network’s phase inversion effect, which plays a role in controlling the intensity of the effective region on the output plane. Therefore, the combined total loss of the network can be expressed as follows:
3. Datasets and Network Training
To assess the performance of the diffraction neural network-based indirect phase recovery approach, we employed the Zernike polynomial method to simulate phase aberrations, thus generating a substantial volume of wavefront data for the training and validation datasets. To independently validate the network model’s effectiveness in addressing tasks related to single Zernike aberrations and combined Zernike aberrations, two distinct datasets were created for training, each consisting of single Zernike aberrations and combined Zernike aberrations.
Dataset 1: Phase distortions generated from single Zernike polynomials ranging from Z1 to Z10 were utilized in this study. The ground truth consisted of Zernike coefficients, which were individually transformed using Equation (3) to obtain values ranging between zero and one. These transformed coefficients represented the intensity values assigned to the corresponding output regions. The transformation was applied only to non-zero coefficients to avoid interference from zero terms. Dataset 1 comprised 10,000 training images and 2000 testing images. The training set consisted of 10 different individual Zernike polynomials, each with 1000 images, while the testing set included 200 images for each polynomial.
Dataset 2: The distorted phase is generated by combining Zernike polynomials of orders one to ten. The output ground truth is obtained by transforming the coefficients of each Zernike term through Equation (3), resulting in output intensity values within the range zero–one. Dataset 2 comprises a total of 12,000 images, with 10,000 images in the training set and 2000 images in the testing set.
For most applications of adaptive optics (AO) systems, the wavefront errors typically fall within a specific range over a given time frame. Drawing from accumulated experience in astronomical observations, we set the peak-to-valley (PV) values of the input aberration phases within the range of 0.3
to 3
, with an average PV value of 1.5
and an average RMS error of 0.25
. The dataset comprises a total of 12,000 images generated using Zernike polynomials, from orders two to eleven. Among these, 10,000 images constitute the training set, and the remaining 2000 images form the test set. The distribution of the aforementioned data is illustrated in
Figure 5.
The training of the diffraction neural network model for indirect phase recovery was conducted using the PyTorch 3.6 framework on an NVIDIA GeForce RTX 3080 GPU with 12 GB of RAM. The training environments and parameters for the two networks were consistent. The Adam optimizer was employed to optimize the parameters of the diffraction neural network, with training conducted over 100 epochs using a batch size of 128 and a learning rate of 0.01. The loss function and mean square error (MSE) decline curve of the training set and verification set during the training process are shown in
Figure 6. The total duration for a single training session was 8 h.
We constructed a five-layer diffraction neural network to learn the mapping relationship between the distorted phase and the decomposition coefficients of the Zernike polynomials. The input beam is represented in the form of complex amplitude, while the output is expressed as the distribution of light intensity. The training process involves the continuous adjustment of the phase values for each pixel in the five diffraction layers and aims to progressively minimize the differences between the predicted 10 Zernike coefficients and the ground truth. The phase distribution results for each layer after training are depicted in
Figure 7. Subsequently, the physical preparation of the diffraction layers can be accomplished using techniques such as photolithography or 3D printing based on the obtained phase distributions for each layer.
5. Conclusions
To address the deficiency in the indirect phase recovery of diffraction neural networks and to specifically target the distortion phase recovery problem in optically simple and easily modellable systems with aberration modes, we propose a diffraction neural network-based indirect phase inversion scheme in this work. This scheme utilizes the passive diffraction of multiple layers of diffraction units to achieve the inversion of distorted phases corresponding to the Zernike polynomial coefficients within a specific range. The mathematical model and mapping relationships of the DDNN are derived, and the DDNN model is trained to obtain the optimal solution for the diffraction layer phase distribution that meets the phase modulation requirements. When a distorted beam is incident on the diffraction network, the trained model will output the Zernike coefficients corresponding to the distorted phase. The simulation results demonstrate that this scheme can achieve single-order Zernike phase identification and recover combined phases within a specific range. The evaluation of the network is conducted via the output coefficients and phase correction results, proving that the network significantly reduces the mean square error of the distorted phase, thus greatly improving the imaging quality of optical systems. Additionally, the simulation verifies the impact of the number of diffraction layers on the network performance. By endowing the diffraction neural network with wavefront sensing capabilities, this work achieves low-power, high-speed phase recovery, exhibiting advantages with regard to convenience and cost control over the Shack–Hartmann sensor in traditional adaptive optics systems. The proposed wavefront recovery scheme, based on diffractive neural networks, demonstrates superior performance in terms of power consumption and real-time processing compared to wavefront recovery schemes relying on deep learning networks. However, the complexity of the recovered wavefront in this scheme is constrained by the designed light intensity distribution on the output plane, limiting its effectiveness in addressing the recovery of high-order complex phases. Further research is required to explore aspects such as the physical implementation and performance enhancement of this scheme. In summary, this scheme provides a new approach for distorted wavefront recovery, and future optical experiments are expected to validate and contribute to the development of adaptive optics systems.