*Article* **Study of Image Classification Accuracy with Fourier Ptychography**

**Hongbo Zhang 1,\*, Yaping Zhang 2,\*, Lin Wang 3, Zhijuan Hu 4, Wenjing Zhou 5, Peter W. M. Tsang 6, Deng Cao <sup>1</sup> and Ting-Chung Poon <sup>3</sup>**


**Abstract:** In this research, the accuracy of image classification with Fourier Ptychography Microscopy (FPM) has been systematically investigated. Multiple linear regression shows a strong linear relationship between the results of image classification accuracy and image visual appearance quality based on PSNR and SSIM with multiple training datasets including MINST, Fashion MNIST, Cifar, Caltech 101, and customized training datasets. It is, therefore, feasible to predict the image classification accuracy only based on PSNR and SSIM. It is also found that the image classification accuracy of FPM reconstructed with higher resolution images is significantly different from using the lower resolution images under the lower numerical aperture (NA) condition. The difference is yet less pronounced under the higher NA condition.

**Keywords:** fourier ptychography; image classification; deep learning; neural network

#### **1. Introduction**

Fourier Ptychography Microscopy (FPM) is a computational microscopy imaging technique potentially able to achieve a wide field of view and high-resolution imaging. It involves the use of a sequence of LEDs (LED array), which illuminates sequentially onto the target. Based on the sequential illumination, iterated sampling methods in the frequency domain are used for recovering higher resolution images [1–3]. Different types of Fourier Ptychography techniques have been proposed in the past for improving imaging resolutions [1–7]. Among them, multiplexed coded illumination techniques, laser-based implementations, aperture scanning Fourier Ptychography, camera scanning Fourier Ptychography, multi-camera approach, single-shot Fourier Ptychography, speckle illumination, X-ray Fourier Ptychography, and diffuser modulation have achieved successes [1]. More recent research has specifically addressed the Fourier Ptychography imaging problems such as the brightfield, phase, darkfield, reflective, multi-slice, and fluorescence imaging [3–5].

Driven by the significant interests in deep learning, a few different methods have been developed to solve the ill-posed Fourier Ptychography imaging problems. Notably, a convolutional neural network has been successfully applied for solving the Fourier Ptychography imaging problem [7]. The U-Net type structure-based generative adversarial network (GAN) was used to utilize fewer samples (26 images versus a few hundred images)

**Citation:** Zhang, H.; Zhang, Y.; Wang, L.; Hu, Z.; Zhou, W.; Tsang, P.W.M.; Cao, D.; Poon, T.-C. Study of Image Classification Accuracy with Fourier Ptychography. *Appl. Sci.* **2021**, *11*, 4500. https://doi.org/10.3390/ app11104500

Academic Editor: Jacek Wojtas

Received: 20 February 2021 Accepted: 11 May 2021 Published: 14 May 2021

**Publisher's Note:** MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

**Copyright:** © 2021 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https:// creativecommons.org/licenses/by/ 4.0/).

for obtaining similar quality image reconstruction [8]. More promisingly, the U-Net plus GAN structure is proven able to reconstruct the high-resolution images using only a smaller amount of lower resolution image input data [9]. An interpretable deep learning approach has shown its effectiveness for the imaging of scattering materials [9,10].

Until recently, the majority of Fourier Ptychography imaging studies have been primarily focusing on achieving high-resolution imaging based on the criteria of human visual perception satisfaction, yet a significant question left to answer is to understand the impact of such a technique on the broad spectrum of the downstream image processing related visual tasks such as classification, segmentation, and object detection. The ability to answer these questions is critical for applications such as industrial robotics, medical imaging, and industrial automation [10,11].

It is well known that deep learning is time-consuming, thus imposing significant computational challenges. However, a significant research effort has been made where the recent breakthrough has shown that the training time of Imagenet has been reduced from days to hours, even further to minutes [12]. All these breakthroughs, however, come with a price. They all require expensive GPU devices for training the deep learning neural network. The expenses of the costly hardware can be well reflected through the price of the GPU card, for example, an NVIDIA P-100 GPU card costs about USD 6000 with the current market [12]. Training of all possible combinations of the different quality of FPM reconstructions for obtaining image classification accuracy is feasible but timeconsuming and costly. As such, it is desirable to formulate the relationships between the FPM reconstructed image quality and the image classification accuracy in order to avoid training all the different combinations of the data and still able to identify the optimal parameters used for FPM reconstruction to achieve the best performance of the image classification accuracy [13–20].

With this research, we propose to use the Fourier Ptychography technique to reconstruct higher (FPM-based) and lower resolution (without FPM) images for image classification. Following the FPM reconstruction, a deep convolution neural network is constructed for evaluation of the image recognition accuracy for both higher and lower resolution images. A multiple linear regression model is used to regress the relationship between independent variables including peak signal to noise ratio (PSNR), the structural similarity index (SSIM), and the dependent variable image classification accuracy. Based on the regression model, it becomes feasible to infer the image classification accuracy directly based on PSNR and SSIM rather than through the intensive and time-consuming deep learning-based image classification training. In comparison to deep learning image classification training, PSNR and SSIM are easier to calculate with lower computational cost, the proposed method then becomes useful for predicting the image classification accuracy of FPM-related visual tasks. Additionally, our research also provides insights into the general effectiveness of image classification accuracy following FPM reconstruction by comparing the FPM reconstruction with or without FPM reconstruction (also known as higher or lower resolution images). In conclusion, our contribution of using PSNR and SSIM to predict image classification accuracy is not only limited to the FPM technique but also a universal approach to other imaging techniques such as digital holography, optical scanning holography, and transport of intensity imaging [21,22].

#### **2. Methodology**

#### *2.1. Fourier Ptychography*

For the imaging task, given an object complex field O0(x, y) with spectrum represented by O0 kx, ky , where kx and ky denote the spatial frequencies along the x and y-direction. The *n*-th captured low-resolution raw image intensity is In(x, y) (n = 1, 2, 3 ... N) with a spectrum of Gn kx, ky . The coherent transfer function (CTF) of the microscope objective is given as C kx, ky = circ(kr/k0NA), where kr = k2 <sup>x</sup> + k2 y, kx and ky represent the spatial frequencies along the x and y directions. denotes a value 1 within a circle of radius r0 and 0

otherwise, and r = x2 + y2. NA is the numerical aperture of the microscope objective and finally k0 is the wavenumber of the light source.

The FPM reconstruction starts from an initial guess of the spectrum distribution of the nth raw image, G<sup>0</sup> n kx, ky , according to:

$$\mathbf{g}\_{\mathbf{n}}^{0}(\mathbf{x},\mathbf{y}) = \odot^{-1} \left\{ \mathbf{G}\_{\mathbf{n}}^{0}(\mathbf{k}\_{\mathbf{x}},\mathbf{k}\_{\mathbf{y}}) \right\} = \odot^{-1} \left\{ \mathbf{O} \mathbf{y} \left( \mathbf{k}\_{\mathbf{x}} - \mathbf{k}\_{\mathbf{x}\mathbf{n}}, \mathbf{k}\_{\mathbf{y}} - \mathbf{k}\_{\mathbf{y}\mathbf{n}} \right) \mathbf{C} \left( \mathbf{k}\_{\mathbf{x}}, \mathbf{k}\_{\mathbf{y}} \right) \right\} \tag{1}$$

where and −<sup>1</sup> represent Fourier transform and inverse Fourier transform. kxn and kyn indicate the phase-shift caused by the oblique illumination along the x and y directions. To update g0 n, we replace the amplitude of g0 <sup>n</sup> with <sup>√</sup>In as:

$$\mathbf{g}\_{\mathbf{n}}^{1}(\mathbf{x}, \mathbf{y}) = \sqrt{\mathbf{I}\_{\mathbf{n}}} \frac{\mathbf{g}\_{\mathbf{n}}^{0}}{|\mathbf{g}\_{\mathbf{n}}^{0}|} \tag{2}$$

The superscript indices 0 and 1 in gn(x, y) represent the prior and post updated lowresolution image, respectively. The updated spectrum is G1 n kx, ky = g1 <sup>n</sup>(x, y) , as such the updated spectrum O<sup>1</sup> 0 kx, ky is:

$$\mathbf{O}\_{0}^{1}(\mathbf{k}\_{\text{x}} - \mathbf{k}\_{\text{xn}}, \mathbf{k}\_{\text{y}} - \mathbf{k}\_{\text{xn}}) = \mathbf{O}\_{0}(\mathbf{k}\_{\text{x}} - \mathbf{k}\_{\text{xn}}, \mathbf{k}\_{\text{y}} - \mathbf{k}\_{\text{yn}}) \left[ 1 - \mathbf{C}(\mathbf{k}\_{\text{x}}, \mathbf{k}\_{\text{y}}) \right] + \mathbf{G}\_{\text{n}}^{1}(\mathbf{k}\_{\text{x}}, \mathbf{k}\_{\text{y}}) \tag{3}$$

Repeating Equations (1) to (3) by using In from n = 1 to n = N in one iteration, following convergence, the error function Ek becomes:

$$\mathbf{E}\_{\mathbf{k}} = \sum\_{\mathbf{n}=1}^{N} \sum\_{\mathbf{x}, \mathbf{y}} \left\{ \left| \mathbf{g}\_{\mathbf{k}, \mathbf{n}}^{1} (\mathbf{x}, \mathbf{y}) \right| - \sqrt{\mathbf{I}\_{\mathbf{n}}(\mathbf{x}, \mathbf{y})} \right\}^{2} \tag{4}$$

The second summation indicates the pixel-by-pixel summation for every single image. In g<sup>0</sup> k,n, the superscript indices 0 and k represent the complex distribution of the *n*-th low-resolution image before the *k*-th iteration is completed. Following the *k*-th iteration, the high-frequency component-maintained spectrum Ok kx, ky is recovered. The intensity distribution of it is:

$$\left| \mathbf{I}\_{\mathbf{k}} = \left| \mathbf{O}\_{\mathbf{k}}(\mathbf{x}, \mathbf{y}) \right|^{2} = \left| \hspace{0.1cm} \{ \mathbf{O}\_{\mathbf{k}} \left( \mathbf{k}\_{\mathbf{x}}, \mathbf{k}\_{\mathbf{y}} \right) \} \right|^{2} \tag{5}$$

Ok kx, ky of Equation (5) indicates the desired spectrum after the *k*-th iteration, correspondingly, the O0 kx, ky indicates the original spectrum prior to the iteration of the update, the same as to Ik and Ok(x, y). Figure 1 shows the sequence of the FPM algorithm.

**Figure 1.** The iteration process of FPM. In the iteration, the sampling rate of the initial guess of the high-resolution object is higher than the collected low-resolution images. Through the iterative process, the reconstructed image is increased in spatial resolution.

In simulation, we choose to use a 630 nm red laser. Camera CCD resolution is 2.76 μm. The distance between the LED array and the sample is 90 mm. The gap between LEDs is 4 mm. The Fourier Ptychography image reconstruction process starts from the lower resolution direct images. In our study, we used 225 slower resolution images (15 × 15 LED array). Based on the 225 lower resolution images, we continued to perform spectrum sampling in the frequency domain. Based on Equation (2), the higher resolution image spectrum was updated until the converging condition specified was reached as shown in Equation (3). For the purpose of comparison between the FPM reconstructed image (the higher resolution) and the lower resolution image (the raw lower resolution image, without going through the FPM reconstruction process), the middle LED of the 15 by 15 LED array was chosen to be used. The system numerical aperture (NA) chosen here was varied from 0.05 to 0.5.

#### *2.2. Image Classification*

For the evaluation of the accuracy of image classification, a deep convolution neural network was used. We used six different neural networks for systematic image classification. The first deep convolution neural network was designed to train the MNIST dataset. The architecture of the convolution neural network is shown in Figure 2. In this convolution neural network, the following structure was used. Given the input size of 28 by 28 images, a 3 by 3 convolution is performed. Following the 3 by 3 convolution, a feature map of size 26 by 26 was obtained. There was a total of 16 such feature maps obtained. A 2 by 2 max-pooling was consequently performed. It produced a 13 by 13 feature map, where in total 16 such feature maps were obtained. A 3 by 3 convolution was further processed. Therefore, this lead to 32 10 by 10 feature maps. Continuously, a 2 by 2 max-pooling was conducted, yielding 32 feature maps at the size of 5 by 5. It follows that the 3 by 3 convolution operation yielded a 2 by 2 feature map with a total of 64 such feature maps. The fully connected operation was also conducted yielding a fully connected layer at the size of 10. Finally, softmax was performed for image classification.

**Figure 2.** Convolution neural network architecture for image classification using MNIST data.

Similarly, the same network architecture was also used for the Fashion-MNIST dataset training as shown in Figure 3. The Fashion-MNIST dataset is built following the principle of MNIST dataset. It has an identical format to the MNIST dataset in terms of image size, but has a more complex geometry including 11 classes of t-shirt, trouser, dress, coat,

sandal, shirt, sneaker, bag, and ankle boot [14]. It is known that Fashion-MNIST is a more challenging data benchmark for image classification tasks [15].

**Figure 3.** Convolution neural network architecture for image classification using Fashion-MNIST data.

It is worthwhile to note that the purpose of our research was not to achieve the highest image classification accuracy, but rather, we wanted to compare the image classification accuracy between an FPM reconstructed image (higher resolution) versus the lower resolution image (raw lower resolution image collected). We also want to discover the relationship between PSNR, SSIM, and image classification accuracy. As such, we only randomly chose part of the entire 60,000 images. Through our experiment, we discovered that the use of a partial set of images is able to achieve reasonable training accuracy while preventing overfitting. Thus, in this research, we chose to use 2560 MNIST images for performing training. By doing this, there are two advantages, first, the use of fewer images enabled us to do more rapid training. For example, training 50,000 FPM-processed images requires over 30 h on a moderate performance laptop. Yet, training of 2560 images requires nearly 20 times less training time. This allows us to do multiple trainings given different NAs within a reasonable time frame, thus were are able to build the regression model between PSNR, SSIM, and classification accuracy. However, the principles of the method proposed in this research were able to be universally applied to an arbitrary number of images for training and evaluation.

Similarly, for Fashion-MNIST, 2560 FPM processed images were used for training. A total of 512 FPM-processed images were used for the evaluation of the classification accuracy. Stochastic gradient descent with momentum was used as the optimizer for both MNIST and Fashion-MNIST training. Compared to batch stochastic gradient descent, minibatch stochastic gradient descent is more capable of reducing the training error, as such, it is also adopted to reduce the training error to the smallest. The training learning rate was chosen as 0.0001. For all the convolution layers, a padding of 1 was used. Eighty epochs were iterated before the training stopped.

The CIFAR dataset was also used for both training and evaluation. The CIFAR dataset includes 10 classes of airplane, automobile, bird, cat, deer, dog, frog, horse, ship, and truck. Differently, images from the CIFAR dataset have three channels (RGB), whereas both MNIST and Fashion-MNIST only have a single channel. Among them, similar to MNIST and Fashion-MNIST, 2560 FPM-processed images were used for training, and 512 FPM-processed images were used for evaluation of the classification accuracy. The residual type of neural network structure was adopted for training and evaluation purposes. The residual network structure is known to enable deeper layers of a network without the vanishing gradient problem [16]. It suggests that even with deeper layers, the training is still able to converge within a reasonable amount of training time. The residual network is also able to perform well on image feature maps fusion, taking advantage of its unique longshort memory mechanisms, and thus achieving better image classification accuracy [16]. The learning rate was chosen as 0.1. In total, 80 epochs during training were conducted. The residual network structure is shown in Figure 4. In order to reduce the likelihood of over-fitting, image augmentation is conducted including image translation and reflection. Cropping is performed prior to the start of training [17].

**Figure 4.** Residual convolution neural network architecture for image classification using the CIFAR dataset. The line (on the right side) in the residual layer represents a skipped connection.

The Caltech 101 dataset ass also used for training. Within Caltech 101, there are 101 classes including plane, chair, soccer, brain, and others. Each class has about 60 images. In total, 25 categories were used for training and evaluation. In contrast to MNIST, Fashion-MNIST and the CIFAR image set have limited resolutions of 28 by 28 and 32 by 32. Caltech 101 image resolution is much higher, mostly with a size of 300 by 200. The use of large images can better ensure the FPM image reconstruction accuracy. The FPM reconstruction method involves iteratively sampling the overlap region of the lower resolution images in the frequency domain. As such, a larger input can help to improve the reconstruction accuracy [1]. With the inclusion of larger input images, the input dataset becomes richer. With the richer input dataset, it is also helpful to build a more diversified relationship model between image classification, PSNR, and SSIM. Similar to the training performed against the CIFAR dataset, the deep residual network, resnet50, was used for the image classification task shown in Figure 5. The neural network had 177 layers connected by the residual blocks among the inter layers. Before the training started, color and cropping-based image augmentation was performed to reduce the likelihood of image overfitting [17,18]. Note that the linear activation function (Relu) was used in the residual network. The function was able to train a deeper neural network without the vanishing gradient problem because the activation function had the advantages of both linear and nonlinear transformations of the input. The batch normalization reduced the value variation for each layer, thus increasing the stability of the deep network training and reducing the needed epochs for achieving the ideal classification accuracy.

**Figure 5.** Residual convolution neural network architecture for image classification training using the Caltech 101 dataset. Res2s\_branch2a represents the second stage and the second branch.

The use of customized data for testing the image classification accuracy with and without FPM reconstruction was also conducted. For this, two customized training data were included, namely the flowers and apple pathology datasets. The flowers dataset included 4242 images with sub-categories of daisy, dandelion, flowers, rose, sunflower, and tulip. The number of images per category were balanced. The balanced apple pathology data had 3200 images. Blackrot, cedar rust, scab, and healthy apples were included. For the flowers dataset, SqueezeNet was used for the classification of different types of flowers shown in Figure 6. The SqueezeNet consists of repeated Squeeze and expansion neural network modules. The Squeeze and fire modules employ a 1 by 1 filter. The direct benefit of using a 1 by 1 convolution filter is that the network requires fewer parameters and more memory efficient. The subsequent downsampling (e.g., pool10) enables the larger activation map, thus is beneficial for maximizing the classification accuracy.

For the apple pathology dataset, a Google inception-based network was used for the classification of the apple diseases shown in Figure 7. Google inception employs the inception layer as the backbone of the network. In total, nine repeated inception modules were used. Within the inception modules, different sizes of filters were used. Among them, 1 by 1, 3 by 3, and 5 by 5 filters were used within the inception module. By doing this, it can save weight parameters of the deep network. Additionally, by dividing the sequential filters into four branches within the inception layer, it further reduced the numbers of parameters, thus enabling a further increase in the computational efficiency.

Through training, a fine-tuned approach is utilized. The beginning three layers of the network are frozen and only the last two layers are trained. The data are augmented through reflection along one side of the image, followed by random translation and scaling along both sides of the image. A minimum batch size of 5 was used per batch throughout the training process. Random shuffling of the data was performed to reduce the likelihood of over-fitting. Twenty percent of the data was used for validation of the training accuracy. For achieving good training accuracy, a small initial learning rate of 0.0003 was used. In total, 6 epochs of training was found to be sufficient for achieving a converged training accuracy.

**Figure 6.** SqueezeNet convolution neural network architecture for image classification training using the flowers dataset. Fire2\_squeeze means the second stage squeeze module.

**Figure 7.** Google inception convolution neural network architecture for image classification training using the apple pathology dataset. Conv2\_3 × 3 represents the second stage, 3 × 3 convolution filter.

Following the training of the MNIST, Fashion-MNIST, CIFAR, Caltech 101, flowers, and apple pathology datasets, the image classification accuracy was obtained for the FPM reconstructed image, lower resolution image, as well as the ground truth image. Furthermore, PSNR and SSIM are also calculated for the FPM reconstructed image, lower resolution image, as well as ground truth image. The calculation of PSNR and SSIM values uses the ground truth image as the reference. Multiple linear regression is also performed for the PSNR, SSIM (independent variables), and image classification accuracy (dependent

variable). A *p*-value of 0.05 is used as the significance threshold of the F-Test performed against the goodness of fit of the regression.

#### **3. Results**

The results of Fourier Ptychography are shown in Figure 8, Figure 9, Figure 10 for different datasets. Figure 8 shows the results using MNIST and Fashion-MNIST.

**Figure 8.** (**A**) Lower resolution MNIST image; (**B**) FPM reconstructed MNIST image; (**C**) original MNIST image; (**D**) lower resolution Fashion MNIST image; (**E**) FPM reconstructed fashion MNIST image; (**F**) original Fashion MNIST image.

**Figure 9.** (**A**) Lower resolution CalTech 101 image, (**B**) FPM reconstructed CalTech 101 image (**C**) Original CalTech 101 image (**D**) Lower resolution Fashion CIFAR image (**E**) FPM reconstructed CIFAR image (**F**) original CIFAR image.

**Figure 10.** (**A**) Lower resolution apple pathology image; (**B**) FPM-reconstructed apple pathology image; (**C**) original apple pathology image; (**D**) lower resolution flowers image; (**E**) FPM-reconstructed flowers image; (**F**) original flowers image.

Lower resolution image, FPM reconstructed image, and original image based on CIFAR and CalTech 101 are shown in Figure 9. Similarly, the lower resolution image, FPM reconstructed image, and original image based on the flowers and apple pathology

datasets are shown in Figure 10. Consistently, all four datasets show their effectiveness in the generation of higher resolution images using FPM reconstruction.

The results of PSNR and SSIM, and classification accuracy for CIFAR, CalTech101, MNIST, Fashion-MNIST, flowers, and apple pathology datasets are shown in Table 1, Table 2, Table 3.


**Table 1.** PSNR, SSIM, and classification accuracy for CIFAR and CalTech101.


The multiple variable linear regression between independent variables of PSNR and SSIM and the dependent variable of classification accuracy was performed. Figure 11 shows the linear relationships between SSIM, PSNR, and classification accuracy. As shown in Figure 11, the linear relationship between image classification accuracy, PSNR, and SSIM is evident.


**Table 3.** PSNR, SSIM, and classification accuracy for flowers and apple pathology datasets.

**Figure 11.** Visualization of data distribution for multiple linear regression between image classification accuracy, PSNR, and SSIM.

The *p*-values, an indicator of the efficacy of the multiple linear regression, are shown in Table 4. It is also clear that all the F-Test *p*-values are less than 0.05 showing that a significant linear relationship exists between these variables within the linear regression model.

PSNR and SSIM increased with higher values of numerical aperture. For the purpose of illustrating the increased image classification accuracy due to the increase in PSNR and SSIM, the neural network activation map under different numerical apertures is presented. Figure 12 shows the Google inception convolution neural network activation map and the reconstructed images corresponding to each numerical aperture. The reconstructed image on the top of Figure 12 is the input of the Google inception neural network, and the activation map is the gradient of the features learned from the neural network.


**Table 4.** F-Test *p*-values for multiple linear regression between image classification accuracy, PSNR, and SSIM for MNIST, Fashion-MNIST, CIFAR, CalTech 101, and the flowers and apple pathology datasets.

**Figure 12.** Visualization of the apple pathology images and Google inception convolution neural network activation map (gradient) at the 141st layer. In total, the Google inception convolution neural network had 144 layers. The 141st layer is followed by the fully connected layer, the softmax layer, and the output layer. The green color corresponds to the vanished gradient. Fewer features in the activation map corresponds to the reduction in gradient.

#### **4. Conclusions**

In this study, we have investigated image classification accuracy with and without FPM reconstruction with six different image classifiers. We have also compared the image classification accuracy for the FPM reconstructed image versus the lower resolution images shown in Figure 8, Figure 9, Figure 10. It is clear that the lower resolution image has lower image visualization quality than the FPM-reconstructed image. Such a finding is further reinforced by the significant difference between the lower resolution and FPM reconstructed images in terms of image classification accuracy, especially for the lower NA conditions (e.g., 0.05 NA). For MINST, Fashion MNIST, Cifar, Caltech 101, and customized training datasets, when NA is lower, the lower resolution image classification accuracy becomes significantly lower than that of an FPM-reconstructed image. In contrast, when the image quality is higher with a higher NA condition (e.g., 0.5 NA), the image classification performance differs less significantly between with and without FPM reconstruction, which is possibly limited by classifier capabilities.

Nevertheless, the use of FPM reconstruction to improve image classification accuracy is meaningful because the lower NA image without FPM reconstruction suffers from a lower accuracy of image classification. This observation is clear across all six different datasets with different image classifiers. The catastrophic outcome is further supported by the finding that for the Caltech-101dataset, the accuracy of image classification corresponding to lower NA (0.05) and lower resolution conditions is 87% lower than the ground truth. For this situation, the use of FPM reconstruction is indeed helpful, which improves the image classification accuracy from 4.75% to 20%. Our results also suggest that the capabilities of different classifiers differ in terms of their capabilities to deal with lower

NA images. The differences could be caused by different neural network structures. For example, the Google Inception Network is found to have better capabilities to retain the image classification accuracy even for the lower NA images, which is likely caused by the Inception module within the network to handle the lower resolution images.

We further built multiple linear regression between image classification value (dependent variable) and PSNR and SSIM (independent variables). Results show that there is a linear relationship between the dependent variable and independent variables (Figure 8). The related regression F-Test *p*-value is also smaller than 0.05, which indicates that such a linear relationship is strong. The linear relationship implies that it is feasible to predict the image classification accuracy based on the PSNR and SSIM values. The impact of noises on image classification performance has been documented in previous literature [23,24]. Specifically, Figure 12 shows the activation maps, which are the gradient features learned from the Google inception neural network. It is evident that for the low numerical aperture, which is associated with a lower SSIM and PSNR, more learned gradient features become vanished or reduced. The vanished gradient compromises the network classification accuracy. It is clear that a more vanished gradient in the learned features corresponds to the lower numerical apertures of 0.1 and 0.3, thus compromising network classification accuracy. For a 0.5 numerical aperture, the vanished gradient has been reduced, which, therefore, corresponds to the improved image classification accuracy.

This study had some limitations. First, a greater number of trials for training and the evaluation of image classification accuracy for different NAs are needed. The linear relationship between SSMI, PSNR, and image classification accuracy needs to be evaluated based on a greater number of such trials. Specifically, under extreme lower or higher NA conditions, e.g., lower than 0.05 NA or greater than 0.5 NA, an examination of the relationship is also needed. Furthermore, factors involved in the study are limited. We have not included different noises, training and testing data ratios, or a more extensive number of classifiers. We plan to address the limitations in future work.

**Author Contributions:** Conceptualization, H.Z., Y.Z., L.W., Z.H., T-.C.P. and P.W.M.T.; methodology, H.Z., W.Z., D.C. and L.W.; software, H.Z. and D.C.; validation, Y.Z., T.-C.P. and P.W.M.T.; formal analysis, H.Z. and T.-C.P.; investigation, T.-C.P.; resources, Y.Z.; data curation, H.Z.; writing—original draft preparation, H.Z. and L.W.; writing—review and editing, T-C.P., W.Z.; visualization, Y.Z.; supervision, T.-C.P.; project administration, H.Z. and Y.Z.; funding acquisition, Y.Z., P.W.M.T. All authors have read and agreed to the published version of the manuscript.

**Funding:** National Natural Science Foundation of China (11762009, 61865007); Natural Science Foundation of Yunnan Province (2018FB101); the Key Program of Science and Technology of Yunnan Province (2019FA025); General Research Fund (GRF) of Hong Kong SAR, China (Grant No: 11200319).

**Institutional Review Board Statement:** Not Applicable.

**Informed Consent Statement:** Not Applicable.

**Data Availability Statement:** The data used in this study include MINST, Fashion MNIST, Cifar, Caltech 101. They are available at the following URLs: MINST: http://yann.lecun.com/exdb/mnist/; Fashion MNIST: https://www.kaggle.com/zalando-research/fashionmnist; Cifar: https://www. cs.toronto.edu/~kriz/cifar.html; Caltech 101: http://www.vision.caltech.edu/Image\_Datasets/ Caltech101/.

**Conflicts of Interest:** The authors declare no conflict of interest.

#### **References**


## *Article* **A New Method to Verify the Measurement Speed and Accuracy of Frequency Modulated Interferometers**

**Toan-Thang Vu 1, Thanh-Tung Vu 1,\*, Van-Doanh Tran 1, Thanh-Dong Nguyen <sup>1</sup> and Ngoc-Tam Bui 1,2,\***


**\*** Correspondence: tung.vuthanh@hust.edu.vn (T.-T.V.); tambn@shibaura-it.ac.jp (N.-T.B.)

**Abstract:** The measurement speed and measurement accuracy of a displacement measuring interferometer are key parameters. To verify these parameters, a fast and high-accuracy motion is required. However, the displacement induced by a mechanical actuator generates disadvantageous features, such as slow motion, hysteresis, distortion, and vibration. This paper proposes a new method for a nonmechanical high-speed motion using an electro-optic modulator (EOM). The method is based on the principle that all displacement measuring interferometers measure the phase change to calculate the displacement. This means that the EOM can be used to accurately generate phase change rather than a mechanical actuator. The proposed method is then validated by placing the EOM into an arm of a frequency modulation interferometer. By using two lock-in amplifiers, the phase change in an EOM and, hence, the corresponding virtual displacement could be measured by the interferometer. The measurement showed that the system could achieve a displacement at 20 kHz, a speed of 6.08 mm/s, and a displacement noise level < 100 pm//√Hz above 2 kHz. The proposed virtual displacement can be applied to determine both the measurement speed and accuracy of displacement measuring interferometers, such as homodyne interferometers, heterodyne interferometers, and frequency modulated interferometers.

**Keywords:** electro-optic modulator; frequency modulation; displacement measuring interferometer

#### **1. Introduction**

High-precision technology is increasingly critical for industrial applications. The demand for high-speed and precise processing has been on the rise in various fields. To meet these requirements, different types of displacement measuring sensors are available, including capacitive sensors, linear encoders, and laser interferometers. In a short measurement range, capacitive sensors can obtain sub-nanometer resolution [1,2]. There are some disadvantages of capacitive sensors, such as sensitivity to temperature and humidity, short stand-off distance, and relatively low bandwidth [3,4]. Linear encoders, which can measure both distance and displacement at nanometer resolution over a long measurement range, are widely used as machine tools [5,6]. However, the complex structure and large volume limits their application in ultra-precision measurements. Among these sensors, displacement measuring interferometers have increasingly been adopted because of their high level of accuracy and traceability to the definition of the meter. Numerous displacement measuring interferometers have been developed, such as homodynes [7–9], heterodynes [10–12], and frequency/phase modulation interferometers [13–16]. In open-air environments, the measurement range and accuracy of the homodyne interferometer are reduced due to the refractive index fluctuation [17]. The heterodyne frequency is less than 20 MHz, and, hence, the measurement speed is approximately 5 m/s [18,19]. For a frequency modulation interferometer, the measurement speed is limited by the modulation frequency of the laser source. Laser diodes (LDs) are widely used as the laser source in interferometers due

**Citation:** Vu, T.-T.; Vu, T.-T.; Tran, V.-D.; Nguyen, T.-D.; Bui, N.-T. A New Method to Verify the Measurement Speed and Accuracy of Frequency Modulated Interferometers. *Appl. Sci.* **2021**, *11*, 5787. https://doi.org/10.3390/ app11135787

Academic Editor: Jacek Wojtas

Received: 3 June 2021 Accepted: 19 June 2021 Published: 22 June 2021

**Publisher's Note:** MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

**Copyright:** © 2021 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https:// creativecommons.org/licenses/by/ 4.0/).

to their advantageous features, such as high power, compactness, and long lifespan,. In particular, the high modulation frequency of LDs can be obtained by applying a current modulation at the GHz level [20,21]. Hence, the measurement speed of the frequency modulation interferometer can theoretically reach a level of 10 m/s. Therefore, a considerable challenge for the interferometer is the measurement speed verification at nanoscale uncertainty. Normally, a piezo-electric (PZT) actuator uses the motion of the mirrors for interferometry because of its high motion resolution. However, the travelling speed of the PZT stage is limited to several kHz. In addition, a mechanical displacement PZT system typically shows hysteresis, which is a type of cyclic error [22–24]. Hysteresis introduces uncertainties into measurements, which should be suppressed or removed. Voice coil actuators are also used for many high-precision motion applications [25,26]. The actuator can achieve a resolution of 2 nm over a range of 1 mm [27]. Two major disadvantages of voice coil actuators are heat output and disturbance from moving wires [28]. Hence, voice coil actuators are not suitable for wide-range nano-positioning at high speed.

The measurement speed and measurement precision of displacement measuring interferometers are key parameters. For interferometers, the measurement speed or rate implies the time required to take one displacement measurement in the unit of seconds or Hz. In our previous works, a high modulation frequency of 3 MHz was successfully applied to the LD to improve the measurement speed of the frequency modulated interferometer [13,14,29]. Moreover, a high-precision phase meter was developed to measure the phase change in the interferometer [12]. To verify the measurement accuracy of our proposed interferometers, the displacement of the target mirror was measured using a capacitive sensor (D100, Physik Instrumente) integrated into the PZT stage. However, the bandwidth of the capacitive sensor was less than 3kHz, and the resolution was 1 nm over a displacement range of 2 μm [13,14]. Therefore, it remains a significant challenge to verify both parameters using a nanometer resolution and high-speed displacement actuator. In this paper, a high-speed virtual displacement without hysteresis is proposed and validated. It is noteworthy that all types of displacement interferometers determine target displacement by measuring the phase shift. A pure phase modulation at high frequency can be generated using an electro-optic modulator (EOM). This means that the EOM can be used to induce the phase shift rather than mechanical motions, such as PZT or voice coil actuators, to confirm the measurement speed and accuracy of displacement measuring interferometers. By using an EOM, the phase shift can be generated ranging from several kHz to some tens of MHz without distortion, hysteretic, or vibration. Hence, both the measurement accuracy and speed of the interferometer under assessment can be accurately verified. In the primary experiment, an EOM was placed into an arm of a frequency modulated interferometer. The virtual displacement was implemented using the EOM, in which the phase was changed using a modulation voltage at 20 kHz. Using two lock-in amplifiers (LIAs) and a Lissajous diagram, the phase change in the EOM and the corresponding virtual displacement could be measured using the frequency modulated interferometer.

In summary, the main contributions of this paper are listed as follows: (1) the calculation and design of a virtual displacement generated using an EOM; (2) the validation and measurement of the virtual displacement using a frequency modulated interferometer; (3) the comparison of the resulting measurement using the interferometer and theoretical result of the motion. Both the measurement speed and accuracy of the frequency modulated interferometer can be verified using the proposed method. This is significant in high-speed measuring applications, such as spindle error measurement, vibration measurement, and machine tool calibration.

#### **2. Methodology**

Figure 1a shows the schematic of the frequency modulated interferometer. The laser source was a laser diode (LD), in which the frequency was modulated with a sinusoidal signal by a modulation current injection. The angle frequency *ω*(*t*) of the LD is expressed as

$$
\omega(t) = \omega\_0 + \Lambda \omega \sin(\omega t) \tag{1}
$$

where *ω*<sup>0</sup> (=2π*f* 0), *ω*, and Δ*ω* are the initial angle frequency of the LD, acting as the carrier angle frequency, the angle modulation frequency, and the angle modulation frequency excursion of *ω*, respectively. An isolator was used to prevent unwanted lights returning to the LD. A half-wave plate (HWP) was employed to rotate the polarization plane of the incident light at 45º. A beam splitter (BS) divided the incident light from the isolator into two beams: a reflected beam and a transmitted beam. The reflected beam was sent to a mirror (M1) in the reference arm. In the measurement arm, the second beam was sent to another mirror (M2) along a polarizer (P) and an EOM. The polarizer was placed in front of the EOM to ensure the correct polarization of the input light. After reflection from the mirrors, the reflected lights of the two arms recombined with each other at the BS to generate interference.

**Figure 1.** Virtual displacement-measuring interferometer using a frequency-modulated laser source. (**a**) Schematic design and (**b**) the experimental setup. LD: laser diode; HWP: half-wave plate; BS: beam splitter; M: mirror; P: polarizer; EOM: electro-optic modulator; EO crystal: electro-optic crystal; NF: neutral density filter; HVA: high-voltage amplifier; FG: function generator; APD: avalanche photodetector; LIA: lock-in amplifier; ADC: analog-to-digital converter.

The intensity *I*(*τ*, *t*) of the interference signal [13,14] is written as

$$\begin{aligned} I(\tau, t) &= E^2 \, \_{01} + E^2 \, \_{02} + 2E\_{01} E \varrho\_{02} \cos(\omega \upsilon \tau) \, \_0 \mathbf{i}(m) + 4E\_{01} E \varrho\_{02} \cos(\omega \upsilon \tau) \sum\_{n=1}^{\infty} I\_{2n}(m) \cos(2n\omega t) \\ &- 4E\_{01} E\_{02} \sin(\omega \upsilon \tau) \sum\_{n=1}^{\infty} I\_{2n-1}(m) \cos[(2n-1)\omega t], \end{aligned} \tag{2}$$

where *τ*, *m*, *E*<sup>01</sup> and *E*02, *n*, *J*0(*m*), *J*2*n*(*m*), and *J*2*n*−1(*m*) are the changes in time between the two arms of the interferometer, the modulation index, the amplitudes of the electric fields in the reference and measurement arms, an integer, and the Bessel functions, respectively. Here,

$$m = \frac{\Lambda \omega}{\omega} \sin \left(\frac{\omega \tau}{2}\right) \approx \frac{2 \pi \Lambda f n\_{\text{air}} L}{c} \tag{3}$$

where Δ*f* (Δ*ω* = 2πΔ*f*), *nair*, *L*, and *c* are the frequency modulation excursion, the refractive index of air, the unbalance length of the interferometer, and the speed of light in a vacuum. A divider split *I*(*τ*, *t*) into two parts, which are coupled with the two purely sinusoidal signals of 2*ω* and 3*ω* from the function generator. By using the two LIAs [13,14], the intensities *I*2<sup>ω</sup> and *I*3*<sup>ω</sup>* of the 2*ω* and 3*ω* harmonics of *I*(*τ*, *t*), respectively, are produced as follows:

$$\begin{aligned} I\_{2\omega} &= 2E\_{01}E\_{02}\cos(\omega\_0 \tau)I\_2(m), \\ I\_{3\omega} &= -2E\_{01}E\_{02}\sin(\omega\_0 \tau)I\_3(m). \end{aligned} \tag{4}$$

Here, *J*2(*m*) and *J*3(*m*) are the second- and third-order Bessel functions, respectively. By using the Lissajous diagram, *ω0τ* is determined. The total phase difference *Φ* between the arms is

$$\Phi = \omega\_0 \tau = \arctan\left(-\frac{I\_{3\omega}}{I\_{2\omega^\circ}} \frac{J\_2(m)}{J\_3(m)}\right). \tag{5}$$

To generate the virtual displacement, the EOM was applied using a sinusoidal voltage *V(t)*

$$V(t) = V\_{EOM} \sin(\omega\_{EOM} t),\tag{6}$$

where *VEOM* is the amplitude of *V*(*t*). For the EOM (4002, Newport) [30], the optical phase difference Δ*Φ*<sup>1</sup> induced by applying *V(t)* is

$$
\Delta\Phi\_1 = \frac{\pi}{V\_{\text{TH}}} V\_\prime \tag{7}
$$

where *V*π is a half-wave voltage. Because the laser beam in the measurement arm was double-phase modulated (Figure 1), by substituting Equation (6) into Equation (7), the phase Δ*Φ EOM* = 2Δ*Φ*<sup>1</sup> is

$$
\Delta\Phi\_{EOM} = 2\Delta\Phi\_1 = \frac{2\pi}{V\_\pi}V = \frac{2\pi}{V\_\pi}V\_{EOM}\sin(\omega\_{EOM}t).\tag{8}
$$

However, the total phase difference *Φ* between the arms of the interferometer is

$$\Phi = \omega\_0 \tau = \frac{4\pi n\_{\rm air}(L-l)}{\lambda} + \frac{4\pi}{\lambda}(n\_c + \Delta n\_c)l = \frac{4\pi n\_{\rm air}(L-l)}{\lambda} + \frac{4\pi}{\lambda}n\_t l + \Delta \Phi\_{EOM} \tag{9}$$

where *L*, *nair*, *l*, *ne*, Δ*ne*, and *λ* are the unbalance length of the interferometer, the refractive index of air, the length and unperturbed refractive index of the electro-optic (EO) crystal, the change in *ne* induced by applying *V(t)*, and the laser wavelength, respectively. Here,

$$
\Delta\Phi\_{EOM} = \frac{4\pi}{\lambda} \Delta n\_d l = \frac{4\pi}{\lambda} \Delta l,\tag{10}
$$

Δ*l* is the change in the optical path in the EO crystal, which is defined as the virtual displacement of M2 due to applying *V(t)*. Substituting Equation (8) into Equation (10), Δ*l* becomes

$$
\Delta l = \frac{\lambda}{2} \frac{V\_0}{V\_T} \sin(\omega\_{EOM} t). \tag{11}
$$

Substituting Equations (5) and (10) into Equation (9), Δ*l* measured by the interferometer is

$$\Delta l = \frac{\lambda}{4\pi} \left\{ \arctan \left( -\frac{I\_{3\omega}}{I\_{2\omega}}, \frac{I\_2(m)}{I\_3(m)} \right) \right\} - \left[ (L-l)n\_{alr} + n\_al \right]. \tag{12}$$

In this experiment, we compared the measurement in Equation (12) with the calculated displacement in Equation (11) at the high phase-modulation frequency *ωm*.

For the frequency modulated interferometer, the maximum measurable speed *Vmax* can be given by [13]

$$V\_{\text{max}} \le k \frac{\lambda}{4} f,\tag{13}$$

where *k* and *f* are an integer and the modulation frequency of LD, respectively. Here, *k* represents the ratio of the cutoff frequency of the LIAs and the modulation frequency of LD; normally *k* = 0.1–0.8. In this paper, a virtual displacement was developed at a speed of tens of kHz and without hysteresis generated using an EOM.

#### **3. Experiments and Results**

A photograph of the experimental setup is shown in Figure 1b. The laser source was an LD (HL6312G, Thorlabs Inc., Newton, NJ, USA), and it was frequency modulated with a sine-wave signal (a frequency modulation of 20 MHz and modulation excursion of 570 MHz) by modulating the injection current. To produce a high-speed virtual motion, the 4002 EOM with Vπ = ~125 V at λ = 635 nm was used. The EOM was phase changed using a 20 kHz sine-wave signal V(t), which had V0 = 2 V, by utilizing a digital function generator (Moku:labs, Liquid Instruments), and it was then amplified 30 times by a high-voltage amplifier. ΔΦEOM was produced, and the 20 kHz sine-wave virtual displacement Δl could thus be attained with a peak-to-peak (p-p) amplitude of ~152 nm (=635 nm × 2 V × 30/2/125 V). The I(τ, t) was detected using an avalanche photodetector (DET08CFC/M, Thorlabs). To extract I2<sup>ω</sup> (2*ω* = 40 MHz) and I3*<sup>ω</sup>* (2*ω* = 60 MHz), we used two digital LIAs (Moku:labs, Liquid Instruments), which were synchronized with each other and had a cutoff frequency of 200 kHz. The I2*<sup>ω</sup>* and I3*<sup>ω</sup>* were then recorded using an analog-to-digital converter (AD16-16U(PCI)EV, Contec Co., Osaka, JP) at a sampling frequency of 715 kHz to eliminate MHzorder noises in the harmonics. In this experiment, the modulation index was m ≈ 2.4. This led to J(2*ω*) = 2.15 × J(3*ω*). In addition, because (L–l)nair + nel ≈ nairL ≈ 0.2 m. Equation (13) is rewritten as

$$
\Delta l = \frac{\lambda}{4\pi} \left\{ \arctan \left( -\frac{1}{2.15} \frac{I\_{3\omega}}{I\_{2\omega}} \right) \right\} - L\eta\_{alir}. \tag{14}
$$

The experiments were performed under the conditions in Table 1. The second and third harmonics that were clearly detected by LIAs are shown in Figure 2a. A Lissajous diagram using the two harmonics is shown in Figure 2b. A sine-wave displacement at 20 kHz and a p-p amplitude of ~152 nm was obtained over 1 ms, as shown in Figure 3a. This led to a measured speed of 6.08 mm/s (=2Δ*l* × *fEOM* = 2 × 152 nm × 20 kHz).

**Table 1.** The experimental conditions for the virtual displacement measurement.


**Figure 2.** Harmonic detection: (**a**) second and third harmonics; (**b**) Lissajous diagram using *I*2*<sup>ω</sup>* and *I*3*<sup>ω</sup>* harmonics.

The result obtained by the interferometer was compared with a calculated displacement function *F* using Equation (11), *F* = *A* + *B* × sin(2π × 20 kHz × *t*), where *A* = −0.152 μm and *B* = −0.076 μm. The difference between the measured result and *F* is shown in Figure 3b. A displacement difference of ~30 nm and a standard deviation of ~6 nm were attained. The displacement noise floor analyzed in Figure 3b is depicted in Figure 3c. Above 2 kHz, a displacement noise floor of less than 100 pm/√Hz was achieved. A dominated noise peak at 20 kHz can be seen. This led to the cycle error of the sine-wave displacement in Figure 3a, and this noise level was ~100 pm/√Hz. The result showed that our method could generate a stable and high-precision displacement that could be used as a reference. Even though unwanted noises caused by vibration and air disturbance still existed in the measurement, the method was promising for high-speed measurements.

**Figure 3.** Measurement: (**a**) virtual displacement measured by the interferometer (solid line) compared with the calculated displacement function *F* = *A* + *B* × sin(2π × 20 kHz × *t*) (dot line; see Equation (11), where *A* = −0.152 μm and *B* = −0.076 μm); (**b**) difference between the result obtained by the interferometer and F; and (**c**) displacement noise floor analyzed from (**b**) by Fourier transform.

From Equation (5), when *k* = 0.8, *λ* = 635 nm, and *f* = 20 MHz, the maximum measurable speed of 2.54 m/s can be achieved. However, due to the limitation of the hardware caused by the low bandwidth of the high-voltage amplifier (HVA) and LIAs, we could not perform a virtual displacement of several m/s. We plan to use a higher bandwidth HVA and LIAs to improve the measurement speed. Another disadvantage of the proposed method is that the use of the EOM in one arm of the interferometer can cause the polarization mixing effect, and, hence, it may induce some noise of the interference signal. To improve the interference signal, some polarization optics can be used in the next experiment.

#### **4. Conclusions**

In this study, high-speed displacement generation using an electro-optic modulator (EOM) was proposed and validated. A frequency modulated interferometer was established to measure the virtual displacement produced by the EOM. To produce a high-speed displacement, an EOM was phase changed using a high-speed sinusoidal modulation voltage. A 20 kHz sine-wave signal with an amplified amplitude of 60 V was applied to the EOM. A corresponding sinusoidal virtual displacement of ~152 nm at 20 kHz was then generated. The interferometer extracted an interference signal that contained the phase change in the EOM. Two LIAs and a Lissajous diagram were adopted for detecting the phase change and calculating the virtual displacement. The measurement showed that the virtual displacement, with an amplitude of 152 nm at 20 kHz and a measurement speed of 6.08 mm/s, was successfully measured by the interferometer. A displacement noise floor < 100 pm/√Hz above 2 kHz was achieved. This experiment was successful at developing a displacement reference at high speed. For future study, the possibility of using a higher modulation for the LDs and the residual amplitude modulation effect will be investigated, as well as high-speed mechanical displacements when the relevant hardware is available.

**Author Contributions:** Conceptualization, T.-D.N. and T.-T.V. (Thanh-Tung Vu ); methodology, T.- T.V. (Toan-Thang Vu); software, V.-D.T.; validation, T.-D.N, and V.-D.T.; formal analysis, N.-T.B.; investigation, T.-T.V. (Thanh-Tung Vu ); resources, N.-T.B.; data curation, V.-D.T.; writing—original draft preparation, T.-T.V. (Toan-Thang Vu); writing—review and editing, T.-T.V. (Thanh-Tung Vu ); visualization, T.-T.V. (Toan-Thang Vu); funding acquisition, N.-T.B. All authors have read and agreed to the published version of the manuscript.

**Funding:** This research was funded by Hanoi University of Science and Technology, grant number T2020-PC-201.

**Institutional Review Board Statement:** Not applicable.

**Informed Consent Statement:** Not applicable.

**Data Availability Statement:** This study did not report any data.

**Acknowledgments:** This work was supported by the Centennial SIT Action for the 100th anniversary of Shibaura Institute of Technology to enter the top ten Asian Institute of Technology.

**Conflicts of Interest:** The authors declare no conflict of interest.

#### **References**


### *Communication* **Integrated IR Modulator with a Quantum Cascade Laser**

**Janusz Mikołajczyk \* and Dariusz Szabra**

Institute of Optoelectronics, Military University of Technology, S.Kaliskiego 2, 00908 Warsaw, Poland; dariusz.szabra@wat.edu.pl

**\*** Correspondence: janusz.mikolajczyk@wat.edu.pl; Tel.: +48-261-839-792

**Abstract:** This paper presents an infrared pulsed modulator into which quantum cascade lasers and a current driver are integrated. The main goal of this study was to determine the capabilities of a new modulator design based on the results of its electrical model simulation and laboratory experiments. A simulation model is a unique tool because it includes the electrical performance of the lasing structure, signal wiring, and driving unit. In the laboratory model, a lasing structure was mounted on the interfacing poles as close to the switching electronics as possible with direct wire bonding. The radiation pulses and laser biasing voltage were registered to analyze the influence of laser module impedance. Both simulation and experimental results demonstrated that the quantum cascade laser (QC laser) design strongly influenced the shape of light, driving current, and biasing voltage pulses. It is a complex phenomenon depending on the laser construction and many other factors, e.g., the amplitude and time parameters of the supplying current pulses. However, this work presents important data to develop or modify numerical models describing QC laser operation. The integrated modulator provided pulses with a 20–100 ns duration and a frequency of 1 MHz without any active cooling. The designed modulator ensured the construction of a sensor based on direct laser absorption spectroscopy, applying the QC laser with spectral characteristics matched to absorption lines of the detected substances. It can also be used in optical ranging and recognition systems.

**Keywords:** quantum cascade laser; laser controller; infrared modulator; laser spectroscopy; free space optics

#### **1. Introduction**

Quantum cascade (QC) laser technology is a commonly used radiation source in many applications that operates in mid-and far-infrared wavelengths [1–3]. These lasers are mainly used in chemical sensing and spectroscopy, in which low noise and high stability of both the light power and spectra are required [4,5]. To this application, a special low-noise current driver and temperature stabilizing unit were constructed [6]. However, these lasers are also being used in other applications as a result of their features, e.g., high power, modulation bandwidth, direct current control, compact size, room-temperature operation, and operation in high transparent spectral ranges of the atmosphere. These features are important for telecommunication, security, and defense technologies [7,8].

Some research works of the modeling techniques used for a simulation of QC lasers have been conducted. Their main focus was to describe the physical processes in electron transport in multiple-quantum-well heterostructures to improve properties for operating temperature, efficiency, and spectral range [9].

What is more, some electrical equivalent circuits were also developed to simulate the laser operation, analyzing the laser electrical features and driving electronics. These works use an emulator of simulating electrical circuits without creating additional mathematical functions or numerical calculations [10]. It is not so computationally intensive and can be implemented in lasing structure optimization or an applied-level device.

Although a simulation model of a realistic circuit should consider five rate equations of quantum wells, three-level or two-level models with certain approximations have been

**Citation:** Mikołajczyk, J.; Szabra, D. Integrated IR Modulator with a Quantum Cascade Laser. *Appl. Sci.* **2021**, *11*, 6457. https://doi.org/ 10.3390/app11146457

Academic Editors: Vincenzo Spagnolo and Alessandro Belardini

Received: 29 April 2021 Accepted: 9 July 2021 Published: 13 July 2021

**Publisher's Note:** MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

**Copyright:** © 2021 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https:// creativecommons.org/licenses/by/ 4.0/).

used [10–12]. Based on these models, the light-current characteristics of QC laser were analyzed. These simulations were limited to intensity modulation characteristics for a different ratio to the threshold current.

However, only a few papers describe a QC laser pulse response [13]. The obtained results were used to define the influence of different bias currents on the turn-on delay of the pulse current conversion to the light signal. During described simulations, no model of the current driving unit was applied. The results verification was based on data from other circuit models or numerical calculations. These can be critical for some aspects of applied science in which laser-based systems are analyzed.

QC lasers require a new class of pulse drivers because their operational characteristics are different from those of a laser diode [14]. These lasers need driving signals (current and voltage) of an order higher than those of a bipolar diode laser [15]. The high voltage results from the applied cascade construction, in which the energy of the emitted photons is multiplied by the number of stages (e.g., 30–50). It can reach 20 V or more. The applied intersubband transitions are characterized by a lower optical gain than the interband, requiring high supplying currents. Therefore, the threshold current density of QC lasers is more than one order higher than that of diode lasers [16]. These features represent a challenge when constructing a laser driver that can consider both current and voltage. In these devices, various methods are applied to connect the signals to the QC laser. The most popular is to design unique signal poles or sockets on the printed circuit board (PCB) to connect the pins of the laser housing, e.g., butterfly, transistor outline package (TO), high heat load housing (HHL), and others. However, in a short pulse transmission, the main issues are impedance matching and inductance minimization. However, the matching can be impossible, because laser impedance depends on its construction and fabrication, temperature, or level of driving current [17]. The most critical is its dynamic change during current switching.

The properties of QC laser modulator radiation are matched to its application. Most often, they concern a required shape and the power of the emitted light. In the case of gas sensing devices, the applied laser absorption spectroscopy (LAS) technique determines the optical signal shape. This shape has a form ranging from a simple pulse to complex waveforms. For example, a QC laser pulse with a time duration of 500 ns was used in an intra-pulsed tuned LAS to detect the spectra of nitrogen oxide (NO) and nitrous oxide (N2O) at around 5.25486 μm and 4.52284 μm, respectively [18]. Shorter pulses are also used, for example, in cavity-enhanced absorption spectroscopy to detect ammonia (NH3), with a QC laser emitting 50 ns pulses at 200 kHz frequency around 6.8 μm [19].

More complex optical signals are preferable in a tunable laser absorption spectroscopy (TLAS). To obtain scanning of an absorption line range, the QC laser spectra are tuned applying both current ramp signal and high-frequency wave. Many LAS setups use tunable QC lasers, e.g., direct absorption spectroscopy or wavelength modulation [20]. In these setups, IR modulators generated the current signal consisting of, for example, a slow ramp (10 Hz) and a sinusoid (8 kHz) [21].

A high rate of radiation modulation is the primary goal of IR modulator construction for free-space optics (FSO). Some experimental results with GHz- range modulation of QC lasers have been described. In those tests, some specific architectures of laser structure were used [22]. For example, a laser waveguide with a microstrip line ensured a modulation rate of up to 14 GHz [23]. However, a 'high speed' technology of QC lasers is underdeveloped. The modulation speeds up to 26.5 GHz were obtained using a special design QC laser placed on a cold finger with continuous-flow liquid nitrogen and driven with bias-tee and a microwave signal [24].

There are many other applications for which it is important to design a compact and low-power consumption IR modulator. For example, it could be used as an alarm beacon with optical signals detected at extended ranges, a jamming device to disorientate an incoming missile threat, or a light source for beam riding and ranging systems. There are only a few such devices described in the literature [25]. However, they can work in pulsed mode with closed construction, allowing light pulse generation in strictly defined pulse configurations at the maximum repetition rate of 500 kHz.

This study describes an integrated IR pulsed modulator consisting of a QC laser structure and a current switching module. It ensured the generation of short light pulses (tens of ns) with a maximum frequency of 1 MHz, using direct wire bonding of the lasing structure and the current terminals. Various simulations of this configuration were performed considering different QC laser electrical parameters. Finally, the preliminary experiments were performed with the designed modulator. The obtained results provided experimental verification of, e.g., the influence of both QC laser parameters and the driving signal interface on the light pulses, determination of the electrical scheme of the QC laser, and the integrated modulator. In the future, this modulator can be installed in some standard housings, e.g., HHL, butterfly.

#### **2. QC Compact Laser Modulator Design**

A compact laser modulator was built using a laser switching high-speed iC-G30 iCSY HG20M module (iC-House Corporation, Bodenheim, Germany) and a laser structure designed at the Institute of Electron Technology (Warsaw, Poland) [26,27]. The module is equipped with six channels, providing a speed of up to 250 MHz, current pulses of up to 5 A (per channel), and an output voltage of 30 V. Its main capabilities are independent and simple voltage control of channel currents, parallel channel operation of up to 6 A (constant mode) and up to 30 A (pulsed mode), and thermal shutdown. The QC laser was designed to operate at room temperature. Its parameters and construction mean it can be mounted directly on the signal pole of the laser switching module without the need to use an extra cooling unit. The optical power and voltage vs. current characteristics of the laser and a picture of the integrated modulator are presented in Figure 1.

**Figure 1.** Optical power and voltage vs. current of the mounted QC laser (**a**) and a picture of the integrated IR modulator (**b**).

#### **3. Simulations**

The pulse operation performances of the laser modulator were analyzed using SPICE (Simulation Program with Integrated Circuit Emulation) models of the switching module and the prepared electrical model of the QC laser (LTSpice XVII software ver.17.0.27.0, Analog Devices, Norwood, MA, USA). During the simulations, no parameters for the PCB technology were included. Such assumptions simplify the scope of simulations but should be considered in actual conditions [28]. Its manufacturer delivered the SPICE description of the switching module. For the QC laser, the SPICE model was prepared using data from Alpes Lasers (St Blaise, Switzerland) for the lasing structure, and IXYS Corporation (Milpitas, CA, USA) for the inductance of the laser package [29,30]. In Figure 2, an electrical

schematic of the IR lighting device is presented. The Alpes laser model combines a resistor (*R*L) and two capacitors (*C*1, *C*2). The *R*<sup>L</sup> values depend on the laser operation point (e.g., for a low current: 10–20 Ω, and for the lasing current: 1–3 Ω). The *C*<sup>1</sup> (~100 pF) and *C*<sup>2</sup> (below 100 pF) are mainly determined by bonding pads and laser mounting technologies, respectively. IXYS Corporation determines the inductance level (*L*) for various signal interface technologies applied in laser housings. For example, a connection of two points at a distance of 0.2 mm, with a copper wire 0.014 mm in diameter and a height of 0.06 mm, gives an inductance of 3.6 nH.

**Figure 2.** Electrical scheme of the IR modulator (*V*DD—voltage supply of the module, *VI* (*CI*) voltage control of output current, *EF*-*EN*—differential triggering signals).

A simulation of the light device was performed for different lasing structure electrical parameters (Figure 3). The reactance parts of load impedance did not influence the pulse amplitude (Figure 3a,b). The capacitances generate some oscillations at the pulse rising edge caused by feedback signals conditions. Growing these capacitances, the response time constant also increases. The most critical issue is load inductance, considering pulse shape and its oscillations. We observed a slower modulator response and current highlevel changes for both biasing ranges. A current-voltage dependence for the inductor and signal resonance conditions defined these effects. Even at the rising and falling parts of the pulse, these signal fluctuations can be the critical point for the current amplitude, causing, e.g., laser structure damage. The bonding pad capacitance is an almost ideal plate capacitor with a PTFE dielectric [31]. For inductance, the bonding wire generally is 1 nH/mm. However, all of these parameters vary from laser to laser. To reduce pulse degradation, i.e., overshoot and ringing, any unnecessary hardware between the switching unit and the laser must be avoided. That is why the laser's direct and short connection to the electronics is preferable [32].

The load resistance has a direct influence on current amplitude and response time constant (Figure 3c). High resistance attenuates signal fluctuations at the pulse rising edge and increases its fall time. The current level is inversely proportional to this resistance. This is a crucial issue with regard to QCL's switching with dynamic resistance. There is a need to supply lasing structure to ensure population inversion (high resistance—low current—no light) and to emit photons (low resistance—high current—light) [33].

**Figure 3.** Simulation results of the modulator for different lasing structure electrical parameters: capacitance (**a**), inductance (**b**), and resistance (**c**).

#### **4. Tests of the Laser Switching Module**

The switching module tests were performed to analyze the influence of the load resistance on the shapes of the generated current pulses from the switching module using a special test board. The electrical signals were registered using the same scope Tektronix MSO 6 with the current probe CT-1 and the active differential voltage TDP-1500 probe. Figure 4 presents both the current and voltage signals for the two load resistances. These resistances were measured using a Keithley 236 source-measure unit. The module was operating in the current source region, where the current depends only on the 'load' of the biasing voltage. In this region, its output resistance is lower than 300 mΩ.

**Figure 4.** Normalized electrical driving signals: current (**a**) and voltage (**b**).

We observed both voltage and current oscillations at the falling slope for low load resistance. Impedance mismatching caused the transmission and reflection effects in the signal line. An increase in load resistance allowed for the minimizing of these effects with the limited bandwidth.

The load resistance influenced both edges of the voltage signals, but for the current, that was only noticed for the falling one. These differences can modulate the lasing conditions of the QC structure as determined by both current and voltage signals. In practice, these conditions are difficult to predict because they are related to the change in the QC laser impedance during its operation. Therefore, both the signal interface and QC laser parameters form the shapes of the light pulses.

#### **5. Laboratory Tests of the IR Modulator**

The preliminary tests of the IR modulator were performed using the lab setup presented in Figure 5. The modulator was placed on the test board to supply all signals (the voltage supply of the switching module, the differential triggering signals, and the biasing of the QC laser). The device was switched using the AFG 3252 model generator (Tektronix, Beaverton, OR, USA). The light pulses were registered with the PVI-4TE-10.6 (VIGO System S.A., Warsaw, Poland) detection module with a responsivity of 2.5 × 104 V/W and bandwidth of 500 MHz. An oscilloscope of the MSO 6 model (Tektronix) with the active probe (TDP-1500, Tektronix) visualized the voltage biasing of the laser structure. There was no technical possibility of registering current pulses (the laser wire was directly bonded on the driver signal pole) and analyzing their shapes.

**Figure 5.** View of the testing setup and lasing device placed on the test board irradiating detection module.

Figure 6 presents some shapes of the registered voltage signals and radiation pulses for different pulse time durations at a frequency of 1 MHz. The voltage signal came ~5 ns faster than the light one, and the laser biasing voltage limited its amplitude. It agrees strongly with the data described in [28].

Some fluctuations caused by impedance mismatching of the signal interface were also observed. The high similarity of the shape of voltage signals obtained during reference measurements using a resistor (Figure 4b) and the QC laser is notable. But these shapes differed from those of the light pulses. A slower rising edge and laser pulse oscillations were noticed.

**Figure 6.** Registered signals for different pulse durations: biasing voltage (**a**) and radiation pulses (**b**).

Some virtues of the experimental and simulation results are shared, e.g., inflection in the rising slope, oscillations at the top signal, and some peaks at the falling edge (Figure 7). These effects were not registered for the tests with resistances, indicating that their source is a powered laser structure. This is a new aspect for analyses of QC laser module construction for applied science. Experimental results of laser biasing voltage and light pulse also defined the time delay of ~5 ns. This phenomenon was analyzed during modulation simulations.

**Figure 7.** Comparison of experimental results (laser biasing voltage and light pulse) with simulation results (laser current).

#### **6. Conclusions**

This study presents the preliminary test of an integrated IR modulator. A unique characteristic of this study was the simulation of the electrical signals supplying the quantum cascade lasers, and the integration of the lasing structure and the current switching module. For this purpose, an electric circuit for the lasing structure was proposed. The simulated results defined the influence of impedance mismatching on both the current and voltage supply signals. It was shown that both signal interfaces and laser parameters form the shape of light pulses. The QCL impedance has non-Ohmic character. That is why it requires more advanced models to describe the behavior of the laser. A perfect impedance match is also impossible, because the QCL impedance changes with the modulation of the driving signal.

Finally, the integrated IR modulator was constructed. The shapes of its light pulses were comparable to the driving current obtained during simulations. These results confirm the phenomenon of dynamic changes in QC laser impedance with current pulse duration. The electrical circuit emulation does not ensure the observation of these results. However, they provide new knowledge in the field of modeling QC lasing structures considering the laser-system approach.

The practical result of the work is an IR modulator that generates MWIR pulses using a QC laser operated at room temperature. The light time parameters, time duration of few tens of ns and max. frequency of 1 MHz, are unique results considering the performance of the other available compact QC laser modules.

Theoretically, the applied switching module also provides a bandwidth of 250 MHz and the generation of complex waveforms using six independently controlled current signals of up to 5 A, and a biasing voltage of 30 V. In the future, it will give new opportunities for many advanced applications. Such a modulator with combined optical signals is needed, e.g., in tunable direct laser absorption spectroscopy, multi-level optical signal transmission, and ultra-short pulse generation with a pre-biasing current.

**Author Contributions:** Conceptualization, J.M.; methodology, J.M. and D.S.; validation, J.M.; formal analysis, J.M.; investigation, D.S.; resources, J.M.; data curation, J.M.; writing—original draft preparation, J.M.; writing—review and editing, J.M. and D.S.; visualization, D.S. Both authors have read and agreed to the published version of the manuscript.

**Funding:** Research funded by Narodowe Centrum Bada ´n i Rozwoju Grant No MAZOWSZE/0196/19- 00) and by Military University of Technology (Grant no UGB/22-786/2020/WAT).

**Institutional Review Board Statement:** Not applicable.

**Informed Consent Statement:** Not applicable.

**Data Availability Statement:** Not applicable.

**Acknowledgments:** I would like to thank for technical support with Zbigniew Zawadzki from the Institute of Optoelectronics, MUT. Assistance of the Sie´c Badawcza Łukasiewicz—Instytut Technologii Elektronowej (Kamil Pier´sci ´nski) in providing support of quantum cascade lasers technology is gratefully acknowledged.

**Conflicts of Interest:** The authors declare no conflict of interest. The funders had no role in the design of the study; in the collection, analyses, or interpretation of data; in the writing of the manuscript, or in the decision to publish the results.

#### **References**

