Next Article in Journal
k-Tournament Grasshopper Extreme Learner for FMG-Based Gesture Recognition
Next Article in Special Issue
Lensless Three-Dimensional Imaging under Photon-Starved Conditions
Previous Article in Journal
Gas Adsorption Response of Piezoelectrically Driven Microcantilever Beam Gas Sensors: Analytical, Numerical, and Experimental Characterizations
Previous Article in Special Issue
Multi-Scale Histogram-Based Probabilistic Deep Neural Network for Super-Resolution 3D LiDAR Imaging
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Multi-Class Classification and Multi-Output Regression of Three-Dimensional Objects Using Artificial Intelligence Applied to Digital Holographic Information

School of Electronics Engineering, Vellore Institute of Technology (VIT), Chennai 600127, Tamilnadu, India
*
Author to whom correspondence should be addressed.
Sensors 2023, 23(3), 1095; https://doi.org/10.3390/s23031095
Submission received: 20 November 2022 / Revised: 18 December 2022 / Accepted: 29 December 2022 / Published: 17 January 2023
(This article belongs to the Collection 3D Imaging and Sensing System)

Abstract

:
Digital holographically sensed 3D data processing, which is useful for AI-based vision, is demonstrated. Three prominent methods of learning from datasets such as sensed holograms, computationally retrieved intensity and phase from holograms forming concatenated intensity–phase (whole information) images, and phase-only images (depth information) were utilized for the proposed multi-class classification and multi-output regression tasks of the chosen 3D objects in supervised learning. Each dataset comprised 2268 images obtained from the chosen eighteen 3D objects. The efficacy of our approaches was validated on experimentally generated digital holographic data then further quantified and compared using specific evaluation matrices. The machine learning classifiers had better AUC values for different classes on the holograms and whole information datasets compared to the CNN, whereas the CNN had a better performance on the phase-only image dataset compared to these classifiers. The MLP regressor was found to have a stable prediction in the test and validation sets with a fixed EV regression score of 0.00 compared to the CNN, the other regressors for holograms, and the phase-only image datasets, whereas the RF regressor showed a better performance in the validation set for the whole information dataset with a fixed EV regression score of 0.01 compared to the CNN and other regressors.

1. Introduction

Multi-class classification and multi-output regression tasks [1] are the deep learning applications that produce a single output for multiple inputs. These supervised learning techniques [2] play a vital role in the development of artificial intelligence systems in which decision making is done through discrete and continuous labels by considering multiple inputs based on the criteria of the problem at hand. Studies have emerged in the areas of learning and decision making that used multi-class classification and multi-output regression tasks such as Alzheimer’s disease classification [3,4,5,6], food ingredient classification [7], river quality prediction [8], natural gas demand forecasting [9], drug efficacy prediction [10], prediction of the audio spectrum of wind noise (represented by several sound pressure variables) of a given vehicle component [11], real-time prediction of multiple gas tank levels of a Linz–Donawitz converter gas system [12], simultaneous estimation of different biophysical parameters from remote sensing images [13], and channel estimation via the prediction of several received signals [14]. However, these real-world problems [11,12,13,14] still face major challenges such as the absence of feature/target values and the presence of noise due to the complexity of real domains. Despite dealing with these challenges, it has been proven that multi-output regression methods have a better predictive performance and computational efficiency [15]. Therefore, in the present work, we studied the implications of multi-class classification and multi-output regression tasks on 3D objects by using deep neural networks to develop intelligent and pragmatic three-dimensional (3D) vision systems. The design and development of such robust systems involve the need for efficient 3D object data-acquisition systems, 3D object reconstruction techniques from the sensed data, and the development of prudent algorithms to process the 3D information on a real-time basis. In this context, digital holography [16,17,18,19] is a potential technique for the digital sensing and computational retrieval of 3D information in a real-time process. Several applications have been demonstrated by researchers in information processing [20,21,22]. Digital holography is a coherent imaging technique in which optically generated holograms are digitally detected by a CCD/CMOS sensor and thereafter numerically reconstructed to retrieve the whole information from the optical wavefront. Thus, the stored information in the complete 3D volume of the object in the digital hologram can be numerically processed to obtain a digital complex-valued image that is a quantitative measure of the intensity and phase of the sensed complex object’s wavefront [23]. The unwrapped phase images [24] carry the quantitative depth features and are significantly important in gathering the 3D information and in processing [20,21]. In our approach, we thus demonstrated five-class classification and regression tasks of the 3D objects addressed using holograms, concatenated intensity–phase information, and phase-only holographic information via deep convolutional neural network (CNN) learning. The CNN is one of the major deep neural networks and a basic building block of deep learning. The ability of a CNN to handle greater numbers of convolutional layers and pooling layers in the feature-extraction stage and process n number of neurons in the classification-layer stage has made its use feasible for different kinds of tasks such as classification, autofocusing, fringe pattern denoising, image segmentation, image super-resolution, and hologram reconstruction in digital holography [25,26,27,28,29,30,31,32,33,34,35,36,37,38,39,40,41,42,43]. A CNN [44] consists of feature-extraction and classification layers. The feature-extraction layer has n number of convolutional and pooling layers. The classification layer is composed of a dense layer and an output layer. The convolutional layer incorporates convolutional kernels to perform convolution operations; the output of a convolution is passed through a non-linear activation function. The number of kernels and the size of the kernel in the convolutional layer are specified by the user according to the corresponding task at hand. The dense layer consists of multiple neurons, while the output layer can contain multiple neurons or a single neuron depending on the application at hand.
Kim et al. proposed a VGG 19-layer neural network to perform the hologram classification of unlabeled living cells [45]. Lam et al. performed invariant hologram classification of deformable objects using a deep CNN [46]. Zhu et al. proposed a deep learning method for the classification of microplastics using digital holograms [47]. Reddy et al. proposed deep-learning-based binary classification of 3D objects for a holographic phase-only image dataset [48]. Shimobaba et al. performed deep-CNN-based multi-class classification of data pages for holographic memory [29]. Jo et al. proposed a deep CNN to perform the holographic image classification of unlabeled living cells for anthrax and non-anthrax spores [37]. Ren et al. demonstrated the autofocusing problem as a regression task in digital holography using a deep CNN for an amplitude and phase-only hologram dataset at different recording distances [28]. In the present work, five-class classification and regression tasks using a deep CNN applied to holograms, reconstructed intensity and phase information combined in a single image, and phase (depth)-only image datasets of eighteen 3D objects are proposed. The advantage of this work compared to the previous works [28,29,37,45,46,47,48,49] was that the five-class classification and regression tasks were performed on datasets of holograms, concatenated intensity–phase images, and phase-only images of eighteen 3D objects. These three datasets of eighteen 3D objects were constructed by using recorded digital holograms and a complex-wave retrieval method [50]. Off-axis digital Fresnel holograms of 3-D objects are formed by recording the off-axis geometry at different distances; by post-processing the holograms, the datasets of the concatenated intensity–phase images and phase-only images were obtained. These three datasets were further passed through the deep CNN to perform five-class classification and regression tasks. The five-class classification and regression tasks in supervised learning applied to the digital holographic information of the eighteen 3D objects using a deep CNN were equivalent to 3D object allocation and prediction performed on the digital holographic datasets that produced discrete and continuous labels as output, which justified the rationale behind the present work. The CNN was trained on all three datasets separately to generate the results. For the five-class classification, results such as loss/accuracy graphs of the training/validation sets, a confusion matrix, performance metrics, receiver operating characteristics (ROC), and precision-recall characteristics are shown for the validation of the present work. Similarly for the five-class regression task, the results such as loss, mean square error (MSE), and mean absolute error (MAE) curves were plotted for both the training/validation sets; and performance metrics such as the MAE, R2 score (coefficient of determination), and explained-variance (EV) regression score for the test/validation sets are shown for the confirmation of the work. Further, the CNN was compared with machine learning classifiers and regressors such as KNN, MLP, DT, RF, and ET separately using all three datasets for both of the tasks. The proof of the proposed concept was demonstrated using a real-time off-axis digital holographic experiment for sensing and retrieving the 3D object information, which was further processed using AI/ML techniques.

2. Theory

In this section, we describe the digital hologram sensing, retrieval, and processing of the 3D information using the deep CNN and machine learning algorithms.

2.1. Sensing and Retrieval of 3D Information of the Objects Using Off-Axis Digital Fresnel Holography

The construction and modeling of the 3D objects are shown in Section S1.1 of the Supplementary Materials. Figure 1 shows the schematic diagrams of four of the eighteen 3D objects used for the recording of the off-axis digital Fresnel holograms.
The 3D objects shown in Figure 1 consisted of two different planes; namely, the first plane and the second plane, which had different features that were separated by a distance of z = 8   mm . The construction of the remaining fourteen 3D objects was similar to that of the four 3D objects shown in Figure 1 but with different features on each plane [48]. In total, eighteen 3D objects were considered for the proposed five-class classification and regression tasks [49]. The 3D objects were characterized by their intensity and phase information. In the construction shown in Figure 1, when light passed through the first plane, the amplitude and phase information of the object with the features of the first plane were obtained. Then, after propagating by a distance of z using free space propagation, the amplitude and phase information of the next object features in the second plane were also obtained.
Section S1.2 presents the details of the digital recording and numerical reconstruction of the holograms to obtain the complex 3D object wave information. Figure 2 describes the experimental setup (of the Mach–Zehnder digital holographic recording geometry in an off-axis scheme) used for the recording of the holograms of the 3D objects [49]. A He-Ne laser source with a wavelength λ = 632.8   nm was used here. The holograms were recorded by using a CMOS sensor with a square pixel pitch of 6   μ m × 6   μ m at an interference angle of θ = 1.4 ° . The size of each recorded hologram was 1600 × 1600 . Next, the complex-wave retrieval method [50] was applied to the recorded holograms of the 3D objects to obtain the complex-wave fields of the objects at the recording plane. Further, an inverse Fresnel transform was applied on the retrieved complex-wave field to obtain a 2D digital complex-valued image at the object plane. The 2D digital complex-valued images contained 3D information in the form of the intensity and phase.
The intensity and phase information present in the 2D digital complex-valued images were extracted and further united via the method of concatenation to form concatenated intensity–phase (whole information) images, and phase (depth)-only information was also extracted from the 2D digital complex-valued images to form phase images. The off-axis Mach–Zehnder holographic geometry was suitable for both transmitting and reflecting the types of objects. The object beam arm could be modified appropriately for the reflective objects or specimens. In the present paper, we modeled the 3D objects to use them in transmission mode to demonstrate the proof of concept of the proposed application.

2.2. Multi-Class Classification and Multi-Output Regression of 3D Objects Using Holographic Information

Section S1.3 shows the equations that governed the 3D object set formation of the datasets of the sensed holograms, concatenated intensity–phase images, and phase-only images. The holographic information of 3D objects can be processed using an AI-based approach in several ways. One method is to apply direct learning from the sensed hologram data. Another method is to learn the retrieved 3D object information from digital holograms; i.e., by forming a dataset of the retrieved intensities and phases combined to form concatenated intensity–phase images. Since the phase information contains the depth features, a phase-only information or phase-only image dataset can be learned to accomplish the tasks. In the present paper, we addressed the above approaches for the multi-class classification and multi-output regression tasks of the 3D objects in supervised learning by using a deep CNN and comparing the results with those of standard machine learning algorithms. The five-class classification and regression tasks in supervised learning applied to the digital holographic information of eighteen 3D objects using a deep CNN was equivalent to 3D object allocation and prediction performed on the digital holographic datasets that produced discrete and continuous labels as output, which justified the rationale behind the present work. The eighteen 3D objects considered for the above problem were classified into five different sub-classes (Class-a, Class-b, Class-c, Class-d, and Class-e) to perform the five-class classification and regression tasks using the following equations:
Class-a :   { T 1 }   { a d i ,   b d i , c d i , e d i }
Class-b :   { T 2 } {   f d i , g d i , h d i , k d i }
Class-c :   { T 3 } { l d i , m d i , n d i ,   o d i }
Class-d :   { T 4 } {     p d i , q d i , r d i }
Class-e :   { T 5 } { s d i ,   t d i , u d i }
where ‘ d i ’ represents the distance between the recording plane and object plane, and i denotes the indices of the individual objects. The combined objects circle–pentagon   ( a d i ) , circle–triangle ( b d i ) , circle–square ( c d i ), and circle–rectangle ( e d i ) were considered for Class-a. The combined objects square–circle ( f d i ), square–triangle ( g d i ), square–rectangle ( h d i ), and square–pentagon ( k d i ) were considered for Class-b. The combined objects triangle–circle ( l d i ), triangle–square ( m d i ), triangle–rectangle ( n d i ), and triangle–pentagon ( o d i ) were considered for Class-c. The combined objects pentagon–circle ( p d i ), pentagon–square ( q d i ), and pentagon–triangle ( r d i ) were considered for Class-d. Finally, the combined objects rectangle–circle ( s d i ), rectangle–square ( t d i ), and rectangle–triangle ( u d i ) were considered for Class-e. The five-class classification and regression tasks of the 3D objects using the hologram dataset are shown in Equations (1)–(5). The five-class classification and regression tasks for the concatenated intensity–phase (whole information) image dataset were performed by using the following Equations (6)–(10), respectively.
Class-a :   { R T I N P H 1 }   { R a d i , I N P H ,   R b d i , I N P H , R c d i , I N P H , R e d i , I N P H }
Class-b :   { R T I N P H 2 } { R f d i , I N P H , R g d i , I N P H , R h d i , I N P H , R k d i , I N P H }
Class-c :   { R T I N P H 3 } { R l d i , I N P H , R m d i , I N P H , R n d i , I N P H ,   R o d i , I N P H }
Class-d :   { R T I N P H 4 } { R p d i , I N P H , R q d i , I N P H , R r d i , I N P H }
Class-e :   { R T I N P H 5 } { R s d i , I N P H ,   R t d i , I N P H , R u d i , I N P H }
The five-class classification and regression tasks of the phase (depth)-only image dataset were performed by using Equations (11)–(15), respectively.
Class-a :   { R T P H 1 }   { R a d i , P H , R b d i , P H , R c d i , P H , R e d i , P H }
Class-b :   { R T P H 2 } { R f d i , P H , R g d i , P H , R h d i , P H , R k d i , P H }
Class-c :   { R T P H 3 } { R l d i , P H , R m d i , P H , R n d i , P H ,   R o d i , P H }
Class-d :   { R T P H 4 } { R p d i , P H , R q d i , P H , R r d i , P H }
Class-e :   { R T P H 5 } { R s d i , P H ,   R t d i , P H , R u d i , P H }
The deep CNN was used to perform the five-class classification and regression tasks by employing datasets of holograms, concatenated intensity–phase images, and phase-only images. Further, the five-class classification and regression tasks for different digital holographic datasets performed using the deep CNN were compared via machine learning algorithms such as the K-nearest neighbor (KNN), multi-layer perceptron (MLP), decision tree (DT), random forest (RF), and extra trees (ET). In this way, the five-class classification and regression tasks were performed for the different digital holographic datasets using deep learning and machine learning frameworks.

3. Architecture of CNN for Multi-Class Classification and Multi-Output Regression

Figure 3 shows a block diagram of the CNN that was used to perform the five-class classification and regression tasks for the different digital holographic information, which consisted of datasets of holograms, concatenated intensity–phase (whole information) images, and phase (depth)-only images; i.e., the CNN took the inputs from all three datasets independently. Section S2.1 provides the details of the mathematical model of the CNN used for the multi-class classification and multi-output regression.
Figure 3 describes the architecture of the CNN, which contained four convolutional and four pooling layers, fully connected layers, and an output layer. The classification stage was used for the five-class classification and regression purposes. In the five-class regression, the classification stage was modified into the regression stage to implement the task. The convolutional layer operated on the raw input and the kernel to generate the output, which was then further processed by the pooling layer. Here, each convolutional layer consisted of the rectified linear unit (ReLU) activation function. The number of kernels was equal to 8 in the first convolutional layer, 16 in the second, 32 in the third, and finally 64 in the fourth. The kernel size was equal to three in all convolutional layers. The pooling layer accepted the input from the convolutional layer to minimize the dimensionality of the feature map. The pooling technique used here was MaxPooling2D. The pooling layer did not affect the number of parameters because it only reduced the dimensionality of the feature map. After four successive stages of convolution and the pooling operations in the feature-extraction stage, the final pooling-stage output was given to the classification stage to perform the five-class classification and regression tasks. The fully connected layer took the input from the fourth pooling stage; i.e., the 2D data, and converted it into 1D data through the flattened layer before processing it through the fully connected layer. In the fully connected layer, the number of neurons considered was 16. The output layer received the input from the fully connected layer to perform the five-class classification and regression tasks. For the five-class classification, the softmax function was used; for regression, the linear function was used in the output layer. The number of neurons in the output layer for the five-class classification and regression tasks considered was five. The summary of the proposed deep CNN model used for five-class classification and regression tasks is shown in Table 1.
Section S2.2 provides the details on the performance metrics used for the multi-class classification and multi-output regression tasks.

4. Dataset Preparation

The datasets of the holograms, concatenated intensity–phase images, and phase (depth)-only images were created using the eighteen 3D objects considered in this study. Digital holograms of the eighteen 3D objects were created using Mach–Zehnder off-axis digital holographic geometry as shown in Figure 2. The 18 three-dimensional objects were used in the off-axis geometry to form 63 holograms with a size of 1600 × 1600 using a CMOS sensor at 15 different distances of d 1 = 180 mm, d 2 = 185 mm, d 3 = 200 mm, d 4 = 201 mm, d 5 = 205 mm, d 6 = 210 mm, d 7 = 220 mm, d 8 = 250 mm, d 9 = 251 mm, d 10 = 255 mm, d 11 = 300 mm, d 12 = 305 mm, d 13 = 310 mm, d 14 = 311 mm, and d 15 = 315 mm. Figure 4 shows the digital holograms of five of the 3-D objects recorded at a distance of d 3 = 200 mm.
The holograms of eighteen 3D objects were used to obtain 2D digital complex-valued images using the complex-wave retrieval method [50]. The datasets of the holograms, concatenated intensity–phase images, and phase-only images were formed by rotating each image in the respective datasets by 5 ° . Then, these three digital holographic datasets, which consisted of 2268 images separately, were passed through the deep CNN independently as shown in Figure 3 to perform the five-class classification and regression tasks. The hologram and phase-image size considered for the input of the CNN was 160 × 160 from 1600 × 1600 . The concatenated intensity–phase image size considered for the input of the CNN was 160 × 160 from 1600 × 3200 . The five-class classification and regression tasks of the hologram dataset were governed by Equations (1)–(5), respectively. The five-class classification and regression tasks of the concatenated intensity–phase image dataset were governed by Equations (6)–(10). Figure 5a shows the concatenated intensity–phase image of the object ‘circle–pentagon’ ( R a d 1 , I N P H ) present in Class-a at a distance of d 1 = 180   mm . Similarly, Figure 5b shows the concatenated intensity–phase image of the object ‘square–rectangle’ ( R h d 1 , I N P H ) present in Class-b at a distance of d 1 = 180   mm . Figure 6a shows the reconstructed phase image ( R g d 1 , P H ) of the object ‘square–triangle’ present in Class-b at a distance of d 1 = 180   mm . Figure 6b shows the reconstructed phase image ( R k d 1 , P H ) of the object ‘square–pentagon’ present in Class-b at a distance of d 1 = 180   mm . The five-class classification and regression tasks of the phase-only image dataset were also governed by Equations (11)–(15), respectively. Next, after performing the five-class classification and regression for the datasets of the holograms, concatenated intensity–phase images, and phase-only images separately using the deep CNN, the results were further correlated with those of machine learning algorithms such as KNN, MLP, DT, RF, and ET. The number of nearest neighbors considered for the KNN classifier and regressor was k = 5 .
The MLP classifier and regressor, which consisted of a single hidden layer with ReLU as the activation function, were trained using an Adam optimizer with a learning rate and regularization rate (α) of 0.0003. The DT classifier and regressor was trained by setting m a x _ d e p t h = 2 . The RF classifier and regressor was trained by setting n _ e s t i m a t o r s = 10 , m a x _ d e p t h = n o n e , and m i n _ s a m p l e s _ s p l i t = 2 . The ET classifier and regressor were also trained similarly to the RF classifier and regressor using the same parameters. For the five-class classification and regression tasks, all three digital holographic datasets were separated into training, validation, and test sets with respective proportions of 75:15:10. The training set consisted of 340 images in Class-a, Class-b, Class-c, and Class-d; and 341 images in Class-e. The validation set consisted of 68 images each in all five classes. The test set consisted of 45 images each in Class-a, Class-b, Class-c, and Class-d; and 47 images in Class-e. For the five-class classification, the deep CNN was trained for 100 epochs using an Adam optimizer whose learning rate considered was 0.0003; categorical-cross entropy was used as a loss function. Similarly to the five-class regression, the training of the deep CNN was performed like that for the five-class classification with the mean square error (MSE) as the loss function; the metrics considered were the mean square error (MSE) and the mean absolute error (MAE). The learning rate of the deep CNN model for both of the tasks was fixed throughout the training. The execution of the deep CNN was conducted using Python programming in TensorFlow, and the machine learning classifier and regressor execution was conducted using scikit learn.

5. Results and Discussion

5.1. Training of CNN for Multi-Class Classification on Holograms

The CNN was trained for the hologram dataset on both the training/validation sets with a subset of 21/20 images in one epoch. The same process was repeated simultaneously with 81/17 steps for both sets in the remaining epochs. Figure 7 shows the loss/accuracy plot obtained by the CNN. Figure 7 shows that the validation error was higher than the training error and that the accuracy of the training set was greater than that of the validation set. This confirmed that the CNN did not correctly fit.
The testing of the CNN on the hologram dataset was performed separately with a batch size of 23 images. Figure 8 describes the multi-class confusion matrix obtained from the test set. Further, the performance of the multi-class classification was compared with certain machine learning classifiers such as KNN, MLP, DT, RF, and ET. The multi-class confusion matrix obtained by the KNN, MLP, DT, RF, and ET classifiers with a batch size of 23 images for the hologram dataset on the test set are also shown in Figure 8. In the figure, it can be observed that the confusion matrix was represented for multiple classes; i.e., for five classes. Section S2.3 provides the details on the general confusion matrix for the five classes.
The performance metrics obtained from the CNN, KNN, MLP, DT, RF, and ET classifiers for the hologram dataset are shown in Table 2. The metric macro average was obtained by averaging over all five classes for the respective labels. The metrics of the micro average, weighted average, and samples average were calculated using the confusion matrix. Table 2 shows that the CNN had a greater accuracy for Class-a compared to the other classes. The KNN and MLP classifiers had a greater accuracy for Class-b compared to the other classes. The DT, RF, and ET classifiers had a higher accuracy for Class-b and Class-e compared to the other classes. Table 3 shows the computational costs and complexity parameters such as the floating-point operations (FLOPs), training time, and test time for the CNN, as well as for the other machine learning classifiers (KNN, MLP, DT, RF, and ET) for the hologram dataset.
Table 3 shows that the number of floating-point operations (FLOPs) for the CNN was greater compared to that of the other machine learning classifiers. In Table 3, it can also be seen that the training time and test time for the CNN were greater compared to the other machine learning classifiers. The receiver operating characteristic (ROC) and precision-recall characteristic were also used to describe the performance of the five-class classification task. Figure 9 shows the ROCs obtained from the CNN, KNN, MLP, DT, RF, and ET classifiers for the hologram dataset.
In Figure 9a, it can be seen that the CNN has a better area under the curve (AUC) value of 0.57 for Class-a compared to the other classes. Similarly, it can be seen that the KNN classifier had a better AUC value for Class-d compared to the other classes. The MLP classifier had equal AUC values for all five of the classes. The DT classifier had better AUC values for Class-a, -b, and -e compared to the other classes. The RF and ET classifiers have better AUC values for Class-a, -b, -c, and -e compared to Class-d. Figure 10 describes the precision-recall characteristics obtained by the CNN, KNN, MLP, DT, RF, and ET classifiers for the hologram dataset.
Figure 10a shows that the CNN had a lower precision as the recall approached unity for all five of the classes. Similarly to the remaining precision-recall characteristics, the machine learning classifiers also had a lower precision when the recall approached unity for all five of the classes.

5.2. Training of CNN for Multi-Class Classification on Concatenated Intensity–Phase Image Dataset

The CNN was trained on the concatenated intensity–phase image dataset in the same manner as that of the hologram dataset. The loss/accuracy plot obtained by the CNN is shown in Figure 11, which depicts that the validation error was higher than the training error and the accuracy of the training set was greater than that of the validation set. Therefore, it can be said that the CNN model was overfitting.
The testing of the CNN on the concatenated intensity–phase image dataset was performed in the same manner as that of the hologram dataset. The multi-class confusion matrix obtained by the CNN is shown in Figure 12. Further, the CNN was compared with the machine learning classifiers. The testing of the machine learning classifiers on the concatenated intensity–phase image dataset was performed in the same manner as that of the hologram dataset. Figure 12 shows the confusion matrix obtained by the KNN, MLP, DT, RF, and ET classifiers for all five classes for the concatenated intensity–phase image dataset.
Table 4 shows that the CNN, DT, RF, and ET classifiers had a higher accuracy for Class-a compared to that for other classes. The KNN and MLP classifiers had a greater accuracy for Class-b and -c compared to the other classes. Table 5 shows the computational costs and complexity parameters such as the floating-point operations (FLOPs), training time, and test time for the CNN and the machine learning classifiers (KNN, MLP, DT, RF, and ET) for the concatenated intensity–phase image dataset.
In Table 5, it can be seen that the number of floating-point operations (FLOPs) for the CNN was high compared to that of the other machine learning classifiers. Based on Table 5, it also can be said that the training time and the test time for the CNN were higher compared to those of the machine learning classifiers. Figure 13 shows the ROCs obtained from the CNN and the machine learning classifiers for all five classes for the concatenated intensity–phase image dataset.
Figure 13a,b shows that the CNN and KNN classifiers had higher AUC values for Class-a compared to the other classes. Similarly, it can be said that the MLP classifier had equal AUC values for all five of the classes. The DT, RF, and ET classifiers had higher AUC values for Class-b, -e, and -c compared to the other classes. Figure 14 shows the precision-recall characteristics obtained by the CNN and the machine learning classifiers for the concatenated intensity–phase image dataset.
Figure 14a shows that the CNN had a lower precision as the recall approached unity. Similarly, it can be said that the KNN, MLP, DT, RF, and ET classifiers had a lower precision as the recall approached unity.

5.3. Training of CNN for Multi-Class Classification on Phase-Only Information

The CNN was trained on the phase-only image dataset in the same manner as that of the hologram dataset. The loss/accuracy plot for both the training/validation sets obtained by the CNN for the phase-only image dataset is shown in Figure 15.
Figure 15 shows that the validation error was higher compared to the training error and that the accuracy of the training set was higher compared to that of the validation set. This showed that the CNN model was overfitting. The testing of the CNN for the phase-only image dataset on the test set was performed in the same manner as that of the hologram dataset. The confusion matrix obtained by the CNN for the five classes is shown in Figure 16.
Further, the performance of the five-class classification task was described by the machine learning classifiers. The testing of the machine learning classifiers for the phase-only image dataset on the test set was performed in the same manner as that of the hologram dataset. Figure 16 also describes the confusion matrix for all five classes obtained by the KNN, MLP, DT, RF, and ET classifiers for the phase-only image dataset. The performance metrics obtained by the CNN, KNN, MLP, DT, RF, and ET classifiers on the phase-only image dataset are shown in Table 6.
Table 6 shows that the CNN and RF classifiers had a greater accuracy for Class-e and Class-c compared to the other classes. The KNN and DT classifiers had a larger accuracy for Class-d compared to the other classes. The MLP classifier had a higher accuracy for Class-b compared to the other classes. The ET classifier achieved a higher accuracy for Class-a and Class-c compared to the other classes. Table 7 shows the computational costs and complexity parameters such as the floating-point operations (FLOPs), training time, and test time for the CNN and the other machine learning classifiers (KNN, MLP, DT, RF, and ET).
In Table 7, it can be seen that the number of floating-point operations (FLOPs) for the CNN was higher compared to those of the machine learning classifiers. In Table 7, it also can be seen that the training time and test time for the CNN were greater compared to the machine learning classifiers. Figure 17 depicts the ROCs obtained from the CNN, KNN, MLP, DT, RF, and ET classifiers for the phase-only image dataset.
In Figure 17a, it can be seen that the CNN achieved the highest AUC value of 0.93 for Class-c compared to the other classes. The remaining machine learning classifiers achieved a lower AUC value for Class-c compared to that of the CNN. Overall, the CNN had better AUC values for all of the classes compared to the KNN, MLP, DT, RF, and ET classifiers. The precision-recall characteristics for all five classes obtained by the CNN, KNN, MLP, DT, RF, and ET classifiers for the phase-only image dataset are shown in Figure 18.
Figure 18a shows that the CNN had a lower precision when the recall approached unity for all the classes. Similarly, for the remaining precision-recall characteristics, it can be seen that the KNN, MLP, DT, RF, and ET classifiers also had a lower precision when the recall approached unity for all the classes.

5.4. Training of CNN for Multi-Output Regression on Holograms

The training of the CNN for the five-class regression on the hologram dataset was performed in the same manner as that of the five-class classification on the hologram dataset. Figure 19 shows the loss/MSE/MAE plot obtained by the CNN for the training/validation sets; it can be seen that the validation error was higher than the training error. The loss and MSE plots were the same for both training/validation sets. Further, Figure 19 shows that the validation MAE was greater than the training MAE. This showed that the CNN was not correctly fitting. The testing of the CNN was performed separately on a batch size of 23 images for the test set on the hologram dataset. The evaluation metrics such as the mean absolute error (MAE), R2 score (coefficient of determination), and explained-variance (EV) regression score were used to measure the performance of the five-class regression task. The evaluation metrics obtained by the CNN are shown in Table 8. Further, the evaluation metrics obtained by the CNN were compared with those of the machine learning regressors. The evaluation metrics obtained by the KNN, MLP, DT, RF, and ET regressors with a batch size of 23 images for the hologram dataset on the test set are also shown in Table 8.
Table 8 shows that the MLP regressor had a better performance for the five-class regression tasks on the test set with a stable EV regression score of 0.00 compared to the CNN and other regressors. The testing of the CNN for the validation set on the hologram dataset was also performed with a batch size of 20 images. The evaluation metrics obtained by the CNN and the other machine learning regressors for the validation set are shown in Table 9. The machine learning regressors were also tested on the validation set with a batch size of 20 images for the hologram dataset.
Table 9 shows that the MLP regressor had a consistent performance on the validation set compared to the CNN and the other machine learning regressors with a fixed EV regression score of 0.00.

5.5. Training of CNN for Multi-Output Regression on Concatenated Intensity–Phase Image Dataset

The training of the CNN for the five-class regression on the concatenated intensity–phase image dataset was performed in the same manner as that of the five-class classification on the hologram dataset. Figure 20 shows the loss/MSE/MAE plot obtained by the CNN on the training/validation sets.
Figure 20 shows that the validation error was greater than the training error. The loss and MSE plots were the same for both the training/validation sets. Further, it can be seen that the validation MAE was greater than the training MAE. Therefore, it can be said that the CNN model was not correctly fitting. The testing of the CNN for the concatenated intensity–phase image dataset on the test set was performed in the same manner as that of the hologram dataset. The evaluation metrics obtained by the CNN are shown in Table 10. Further, the evaluation metrics obtained by the CNN were compared with certain machine learning regressors. The evaluation metrics obtained by those machine learning regressors on the test set are also shown in Table 10. The testing of the machine learning regressors for the concatenated intensity-phase image dataset on the test set was performed in the same manner as that of the hologram dataset.
Table 10 shows that the MLP regressor had a better performance for the five-class regression tasks on the test set with a fixed EV regression score of 0.00 compared to the CNN and the other machine learning regressors. The testing of the CNN for the concatenated intensity–phase image dataset on the validation set was performed in the same manner as that of the hologram dataset. The evaluation metrics obtained by the CNN and the other machine learning regressors on the validation set are shown in Table 11. The testing of the machine learning regressors for the whole information dataset on the validation set was performed in the same manner as that of the hologram dataset.
Table 11 shows that the RF regressor had a better performance on the validation set compared to the CNN and the other machine learning regressors with a stable EV regression score of 0.01.

5.6. Training of CNN for Multi-Output Regression on Phase-Only Information

The training of the CNN for the five-class regression on the phase-only image dataset was performed in the same manner as that of the five-class classification on the hologram dataset. The loss/MSE/MAE plot for the training/validation sets is provided in Figure 21, which shows that the error for the training set was lower than the error for the validation set. The loss and MSE plots were the same as those depicted in Figure 21. Further, it can also be seen in Figure 21 that the validation MAE was higher compared to the training MAE. This showed that the CNN model was overfitting. The testing of the CNN for the phase-only image dataset on the test set was performed in the same manner as that of the hologram dataset. Next, these evaluation metrics obtained by the CNN on the test set were compared with certain machine learning regressors. The evaluation metrics obtained from the CNN and the machine learning regressors on the test set are shown in Table 12. The testing of machine learning regressors for phase-only image dataset on the test set was performed in the same manner as that of the hologram dataset.
Table 12 shows that the MLP regressor had a better performance for the five-class regression tasks on the test set with a fixed EV regression score of 0.00 compared to the CNN, KNN, DT, RF, and ET regressors. The testing of the CNN for the phase-only image dataset on the validation set was performed in the same manner as that of the hologram dataset. The evaluation metrics obtained by the CNN and the other machine learning regressors on the validation set are shown in Table 13. The testing of the machine learning regressors for the validation set on the phase-only image dataset was performed in the same manner as that of the hologram dataset.
Table 13 shows that the MLP regressor had a good performance on the validation set compared to the CNN and the other machine learning regressors with a stable EV regression score of 0.00.

6. Conclusions

In this paper, digital holographic information in datasets comprising holograms, reconstructed intensity and phase images combined to form concatenated intensity–phase images, and phase-only images was used for the proposed multi-class classification and multi-output regression tasks by using deep learning and machine learning techniques. Each dataset comprised 2268 images separately to perform the multi-class classification and multi-output regression tasks. A deep CNN was used on all three datasets independently to perform the five-class classification and regression tasks. The five-class classification and regression tasks in supervised learning applied to the digital holographic information of eighteen 3D objects using deep CNN was equivalent to 3D object allocation and prediction performed on the digital holographic datasets that produced discrete and continuous labels as output, which justified the rationale behind the present work. For the five-class classification task, the results such as the error/accuracy plots and error matrix, evaluation metrics, receiver operating characteristics (ROCs), and precision-recall characteristics were shown for the confirmation of the work. Similarly, for the five-class regression task, the results such as the error/mean square error (MSE)/mean absolute error (MAE) plots and evaluation metrics were shown for the confirmation of the work. The CNN overfitted all three datasets as shown by the error/accuracy graphs. The ML classifiers had better AUC values for different classes on the datasets of holograms and concatenated intensity–phase images when compared to the CNN. Further, the CNN was found to have a higher AUC value for all five classes on the phase-only image dataset when compared to the other machine learning classifiers. Similarly, the CNN overfitted all three datasets as obtained in the loss/MSE/MAE curves on the training/validation sets. Further, the MLP regressor had a better performance on the test/validation sets for the hologram and phase-only image datasets with a fixed EV regression score of 0.00 compared to the CNN and the other machine learning regressors [51]. The RF regressor had a better performance on the validation set for the concatenated intensity–phase image dataset with a stable EV regression score of 0.01 compared to the CNN and the other regressors [52]. Therefore, we concluded that both the CNN and the machine learning classifiers and regressors (KNN, MLP, DT, RF, and ET) had a superior performance in both the five-class classification and regression tasks for all three datasets.

Supplementary Materials

The following supporting information can be downloaded at: https://www.mdpi.com/article/10.3390/s23031095/s1, Figure S1: 3D object used in the off-axis digital holographic recording geometry; Figure S2: Off-axis digital holographic recording geometry used for the recording of hologram of 3D object; Figure S3: General Multi-Class Confusion Matrix.

Author Contributions

U.M.R.N.: CNN architecture design, data collection, analysis and interpretation of results, manuscript preparation; A.N.: supervision, conceptualization, digital holographic experiment and informational retrieval, manuscript review and editing. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by the Science and Engineering Research Board (SERB), Department of Science and Technology, Government of India, grant number CRG/2018/003906 and the APC was funded by VIT Chennai.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

Not applicable.

Conflicts of Interest

The authors declare no conflict of interest.

References

  1. Chollet, F. Deep Learning with Python, 2nd ed.; Manning: New York, NY, USA, 2018; Available online: https://www.manning.com/books/deep-learning-with-python (accessed on 1 November 2017).
  2. Mitchell, T.M. Machine Learning, 1st ed.; MacGraw-Hill: New York, NY, USA, 1997; pp. 1–421. [Google Scholar]
  3. Mehmood, A.; Maqsood, M.; Bashir, M.; Shuyuan, Y. A Deep Siamese Convolution Neural Network for Multi-Class Classification of Alzheimer Disease. Brain Sci. 2020, 10, 84. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  4. Farooq, A.; Anwar, S.; Awais, M.; Rehman, S. A deep CNN based multi-class classification of Alzheimer’s disease using MRI. In Proceedings of the 2017 IEEE International Conference on Imaging Systems and Techniques (IST), Beijing, China, 18–20 October 2017; pp. 1–6. [Google Scholar] [CrossRef]
  5. Ramzan, F.; Khan, M.U.G.; Rehmat, A.; Iqbal, S.; Saba, T.; Rehman, A.; Mehmood, Z. A Deep Learning Approach for Automated Diagnosis and Multi-Class Classification of Alzheimer’s Disease Stages Using Resting-State fMRI and Residual Neural Networks. J. Med. Syst. 2020, 44, 37. [Google Scholar] [CrossRef] [PubMed]
  6. Islam, J.; Zhang, Y. A Novel Deep Learning Based Multi-class Classification Method for Alzheimer’s Disease Detection Using Brain MRI Data. In Brain Informatics; Lecture Notes in Computer Science; Springer: Cham, Switzerland, 2017; Volume 10654. [Google Scholar] [CrossRef]
  7. Pan, L.; Pouyanfar, S.; Chen, H.; Qin, J.; Chen, S.-C. DeepFood: Automatic Multi-Class Classification of Food Ingredients Using Deep Learning. In Proceedings of the 2017 IEEE 3rd International Conference on Collaboration and Internet Computing (CIC), San Jose, CA, USA, 15–17 October 2017; pp. 181–189. [Google Scholar] [CrossRef]
  8. Dzeroski, S.; Demšar, D.; Grbović, J. Predicting Chemical Parameters of River Water Quality from Bioindicator Data. Appl. Intell. 2000, 13, 7–17. [Google Scholar] [CrossRef]
  9. Aras, H.; Aras, N. Forecasting Residential Natural Gas Demand. Energy Sources 2004, 26, 463–472. [Google Scholar] [CrossRef]
  10. Li, H.; Zhang, W.; Chen, Y.; Guo, Y.; Li, G.-Z.; Zhu, X. A novel multi-target regression framework for time-series prediction of drug efficacy. Sci. Rep. 2017, 7, 40652. [Google Scholar] [CrossRef] [Green Version]
  11. Kuznar, D.; Mozina, M.; Bratko, I. Curve prediction with kernel regression. In Proceedings of the ECML/PKDD 2009, Workshop on Learning from Multi-Label Data, Bled, Slovenia, 7 September 2009; pp. 61–68. [Google Scholar]
  12. Han, Z.; Liu, Y.; Zhao, J.; Wang, W. Real time prediction for converter gas tank levels based on multi-output least square support vector regressor. Control. Eng. Pract. 2012, 20, 1400–1409. [Google Scholar] [CrossRef]
  13. Tuia, D.; Verrelst, J.; Alonso, L.; Perez-Cruz, F.; Camps-Valls, G. Multioutput Support Vector Regression for Remote Sensing Biophysical Parameter Estimation. IEEE Geosci. Remote Sens. Lett. 2011, 8, 804–808. [Google Scholar] [CrossRef]
  14. Sanchez-Fernandez, M.; De-Prado-Cumplido, M.; Arenas-Garcia, J.; Perez-Cruz, F. SVM Multiregression for Nonlinear Channel Estimation in Multiple-Input Multiple-Output Systems. IEEE Trans. Signal Process. 2004, 52, 2298–2307. [Google Scholar] [CrossRef]
  15. Borchani, H.; Varando, G.; Bielza, C.; Larrañaga, P. A survey on multi-output regression. Wiley Interdiscip. Rev. Data Min. Knowl. Discov. 2015, 5, 216–233. [Google Scholar] [CrossRef] [Green Version]
  16. Schnars, U.; Falldorf, C.; Watson, J.; Jüptner, W. Digital Holography and Wavefront Sensing: Principles, Techniques and Applications; Springer: Berlin/Heidelberg, Germany, 2015. [Google Scholar] [CrossRef]
  17. Goodman, J.W. Introduction to Fourier Optics, 4th ed.; Freeman, W.H., Ed.; MacGraw-Hill: New York, NY, USA, 2017. [Google Scholar]
  18. Goodman, J.W.; Lawrence, R.W. Digital image formation from electronically detected holograms. Appl. Phys. Lett. 1967, 11, 77–79. [Google Scholar] [CrossRef]
  19. Schnars, U.; Jüptner, W. Direct recording of holograms by a CCD target and numerical reconstruction. Appl. Opt. 1994, 33, 179–181. [Google Scholar] [CrossRef]
  20. Nelleri, A.; Joseph, J.; Singh, K. Recognition and classification of three-dimensional phase objects by digital Fresnel holography. Appl. Opt. 2006, 45, 4046–4053. [Google Scholar] [CrossRef]
  21. Anith, N.; Unnikrishnan, G.; Joby, J. Three-dimensional object recognition from digital Fresnel hologram by wavelet matched filtering. Opt. Commun. 2006, 259, 499–506. [Google Scholar] [CrossRef]
  22. Reddy, B.L.; Ramachandran, P.; Nelleri, A. Compressive complex wave retrieval from a single off-axis digital Fresnel hologram for quantitative phase imaging and microlens characterization. Opt. Commun. 2021, 478, 126371. [Google Scholar] [CrossRef]
  23. Cuche, E.; Bevilacqua, F.; Depeursinge, C. Digital holography for quantitative phase-contrast imaging. Opt. Lett. 1999, 24, 291–293. [Google Scholar] [CrossRef]
  24. Ghiglia, D.C.; Romero, L.A. Robust two-dimensional weighted and unweighted phase unwrapping that uses fast transforms and iterative methods. J. Opt. Soc. Am. A 1994, 11, 107–117. [Google Scholar] [CrossRef]
  25. Wang, H.; Lyu, M.; Situ, G. eHoloNet: A Learning-Based End-to-End Approach for Inline Digitalholo-Graphicreconstruction. Opt. Express 2018, 26, 22603–22614. [Google Scholar] [CrossRef]
  26. Pitkäaho, T.; Manninen, A.; Naughton, T.J. Focus prediction in digital holographic microscopy using deep convolutional neural networks. Appl. Opt. 2019, 58, A202–A208. [Google Scholar] [CrossRef]
  27. Wang, H.; Rivenson, Y.; Jin, Y.; Wei, Z.; Gao, R.; Günaydın, H.; Bentolila, L.A.; Kural, C.; Ozcan, A. Deep learning enables cross-modality super-resolution in fluorescence microscopy. Naturemethods 2019, 16, 103–110. [Google Scholar] [CrossRef]
  28. Ren, Z.; Xu, Z.; Lam, E.Y. Learning-based nonparametric autofocusing for digital holography. Optica 2018, 5, 337–344. [Google Scholar] [CrossRef]
  29. Shimobaba, T.; Kuwata, N.; Homma, M.; Takahashi, T.; Nagahama, Y.; Sano, M.; Hasegawa, S.; Hirayama, R.; Kakue, T.; Shiraki, A.; et al. Convolutional neural network-based data page classification for holographic memory. Appl. Opt. 2017, 56, 7327–7330. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  30. Pitkäaho, T.; Manninen, A.; Naughton, T.J. Digital hologram reconstruction segmentation using a convolutional neural network. In Digital Holography and Three-Dimensional Imaging; Paper Th3A.1; OSA Technical Digest (Optica Publishing Group): Bordeaux, France, 2019. [Google Scholar] [CrossRef]
  31. Di, J.; Wang, K.; Li, Y.; Zhao, J. Deep learning-based holographic reconstruction in digital holography. In Digital Holography and Three-Dimensional Imaging; Optical Society of America: Washington, DC, USA, 2020; p. HTu4B-2. [Google Scholar] [CrossRef]
  32. Trujillo, C.; Garcia-Sucerquia, J. Deep learning for digital holographic microscopy: Automatic detection of phase objects in raw holograms. In Digital Holography and Three-Dimensional Imaging; Optical Society of America: Washington, DC, USA, 2020; p. HTu4B.3. [Google Scholar] [CrossRef]
  33. Yan, K.; Yu, Y.; Huang, C.; Sui, L.; Qian, K.; Asundi, A. Fringe pattern denoising based on deeplearning. Opt. Commun. 2019, 437, 148–152. [Google Scholar] [CrossRef]
  34. Zeng, T.; So, H.K.H.; Lam, E.Y. RedCap: Residual encoder-decoder capsule network for holographicimage-reconstruction. Opt. Express 2020, 28, 4876–4887. [Google Scholar] [CrossRef] [PubMed]
  35. Meng, Z.; Pedrini, G.; Lv, X.; Ma, J.; Nie, S.; Yuan, C. DL-SI-DHM: A deep network generating the high-resolution phase and amplitude images from wide-field images. Opt. Express 2021, 29, 19247–19261. [Google Scholar] [CrossRef]
  36. Ma, S.; Liu, Q.; Yu, Y.; Luo, Y.; Wang, S. Quantitative phase imaging in digital holographic microscopy based on image inpainting using a two-stage generative adversarial network. Opt. Express 2021, 29, 24928–24946. [Google Scholar] [CrossRef]
  37. Jo, Y.; Park, S.; Jung, J.; Yoon, J.; Joo, H.; Kim, M.-H.; Kang, S.-J.; Choi, M.C.; Lee, S.Y.; Park, Y. Holographic deep learning for rapid optical screening of anthrax spores. Sci. Adv. 2017, 3, e1700606. [Google Scholar] [CrossRef] [Green Version]
  38. Li, H.; Chen, X.; Chi, Z.; Mann, C.; Razi, A. Deep DIH: Single-Shot Digital In-Line Holography Reconstruction by Deep Learning. IEEE Access 2020, 8, 202648–202659. [Google Scholar] [CrossRef]
  39. Priscoli, M.D.; Memmolo, P.; Ciaparrone, G.; Bianco, V.; Merola, F.; Miccio, L.; Bardozzo, F.; Pirone, D.; Mugnano, M.; Cimmino, F.; et al. Neuroblastoma Cells Classification through Learning Approaches by Direct Analysis of Digital Holograms. IEEE J. Sel. Top. Quantum Electron. 2021, 27, 5500309. [Google Scholar] [CrossRef]
  40. Yin, D.; Gu, Z.; Zhang, Y.; Gu, F.; Nie, S.; Ma, J.; Yuan, C. Digital Holographic Reconstruction Based on Deep Learning Framework With Unpaired Data. IEEE Photonics J. 2019, 12, 3900312. [Google Scholar] [CrossRef]
  41. Li, J.; Li, Y.; Li, J.; Zhang, Q.; Yang, G.; Chen, S.; Wang, C. Single Exposure Optical Image Watermarking Using a cGAN Network. IEEE Photonics J. 2021, 13, 6900111. [Google Scholar] [CrossRef]
  42. Ren, Z.; So, H.K.-H.; Lam, E.Y. Fringe Pattern Improvement and Super-Resolution Using Deep Learning in Digital Holography. IEEE Trans. Ind. Inform. 2019, 15, 6179–6186. [Google Scholar] [CrossRef]
  43. Zhu, Y.; Yeung, C.H.; Lam, E.Y. Holographic Classifier: Deep Learning in Digital Holography for Automatic Micro-objects Classification. In Proceedings of the 2020 IEEE 18th International Conference on Industrial Informatics (INDIN), Warwick, UK, 20–23 July 2020; pp. 515–520. [Google Scholar] [CrossRef]
  44. Lecun, Y.; Bottou, L.; Bengio, Y.; Haffner, P. Gradient-based learning applied to document recognition. Proc. IEEE 1998, 86, 2278–2324. [Google Scholar] [CrossRef]
  45. Kim, S.-J.; Wang, C.; Zhao, B.; Im, H.; Min, J.; Choi, H.J.; Tadros, J.; Choi, N.R.; Castro, C.M.; Weissleder, R.; et al. Deep transfer learning-based hologram classification for molecular diagnostics. Sci. Rep. 2018, 8, 17003. [Google Scholar] [CrossRef]
  46. Lam, H.; Tsang, P. Invariant Classification of Holograms of Deformable Objects Based on Deep Learning. In Proceedings of the 2019 IEEE 28th International Symposium on Industrial Electronics (ISIE), Vancouver, BC, Canada, 12–14 June 2019; pp. 2392–2396. [Google Scholar] [CrossRef]
  47. Zhu, Y.; Yeung, C.H.; Lam, E.Y. Digital holographic imaging and classification of microplastics using deep transfer learning. Appl. Opt. 2021, 60, A38–A47. [Google Scholar] [CrossRef]
  48. Reddy, B.L.; Mahesh, R.N.U.; Nelleri, A. Deep convolutional neural network for three-dimensional objects classification using off-axis digital Fresnel holography. J. Mod. Opt. 2022, 69, 705–717. [Google Scholar] [CrossRef]
  49. Mahesh, R.N.U.; Nelleri, A. Deep convolutional neural network for binary regression of three-dimensional objects using information retrieved from digital Fresnel holograms. Appl. Phys. A 2022, 128, 157. [Google Scholar] [CrossRef]
  50. Liebling, M.; Blu, T.; Unser, M. Complex-wave retrieval from a single off-axis hologram. J. Opt. Soc. Am. A 2004, 21, 367–377. [Google Scholar] [CrossRef] [Green Version]
  51. Jiao, S.; Gao, Y.; Feng, J.; Lei, T.; Yuan, X. Does deep learning always outperform simple linear regression in optical imaging? Opt. Express 2020, 28, 3717–3731. [Google Scholar] [CrossRef] [Green Version]
  52. Fan, Y.; Pang, X.; Udalcovs, A.; Natalino, C.; Zhang, L.; Spolitis, S.; Bobrovs, V.; Schatz, R.; Yu, X.; Furdek, M.; et al. Linear Regression vs. Deep Learning for Signal Quality Monitoring in Coherent Optical Systems. IEEE Photonics J. 2022, 14, 8643108. [Google Scholar] [CrossRef]
Figure 1. The 3D objects used in the off-axis digital holographic recording geometry: (a) circle–triangle; (b) square–rectangle; (c) square–pentagon; (d) pentagon–square. Circle: 2 mm in diameter; triangle: 2 mm in x and y directions; square: 2 mm in x and y directions; pentagon: 2 mm in x and y directions; rectangle: 2 mm in x direction and 1 mm in y direction. The distance between the first plane and second plane was 8 mm in the z direction.
Figure 1. The 3D objects used in the off-axis digital holographic recording geometry: (a) circle–triangle; (b) square–rectangle; (c) square–pentagon; (d) pentagon–square. Circle: 2 mm in diameter; triangle: 2 mm in x and y directions; square: 2 mm in x and y directions; pentagon: 2 mm in x and y directions; rectangle: 2 mm in x direction and 1 mm in y direction. The distance between the first plane and second plane was 8 mm in the z direction.
Sensors 23 01095 g001
Figure 2. Off-axis digital holographic recording geometry used for the recording of holograms of 3D objects. SF: spatial filter assembly; CL: collimation lens; BS: beam splitter; M: mirror; CMOS: camera sensor.
Figure 2. Off-axis digital holographic recording geometry used for the recording of holograms of 3D objects. SF: spatial filter assembly; CL: collimation lens; BS: beam splitter; M: mirror; CMOS: camera sensor.
Sensors 23 01095 g002
Figure 3. Block diagram of CNN for multi-class classification and multi-output regression.
Figure 3. Block diagram of CNN for multi-class classification and multi-output regression.
Sensors 23 01095 g003
Figure 4. Representative images of the digital holograms of 3D objects recorded at distance of d 3 = 200 mm: (a) circle–triangle ( b d 3 ); (b) square–rectangle ( h d 3 ); (c) triangle–rectangle ( n d 3 ); (d) triangle–pentagon ( o d 3 ); (e) pentagon–triangle ( r d 3 ).
Figure 4. Representative images of the digital holograms of 3D objects recorded at distance of d 3 = 200 mm: (a) circle–triangle ( b d 3 ); (b) square–rectangle ( h d 3 ); (c) triangle–rectangle ( n d 3 ); (d) triangle–pentagon ( o d 3 ); (e) pentagon–triangle ( r d 3 ).
Sensors 23 01095 g004aSensors 23 01095 g004b
Figure 5. (a) Concatenated intensity–phase image of the circle–pentagon ( R a d 1 , I N P H ) object that belonged to Class-a; (b) concatenated intensity–phase image of the square–rectangle ( R h d 1 , I N P H ) object that belonged to Class-b.
Figure 5. (a) Concatenated intensity–phase image of the circle–pentagon ( R a d 1 , I N P H ) object that belonged to Class-a; (b) concatenated intensity–phase image of the square–rectangle ( R h d 1 , I N P H ) object that belonged to Class-b.
Sensors 23 01095 g005
Figure 6. Reconstructed phase images at a distance of d 1 = 180   mm : (a) square–triangle ( R g d 1 , P H ); (b) square–pentagon ( R k d 1 , P H ).
Figure 6. Reconstructed phase images at a distance of d 1 = 180   mm : (a) square–triangle ( R g d 1 , P H ); (b) square–pentagon ( R k d 1 , P H ).
Sensors 23 01095 g006
Figure 7. Error/accuracy plot for training/validation sets for the hologram dataset.
Figure 7. Error/accuracy plot for training/validation sets for the hologram dataset.
Sensors 23 01095 g007
Figure 8. Multi-class confusion matrix for the hologram dataset: (a) CNN; (b) KNN; (c) MLP; (d) DT; (e) RF; (f) ET.
Figure 8. Multi-class confusion matrix for the hologram dataset: (a) CNN; (b) KNN; (c) MLP; (d) DT; (e) RF; (f) ET.
Sensors 23 01095 g008aSensors 23 01095 g008bSensors 23 01095 g008c
Figure 9. ROCs for the hologram dataset: (a) CNN; (b) KNN; (c) MLP; (d) DT; (e) RF; (f) ET.
Figure 9. ROCs for the hologram dataset: (a) CNN; (b) KNN; (c) MLP; (d) DT; (e) RF; (f) ET.
Sensors 23 01095 g009
Figure 10. Precision-recall characteristics for the hologram dataset: (a) CNN; (b) KNN; (c) MLP; (d) DT; (e) RF; (f) ET.
Figure 10. Precision-recall characteristics for the hologram dataset: (a) CNN; (b) KNN; (c) MLP; (d) DT; (e) RF; (f) ET.
Sensors 23 01095 g010
Figure 11. Loss and accuracy plot for the concatenated intensity–phase (IP) image dataset.
Figure 11. Loss and accuracy plot for the concatenated intensity–phase (IP) image dataset.
Sensors 23 01095 g011
Figure 12. Multi-class confusion matrix for the whole information dataset: (a) CNN; (b) KNN; (c) MLP; (d) DT; (e) RF; (f) ET.
Figure 12. Multi-class confusion matrix for the whole information dataset: (a) CNN; (b) KNN; (c) MLP; (d) DT; (e) RF; (f) ET.
Sensors 23 01095 g012aSensors 23 01095 g012bSensors 23 01095 g012c
Figure 13. ROCs for whole information dataset: (a) CNN; (b) KNN; (c) MLP; (d) DT; (e) RF; (f) ET.
Figure 13. ROCs for whole information dataset: (a) CNN; (b) KNN; (c) MLP; (d) DT; (e) RF; (f) ET.
Sensors 23 01095 g013
Figure 14. Precision-recall characteristic for the whole information dataset: (a) CNN; (b) KNN; (c) MLP; (d) DT; (e) RF; (f) ET.
Figure 14. Precision-recall characteristic for the whole information dataset: (a) CNN; (b) KNN; (c) MLP; (d) DT; (e) RF; (f) ET.
Sensors 23 01095 g014
Figure 15. Error and accuracy measurements on training/validation sets for phase-only image dataset.
Figure 15. Error and accuracy measurements on training/validation sets for phase-only image dataset.
Sensors 23 01095 g015
Figure 16. Five-class error matrix for the phase-only image dataset: (a) CNN; (b) KNN; (c) MLP; (d) DT; (e) RF; (f) ET.
Figure 16. Five-class error matrix for the phase-only image dataset: (a) CNN; (b) KNN; (c) MLP; (d) DT; (e) RF; (f) ET.
Sensors 23 01095 g016aSensors 23 01095 g016bSensors 23 01095 g016c
Figure 17. ROCs for the phase-only image dataset: (a) CNN; (b) KNN; (c) MLP; (d) DT; (e) RF; (f) ET.
Figure 17. ROCs for the phase-only image dataset: (a) CNN; (b) KNN; (c) MLP; (d) DT; (e) RF; (f) ET.
Sensors 23 01095 g017
Figure 18. Precision-recall characteristics for the phase-only image dataset: (a) CNN; (b) KNN; (c) MLP; (d) DT; (e) RF; (f) ET.
Figure 18. Precision-recall characteristics for the phase-only image dataset: (a) CNN; (b) KNN; (c) MLP; (d) DT; (e) RF; (f) ET.
Sensors 23 01095 g018
Figure 19. Loss/MSE/MAE plot for hologram dataset.
Figure 19. Loss/MSE/MAE plot for hologram dataset.
Sensors 23 01095 g019
Figure 20. Loss/MSE/MAE plot on training/validation sets for whole information dataset.
Figure 20. Loss/MSE/MAE plot on training/validation sets for whole information dataset.
Sensors 23 01095 g020
Figure 21. Loss/MSE/MAE plot for training/validation sets of the phase-only image dataset.
Figure 21. Loss/MSE/MAE plot for training/validation sets of the phase-only image dataset.
Sensors 23 01095 g021
Table 1. Model Summary of CNN.
Table 1. Model Summary of CNN.
LayerInputOutputNumber of Parameters
Conv1160 × 160 × 8158 × 158 × 8224
MaxPooling2D158 × 158 × 879 × 79 × 80
Conv279 × 79 × 1677 × 77 × 161168
MaxPooling2D77 × 77 × 1638 × 38 × 160
Conv338 × 38 × 3236 × 36 × 324640
MaxPooling2D36 × 36 × 3218 × 18 × 320
Conv418 × 18 × 6416 × 16 × 6418,496
MaxPooling2D16 × 16 × 648 × 8 × 640
Fully connected40961665,552
Output16585
Total number of parameters 90,165
Table 2. Evaluation metrics for the CNN, KNN, MLP, DT, RF, and ET classifiers on the hologram dataset.
Table 2. Evaluation metrics for the CNN, KNN, MLP, DT, RF, and ET classifiers on the hologram dataset.
Class and MetricClass-aClass-bClass-cClass-dClass-eMicro AverageMacro AverageWeighted AverageSamples AverageCNN
Accuracy0.780.650.610.740.48
Precision0.000.000.380.000.000.130.070.110.13
Recall0.000.000.430.000.000.130.090.130.13
F1-Score0.000.000.400.000.000.130.080.120.13
Class and MetricClass-aClass-bClass-cClass-dClass-eMicro AverageMacro AverageWeighted AverageSamples AverageKNN
Accuracy0.870.910.570.570.83
Precision0.000.000.000.220.000.200.040.050.09
Recall0.000.000.000.400.000.090.080.090.09
F1-Score0.000.000.000.290.000.120.060.060.09
Class and MetricClass-aClass-bClass-cClass-dClass-eMicro AverageMacro AverageWeighted AverageSamples AverageMLP
Accuracy0.130.910.430.220.13
Precision0.130.000.430.220.130.230.180.270.23
Recall1.000.001.001.001.000.910.800.910.91
F1-Score0.230.000.610.360.230.370.280.400.37
Class and MetricClass-aClass-bClass-cClass-dClass-eMicro AverageMacro AverageWeighted AverageSamples AverageDT
Accuracy0.650.870.480.390.87
Precision0.000.000.000.200.000.130.040.040.13
Recall0.000.000.000.600.000.130.120.130.13
F1-Score0.000.000.000.300.000.130.060.070.13
Class and MetricClass-aClass-bClass-cClass-dClass-eMicro AverageMacro AverageWeighted AverageSamples AverageRF
Accuracy0.650.870.830.650.87
Precision0.000.000.000.000.000.000.000.000.00
Recall0.000.000.000.000.000.000.000.000.00
F1-Score0.000.000.000.000.000.000.000.000.00
Class and MetricClass-aClass-bClass-cClass-dClass-eMicro AverageMacro AverageWeighted AverageSamples AverageET
Accuracy0.650.870.830.610.87
Precision0.000.000.000.000.000.000.000.000.00
Recall0.000.000.000.000.000.000.000.000.00
F1-Score0.000.000.000.000.000.000.000.000.00
Table 3. Computational costs and complexities for the CNN, KNN, MLP, DT, RF, and ET classifiers on the hologram dataset.
Table 3. Computational costs and complexities for the CNN, KNN, MLP, DT, RF, and ET classifiers on the hologram dataset.
ParameterCNNKNNMLPDTRFET
Floating-point operations (FLOPs)45,223,104384,00076,80576,8071,536,0001,536,000
Training time (s)6560175190125116119
Test time (s)1639480677178
Table 4. Evaluation metrics for the CNN, KNN, MLP, DT, RF, and ET classifiers on the whole information dataset.
Table 4. Evaluation metrics for the CNN, KNN, MLP, DT, RF, and ET classifiers on the whole information dataset.
Class and MetricClass-aClass-bClass-cClass-dClass-eMicro AverageMacro AverageWeighted AverageSamples AverageCNN
Accuracy 0.910.610.740.740.61
Precision0.000.500.380.000.270.300.230.310.30
Recall0.000.110.750.000.750.300.320.300.30
F1-Score0.000.180.500.000.400.300.220.230.30
Class and MetricClass-aClass-bClass-cClass-dClass-eMicro AverageMacro AverageWeighted AverageSamples AverageKNN
Accuracy 0.740.830.910.610.83
Precision0.600.000.000.000.000.380.120.180.13
Recall0.430.000.000.000.000.130.090.130.13
F1-Score0.500.000.000.000.000.190.100.150.13
Class and MetricClass-aClass-bClass-cClass-dClass-eMicro AverageMacro AverageWeighted AverageSamples AverageMLP
Accuracy 0.300.830.090.260.17
Precision0.300.000.090.260.170.210.170.200.21
Recall1.001.001.001.001.000.830.800.830.83
F1-Score0.470.000.160.410.300.330.270.320.33
Class and MetricClass-aClass-bClass-cClass-dClass-eMicro AverageMacro AverageWeighted AverageSamples AverageDT
Accuracy 0.910.780.740.570.83
Precision0.000.600.000.270.000.380.170.220.26
Recall0.000.500.000.600.000.260.220.260.26
F1-Score0.000.550.000.370.000.310.180.220.26
Class and MetricClass-aClass-bClass-cClass-dClass-eMicro AverageMacro AverageWeighted AverageSamples AverageRF
Accuracy 0.910.740.700.780.87
Precision0.000.000.330.001.000.500.270.260.09
Recall0.000.000.170.000.250.090.080.090.09
F1-Score0.000.000.220.000.400.150.120.130.09
Class and MetricClass-aClass-bClass-cClass-dClass-eMicro AverageMacro AverageWeighted AverageSamples AverageET
Accuracy 0.910.740.780.700.83
Precision0.000.001.000.000.000.330.200.260.04
Recall0.000.000.170.000.000.040.030.040.04
F1-Score0.000.000.290.000.000.080.060.070.04
Table 5. Computational costs and complexities for the CNN, KNN, MLP, DT, RF, and ET classifiers on the whole information dataset.
Table 5. Computational costs and complexities for the CNN, KNN, MLP, DT, RF, and ET classifiers on the whole information dataset.
ParameterCNNKNNMLPDTRFET
Floating-point perations (FLOPs)4,522,3104384,00076,80576,8071,536,0001,536,000
Training time(s)5012164178134121131
Test time(s)1398973626472
Table 6. Evaluation metrics for the CNN, KNN, MLP, DT, RF, and ET classifiers on the phase-only image dataset.
Table 6. Evaluation metrics for the CNN, KNN, MLP, DT, RF, and ET classifiers on the phase-only image dataset.
Class and MetricClass-aClass-bClass-cClass-dClass-eMicro AverageMacro AverageWeighted AverageSamples AverageCNN
Accuracy0.780.650.650.870.91
Precision0.000.000.500.500.000.430.200.240.43
Recall0.000.001.000.670.000.430.330.430.43
F1-Score0.000.000.670.570.000.430.250.310.43
Class and MetricClass-aClass-bClass-cClass-dClass-eMicro AverageMacro AverageWeighted AverageSamples AverageKNN
Accuracy0.780.830.650.870.83
Precision0.000.000.000.000.000.000.000.000.00
Recall0.000.000.000.000.000.000.000.000.00
F1-Score0.000.000.000.000.000.000.000.000.00
Class and MetricClass-aClass-bClass-cClass-dClass-eMicro AverageMacro AverageWeighted AverageSamples AverageMLP
Accuracy0.220.830.300.130.17
Precision0.220.000.300.130.170.210.170.190.21
Recall1.000.001.001.001.000.830.800.830.83
F1-Score0.360.000.470.230.300.330.270.300.33
Class and MetricClass-aClass-bClass-cClass-dClass-eMicro AverageMacro AverageWeighted AverageSamples AverageDT
Accuracy0.740.650.430.830.78
Precision0.000.000.190.000.000.160.040.020.13
Recall0.000.001.000.000.000.130.200.130.13
F1-Score0.000.000.320.000.000.140.060.040.13
Class and MetricClass-aClass-bClass-cClass-dClass-eMicro AverageMacro AverageWeighted AverageSamples AverageRF
Accuracy0.870.610.910.830.78
Precision0.000.001.000.000.000.500.200.130.04
Recall0.000.000.330.000.000.040.070.040.04
F1-Score0.000.000.500.000.000.080.100.070.04
Class and MetricClass-aClass-bClass-cClass-dClass-eMicro AverageMacro AverageWeighted AverageSamples AverageET
Accuracy0.870.650.870.830.78
Precision0.500.000.000.000.000.500.100.070.04
Recall0.330.000.000.000.000.040.070.040.04
F1-Score0.400.000.000.000.000.080.080.050.04
Table 7. Computational costs and complexities for the CNN, KNN, MLP, DT, RF, and ET classifiers on the phase-only image dataset.
Table 7. Computational costs and complexities for the CNN, KNN, MLP, DT, RF, and ET classifiers on the phase-only image dataset.
ParameterCNNKNNMLPDTRFET
Floating-point operations (FLOPs)45,223,104384,00076,80576,8071,536,0001,536,000
Training time(s)5635159181141123134
Test time(s)967968615867
Table 8. Evaluation metrics for test set of the hologram dataset.
Table 8. Evaluation metrics for test set of the hologram dataset.
MetricCNNKNNMLPDTRFET
MAE0.460.300.480.340.350.32
R2 score−1.20−0.09−1.13−0.74−0.40−0.26
EV regression score−0.83−0.010.00-0.64−0.27−0.13
Table 9. Evaluation metrics for validation set of the hologram dataset.
Table 9. Evaluation metrics for validation set of the hologram dataset.
MetricCNNKNNMLPDTRFET
MAE0.460.340.420.310.310.32
R2 score−1.35−0.65−1.10−0.44−0.14−0.27
EV regression score−0.77−0.330.00−0.33−0.13−0.19
Table 10. Evaluation metrics for test set of the whole information dataset.
Table 10. Evaluation metrics for test set of the whole information dataset.
MetricCNNKNNMLPDTRFET
MAE0.330.280.470.330.310.32
R2 score−0.93−0.11−1.26−1.01−0.14−0.61
EV regression score−0.30−0.020.00−0.79−0.02−0.42
Table 11. Evaluation metrics for validation set on whole information dataset.
Table 11. Evaluation metrics for validation set on whole information dataset.
MetricCNNKNNMLPDTRFET
MAE0.250.310.510.280.310.28
R2 score−0.17−0.36−2.01−0.45−0.13−0.23
EV regression score−0.16−0.230.00−0.430.01−0.06
Table 12. Evaluation metrics for test set on phase-only image dataset.
Table 12. Evaluation metrics for test set on phase-only image dataset.
MetricCNNKNNMLPDTRFET
MAE0.360.330.440.300.320.30
R2 score−0.58−0.38−0.86−0.70−0.17−0.19
EV regression score−0.20−0.100.00−0.62−0.10−0.13
Table 13. Evaluation metrics for validation set of phase-only image dataset.
Table 13. Evaluation metrics for validation set of phase-only image dataset.
MetricCNNKNNMLPDTRFET
MAE0.440.350.440.370.340.33
R2 score−0.79−0.43−0.88−1.51−0.41−0.56
EV regression score−0.62−0.230.00−1.21−0.15−0.30
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Mahesh R N, U.; Nelleri, A. Multi-Class Classification and Multi-Output Regression of Three-Dimensional Objects Using Artificial Intelligence Applied to Digital Holographic Information. Sensors 2023, 23, 1095. https://doi.org/10.3390/s23031095

AMA Style

Mahesh R N U, Nelleri A. Multi-Class Classification and Multi-Output Regression of Three-Dimensional Objects Using Artificial Intelligence Applied to Digital Holographic Information. Sensors. 2023; 23(3):1095. https://doi.org/10.3390/s23031095

Chicago/Turabian Style

Mahesh R N, Uma, and Anith Nelleri. 2023. "Multi-Class Classification and Multi-Output Regression of Three-Dimensional Objects Using Artificial Intelligence Applied to Digital Holographic Information" Sensors 23, no. 3: 1095. https://doi.org/10.3390/s23031095

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop