1. Introduction
In today’s era, there is an enormous demand for fast data rates, bandwidth, and capacity for live streaming, social networking, and video conferencing [
1]. Radio frequency technology has been widely used for terrestrial communication, but it has limited bandwidth and suffers from interference that decreases the capacity and security. Radio frequency links offer lower data rates, need large antennas, and are highly susceptible to multipath fading. Free space optics (FSOs) is considered a strong candidate to complement and integrate with next generation technologies and is currently considered as one of the key technologies for solving the last mile access problem resulting from the backbone of optical fibers to user terminals. It may offer high-speed broadband access in regions where fiber optic cabling may be impractical or costly. There is a challenge of getting high speed, stable, and secure communication service from core network infrastructure (e.g., central office, local exchange) to remote sites (e.g., businesses, offices, individual homes, hilly areas, or other end-user locations) [
2]. Unrestricted spectrum access, low latency, the absence of interference to radio frequency and electromagnetic induction, and the possibility of quick and easy installation are some of the main features of this technology.
As FSOs use air as a transmission medium to transfer the laser pulses, there is no need to dig up the ground, as in the case of optical fiber [
3,
4]. Thus, utilizing this technology for last mile communication or difficult terrain can significantly contribute to the connectivity between remote rural areas and cities. It also finds applications in the interconnection of high-rise buildings in urban areas and office blocks or structures on both sides of highways or railways as well as banks of rivers that are difficult to bridge.
Technology faces many challenges, like scintillation, turbulence, geometric losses, and attenuation, due to different weather conditions. Random variations in temperature and pressure cause the refractive index of air to deviate, causing atmospheric turbulence. Scintillation may result in rapid fluctuations in the phase and intensity of light as it propagates through the atmosphere [
5]. Geometric losses spread the optical beam due to misalignment of the transmitter and receiver, which causes diffraction of the signal during propagation through the atmosphere. Performance estimation could be useful for applications, including FSO link beam tracking and localization. The losses impact the received signal strength and may be considered as constant factors for a fixed link configuration [
6]. Weather conditions, like fog, haze, and clouds, may heavily attenuate the signal by absorbing and scattering the information-carrying photons, affecting link availability [
7].
Dual polarization comprises two distinct, orthogonally polarized optical signals at the same wavelength. The transmission of signals may require half the data rate by splitting the data across the two polarizations. Channels can be spaced closer together by utilizing dual polarization quadrature phase shift keying (DP-QPSK) as the modulation technique, thus reducing the amount of bandwidth needed to transmit the optical signal [
8].
In WDM, different wavelength signals share the same transmission medium, thus improving the capacity of the transmission. Information rates and spectral efficiency can be increased by employing MDM with parallel transmission of information through different modes within the same frequency band [
9]. Further capacity enhancement is achieved by incorporating MDM along with WDM in a FSO system by utilizing both spatial modes and multiple wavelengths for free-space channels [
10,
11]. DP-QPSK combined with WDM and MDM enables the simultaneous transmission of numerous independent data streams at different wavelengths, polarizations, and modes. This may enable a very high data rate throughput and capacity enhancement of optical networks.
In the literature, several techniques have been proposed to improve spectral efficiency, OSNR, capacity, and the data rate of the FSO link [
12,
13,
14,
15,
16,
17]. The authors of [
12] modeled the cloud induced attenuation of Tokyo and Sendai regions of Japan by deriving its probability distribution function using curve-fitting techniques. The experimental evaluation of fog and haze climate conditions of a FSO link have been carried out under controlled conditions employing the Kim, Kruse, and Alnaboulsi prediction models in [
13]. The simulation results have been compared with experimental data, exhibiting the same trend of decrease in the received voltage levels with the increase in the intensity of fog or haze conditions. The authors of [
2] analyzed the attenuation as a result of fog weather conditions in hilly regions of India. The PSK modulation technique results in better performance of the FSO link at a data rate and transmission wavelength of 40 Gbps and 1550 nm, respectively.
A bidirectional WDM-FSO link has been investigated employing 16 quadrature amplitude modulation (QAM), orthogonal frequency division multiplexing (OFDM), and on–off keying (OOK) modulation formats, considering the gamma-gamma channel model in [
14]. A data rate of 320 Gbps has been achieved over a transmission range of 1000 m for possible data center applications. A 16 × 10 Gbps FSO communication link has been analyzed for fog-induced attenuation in diverse regions by employing WDM with the OFDM technique [
15]. A digital signal processing (DSP) module has been incorporated to mitigate channel-induced limitations, achieving a link range up to 10.75 km while maintaining a bit error rate of 10
−9.
The authors of [
16] estimated the prediction accuracy of received signal strength indicators (RSSI) on the receiver side of a FSO link by considering real time, non-linear atmospheric conditions in a maritime environment. Different models proposed in the literature predict the attenuation of the link based on the specific weather conditions of a region. Multi-impairments (fog, clouds, and haze) are time varying and depend on atmospheric and environmental variables.
Regions with diverse meteorological conditions should be considered for analysis with an aim to evaluate the performance of the FSO link at high data rates under multi-impairment conditions Delhi, Washington, Cape Town, and London are meteorological regions and metropolitan cities with higher population densities and serve as hubs of information technology. The regions need better means of communication with higher achievable data rates under different impairments. Network planners must calculate the link length between the transmitter and receiver, link availability, outage probability, network dependability, and other factors based on the power margin/loss budget. It is important to investigate the performance of the link as a result of aggregate attenuation caused by different weather conditions.
In the present work, data on fog, haze, and cloud climate conditions of four distinct geographical regions have been analyzed to derive a unified empirical model for the prediction of attenuation at optical wavelengths caused by multi-impairments in any region. These regions have greater population density and are hubs of information technology. The DP-QPSK modulation technique with Laguerre Gaussian (LG 00 and LG 01) modes has been employed to investigate the performance of the WDM-MDM-based FSO link under fog, haze, and cloud weather conditions. Accurate attenuation predictions may help determine the required transmitter power, antenna alignment receiver sensitivity, and other system parameters to achieve reliable communication. Further, performance estimation of the system has been achieved by employing random forest (RF), k-nearest neighbors (KNN), multi-layer perceptron (MLP), and gradient boosting (GB) machine learning (ML) techniques. The rest of the paper is organized as follows:
Section 2 defines the different attenuation prediction models. The empirical model and its performance evaluation have been described in
Section 3.
Section 4 details the system design, and the results of analysis have been included in
Section 5.
5. Results and Discussion
In this section, the results of the proposed MDM-WDM FSO link have been investigated by employing DP-QPSK modulation under real weather conditions. The signal to noise ratio (SNR) of the received signal has been analyzed and compared as a function of transmission range considering diverse geographical regions for different channels. Next, the received power has been obtained as a function of the propagation range for the four regions, and ML techniques have been employed for SNR estimation of the received signal.
Figure 8 depicts the comparative analysis of attenuation due to different weather conditions in the four regions. It can be observed that clouds and haze cause a maximum attenuation of 10.06 and 4.84 dB/km in the case of New Delhi and London regions, respectively. Fog is the main contributing factor affecting visibility in Washington and Cape Town regions. Maximum attenuation has been reported in the New Delhi region due to cloud and haze weather conditions in December, February, and January months, respectively.
Next, a unified empirical model has been developed for the prediction of overall attenuation of a region due to different weather conditions at 1550 nm. To describe the relationship between a set of visibility values and average attenuation, a least squares method has been utilized. The performance of the model has been measured using R squared and mean square error (MSE) as performance metrics.
R squared measures the percentage of variance of the dependent variable that can be predicted based on the independent variable. It indicates the goodness of fit of the regression model and is defined as [
18]:
Mean square error is the ratio of the square of difference between the predicted and mean values compared to the number of observations. MSE measures the variance of the fitted curve and is given as [
18]:
Table 5 depicts the regression models for predicting attenuation (dB/km) as a function of visibility for different regions along with corresponding R squared and MSE values as performance metrics. Values of R squared fall between 0 and 1. The higher the value better fit the model. MSE is best suited for regression problems, as with large deviations, they are penalized severely by squaring errors, which raises the penalty for large errors. The maximum and minimum attenuation have been predicted for Delhi and Washington regions, respectively. In the case of the Delhi region, there is a better correlation between visibility values for the data set with different weather conditions. A moderate value of R squared prevents over-fitting of the model so that it performs better on unseen values of visibility. MSE is a measure of the quality of the estimation. Performance metrics and generalization of empirical models may be further improved by increasing the values in the data set and by incorporating more regions. The minimum mean square error and maximum coefficient of determination have been reported in the case of models developed for the Washington and Delhi regions, respectively.
Table 6 and
Table 7 depict the simulation parameters using the OptiSystem 20.0 software and transmission wavelengths of different channels, respectively.
Figure 9a displays the SNR versus range for the New Delhi region with the constellation diagram obtained at a transmission range of 2 km considering channel 9 and channel 10. It can be observed from the graphs that SNR varies from 54.7 dB to 34.73 dB and 52.52 dB to 34.62 dB for Ch 10 and Ch 9, respectively.
Figure 9b displays the SNR values for the Cape Town region depicting channel 1 and channel 2 along with the constellation diagrams observed at a transmission range of 2 km. A 3 dB higher value for SNR has been achieved in the case of the LG 01 mode at a wavelength of 1550 nm.
Figure 10a displays the SNR range for the Washington region with the constellation diagram obtained at a transmission range of 2 km considering channel 9 and channel 10. It can be observed from the graphs that SNR varies from 52.84 dBm to 42.26 dBm and 54.86 dBm to 42.94 dBm for Ch 9 and Ch 10, respectively.
Figure 10b displays the SNR values for the London region depicting channel 1 and channel 2 along with the constellation diagrams observed at a transmission range of 2 km. A 3 dB higher value for SNR has been achieved in the case of the LG 01 mode at a wavelength of 1550 nm. As the range increases, the signal to noise ratio decreases due to geometric absorption and scattering losses.
Figure 11a shows the scatter plot of the SNR range for different geographical regions. It may be observed from the figure that the SNR value of the New Delhi region falls off linearly by −18 dB with an increase in the propagation range from 200 to 2000 m. SNR values above 38 dB have been reported for the other three regions over the observed propagation range. Minimum and maximum SNRs have been obtained in the case of the Delhi and Washington regions, respectively.
Figure 11b depicts the received power as a function of the transmission range for the four regions. It may be observed from the figure that received power decreases linearly by 20.79 dBm in the case of the Delhi region with an increase in the propagation range from 200 to 2000 m. A minimum received power of about −45 dB has been obtained for the observed transmission range in the case of the other three regions due to the effect of multi-impairments.
Further, different ML techniques (RF, KNN, MLP, GB) have been employed to assess the performance of the DP-QPSK-based FSO link using the Jupiter notebook in Python 3.0. The transmission range, wavelength, average attenuation, LG modes, and SNR are the input features and modeling target of the ML models, respectively. Initially, the entire sets of observations have been split into two subsets. Different models have been trained using the first subset and tested using the second data subset. Training and testing data sets are divided into fractions of 80% and 20%, respectively. The models develop relationships by learning from the features and target values of the training data and predict the SNR value for unseen data. Scaling has been performed to pre-optimize the features of the data set to similar ranges.
In RF, multiple decision trees have been developed and further averaged for predictions of the final outputs. It may reduce bias due to ensemble of learning and averaging errors of individual trees. The number of random variables chosen at each stage of splits for a single decision tree is set to 42. A total of 100 trees have been considered as one of the other hyper parameter of the RF algorithm. In the KNN algorithm, an optimal K value of 5 has been chosen to predict the class of new data points by selecting the K closest neighbors based on the distance between points in the training set and the new point. To assign the predicted class based on the majority class, the majority class of K neighbors has been identified. MLP incorporates input, hidden, and output neurons to improve the input–output mapping capabilities. Two hidden layers with 100 and 50 neurons each have been used in our work. The MLP algorithm updates the model’s weights using an optimization and back propagation technique. Stochastic gradient descent (SGD) has been used to minimize the loss function in the MLP model. An ensemble of weak learners is trained successively in the GB method, with each weak learner attempting to fix the failings of the preceding learners. Fewer repetitions enable GB to converge rapidly and the model performs well on both training and test data to give low bias.
The objective is to estimate the performance of the ML models in terms of SNR for different input feature values.
Table 8 depicts the performance metrics employed to analyze the performance of RF, KNN, MLP, and GB ML techniques. The MSE, mean absolute error (MAE), root mean square error (RMSE), and coefficient of determination (R squared) have been determined. The GB technique has the ability to handle over fitting and noise and works well with the small dataset. The higher R squared (0.99) and lower MSE (0.11), MAE (0.25), and RMSE (0.33) values have been reported in the case of the GB algorithm, depicting a close relationship between training and testing data sets. ML models thus perform well in the estimation and prediction of the performance of the system.