1. Introduction
The demand for high-speed communication continues to grow, spurring technological innovations to boost efficiency and performance. Wireless communication outperforms wired communication because wireless communication is easier to set up and allows for ubiquitous data transmission without wires. Such innovations have proven crucial in the development of mobile networks. Radio frequencies (RFs) systems are gradually depleting their resources, necessitating the usage of a higher frequency range to boost data speed. Many researchers have been working on the millimeter-wave band to produce the fifth generation of mobile communications (5G), which has the potential to deliver a high data rate (1–10 Gbps). However, it is vital to highlight that the increased frequency can potentially have negative health consequences.
Visible light communication (VLC), light fidelity (LiFi), and optical camera communication (OCC) are new feasible candidates that could replace radio frequency communication. The advantages of VLC/LiFi/OCC compared with RF communication can be described as follows: If we provided the dimming and non-flicker technologies, light waves would have no harmful impact on human health. Ref. [
1] shows that optical modulation frequencies up to 200 Hz can be considered without damaging human eyes. Visible light waves have a bandwidth more than 1000 times that of radio waves. Furthermore, visible light exists in light infrastructures on smart houses, hospitals, and automobiles, lowering the cost of adopting VLC/LiFi/OCC systems compared to the radio frequency systems.
Consequently, many businesses are devoting a significant portion of their budgets to studying this unique technology. IEEE 802.15 tutorials [
2] provide an overview of the OWC technology. The adoption of a wide range of protocol complexity and the peculiarity of OWC was also available in [
2]. The IEEE 802.15.7 a Task Group (TG7 a) has divided the system into two categories:
High rate OCC for IoT applications with four modulation schemes [
3] considering mobility environment.
High rate OCC for vehicular application with three modulation schemes [
3] considering high mobility environment.
Contrasting VLC and Li-Fi systems, which use photodiodes to receive data, optical camera communication uses a camera as the detector. Refs. [
4,
5] show that the OCC system performance depends on the camera type. Global-shutter and rolling-shutter cameras are two types of cameras for this arrangement on the market. The camera frame rate, which decides whether the sampling rate meets the Nyquist sampling requirement, is the primary determinant of OCC performance with the global shutter camera. The rolling-shutter camera sampling rate is determined by both the camera frame rate and the camera’s rolling rate.
OFDM (orthogonal frequency-division multiplexing) is a communication modulation technique for encoding digital data on multiple carrier frequencies. It is a useful approach in high data rate communication systems since the bandwidth is partitioned into orthogonal sub-carriers to reduce the distortion affected by inter-symbol interference (ISI). The sub-carriers in an OFDM system can overlap with other sub-carriers by utilizing the Fourier transform method without impacting signal performance. The cyclic-prefix was appended to compensate for the distortion to the OFDM symbol.
The camera on-off keying (C-OOK) system was introduced in paper [
6], which can maintain a high data rate. IEEE 802.15.7–2018 standardized this method as well. On the other hand, the C-OOK system has several disadvantages, such as a high bit error rate (BER) and a short communication distance. In 2019, we suggested 2-D OFDM (2-dimention OFDM) based on screen code OCC in [
7]. 2D-OFDM focuses on using asynchronous-quick-link (A-QL) code to collect the 2D-OFDM signal. The transmission distance of 2D-OFDM is a disadvantage, as it only operates reliably at 3 m [
7]. MIMO C-OOK is based on the matched filter proposed in [
8] with MIMO technique for high-rate OCC, but the mobility effect was not considered. We presented a rolling-shutter orthogonal frequency-division multiplexing (OFDM) technique for a high data-rate OCC system in [
9]. Depending on the camera’s rolling-shutter effect, the OFDM waveform could be received based on the LED intensity levels in each image. It reveals the drawback of the mobility effect, which is essential for OCC systems. In this paper, we propose an RS-OFDM, LED detecting method based on deep learning and the YOLOv5 algorithm to achieve high accuracy in real-time. With YOLOv5, we can achieve a low bit error rate with a mobility effect. In addition, a deep learning decoder presents a way to improve OCC performance as compared to the traditional [
9] approach. By using deep learning for the start of frame detection, we can reduce the error in the mobile environment.
The remainder of this study contains five sections structured as follows. In
Section 2, we present technical contributions. In
Section 3, we represent the related work of OFDM techniques for OWC systems.
Section 4 displays the system architecture of RS-OFDM using deep learning.
Section 5 depicts the simulation and implementation results of the RS-OFDM system for the mobility environment.
Section 6 concludes this study.
3. Related Works
In 1966, the Fourier transform (FT) was presented to the banks of sinusoidal frequency based on its orthogonality. To reduce the ISI effect, the cyclic-prefix (CP) was suggested in 1969 as an enhancement to OFDM systems. Researchers began to use OFDM for practical wireless communication in 1980. The FT is depicted in Equation (1).
Due to transmitting data by light source, the non-negative signal is a requirement in OWC signals. Then, before proceeding to the IDFT block, the waveform must be pre-processed. Asymmetrically clipped optical-OFDM (ACO-OFDM) and DC-biased optical-OFDM (DCO-OFDM) are currently two developed Li-Fi systems. On the other hand, the wavelet transform has several advantages over the Fourier transform, including no cyclic-prefix (CP) redundancy, reduced sub-channel interference, and greater spectral separation. The IEEE Standard Association recently defined wavelet OFDM (wavelet transform is the primary topic of IEEE 1901 for the power line communication (PLC)). It demonstrated that wavelet OFDM outperforms traditional OFDM. Ref. [
10] proposes that wavelet OFDM is better than DCO-OFDM in terms of spectrum efficiency, side-lobe suppression, and bit error rate, and [
11] proposes wavelet package approaches to OWC systems and shows that wavelet OFDM’s PAPR (peak to average power ratio) is smaller and more robust in the face of channel imperfection. However, wavelet OFDM is difficult to deploy with high cost, so OFDM based on Fourier transform widely applied to the communication systems.
3.1. DCO-OFDM
DCO-OFDM is preferred more than ACO-OFDM for IM/DD techiniques by some industrial companies and institutes, such as PureLiFi and Fraunhofer Heinrich Hertz Institute in Europe, as shown in [
12,
13], since the DCO-OFDM scheme provides better bandwidth efficiency when all subcarriers are used rather than just odd subcarriers. DCO-OFDM needs a post-processing step to guarantee that the light source waveform is unipolar. Adding DC-bias to generate a non-zero waveform within the LED operation range is the most accessible post-processing approach. Despite having a higher bandwidth efficiency, DCO-OFDM has a lower BER than ACO-OFDM because of the presence of DC bias.
Prior to arriving in the IDFT block, the OFDM signal
, must be restricted to have Hermitian symmetry:
With
m as the numerical order of OFDM elements and
N as the size of OFDM symbol. The signal
X is a real signal in the time domain after the IDFT block. The
th time-domain sample of
X is revealed:
The signal still has a negative component after this operation. The signal must then be DC-biased to ensure that it is non-complex and non-negative.
3.2. ACO-OFDM
The ACO-OFDM technique is frequently utilized in Li-Fi systems for photodiodes-equipment as receivers [
14,
15]. Different DCO-OFDM and ACO-OFDM only transmits over odd-subcarriers. In contrast, the BER performance of ACO-OFDM is more consistent than DCO-OFDM. The clipping noise spills into the detachable carriers on the receiving side because the null data was placed on even subcarriers, as seen in [
16]. In ADO-OFDM, data symbols are carried on just the odd subcarriers to guarantee that the signal after the IDFT is non-complex and non-negative. The IDFT’s input signal,
X, consists solely of odd subcarriers
. The resulting real signal after the IDFT block is as follows:
With k as the numerical order of OFDM elements and N as the size of OFDM symbol.
4. System Architecture
4.1. Channel Coding
In digital communication systems, bit mistakes are found and fixed via a method called channel coding, commonly referred to as forward error correction coding (FEC). Both the transmitter and the receiver carry out channel coding. Channel coding is known as an encoder on the transmit side, where additional bits (parity bits) are added to the raw data before modulation. Due to noise, interference, and fading during transmission, channel coding enables the receiver to find and correct errors. In order to improve the OCC system performance, channel coding is employed in this paper as
Figure 1.
4.2. Hermitian Mapping
The technological features of a rolling-OFDM system are presented in this section. Instead of putting the data symbol directly into the IDFT block, as in conventional OFDM in radio frequency, each symbol must transit via the Hermitian block. After that, the signal is routed into the IFFT. The Hermitian block provides the unique task of ensuring that the IDFT’s output is completely real.
The Hermitian mapping process depicts in the equation below.
4.3. Cyclic-Prefix (CP)
The cyclic-prefix (CP) is important in the OFDM system to mitigate inter-symbol interference. The length of CP is determined by the environment channel’s symbol length and time delay. The cyclic prefix is generated by copying the bottom part of the OFDM symbol to the top of the OFDM symbol.
Figure 2 shows the creation and insertion cyclic-prefix to the OFDM symbol.
4.4. Pilot
Before transmitting, the pilots must add the signal to estimate the channel. A system’s minimum pilot density and pilot position are critical. The pilot spacing used for the OFDM symbol was studied and implemented in [
9].
Let N be the OFDM symbol’s length, be the spacing between subcarriers, and be the OFDM system’s bandwidth.
Let
be the spatial sampling period and
be the time delay between spatial sampling. As indicated below,
is the maximum pilot spacing:
The spacing between two adjacent pilots would be close for suitable interpolation performance. It should be emphasized, however, that the accuracy of the estimation is not proportional to the number of pilots. In circumstances where the channel is over-estimated, pilots placed too closely may compromise performance.
4.5. Deep Learning for LED Detection
In OCC system, RoI algorithms are well-established. Object and feature-based detection approaches are used in most RoI methods with real-time processing for object detection. As previously noted, with the rolling-shutter effect, LEDs are shown in images represented by the intensity of LED strips corresponding to the OFDM waveform. Each image includes many strips, which make problems for RoI detection, particularly concerning the mobility effect. Deep learning neural networks are well-established methods for computer vision applications, e.g., object detection, image classification, localization, and image reconstruction. Convolution neural networks have emerged as promising candidates for deep learning-based computer vision applications. The CNN-based YOLO algorithm is a state-of-the-art, real-time object detection system. This paper proposes the customization and training of YOLO models for LED detection and tracking, considering the rolling-shutter and mobility effects.
To verify the performance of our approach, the experiment dataset was collected using real scenes. We recorded daytime and nighttime video footage with the mobility effect and generated 2500 blurry and clear images at different exposure times. Using these images as the dataset, we labeled images and trained the YOLOv5 model, which was modified with 5/7 convolution layers. Only one detection class was included. The corresponding number of filters in the final number of convolution layers was 38. The YOLOv5 model’s average loss was about 0.09 after 10,000 training epochs with a low processing time of 0.0985 s for each epoch (NVIDIA GeForce GTX 1050).
4.6. Starting Point of OFDM Frame Detection
Van De Beek et al. suggested a time-frequency joint synchronization approach that uses CP to calculate a maximum likelihood function to predict the time and frequency deviation [
17,
18]. By utilizing the properties of cyclic-prefix, the start of OFDM frame may also be detected. In [
9], we proposed the Van de Beek algorithm to detect the frame of RS-OFDM symbol in real-time for the OCC system. However, as we mentioned above, the mobility channel has a lot of effects on the OCC system (blur effect and SNR reduction, among others), so the conventional algorithm does not provide good performance for the RS-OFDM system. In this paper, we proposed deep learning for detecting the start of OFDM frame, as shown in
Figure 3. We used a basic deep learning neural network model with two hidden layers to avoid overfitting the model. Following preamble detection, we can accurately detect the start of a frame of OFDM signals, improving the OCC system performance compared with matched filter technology in a mobility environment. In the case of overfitting, the accuracy of the test dataset is undermined when there are five or more hidden layers. The dataset was collected from several cases (different distances and velocities) with rolling shutter camera. The results of our proposed approach compared with the conventional scheme in [
9] are shown in
Section 5.
4.7. Packet Structure
The format of the suggested scheme, as seen in
Figure 4, is discussed in this section. Every packet can comprise numerous data sub-packets (DSs) to provide frame rate variation compatibility. Each sub-packet in the same packet contains the same payload with the same sequence number (SN). The SN represents the serial number of packets. We may actually split two scenarios based on the transmitter’s packet rate and the camera’s frame rate. In the case of undersampling, the frame rate is lower than the transmitter’s packet rate (LED). In the case of oversampling, the frame rate is many times higher than the transmitter’s packet rate. The transmitter’s packet rate is defined as the number of packets that carry distinct payloads across the transmission media in a certain time period (for example, 20 packets/s). Many data packet frames are included in our proposed data frame structure. Each data sub-packet (DS) contains payload data and an SN, and each packet is made up of numerous data sub-packets (DS). The SN denotes a data packet’s sequence information, which aids a receiver in determining the arrival state of a new payload in the oversampling case and detecting lost payloads in the undersampling case. We can modify SN length depending on the OCC system parameters and optical channel. Then, the number of missing packets discovered grows as the SN length increases.
Figure 5 illustrates how to apply SN to two common scenarios: oversampling and undersampling.
4.7.1. Undersampling
The undersampling case occurs if the frame rate drops below the transmitter’s packet rate. The payload is lost in this case.
Figure 5 depicts the use of the SN to detect a missed payload. The SN can identify the missed payload if the SN length is long enough. The SN is represented as n − 1 in the data frame obtained from the payload n − 1. The SN in the following data frame is n; however, the actual data frame has an SN of n + 1. The loss is identified by matching the SN of two adjacent data sub-packets, demonstrating that the payload n is missed. However, depending on the SN’s length, a variety of states are formed. The sequence number, for example, can detect seven missing payloads of transmitted packets if the SN length is 3 bits. When problems are recognized, it is much easier to correct them. The mistakes occur when two successive packets have two non-consecutive SN (n − 1 and n + 1), as revealed in
Figure 5.
4.7.2. Oversampling
When a rolling-shutter camera’s frame rate exceeds (at least doubles) the transmitter’s packet rate, every data packet is sampled at least twice (i.e., two pictures). We receive the same packet at the receiver, generating packet merger confusion. The SN is added to DS to assist the receiver in decreasing the influence of the camera’s frame rate variation. Each packet contains identical DSs, which aids the receiver in removing superfluous data. When a DS is received, the receiver chooses which one has a compatible SN. As seen in
Figure 5, the receiver side discards consecutive packets with the same SN and merges packets with successive SN (n − 1, n, n + 1).