*4.1. System Hardware*

The system incorporates the following hardware components, as shown in Figure 6: TFmini Plus LiDAR, an ultrasonic sensor, Bluetooth, Arduino Uno, and the user's smartphone. A servo, buzzer, and power bank are used to operate the device.

**Figure 6.** System Hardware.

Figure 7 displays a photo of the LidSonic V2.0 gadget, which includes smart glasses with sensors and an Arduino Uno board. The LidSonic V2.0 device's central nervous system is the Arduino Uno microcontroller unit, which is used to integrate and manage the sensors

and actuators and to transfer sensor data to the smartphone app through Bluetooth. It is configured so as to control how the servo motions, sensors, and other components interact. The LiDAR Unit contains the TFmini Plus LiDAR sensor [135] that is connected with a servo and laser as a unit. The laser beam is installed above the TFmini Plus LiDAR and helps to indicate where the LiDAR is pointed so that one may scan and categorize various things in order to build a valid dataset. The data collected by the TFmini Plus LiDAR from its spatial environment is transferred to the Arduino unit. Some of this information is used by Arduino to detect obstacles and activate the buzzers as needed, while other information is relayed through Bluetooth to the smartphone app. The servo that controls the movement of the two devices comprises both the TFmini Plus LiDAR and the laser. The ultrasonic sensor is capable of detecting a wide range of obstructions. It is also utilized to compensate for the TFmini's LiDAR's inadequacies by recognizing transparent obstructions on the route of visually impaired people. The ultrasonic sensor detects objects at 30 degrees and has a detection range of 0.02 m–4.5 m [136]. The Arduino unit analyzes the data from the ultrasonic sensor, and if an item is detected, the buzzer is actuated. The buzzer sounds with distinct tones to notify visually impaired persons of different sorts of objects detected by the sensors. The buzzer and sound frequencies, or tones, are controlled by the Arduino CPU based on the identified items. A microphone for user instructions, Bluetooth for interfacing with the LidSonic V2.0 device, and speakers for vocal feedback regarding the detected objects are all included in the smartphone app's hardware.

**Figure 7.** (**a**) LidSonic V2.0 device and a glass (**b**) LidSonic V2.0 device mounted into a glass frame.

### 4.1.1. TFmini-S LiDAR

A laser diode emits a light pulse, which is used in a LiDAR. Light strikes and is reflected by an item. A sensor detects the reflected light and determines the time of flight (ToF). The TF-mini-S device is based on the OPT3101 and is a high-performance, single-point, short-range LiDAR sensor manufactured by Benewake [137]. It is based on long-range proximity and distance sensor analog front end (AFE) technology based on ToF [137]. A TFmini-S LiDAR operates on the networking protocol UART (TTL)/I2C, can be powered by a conventional 5 V supply, and has a total power consumption of 0.6 w.

The TFmini-S LiDAR has a refresh rate of 1000 Hz and a size range of 10 cm to 12 m. It provides a ±6 cm accuracy between 0.1 m and 6 m, and a 1 percent accuracy between 6 m and 12 m. The operational temperature range is from around 0 ◦C to 60 ◦C. The range of the angles is 3.5◦ [138]. Data from the TFmini-S LiDAR may be collected quickly and precisely. There are no geometric distortions in the LiDAR, and it may be utilized at any time of day or night [138]. When no item is identified within a 12 m range, the sensor sends a value of 65,535.

The TFmini-S has the advantages of being inexpensive in cost, having a small volume, low energy consumption, and many interfaces in order to satisfy various requirements, but it has the disadvantage of not being able to detect transparent objects, such as glass doors

(we used an ultrasonic sensor to compensate for this). It improves the outdoor efficiency and accuracy with various degrees of reflectivity by detecting stable, accurate, sensitive, and high-frequency ranges. Few studies have been conducted on the utilization of LiDAR to assist the visually impaired and identify their needs. The gadgets that aid the visually impaired make use of a very expensive Linux-based LiDAR [139].

### 4.1.2. Ultrasonic Sensor

An ultrasonic sensor is one of the best tools for detecting barriers because of its cheap price, low energy consumption, sensitivity to practically all types of artifacts [40], and the fact that the ultrasonic waves may be transmitted up to a distance from 2 cm to 300 cm. Furthermore, ultrasonic sensors can detect items in the dark, dust, smoke, in cases of electromagnetic interference, and tough atmospheres [140].

A transducer, in an ultrasonic sensor, transmits and receives ultrasonic pulses, which carry information about the distance between an item and the sensor. It sends and receives signals using a single ultrasonic unit [41]. The HC SR04 ultrasonic sensor has a <15◦ effective angle, a resolution of 0.3 cm, a frequency of operation of 40 kHz, and a measurement angle of 30◦. The range limit of ultrasonic sensors is reduced when they are reflected off smooth surfaces, when they have a low incidence beam and when they open narrowly. Optical sensors, on the other hand, are unaffected by these issues. Nonetheless, the optical sensors' shortcomings include the fact that they are sensitive to natural ambient light and rely on the optical properties of the object [37]. Sensors are often employed in industrial systems to calculate object distance and flow velocity. ToF is the time required for an ultrasonic wave to travel from the transmitter to the receiver after being reflected by an object. Equation (1) can be used to calculate the distance from the transmitter, where c is the velocity of the sound [141]:

$$\mathbf{d} = [\mathbf{c} \times (\text{ToF})]/2$$

Infrared sensors and lasers are outperformed by ultrasonic sensors. Infrared sensors cannot work in the dark and produce incorrect findings when there is no light. However, there are inherent drawbacks that restrict the application of ultrasonic instruments to mapping or other jobs requiring grea<sup>t</sup> accuracy in enclosed spaces. Due to sonar cross-talk, they are less reliable and have a reduced range, large beam coverage, latency, and update rates [12]. The receiver detects an undetectable small volume of the reflected energy if the obstacle surface is inclined (i.e., surfaces formed of triangles or with rough edges), which causes the ultrasonic range estimations to fail [142].

### *4.2. System Software*

The LidSonic V2.0 system consists of the LidSonic V2.0 device and the LidSonic V2.0 Smartphone App (see Figure 8). The LidSonic V2.0 device's sensor module contains software that controls and manages the sensors (LiDAR and ultrasonic sensors) and actuators (the servo and laser beam). This module also carries out the basic logical processing of sensor data in order to generate buzzer alerts regarding discovered items.

The smartphone app's dataset module collects data from the LidSonic V2.0 device and appropriately stores the dataset, including the labels. The machine and deep learning module is located in the smartphone app and allows the models to be trained, inferred, and evaluated. Two Google APIs are used by the voice module. The text-to-speech API is used to provide audio feedback from the smartphone app, such as spoken input regarding adjacent objects identified by the sensors, using the mobile speakers. The Google speech-totext API is used to transform user voice instructions and evaluate them so that the app can take relevant actions, such as labeling and relabeling data objects.


**Figure 8.** LidSonic V2.0 Software Modules.

The master algorithm is given in Algorithm 1. The array VoiceCommands (various commands sent to the LidSonic V2.0 System) and AIType are the master algorithm's inputs. AItype indicates the type of classification approach to be used, either machine Learning (ML) or deep learning (DL). Label, Relabel, VoiceOff, VoiceOn, and Classify are the VoiceCommands. The user gives the system the Label and Relabel voice commands to Label or Relabel an object observed by the system. The commands VoiceOff and VoiceOn are used to switch voice commands on and off if the user simply wants to hear the buzzer sound that alerts them when an object is close rather than hearing the names of all the things being recognized in the surroundings. When the user wants to identify a specific obstacle, they can use the voice command Classify. This command can be used even if the vocal feedback is turned off. The master algorithm produces three outputs: LFalert, HFalert, and VoiceFeedback, which are used to notify the user about various items through a buzzer or voice instruction.


The LidSonic V2.0 system operates different modules and subsystems for numerous purposes, as shown by the master algorithm. The ServoSensorsModuleSubSystem, a subsystem of the Sensors module, uses the angle and position as inputs to determine the servo starting position and motion and control the position of the LiDAR sensor. The LaserSensorsModuleSubSystem displays the direction in which the LiDAR is pointing (this is only for development purposes and assists the developer in identifying the object being scanned

by the LiDAR). The UltrasonicSensorsModuleSubSystem returns the data output from the ultrasonic sensor, "UD2O" (the user's distance from the object computed based on the data from the ultrasonic sensor). The LiDARSensorsModuleSubSystem returns two outputs from the LiDAR sensor, "LD2O" (the user's distance from the object computed based on the data from the LiDAR sensor) and "LDO" (the user's distance from the object (LiDAR data object that contains detailed data about the objects). The ObsDetWarnSensorsModule-SubSystem returns "FeedbackType", detects objects, and informs the user about them via buzzers and voice feedback. The MLDatasetModule provides the Weka datasets, labeled "MLDataset", after receiving the inputs "LDO", "Label", and "Relabel". The DLDataset-Module provides CSV files, the "DLDataset". The MLModule returns "MOL" (the object level below or above the floor) and "VoiceCommands". The DLModule returns "DOL" (the object level below or above the floor) and "VoiceCommands". The VoiceModule transforms speech to text and vice versa using VoiceCommands and VoiceFeedback as inputs. In the next sections, more algorithms, pictures, and text are used to describe the four modules, as well as the inputs and outputs.

### *4.3. Sensor Module*

Figure 9 illustrates how the LidSonic V2.0 pair of glasses use ultrasonic and LiDAR sensors to observe the world. The ultrasonic sound pulse is directed in front of the user, as seen by the dotted green line, to detect any objects in front of the user. It can also detect obstacles that are transparent, such as glass doors or walls, which LiDAR may miss. The LiDAR sensor range is represented by dotted blue lines. The LiDAR sensor has a range of 10 cm to 12 m. We covered a 60-degree region in front of the user with a servo motor that moves the LiDAR sensor, which equates to an area of "m" meters on the floor. This floor area "m", covered by the LiDAR sensor for a user of 1.7 m in height, would be around 3.5 m. Note that we ignored the fact that the glasses are at eye level rather than head level. The figure also displays the floor area "n", which is the closest floor area to the user that is not covered by the LiDAR sensor, since we deactivated it in order to minimize false alarms triggered by the user's knee while walking. This floor space closest to the user would be around 0.15 m for a person of 1.7 m in height. Within this "m" floor space, the LidSonic V2.0 system identifies any obstacles, including descending the stairs, using the LiDAR sensor.

**Figure 9.** Sensor Coverage.

The flow diagram of the obstacle detection and warning subsystem is shown in Figure 10. The LidSonic V2.0 device uses data from an ultrasonic sensor (UD2O stands for distance to object detected by the ultrasonic sensor) and activates a buzzer with a lowfrequency warning, providing an LFalert if it falls below a threshold of 0.5. The LidSonic V2.0 device additionally checks the closest point read by the LiDAR sensor (LD2O is the D2O detected by the LiDAR sensor), and if it is higher than the floor, the LFalert is triggered, signaling that the obstacle might be a high obstacle, a bump, and/or ascending stairs, etc. If it is below the floor, this means that there are obstructions such as descending stairs and/or holes, and a high-frequency alarm HFalert buzzer tone is triggered. The ML module provides the user with voice input depending on the anticipated obstacle type. MOL is converted from the predicted value (object level detected by the ML module). The buzzer is actuated with the LFalert if the predicted value type is an object above the floor; otherwise, the HFalert is triggered. The predicted value is converted to DOL (object level detected by the DL module). If the predicted value type is an object above the floor, the buzzer is activated with the LFalert; otherwise, the HFalert is activated. The figure only shows "MOL < Floor"; however, the values of MOL or DOL are used based on the algorithm used.

**Figure 10.** Detection and Warning System.

Algorithm 2 depicts the algorithm for the obstacle detection, warning, and feedback subsystem. It takes the ultrasonic data (UD2O), nearby LiDAR distance readings (LD2O), the object level calculated by the machine learning module (MOL), and the object level calculated by the deep learning module (DOL) as inputs. The ObsDetWarnSensorsModuleSub-System function analyzes data for detection and generates audio alarms. High-frequency buzzer tones (HFalert), low-frequency buzzer tones (LFalert), and VoiceFeedback are the output alerts. The subsystem has a logical function that accepts the inputs UD2O, LD2O, and MOL and returns the determination of the kind of obstacle (whether the obstacle is an object above floor level, etc.). No action is required if the output is a floor. If the obstacle returns as HighObs, however, it is either a wall or a high obstacle, and so on.

An LFalert instruction is delivered to the buzzer to start the low-frequency tone buzzer. The buzzer parameter HFalert is used to activate the high-frequency tone buzzer if the obstacle is of the LowObs type. The goal of selecting a high-frequency tone for low obstacle outputs is to ensure that low obstacle outputs are possibly more hazardous and destructive than high obstacle outputs. The high-frequency tone may be more noticeable than the low-frequency tone.


### *4.4. Dataset Module*

Machine learning is strongly reliant on data. It is the most important factor that enables algorithm training to become possible and to obtain accurate results from the trained models. Our dataset includes a 680-example training set and is formatted using ARFF files for Weka and CSV files for TensorFlow deep learning. The collection provides distance data obtained from the LiDAR equipment by LidSonic V2.0.

Table 3 shows the eight classes in the dataset, the kinds of obstacles that we accounted for, as well as the number of examples/instances of each. Note that the images in the table include both indoor and outdoor conditions. Deep Hole, Ascending Stairs, Descending Stairs, and Wall were trained in indoor environments, while the rest of the objects were trained in outdoor environments. For our smart glasses, we were not especially concerned with the exact specifications of the objects but rather with the broader types of the objects. For example, differentiating between Descending Stairs and a Deep Hole is important, because the former is an object that blind people might wish to use (go down), while they would avoid the Deep Hole. The type of Deep Hole is not important, because they would aim to avoid a deep hole. We trained the machine and deep learning algorithms with the data generated by the LiDAR sensor for the objects and did not program the software with the specification of the objects. Hence, the objects are not defined.

Table 4 shows the preprocessing and feature extraction approaches. PF1 requires that the LidSonic V2.0 device's upward line scan is set to the same angle index as the downward scan and ends with the class label. PF2 must complete PF1 and then extract just eleven angle readings by dividing the 60 readings by ten along with the last angle distance data (in essence, skipping every five readings, assuming that an object does not exist in this gap and considering that the user is moving, so that this gap is moving too). It also computes the height of the angle nearest to the user, which is the LidSonic V2.0 device's starting point, as well as the middle angle of the LidSonic V2.0 device's scan. We need to calculate the distance from the user to the obstacle d2 and the distance from the user to the ground d1 (*y*-axis), the points for both angles, once we have the two height calculations, which are the *x*-axis points. The slope between h1 and h2 can then be calculated. The two heights and the slope are added to the 11-angle distance readings to create the 14-feature dataset, DS2. The 60-angle distance readings are the dataset features of DS1.


**Table 3.** Obstacle Dataset.

**Table 4.** Preprocessing and Feature Extraction.


Figure 11 depicts the model used for calculating the obstacle height. The distance between the LidSonic V2.0 device and the ground is represented by g. The larger triangle's hypotenuse, which is colored blue, is represented by g. The LidSonic V2.0 device's LiDAR distance from an obstacle is c. We can compute the height of the object h using the similar triangle law and the value of c. Two triangles that have the same ratio of their comparable sides and an identical pair of corresponding angles are called similar triangles.

**Figure 11.** Obstacle Height.

In ΔPQR and PDE, ∠DPE is common and ∠PDE = ∠PQR (corresponding angles). (1)

$$\Rightarrow \Delta \text{PQR} \quad \sim \text{ } \Delta \text{ PDE (PP criterion for similar triangles)} \tag{2}$$

Hence, from (1) & (2):

$$\mathbf{P} \Rightarrow \mathbf{P} \mathbf{R} / \mathbf{P} \mathbf{E} = \mathbf{P} \mathbf{Q} / \mathbf{P} \mathbf{D} \tag{3}$$

From Equation (3), calculate r as:

$$= \mathbf{c}/\mathbf{g} \tag{4}$$

then,

$$\mathbf{a} = \mathbf{t} \ast \mathbf{r} \tag{5}$$

The height of the obstacle is:

$$= \mathbf{t} - \mathbf{a} \tag{6}$$

The horizontal distance from the user to the obstacle is calculated by Equation (7):

r

h

b 
$$\mathbf{r} = \mathbf{d} \, \mathbf{\bar{r}}\tag{7}$$

There is a ∼= ±3 cm error when computing the height of an object, which we consider insignificant in our case, because we do not require the exact height but, rather, the nature of it (high, low, etc.). The height is calculated and used as a feature in the dataset. We added two height features, h1 and h2, for the purpose of this computation.

Another crucial parameter that we included in our dataset is the slope in Figure 12, between h2 and h1. Since the value of the slope fluctuates depending on the slope of the ground level or the kind of obstacle, especially in the case of stairs, it is a significant factor. Equation (8) calculates the slope as follows:

> s

$$\mathbf{h} = \frac{\mathbf{h}2 - \mathbf{h}1}{\mathbf{b}2 - \mathbf{b}1} \tag{8}$$

We created two types of datasets: one that collects 60 values of the features using the LiDAR distance of 60 angles, which we called DS1, and the second dataset, DS2, which extracts 11 features from DS1. Three more features were added: two obstacle heights from two different angles, as well as the slope between these two positions, giving a total of 14 features. We constructed six training models to be examined, evaluated, and analyzed for optimal utilization. Two different approaches were investigated: Weka-based machine learning and TensorFlow-based neural networks. We used K\* (KStar), Random Committee,

and IBk as classifiers in Weka in order to train six machine learning models. For our system, these were the most successful classification methods [39]. The second is a TensorFlowbased deep learning technique that utilizes the two datasets. We constructed two deep learning models for each. Machine and deep learning algorithms are discussed in the next subsection. The DS1 labels range from a01–a60 and end with the obstacle class to give a total of 61 features. DS2 has 11 angle labels in addition to h1, h2, s, and the obstacle class. The DS1 labels range from a01–a60 and end with the obstacle class to give a total of 61 features. DS2 has 11 angle labels in addition to h1, h2, s, and the obstacle class.

Tables 5 and 6 list the two types of datasets that we used in our research, along with a sample of collected data for each obstacle class. The DS1 labels range from a01–a60 and end with the obstacle class to give a total of 61 features. DS2 has 11 angle labels in addition to h1, h2, s, and the obstacle class.


**Table 5.** Dataset 1 (D1) Sample.


**Table 6.** Dataset 2 (D2) Sample.

Algorithm 3 outlines how our system's dataset is created. It takes the CSV Header, LDO, and Features as inputs. Using the Building Dataset function, the CSV header file from CSVHeader is first placed in the new dataset, CSVFile. Data are collected from LDO and saved in a LogFile using the DataCollection method. LDO represents the LiDAR distance readings, while the loop records the data in the proper format, saving the LiDAR downwards data in the original order and reversing the order of the LiDAR upward data.


Figure 13 depicts the LidSonic V2.0 Smartphone app's user interface, which is used for building the dataset. The LiDAR sensor sends data to the mobile app through Bluetooth, which it saves in a file named LogFile. On the left-hand side is the prediction mode in which the user hears the verbal feedback of the recognized hazard. In addition, it shows some of the evaluation measurements that are conducted for the three classifiers. For example, KstarT-Elapsed Time (ms) shows the time in milliseconds that is required to build

the Kstar classifier (we provide more details regarding the classifiers in the next subsection) for the DS1 dataset in the white box and the DS2 dataset in the blue box. KstarI-Elapsed Time (ms) shows the inference time required to predict an object for datasets DS1 and DS2, respectively. When the D1 INFERENCE button is pressed, the evaluation measurement of DS1 is presented in the white boxes for each classifier, while the D2 INFERENCE button shows the results obtained for DS2.

**Figure 12.** Slope.

**Figure 13.** LidSonic V2.0 App. (**a**) Prediction Mode. (**b**) Build Dataset Mode.

The LogFile data are displayed in the figure's right-hand mobile app view. The real number ultrasonic sensor measurements is shown in the first line. We will explore this

further in future work to determine whether it is worthwhile to include it as a feature. The ultrasonic measurements were not included in the dataset for this study. The LiDAR sensor's downward and upward 60-degree readings are presented in the first two lines, which include 60 comma-separated numbers. The LiDAR sensor is linked to a servo that rotates in one-degree increments downwards and upwards, capturing the distance from the device to the object at each degree position. The 60-degree downward and upward measurements are acquired in this manner. Each line of data basically provides 60 measurements of the distance from the user's eye to the object, each with a distinct line of sight angle. Every two lines of the 60-degree downward and upward measures are followed by a real number, and so on.

### *4.5. Machine and Deep Learning Module*

We tested multiple types of models on different types of datasets to determine which one performed the best. Algorithm 4 shows the method used to preprocess the data and extract features. The inputs are Dataset, LIndex1, and LIndex2. This is the method we use to preprocess and extract features from the dataset and the variation between the ARFF files (for WEKA) and CSV files (for TensorFlow) based on the file format. To begin, we proceed to the locations where the dataset's data begin. The SelectedFeatures function extracts 11 data values from 60 values for each line. The CalculateHeight function takes the two pointing locations of the Lidar sensor, LIndex1 and LIndex2, to acquire their heights, yielding Height [1] and Height [2]. The two numbers are then sent to the CalculateSlope function to determine the slope, finally incorporating the findings into the file.


Algorithms 5 and 6 provide high-level algorithms for the machine and deep learning modules. A detailed explanation is presented in Sections 4.5.1 and 4.5.2.



The prediction mode can be used to aid visually impaired users in three distinct ways: the prediction button, throw gesture, or voice instruction. To utilize the voice instruction, the user must double-tap the screen so as to access the speech-to-text API and then speak "Prediction Mode" on the command line. The prediction mode is accessed by flinging the screen.

### 4.5.1. Machine Learning Models (WEKA)

WEKA, a java-based open source program that contains a collection of machine learning algorithms for data mining applications [143], was employed to train the dataset with three classifiers that were carefully selected from detailed experiments carried out in our previous works, which provided the best results, including KStar, IBk, and Random Committee [39].
