FPGA Applied to Latency Reduction for the Tactile Internet

Junior, José C. V. S.; Silva, Sérgio N.; Torquato, Matheus F.; Mahmoodi, Toktam; Dohler, Mischa; Fernandes, Marcelo A. C.

doi:10.3390/s22207851

Open AccessArticle

FPGA Applied to Latency Reduction for the Tactile Internet

by

José C. V. S. Junior

^1,†

,

Sérgio N. Silva

¹

,

Matheus F. Torquato

¹

,

Toktam Mahmoodi

²

,

Mischa Dohler

²

and

Marcelo A. C. Fernandes

^1,3,*,†

¹

Laboratory of Machine Learning and Intelligent Instrumentation, Federal University of Rio Grande do Norte, Natal 59078-970, RN, Brazil

²

Centre for Telecommunications Research, Department of Engineering, King’s College London, London WC2R 2LS, UK

³

Department of Computer Engineering and Automation, Federal University of Rio Grande do Norte, Natal 59078-970, RN, Brazil

^*

Author to whom correspondence should be addressed.

^†

These authors contributed equally to this work.

Sensors 2022, 22(20), 7851; https://doi.org/10.3390/s22207851

Submission received: 27 August 2022 / Revised: 9 October 2022 / Accepted: 10 October 2022 / Published: 16 October 2022

(This article belongs to the Special Issue Advances in Tactile Sensing and Robotic Grasping)

Download

Browse Figures

Versions Notes

Abstract

:

Tactile internet applications allow robotic devices to be remotely controlled over a communication medium with an unnoticeable time delay. In bilateral communication, the acceptable round trip latency is usually 1 ms up to 10 ms, depending on the application requirements. The communication network is estimated to generate 70% of the total latency, and master and slave devices produce the remaining 30%. Thus, this paper proposes a strategy to reduce 30% of the total latency produced by such devices. The strategy is to use FPGAs to minimize the execution time of device-associated algorithms. With this in mind, this work presents a new hardware reference model for modules that implement nonlinear positioning and force calculations and a tactile system formed by two robotic manipulators. In addition to presenting the implementation details, simulations and experimental tests are performed in order to validate the hardware proposed model. Results associated with the FPGA sampling rate, throughput, latency, and post-synthesis occupancy area are analyzed.

Keywords:

tactile internet; latency reduction; haptic devices; FPGA

1. Introduction

Tactile internet is conceptually defined as the new generation of internet connectivity which will combine very low latency with extremely high availability, reliability and security [1,2]. Another feature that has been pointed out is that this new generation will be centered around applications that use human-machine communications (H2M) alongside devices that are compatible with tactile sensations [3,4,5]. Currently, the IEEE 1918.1 standard [6] defines characteristics of the tactile internet, where both the structure and description of application scenarios and definitions are presented. A tactile internet environment is basically composed of a local device (known as a master) and a remote device (known as a slave), where the master device is responsible for controlling the slave device over the internet through a two-way data communication network [7,8]. Bidirectional communication is needed to simulate the physical laws of action and reaction, where action can be represented as sending operational commands and reaction can be represented as the forces resulting from that action. In tactile internet applications, the desired time delay for device communication is characterized by an ultra-low latency. In bilateral communication, the required round trip latency ranges from

1 ms

up to

10 ms

depending on the application requirements [9,10,11,12].

According to [13], it can be noticed that in a tactile internet application,

30 %

of the total system latency is generated by the master and slave devices. These devices demand high processing speeds as repeated execution of a variety of computationally expensive algorithms and techniques are required. These algorithms involve the use of arithmetic operations and calculations of linear and nonlinear equations that need to be computed at high sampling rates in order to maintain application fidelity. The remaining

70 %

of the latency is caused by the communication network, which makes them unsuitable for such latency constraints [14]. To address this problem, some research groups have been studying communication networks in the context of tactile internet. The works [15,16,17] shows some types of techniques that can be used to reduce network latency.

Other groups have been studying prediction techniques, where many algorithms have been studied and proposals using artificial intelligence (AI) have proved to be effective [18]. On the other hand, the implementation of complex AI-based prediction methods can further increase the latency of the computer systems present in master and slave devices. Alternatively, new approaches such as using field-programmable gate arrays (FPGAs) can improve the performance of master and slave devices in a tactile system environment. The FPGAs enables the creation of customizable hardware which allow algorithms to be parallelized and optimized at the logical gate level to speed up their operations. Literature results show that computationally expensive algorithms can achieve speedups of up to

1000 \times

over software implementations when custom-implemented in FPGAs [19,20,21,22,23,24,25].

In this context, this paper has, with motivation, a hardware proposed implementation to target reducing the 30% of the total latency related to tactile devices. The project uses FPGA devices to minimize the execution time of algorithms associated with master and slave devices. FPGAs allow the parallelization of algorithms and latency reduction compared to software systems embedded in traditional architectures with general purpose processors and microcontrollers. In an effort to validate the proposed strategy, this paper presents a discrete reference model that can be adjusted for different types of master and slave devices in a tactile internet system. Validation results, throughput, and post-synthesis figures obtained for the proposed hardware implementation using FPGA devices are presented. Comparisons with other works in the literature show that using FPGA can significantly accelerate the processing speed in tactile devices. Thus, this work makes the following contributions:

A new discrete reference model for a tactile internet system.
The novel reference architecture for hardware design on FPGA for tactile master and slave devices.
A new reference architecture for forward and inverse kinematics on FPGA.
A new strategy to reduce the latency on tactile internet based on FPGA.
Comparison of performance of the proposed hardware model with other proposals in the literature.

The remainder of this paper is organized as follows: Section 2 presents the related works in the literature; Section 3 introduces a new discrete reference model for a tactile internet system; Section 4 describes the PHANToM Omni robot used with master and slave device; Section 5 presents the simulated tactile internet model; Section 6 gives all detailed description of the reference hardware architectures proposed in this paper; Section 7 presents and analyzes the synthesis results obtained from the described implementation, including a comparison to other works; Section 8 presents the final considerations.

2. Related Work

The authors of [26] presented a tactile internet environment that used a glove type device in conjunction with a robotic manipulator. The environment was developed using a general purpose processor, which made the execution of the algorithms sequential. In order to send the data, the tactile glove produced a latency of approximately

4.82 ms

, and the hardware responsible for performing the inverse kinematics calculations took an interval of

0.95 ms

. The latency values obtained in this application could be improved by hardware structures that allow algorithms parallelization.

Studies in the literature demonstrate the benefit of using FPGA to accelerate the sample rate for data acquisition from devices associated with haptic systems. The authors of [27] presented an implementation for controlling a 3-DoF (Degree of Freedom) device. The presented technique proposed to increase the device sampling rate using FPGA hardware together with a real-time operating system (RTOS) in order to increase the resolution acquisition of the stiffness sensor. The control technique presented was developed in 32-bit fixed point, and trigonometric functions were implemented using lookup tables.

The work described in [28] presented a control system for one-dimensional haptic devices (1-DoF). The FPGA control implementation used single-precision floating point representation (IEEE std 754) and the algorithms performed all calculations in

50 μ s

. The processing time was satisfactory; however, the data frame size to be sent over the network increased with the size of the DoF. This peculiarity can increase latency for more complex haptics systems with many DoFs. In the same topic of previous works, an implementation for bilateral control of single-dimensional haptic devices (1-DoF) was presented in [29]. A more accurate control techniques based on the sliding mode control (SMC) was implemented in FPGA, and to assist in performing the complex calculations, the CORDIC (COordinate Rotation DIgital Computer) was used. The hardware was designed to locally control two devices, one master and one slave. In the implementation, a 24-bit fixed point was used, of which 9 bits in the integer part and 14 bits for the fractional, and the total execution time of the controllers was of

7.2375 μ s

.

The works [27,28,29] presented a control that depends directly on the encoder reading of the device motors. Usually in commercial models, accessing the device electronics can be tricky requiring some reverse engineering and specific knowledge to make the appropriate encoder connections. On the other hand, some works abstract the data acquisition and work directly with robotics algorithms. These algorithms may require high computational power that can surpass the capabilities of many general-purpose processors (GPPs) that perform the operations sequentially.

Some studies demonstrate the benefit of using FPGA to accelerate robotic manipulation algorithms related to haptic systems. A hardware architecture implemented in FPGA for performing the forward kinematics of 5-DoF robots using floating point arithmetic was described in [30]. In this hardware implementation, all the forward kinematics calculations were performed within

1.24 μ s

which represents 67 clock cycles in a frequency of

54 MHz

. The equivalent software implementation has a total processing time of

1.61036 ms

. Overall, the hardware implementation is

1298 \times

faster than the software implementation, which means a considerable acceleration in the forward kinematics processing time.

The authors of the paper [31] presented an FPGA implementation of inverse kinematics, velocity calculation and acceleration of a 3-DoF robot. Three systems were created: the first one did not use any arithmetic co-processor and floating point operations were performed in software; in the second system a floating point co-processor was used which allowed the execution of the four basic mathematical operations in hardware; lastly, the third system also had a custom arithmetic co-processor but in this case it allowed hardware computation of square root. The overall times to perform the calculations were

2324 μ s

,

560 μ s

and

143 μ s

and the total logic elements used from the entire device were 4501 (

4 %

), 5840 (

5 %

) and 7219 (

6 %

), respectively. The work uses hardware–software to implement inverse kinematics, in which critical parts were implemented in FPGAs to accelerate the whole process.

In [32], a hardware able to control a 6-DoF device using 32-bit fixed point representation is presented, where 21 bits were used for the fractional part and 11 bits for the integer part. In that work, a CORDIC implementation was used to assist in performing the trigonometric calculations. The total time spent to compute the forward kinematics was

3 μ s

and for the inverse kinematics the time was

4.5 μ s

for a clock of

50 MHz

. However, in the presented proposal, some calculations were performed sequentially, that is, for the execution of the forward kinematics it was necessary 150 clock cycles and for the inverse, 225 cycles. The use of partial parallelization in the execution of robotic manipulation algorithms provided a significant increase in system throughput. Nevertheless, it is important to note that there is still room for improvement since all calculations can be computed in parallel.

Another hardware implementation of inverse kinematics was presented in [33]. The device used was a 10-DoF biped robot. A CORDIC implementation was used to perform the trigonometric calculations. The execution time needed to compute the kinematics of the 10 joints in FPGA was of

0.44 μ s

. In this paper, a comparison with a software implementation was also performed, and the time taken to perform the same calculations was

3342 μ s

, i.e., the gain on execution, or speedup, on custom FPGA hardware was

7595 \times

. The resulting error between both implementations was acceptable for this specific control.

In [34], an FPGA implementation of the forward and inverse kinematics of a 5-DoF device was presented. The hardware was developed using a fixed point representation where 32 bits were used for the angles representation and 15 bits for the fractional part. For the device spatial positioning, 16 bits were used of which 7 bits for the fractional part. In the implementation of trigonometric functions, a combination of techniques using lookup tables (LUTs) and Taylor series was used. To perform the necessary calculations, a finite-state machine model (FSM) was used to reduce the use of hardware resources; however, the use of such FSM generated a sequential computation of the robotic manipulation algorithms. In this model, the forward kinematics implementation achieved a runtime of

680 ns

and the inverse

940 ns

, that is, for the

50 MHz

clock, the forward kinematics took 34 clock cycles and the inverse kinematics took 47 cycles. Using such approaches to reduce the use of hardware resources increases computation runtime. For tactile device applications, it is important to optimize the runtime rather than the use of hardware resources.

Similarly, an FPGA implementation of forward and inverse kinematics for a 7-DoF device was presented in [35]; however, only 3-DoF required to control the device movement were implemented in hardware. The proposal used a 32-bit fixed point representation and a CORDIC was used to execute the trigonometric functions. To validate the proposal, the FPGA was set to receive the three reference angles, perform the forward kinematics and then the inverse. The model was developed based on pipeline and the operating frequency used was of

100 MHz

. As a result, the model calculation took

2 μ s

to perform the entire kinematics algorithm, which represented 200 clock cycles.

In this context, it is possible to realize that the use of FPGA-based computing can accelerate haptic device control algorithms. Unlike traditional hardware that processes information sequentially, FPGA enables parallel information processing. However, most studies from the literature have developed partially parallel implementations, that is, implementations in which parts of the used algorithms are executed sequentially. Unlike the research previously mentioned, this study presents a new approach in which the execution of the robotic manipulation algorithms are performed in a full-parallel hardware implementation. This proposed implementation provides a latency reduction for the tactile devices and enables tactile internet applications.

3. Discrete Model of the Tactile Internet

A discrete model of the tactile internet system is proposed and presented in Figure 1. This model consists of seven subsystems: the Operator (OP), Master Device (MD), Hardware of the MD (HMD), Network (NW), Hardware of the SD (HSD), Slave Device (SD) and the Environment (ENV). It is assumed that the signals are sampled at time

t_{s}

.

The OP is an entity responsible for generating stimuli that can be in the form of position signals, speed, force, image, sound or any other. These stimuli are sent to the devices involved so that some kind of task can be performed in some kind of environment. The environment, the ENV subsystem, receives the stimuli from the OP and generates feedback signals associated with sensations such as reactive force information and tactile information that are sent back to the OP. The interaction between the OP and the ENV is performed through the master and slave devices, MD and SD, respectively.

Specifically in this work, MD is characterized as a local device, SD as remote one and both of them are responsible for transforming the stimuli and sensations associated with OP and ENV into signals to be processed. Tactile devices (MD and SD) can take the form of robotic manipulators, haptic devices, tactile gloves and others that may be developed in the future. In the coming years, the introduction of new types of sensors and actuators is expected that will form the basis for the development of new tactile devices.

Although there are no tactile internet standards nor products yet, it can be affirmed that future tactile devices will be integrated with a hardware responsible for all operational metrics and calculations. Within this conjecture, this work adds a couple of modules to the discrete model (as per Figure 1), called HMD and HSD. HMD is responsible for performing all transformations and calculations associated with MD, and HSD performs the equivalent operations for the SD. Several algorithms associated with transformation, compression, control, prediction will be under the responsibility of these two modules. The authors from [36] present a few approaches focusing on the reduction of kinesthetic data and tactile signal compression, which can be applied to the model.

Based on the model presented in Figure 1, the signals generated by the OP can be characterized by the array

a (n)

expressed as

a (n) = [a_{1} (n), \dots, a_{i} (n), \dots, a_{N_{O P}} (n)],

(1)

where

a_{i} (n)

is the i-th stimulus at the n-th instant and

N_{O P}

is the total number of stimuli signals generated by the OP. At every n-th moment the stimulus array,

a (n)

, is received by the MD which transforms the stimuli into a set of

N_{M D}

signals expressed as

b (n) = [b_{1} (n), \dots, b_{i} (n), \dots, b_{N_{M D}} (n)],

(2)

where

b_{i} (n)

is the i-th signal generated by MD at the n-th instant. It can be stated that at each n-th moment a set of stimuli

a (n)

generates a set of signals

b (n)

that depend on the type of MD and the sensor set associated with the device. Especially important is the fact that the signals generated by MD,

b (n)

, have heterogeneous characteristics in which each i-th signal

b_{i} (n)

can represent an angle, spatial coordinate, pixel of an image, audio sample or any other information associated with a stimulus generated by OP. In practice, the signals grouped by the

b (n)

array originate from sensors coupled to the MD and the amount of data may vary according to the amount of information to be sent,

N_{M D}

.

The set of signals, expressed by

b (n)

are sent to the HMD (Figure 1) which has the function of processing this information before sending it to the NW subsystem. Calculations associated with calibration, linear and nonlinear transformations and signal compression are performed by the HMD. Essentially, the majority of the computational effort of MD is in this subsystem. At each n-th instant

t_{s}

the HMD processes the array

b (n)

generating an information array

c (n)

expressed by

c (n) = [c_{1} (n), \dots, c_{i} (n), \dots, c_{N_{H M D}^{f}} (n)],

(3)

where

c_{i} (n)

is the i-th signal generated by HMD towards the subsystem NW at the n-th instant

t_{s}

and

N_{H M D}^{f}

is the numbers of signals.

N_{H M D}^{f} < N_{M D}

is expected to minimize latency during the transmission in the NW subsystem.

The NW subsystem, as shown in Figure 1, characterizes the communication medium that links OP to ENV. In this model, the data propagates through two different channels called the forward channel, that transmits the OP data towards the ENV, and the backwards channel, that transmits the ENV signals towards the OP. The signal transmitted by the forward and backwards channels may be disturbed and delayed. In the case of the forward channel, the received signal,

v (n)

, may be expressed as

v (n) = [v_{1} (n), \dots, v_{i} (n), \dots, v_{N_{H M D}^{f}} (n)],

(4)

where

v_{i} (n) = c_{i} (n - d_{i}^{f} (n)) + r_{i}^{f} (n)

(5)

in which,

r_{i}^{f} (n)

represents the added noise and

d_{i}^{f} (n)

represents a delays associated with the i-th information sent in

c (n)

. In this model, the noise can be characterized as a random Gaussian variable of zero mean and

σ_{r f}^{2}

variance, and the delays are characterized as integers, that is, they occur at a granularity of

t_{s}

. It is important to note that the NW subsystem can take the shape of the Internet, a metropolitan network (MAN), a local area network (LAN), or even a direct connection between an MD and a workstation or computer.

As shown in Figure 1, the HSD receives the

v (n)

signal through the forward channel and has the role of generating control signals to the SD through the signal

u (n) = [u_{1} (n), \dots, u_{i} (n), \dots, u_{N_{H S D}^{f}} (n)],

(6)

where

N_{H S D}^{f}

is the number of control signals and

u_{i} (n)

is i-th control signal at the n-th instant

t_{s}

associated with the array

u (n)

. It is important to note that there may be various types of SD: from real robotic handlers to virtual tools in computational environments. Thus, it can be stated without loss of generality that HSD can perform an inverse processing to HMD in addition to specific algorithms associated with the type of SD. For example, if the SD is a robotic handler, HSD must additionally implement closed loop control algorithms, whereas if SD is a virtual arm HSD must implement positioning algorithms for a given virtual reality platform. SD does not have to correspond directly with MD, e.g., MD can be a glove while SD is a drone. However, it is desirable that the stimulus generated by the SD is a copy of the stimulus generated by the OP, that is, within the model presented in Figure 1, it can be understood that SD generate a signal expressed as

\hat{a} (n) = [{\hat{a}}_{1} (n), \dots, {\hat{a}}_{i} (n), \dots, {\hat{a}}_{N_{O P}} (n)],

(7)

where

{\hat{a}}_{i} (n)

is an estimate of the i-th stimulus

a_{i} (n)

generated by the OP. Thus, the estimate of the stimulus generated by OP,

{\hat{a}}_{i} (n)

, is applied to the ENV subsystem representing a given real or virtual environment in which OP is interacting.

In the backwards direction, the stimulus actions generated by OP,

a (n)

, and represented by

\hat{a} (n)

, receives a group of reactions from the ENV subsystem that can be characterized in the model by the set of signals expressed by

o (n) = [o_{1} (n), \dots, o_{i} (n), \dots, o_{N_{E N V}} (n)],

(8)

where

N_{E N V}

is the number of stimulus signals and

o_{i} (n)

is i-th stimulus signal at the n-th instant

t_{s}

. Reaction signals grouped into

o (n)

can be in the form of strength, touch, temperature, etc.

Reaction signals are captured by the SD that turns this information into electrical signals from real or virtual sensors, if the SD is in a virtual reality environment. After capturing this information the SD transmits these signals to the HSD. In the model presented in Figure 1, the signals generated by the SD are expressed as

g (n) = [g_{1} (n), \dots, g_{i} (n), \dots, g_{N_{S D}} (n)],

(9)

where

g_{i} (n)

is the i-th signal generated by the SD at the n-th instant of time,

t_{s}

and

N_{S D}

is the amount of signals. The HSD in turn processes this information and sends to the NW subsystem through the array

h (n)

, expressed by

h (n) = [h_{1} (n), \dots, h_{i} (n), \dots, h_{N_{H S D}^{b}} (n)],

(10)

where

h_{i} (n)

is the i-th signal generated by HSD at the n-th instant of time,

t_{s}

and

N_{H S D}^{b}

is the amount of signals.

The signal received by the HMD through the backwards channel of the NW subsystem can be expressed as

q (n) = [q_{1} (n), \dots, q_{i} (n), \dots, q_{N_{H S D}^{b}} (n)],

(11)

where

q_{i} (n) = h_{i} (n - d_{i}^{b} (n)) + r_{i}^{b} (n)

(12)

in which,

r_{i}^{b} (n)

represents an added noise and

d_{i}^{b} (n)

represents a delay associated with the i-th information transmitted in

q (n)

by the backwards channel. Similarly to the forward channel, noise can also be characterized as a random variable Gaussian of zero mean and variance

σ_{r b}^{2}

and delays are characterized as integers with

t_{s}

granularity. The HMD processes the

q (n)

signal information and generates a set of control signals that will act on the MD and can be characterized as

p (n) = [p_{1} (n), \dots, p_{i} (n), \dots, p_{N_{H M D}^{b}} (n)],

(13)

where

p_{i} (n)

is the i-th signal generated by the HMD at the n-th instant of time

t_{s}

and

N_{H M D}^{b}

is the number of signals. The MD in turn will synthesize the reaction stimuli generated by the environment, i.e., the ENV subsystem. Based on the model, it is possible to characterize these reaction stimuli as a signal expressed by

\hat{o} (n) = [{\hat{o}}_{1} (n), \dots, {\hat{o}}_{i} (n), \dots, {\hat{o}}_{N_{E N V}} (n)],

(14)

where

{\hat{o}}_{i} (n)

is an estimate of the i-th stimulus

o_{i} (n)

generated in the ENV subsystem. Examples of reaction stimuli generated or synthesized by MD are touch, strength and temperature.

In addition to the latency associated with the NW subsystem that characterizes the communication medium between the OP and ENV subsystems, the MD, HMD, HSD, and SD subsystems also add latency to the system. Based on the work presented in [13,14] these components represent

30 %

of total latency. The latency of the MD and SD subsystems are associated with sensors and actuators that can be mechanical, electrical, electromechanical and other variations. HMD and HSD latencies are associated with the processing time of the algorithms in these devices and depending on the type of hardware and implementation architecture this latency can be considerably reduced.

4. PHANToM Omni Device Model (MD & SD)

Based on the scheme presented in Figure 1, this section presents details associated with the MD and SD used as reference for the hardware system proposed in this research. The MD and SD are characterized as a three degree of freedom robotic manipulator, 3-DoF, called the PHANToM Omni [37] (Figure 2). The PHANToM Omni has been widely used in literature as presented in [38,39]. In this work two of this devices are going to be used: one as an MD and the other as a SD.

As can be seen from Figure 3, the PHANToM Omni physical structure is formed by a base, an arm with two segments

L_{1}

and

L_{2}

which are interconnected by three rotary joints

θ_{1}

,

θ_{2}

and

θ_{3}

and a tool. The variables presented in Figure 3 are represented by:

L_{1}

= 0.135 mm,

L_{2}

=

L_{1}

,

L_{3}

= 0.025 mm and

L_{4} = L_{1} + A

where A = 0.035 mm as described in [40]. These detailed features of the device are essential for performing the kinematics and dynamic calculations.

4.1. Forward Kinematics

The kinematics of manipulative devices makes use of the relationship between operational coordinates and joint coordinates. Forward kinematics (FK) correlates the angular variables of the joints with the Cartesian system. That is, given an array of joint coordinates it is possible to determine the spatial position of the tool through the equation that can be expressed by

x = - sin (θ_{1}) (L_{2} sin (θ_{3}) + L_{1} cos (θ_{2})),

(15)

y = - L_{2} cos (θ_{3}) + L_{1} sin (θ_{2}) + L_{3},

(16)

\begin{matrix} z = L_{2} cos (θ_{1}) sin (θ_{3}) \\ + L_{1} cos (θ_{1}) cos (θ_{2}) - L_{4} \end{matrix}

(17)

where x, y and z are variables that determine the spatial position of the tool in the Cartesian plane.

4.2. Inverse Kinematics

In inverse kinematics (IK), the relationship between the joint angles and the Cartesian system is reversed, that is, given the spatial position of the tool it may be possible to determine the joint coordinates. The solution to this process is not as straightforward as in the direct kinematics. In direct kinematics, the position of the tool is determined solely by the displacements of the joints. In inverse kinematics, equations are composed of nonlinear calculations formed by trigonometric functions. Depending on the manipulator structure, multiple solutions may be possible for the same tool position, or there may be no solution for a particular set of tool positions. Based on the works [40,41,42], the value of

θ_{1}

can be defined through the equation expressed by

θ_{1} = - atan 2 (x, z + L_{4})

(18)

where x and z represent coordinates in the Cartesian plane and

L_{4}

corresponds to the size of the the arm segments, as shown in Figure 3.

To calculate the other two joints

θ_{2}

and

θ_{3}

it is necessary to perform intermediate calculations. Thus, one can obtain R, r,

β

,

γ

and

α

through the equations

R = \sqrt{x^{2} + {(z + L_{4})}^{2}},

(19)

r = \sqrt{(x^{2} + z + L_{4})^{2} + {(y - L_{3})}^{2}},

(20)

γ = acos (\frac{L_{1}^{2} - L_{2}^{2} + r^{2}}{2 L_{1} r}),

(21)

β (n) = atan 2 (y - L_{3}, R),

(22)

and

α = acos (\frac{L_{1}^{2} + L_{2}^{2} - r^{2}}{2 L_{1} L_{2}}) .

(23)

After performing the intermediate calculations it is possible to calculate

θ_{2}

through the equation

θ_{2} = γ + β .

(24)

Finally, the value corresponding to the

θ_{3}

joint can be obtained through the equation

θ_{3} = θ_{2} + α - \frac{π}{2} .

(25)

4.3. Kinesthetic Feedback Force

The kinesthetic feedback force allows the environment to be “felt”, i.e., when the SD comes into physical contact with an object, the MD will receive a counter force. This model can be implemented through the equation

τ = J^{T} F,

(26)

where

τ

defines the torque array that will be applied to each joint (

θ_{1}

,

θ_{2}

and

θ_{3}

) of the PHANToM Omni associated with the MD,

J^{T}

is the transpose of the Jacobian matrix and

F

is the force array resulting from the interaction of SD with ENV. The torque array

τ

can be expressed as

τ = [τ_{1}, τ_{2}, τ_{3}] .

(27)

The

J

Jacobian matrix incorporates structural information about the handler and it is identified as

J = [\begin{matrix} J_{11} & J_{12} & J_{13} \\ J_{21} & J_{22} & J_{23} \\ J_{31} & J_{32} & J_{33} \end{matrix}],

(28)

where

J_{11} = - cos (θ_{1}) (L_{2} sin (θ_{3}) + L_{1} cos (θ_{2})),

(29)

J_{21} = 0,

(30)

J_{31} = - L_{1} cos (θ_{2}) sin (θ_{1}) - L_{2} sin (θ_{3}) sin (θ_{1}),

(31)

J_{12} = L_{1} sin (θ_{1}) sin (θ_{2}),

(32)

J_{22} = L_{1} cos (θ_{2}),

(33)

J_{32} = - L_{1} sin (θ_{2}) cos (θ_{1}),

(34)

J_{13} = - L_{2} sin (θ_{1}) cos (θ_{3}),

(35)

J_{23} = L_{2} sin (θ_{3}),

(36)

and

J_{33} = L_{2} cos (θ_{3}) cos (θ_{1}) .

(37)

The force array

F

is expressed as

F = [F_{x}, F_{y}, F_{z}]

(38)

and can be obtained through sensors internal or external to the device. According to Equation (26), the

τ

torque array representing the resulting force at each joint can be defined as

τ_{1} = J_{11} F_{x} + J_{21} F_{y} + J_{31} F_{z},

(39)

τ_{2} = J_{12} F_{x} + J_{22} F_{y} + J_{32} F_{z},

(40)

and

τ_{3} = J_{13} F_{x} + J_{23} F_{y} + J_{33} F_{z} .

(41)

5. Simulated Tactile Internet Model

Figure 1 and Figure 4 details the structure used for the hardware design in FPGA, in which a given operator, OP, handles a PHANToM Omni on the master side, MD, which is connected to HMD that, in this case, is a dedicated FPGA hardware. Data are transmitted through the network, the NW subsystem, to HSD which is also a dedicated hardware in FPGA. The HSD is also connected to a PHANToM Omni that interacts with the environment, the ENV subsystem. Figure 4 also details the backwards direction from the ENV and the OP.

The OP is modeled as an information source responsible for generating a spatial trajectory through discrete signals expressed in the

a (n)

array. At each n-th instant

t_{s}

the OP sends three variables

x^{O P} (n)

,

y^{O P} (n)

and

z^{O P} (n)

representing the positioning of the MD tool (Figure 2 and Figure 3) in the Cartesian space an this is expressed by

a (n) = [x^{O P} (n), y^{O P} (n), z^{O P} (n)] .

(42)

Both devices, master and slave PHANToM Omni, and the structures that form the system were modeled and simulated on Matlab/Simulink [43] and Xilinx System Generator. This step simulates the spatial movement of the MD tool by the operator, that is, at each instant of time,

t_{s}

, a spatial movement is performed and a new signal

a (n)

is generated by the OP.

The PHANToM Omni has encoders at its three joints that translate spatial positioning at the three angles

θ_{1}

,

θ_{2}

and

θ_{3}

(Figure 2 and Figure 3). Thus, based on Figure 4, it can be said that MD converts the signal

a (n)

into a signal expressed as

b (n) = [θ_{1}^{M D} (n), θ_{2}^{M D} (n), θ_{3}^{M D} (n)]

(43)

and forwards it to the HMD at every n-th instant of time

t_{s}

.

Then, as can be seen in Figure 4, the

b (n)

signal propagates to the HMD, which on receiving the signal transforms the joint positioning angles,

b (n)

, into spatial position by calculating the FK according to Equations (15)–(17). All equations are implemented in FPGA through a hardware module called the FK-HMD. The equations are implemented in parallel which can significantly increase the processing time. The use of FK is motivated by an reduction of the amount of information utilized, i.e., for a N-DoF robotic manipulator N joint angles will be generated and that can be converted into only three values associated with the spatial position of the tool, x, y and z. On the other hand, the use of this strategy increases the amount of calculations to be performed by the MD, which is compensated by the parallel implementation of the algorithm in FPGA. It is essential to note that the use of custom hardware operating in parallel allows processing time not to be substantially affected by N.

Based on Section 3, after the FK calculation by the FK-HMD hardware module, a new discrete signal is created that can be expressed by

c (n) = [x^{H M D} (n), y^{H M D} (n), z^{H M D} (n)]

(44)

where

x^{H M D} (n)

,

y^{H M D} (n)

and

z^{H M D} (n)

are the values of the spatial coordinate array generated by the HMD to be sent to HSD via the communication medium, NW. The FK-HMD hardware module generates a new

c (n)

array every n-th instant of time.

After the transmission through the forward channel, here called FC, the signal received by the HSD can be expressed as

v (n) = [x^{H S D} (n), y^{H S D} (n), z^{H S D} (n)] .

(45)

Based on Equation (5) the spatial coordinate signal received by HSD can be expressed as

x^{H S D} (n) = x^{H M D} (n - d_{x}^{f} (n)) + r_{x}^{f} (n),

(46)

y^{H S D} (n) = y^{H M D} (n - d_{y}^{f} (n)) + r_{y}^{f} (n),

(47)

and

z^{H S D} (n) = z^{H M D} (n - d_{z}^{f} (n)) + r_{z}^{f} (n)

(48)

where

d_{x}^{f} (n)

,

d_{y}^{f} (n)

,

d_{z}^{f} (n)

,

r_{x}^{f} (n)

,

r_{y}^{f} (n)

and

r_{z}^{f} (n)

are the delays and noises associated with CF.

As, in this case, the Slave PHANToM Omni, SD, copies the movement of the master PHANToM Omni, MD, it is necessary for the HSD to perform a feedback control system on the three joints of the PHANToM Omni slave, here expressed as

θ^{S D} (n) = [θ_{1}^{S D} (n), θ_{2}^{S D} (n), θ_{3}^{S D} (n)]

(49)

that is,

θ_{1}^{S D} (n)

,

θ_{2}^{S D} (n)

,

θ_{3}^{S D} (n)

are control variables associated with DS. The control system illustrated in Figure 4 as FCS shall minimize the error,

e^{F C S} (n)

, between

θ^{S D} (n)

and the reference signal

θ^{H S D} (n)

characterized as

θ^{H S D} (n) = [θ_{1}^{H S D} (n), θ_{2}^{H S D} (n), θ_{3}^{H S D} (n)]

(50)

where

e (n) = θ^{H S D} (n) - θ^{S D} (n) a n d

(51)

and

[\begin{matrix} e_{1}^{F C S} (n) \\ e_{2}^{F C S} (n) \\ e_{3}^{F C S} (n) \end{matrix}] = [\begin{matrix} θ_{1}^{H S D} (n) \\ θ_{2}^{H S D} (n) \\ θ_{3}^{H S D} (n) \end{matrix}] - [\begin{matrix} θ_{1}^{S D} (n) \\ θ_{2}^{S D} (n) \\ θ_{3}^{S D} (n) \end{matrix}] .

(52)

The

θ^{S D} (n)

signal is obtained from the SD via sensors (encoders) at the SD joints and the

θ^{H S D} (n)

signal is obtained from the IK-HSD hardware module shown in Figure 4. This hardware module implements all inverse kinematics equations presented in Section 4.2, i.e., Equations (18)–(25). There are several techniques and approaches that can be used in the FCS module ranging from more traditional techniques such as a proportional-integral-derivative controller [44] to more innovative artificial intelligence based techniques [45,46].

The CPD-HSD and JPD-HSD modules, illustrated in Figure 4, represent the algorithms of prediction and detection in cartesian space and joints, respectively. These modules are responsible for minimizing the latency and noise added by the FC associated with the tactile internet system (Equations (46)–(48)). Depending on the prediction and detection technique used, the HSD may use only one of the modules, namely the CPD-HSD or JPD-HSD. There is still no consensus about whether the Cartesian space or joints is the best for minimizing latency and noise inserted by the channel. There are several works in the literature that present proposals using only one of the spaces and proposals that try to use the information from both simultaneously.

Similarly to the FCS module, approaches ranging from the more traditional techniques up to more innovative techniques based on artificial intelligence have been used in the CPD-HSD and JPD-HSD modules [47,48,49,50,51]. Thus, it can be said that

θ^{H S D} (n)

is an estimate of the

b (n)

signal generated by the MD.

At each n-th time, the FCS acts on the SD through the

u (n)

signal, detailed in Figure 1 and Figure 4, which in the case of the PHANToM Omni can be expressed as

u^{H S D} (n) = [τ_{1}^{H S D} (n), τ_{2}^{H S D} (n), τ_{3}^{H S D} (n)]

(53)

where

τ_{i}^{H S D} (n)

is the i-th torque applied every i-th joint. The FCS will act as a tracking mechanism, making the SD follow the path traveled by the MD. Finalizing the data stream associated with the forward channel, it can be said that the

\hat{a} (n)

signal is formed by an estimate of the spatial position generated by the OP,

\hat{a} (n)

, i.e.,

\hat{a} (n) = [{\hat{x}}^{O P} (n), {\hat{y}}^{O P} (n), {\hat{z}}^{O P} (n)] .

(54)

The interaction of the PHANToM Omni, SD, with ENV can vary from free movement to physical contact. When some kind of physical contact occurs, the SD detects the touch and sends this information back to the HSD. As per the model detailed in Figure 4 the ENV sends back to SD the information associated with the contact force in the spatial plane, expressed here as,

o (n) = [F_{x}^{E N V} (n), F_{y}^{E N V} (n), F_{z}^{E N V} (n)] .

(55)

The value associated with the contact force information can be measured directly through SD-coupled force sensors or indirectly estimated through other types of sensors that may be SD-coupled or inserted into the environment [52]. In the case of the model presented in Figure 4, the SD sends to HSD the objects surface’s spatial positions through sensors spread in the ENV. The signal expressed as

s^{O B J} (n) = [x^{O B J} (n), y^{O B J} (n), z^{O B J} (n)]

(56)

represents the spatial position of the closest object from the SD tool. Thus, based on the information already described, every n-th time

t_{s}

the SD sends to the HSD a signal characterized by the array

g (n)

expressed as

g (n) = [θ^{S D} (n), s^{O B J} (n)] .

(57)

In the HSD, when the signal

g (n)

is received, the Split module separates the

θ^{S D} (n)

signal and sends it to the FCS and the FK-HSD hardware module. In addition, the signal

s^{O B J} (n)

is sent to the FB-HSD hardware module, as detailed in Figure 4. The FK-HSD hardware module performs the forward kinematics calculation similarly to FK-HMD and thus the current spatial position of the SD tool in the environment, ENV, can be obtained. Every n-th instant

t_{s}

FK-HSD generates a signal expressed as

l (n) = [x^{E N V} (n), y^{E N V} (n), z^{E N V} (n)]

(58)

where

x^{E N V} (n)

,

y^{E N V} (n)

and

z^{E N V} (n)

are the spatial position of the tool in the ENV module from

θ^{S D} (n)

. The FBF-HSD hardware module implements the calculations associated with the generation of the feedback force from the contact between the tool and the object. Based on the work presented in [52] the contact force, represented by the

h (n)

signal, can be expressed as

h (n) = [F_{x}^{H S D} (n), F_{y}^{H S D} (n), F_{z}^{H S D} (n)],

(59)

where

F_{x}^{H S D} (n) = h_{x} (n) (x^{O B J} (n) - x^{E N V} (n)),

(60)

F_{y}^{H S D} (n) = h_{y} (n) (y^{O B J} (n) - y^{E N V} (n)),

(61)

and

F_{z}^{H S D} (n) = h_{z} (n) (z^{O B J} (n) - z^{E N V} (n)) .

(62)

In these equations, the constants

h_{x} (n)

,

h_{y} (n)

and

h_{z} (n)

represent the elasticity coefficients associated with the object. It is important to note that in this model the

h (n)

signal is a synthesized version of the real force value here characterized by the

o (n)

array.

After the feedback force calculation process, as illustrated in Figure 4, the

h (n)

signal is transmitted to the HMD via the backwards channel (BC) which, similarly to FC, adds latency and noise. The signal received by the HMD can be expressed as

q (n) = [F_{x}^{H M D} (n), F_{y}^{H M D} (n), F_{z}^{H M D} (n)]

(63)

where

F_{x}^{H M D} (n) = F_{x}^{H S D} (n - d_{x}^{b} (n)) + r_{x}^{b} (n),

(64)

F_{y}^{H M D} (n) = F_{y}^{H S D} (n - d_{y}^{b} (n)) + r_{y}^{b} (n),

(65)

and

F_{z}^{H M D} (n) = F_{z}^{H S D} (n - d_{z}^{b} (n)) + r_{z}^{b} (n)

(66)

where

d_{x}^{b} (n)

,

d_{y}^{b} (n)

,

d_{z}^{b} (n)

,

r_{x}^{b} (n)

,

r_{y}^{b} (n)

and

r_{z}^{b} (n)

are the latencies and the noises associated with the BC.

Similarly to HSD, the HMD will minimize the effect of latency and noise from operations of Cartesian and joint space. For HMD, the calculations associated with the Cartesian space will be performed by the CPD-HMD module and associated with the joint space by the JPD-HMD module. In addition to the prediction and detection calculations, the HMD must transform the force signals received through signal

q (n)

into a torque to be applied to the MD joints which is accomplished by the KFF-HMD hardware module. KFF-HMD implements the Equations (39)–(41) presented in Section 4.3 and generate the signal expressed as

p (n) = [τ_{1}^{H M D} (n), τ_{2}^{H M D} (n), τ_{3}^{H M D} (n)]

(67)

where

τ_{i}^{H M D} (n)

is the torque associated with the i-th joint of the MD. Since the PHANToM Omni is a haptic device, it already has a built-in control system, FCS, which uses as reference signal the torques associated with the

p (n)

array.

After applying the torques to the MD joints via the

p (n)

signal, the OP receives the feedback force signal, in other words, it feels the object touched by the SD in the ENV. This sensation is identified in by the

\hat{o} (n)

signal expressed as

\hat{o} (n) = [{\hat{F}}_{x}^{E N V} (n), {\hat{F}}_{y}^{E N V} (n), {\hat{F}}_{z}^{E N V} (n)] .

(68)

As illustrated in Figure 4, the MD, HMD, NW, HSD, and SD subsystems have the following runtimes:

t_{M D}

,

t_{H M D}

,

t_{N W}

,

t_{H S D}

and

t_{S D}

, respectively. The sum of these, times taking into account the forward direction (between OP and ENV) and the backwards direction (between ENV and OP), represent the total system latency that can be expressed as

t_{latency} = 2 (t_{MD} + t_{HMD} + t_{NW} + t_{HSD} + t_{SD}) .

(69)

Some works presented in the literature review agree that the ideal requirement is that

t_{latency} \leq 1 ms

, on the other hand, other works point out that the latency requirement can be expresses as

t_{latency} \leq 10 ms

, depending on the application [9,10,11,12,53]. Considering that

30 %

of the total latency time

t_{latency}

is spent by MD, HMD, HSD, and SD, it can be understood that

(t_{MD} + t_{HMD} + t_{HSD} + t_{SD}) \leq \frac{0.3 t_{latency}}{2} .

(70)

Assuming an equal time division among MD, HMD, HSD, and SD it is possible to affirm that the time associated with hardware,

t_{hardware}

, whether the master, HMD, or the slave device, HSD, can be expressed as

t_{HMD} = t_{HSD} = t_{hardware} \leq \frac{0.3 t_{latency}}{8} .

(71)

Taking the

1 ms

constraints into consideration and substituting this value in Equation (71), it is possible to affirm that the hardware time,

t_{hardware}

, must meet the

t_{hardware} \leq 37.5 μ s

constraint for all cases (condition

1 ms

) or the

t_{hardware} \leq 375 μ s

constraint for some specific cases (

10 ms

condition).

Recent studies from the literature show that the

1 ms

restriction (

t_{hardware} \leq 37.5 μ s

) is difficult to achieve using hardware devices based on embedded systems such as microprocessors and microcontrollers [26,54]. The

10 ms

restriction (

t_{hardware} \leq 375 μ s

) is achieved in specific cases where SD is a virtual environment and HSD is a high performance processor computer [53]. Thus this work aims to minimize the execution time in HMD,

t_{HMD}

, and HSD,

t_{HSD}

, using FPGA. In other words, the target is to achieve a

t_{hardware} \leq 37.5 μ s

.

This paper presents a hardware reference model for the FK-HMD, KFF-HMD, IK-HSD, FK-HSD, and FBF-HSD modules illustrated in Figure 4. The complete model that will be presented in detail in the next section makes use of a parallel implementation methodology in which high throughput is prioritized, i.e., the execution time of the modules

t_{FK}

,

t_{KFF}

,

t_{IK}

and

t_{FBF}

, illustrated in Figure 4.

This work does not propose dedicated hardware reference models for the CPD-HSD, JPD-HSD, CPD-HMD, JPD-HMD and FCS modules as there are several techniques and algorithms that can be applied to them. However, considering the hardware time constraints,

t_{hardware}

, it is noted that it is also important to use dedicated hardware structures with as FPGA-based circuits for these modules. Studies in the literature foresee the use of AI based techniques for these modules; however, it is essential to note that AI techniques and algorithms implemented on general purpose processor-based hardware platforms can lead to higher processing times [19,20,21,22,23,24,25].

6. Implementation Description

The FK-HMD and KFF-HMD hardware modules associated with the master device (HMD) and the IK-HSD, FK-HSD, and FBF-HSD hardware modules associated with the slave device (HSD) (Figure 4) were designed using a parallel implementation in order to prioritize the processing speed. The implementations were designed in FPGA using a hybrid scheme with fixed point and floating point representation in distinct parts of the proposed architecture. In the portions that adopt the fixed point format, the variables follow a notation expressed as

[s V . N]

indicating that the variable is formed by V bits of which N bits are intended for the fractional part and the s symbol indicates that the variable is signed. In this case, the number of bits intended for the integer part is

V - N - 1

. For the representation of floating point variables, the notation [F32] is adopted. Most of the implemented circuits were designed using a 32-bit single precision (IEEE754) floating point format representation. The fixed point format was used only on the circuit that implements the trigonometric function block (TFB) module, as illustrated in Figure 5. TFB is the module responsible for performing trigonometric operations through the hardware implementation of CORDIC (COordinate Rotation DIgital Computer) [55]. For that, a Xilinx CORDIC IP Core was used. This implementation uses data representation in a fixed-point format using the

[s 16.13]

representation.

As illustrated in Figure 5, the TFB module receives data from external circuits in the 32-bits floating point standard. A conversion to the fixed point numeric representation type represented by the

[s 16.13]

notation is performed through the Float to Fixed-point (F2FP) module that has been implemented in hardware. After the CORDIC hardware operations are performed, the data in the fixed point format is transformed back to the 32-bit floating point through the Fixed-point to Float (FP2F) module which was also implemented in hardware.

Several of the proposed methods to be presented use the constants

L_{1}

,

L_{2}

,

L_{3}

and

L_{4}

. They represent physical characteristics of the PHANToM Omni device as illustrated in Figure 2. These constants use the 32-bit floating point numeric representation.

6.1. Forward Kinematics (FK-HMD and FK-HSD)

As illustrated in Figure 4, both the hardware associated with the master device (HMD) and the hardware associated with the slave device (HSD) implement forward kinematics through the FK-HMD and FK-HSD modules, respectively. These modules have the same FPGA-implemented circuit, differing only in the input and output signals. They are designed to work with three input signals, one for each component of the angular positioning of the device’s joints, and three output signals, one for each component of the the positioning of the device’s tool in the Cartesian system. The input signals are

θ_{1} [F 32] (n)

,

θ_{2} [F 32] (n)

and

θ_{3} [F 32] (n)

and the output signals are

x [F 32] (n)

,

y [F 32] (n)

and

z [F 32] (n)

. For FK-HMD, the input signals represent the

θ_{1}^{M D} [F 32] (n)

,

θ_{2}^{M D} [F 32] (n)

and

θ_{3}^{M D} [F 32] (n)

signals, and the output signals represent the

x^{H M D} [F 32] (n)

,

y^{H M D} [F 32] (n)

and

z^{H M D} [F 32] (n)

signals. In the case of the FK-HSD module, the input signals represent the signals

θ_{1}^{S D} [F 32] (n)

,

θ_{2}^{S D} [F 32] (n)

and

θ_{3}^{S D} [F 32] (n)

and the output signals represent the signals

x^{E N V} (n)

,

y^{E N V} (n)

and

z^{E N V} (n)

. At every n-th instant all the computation performed in order to calculate the forward kinematics are executed in parallel.

Based on Equation (15), the algorithm used for calculating

x [F 32] (n)

was implemented in FPGA through the generic circuit illustrated in Figure 6. The circuit was designed to work with three input signals

θ_{1} [F 32] (n)

,

θ_{2} [F 32] (n)

and

θ_{3} [F 32] (n)

and one output signal. These signals are forwarded to TFB sub circuits where sine and cosine calculations are performed. For this process the constants

L_{1}

and

L_{2}

, three multipliers, one inverter and one adder are used.

The calculation of

y [F 32] (n)

based on Equation (16) was implemented in FPGA through the generic circuit shown in Figure 7. The circuit was designed to work with two input signals

θ_{2} [F 32] (n)

and

θ_{3} [F 32] (n)

and one output signal. These signals are routed to TFB sub circuits to perform sine and cosine calculations. In the process flow two multipliers, two adders, one inverter and the constants

L_{1}

and

L_{2}

are used.

The generic circuit illustrated in Figure 8 was implemented in FPGA to perform the calculation of

z [F 32] (n)

and it is based on Equation (17). The circuit is designed to work with three input signals

θ_{1} [F 32] (n)

,

θ_{2} [F 32] (n)

and

θ_{3} [F 32] (n)

and one output signal. These signals are routed to TFB sub circuits in order to perform sine and cosine calculations. In the process flow four multipliers, two adders, one inverter and the constants

L_{1}

,

L_{2}

and

L_{4}

are used.

In the FK-HMD module the

θ_{1}^{M D} [F 32] (n)

,

θ_{2}^{M D} [F 32] (n)

and

θ_{3}^{M D} [F 32] (n)

input signals are received through the

b (n)

array (Equation (43) in Section 5), then all calculation are performed in parallel resulting in the

c (n)

array (Equation (44) in Section 5) with the

x^{H M D} [F 32] (n)

,

y^{H M D} [F 32] (n)

and

z^{H M D} [F 32] (n)

signals as shown in Figure 4. For the FK-HSD module the

θ_{1}^{S D} [F 32] (n)

,

θ_{2}^{S D} [F 32] (n)

and

θ_{3}^{S D} [F 32] (n)

input signals enter the module via the

θ^{S D} (n)

array (Equation (49) in Section 5) and after performing all parallel computations, the resulting signals

x^{E N V} (n)

,

y^{E N V} (n)

and

z^{E N V} (n)

are output from the module via the

l (n)

array (Equation (49) in Section 5).

6.2. Inverse Kinematics (IK-HSD)

The hardware associated with the slave device (HSD) implements the inverse kinematics through the IK-HSD module, as shown in Figure 4. The IK-HSD FPGA-implemented circuit is designed to work with three input signals

x^{H S D} [F 32] (n)

,

y^{H S D} [F 32] (n)

and

z^{H S D} [F 32] (n)

and three output signals

θ_{1}^{H S D} [F 32] (n)

,

θ_{2}^{H S D} [F 32] (n)

and

θ_{3}^{H S D} [F 32] (n)

. However, to calculate

θ_{2}^{H S D} [F 32] (n)

(Equation (24)) and

θ_{3}^{H S D} [F 32] (n)

(Equation (25)) it is first necessary to perform intermediate calculations to obtain the values of

R [F 32] (n)

,

r [F 32] (n)

,

β [F 32] (n)

,

γ [F 32] (n)

and

α [F 32] (n)

Based on Equations (18), (24) and (25), algorithms for calculating

θ_{1}^{H S D} [F 32] (n)

,

θ_{2}^{H S D} [F 32] (n)

and

θ_{3}^{H S D} [F 32] (n)

were implemented in FPGA through the generic circuits illustrated in Figure 9, Figure 10 and Figure 11 respectively.

As already described, and according to the illustrations shown in Figure 10 and Figure 11, to perform the calculations of

θ_{2}^{H S D} [F 32] (n)

and

θ_{3}^{H S D} [F 32] (n)

it is first necessary to perform the intermediate calculations of

γ [F 32] (n)

(Equation (21)),

β [F 32] (n)

(Equation (22)) and

α [F 32] (n)

(Equation (23)). However, these calculations depend on the calculation of

R [F 32] (n)

and

r [F 32] (n)

. Then, when the IK-HSD module receives the input signals at every n-th instant the circuit shown in Figure 9 performs the calculation of

θ_{1}^{H S D} [F 32] (n)

in parallel with the generic circuits illustrated in Figure 12 and Figure 13 which were implemented in FPGA to perform the calculation of

R [F 32] (n)

and

r [F 32] (n)

based on Equations (19) and (20).

The circuit shown in Figure 12 used to obtain

R [F 32] (n)

, is designed to work with two input signals

x^{H S D} [F 32] (n)

and

z^{H S D} [F 32] (n)

and one output signal. This design contains two multipliers, two adders, the

L_{4}

constant and a sub-circuit called Sqrt, which was implemented in hardware to calculate the square root.

The

r [F 32] (n)

calculation is performed through the circuit shown in Figure 13. This circuit is designed to work with three input signals

x^{H S D} [F 32] (n)

,

y^{H S D} [F 32] (n)

and

z^{H S D} [F 32] (n)

and one output signal. The circuit consists of three multipliers, four adders, one inverter, the constants

L_{3}

and

L_{4}

, and, again, the

S q r t

sub-circuit.

After the parallel processing of

θ_{1}^{H S D} [F 32] (n)

,

R [F 32] (n)

and

r [F 32] (n)

, the circuits responsible for calculating

γ [F 32] (n)

,

β [F 32] (n)

and

α [F 32] (n)

are also executed in parallel through the FPGA implementations of the generic circuits illustrated in Figure 14, Figure 15 and Figure 16. The value of

γ [F 32] (n)

is obtained through the circuit shown in Figure 14 which is based on Equation (21). The circuit is designed to work with an input signal

r [F 32] (n)

and one output signal. It consists of five multipliers, two adder, one divisor, one TFB sub-circuit to calculate the arccosine and the constants

L_{1}

and

L_{2}

.

The circuit for obtaining

β [F 32] (n)

illustrated in Figure 15 is based on Equation (22) and is designed to work with two input signals

y^{H S D} [F 32] (n)

and

R [F 32] (n)

and one output signal. The circuit is composed of one adder, one inverter, a TFB sub-circuit to perform the arctangent calculation and the

L_{3}

constant.

The value of

α [F 32] (n)

is obtained from the circuit shown in Figure 16 which is based on Equation (23) and is designed to work with an input signal

r [F 32] (n)

and one output signal. The circuit is composed of five multipliers, two adders, one inverter, one divider, one TFB sub-circuit to perform the arccosine calculation and the constants

L_{1}

and

L_{2}

.

To complete the process, after performing the calculations of

β [F 32] (n)

,

γ [F 32] (n)

and

α [F 32] (n)

, it is possible to obtain the

θ_{2}^{H S D} [F 32] (n)

and

θ_{3}^{H S D} [F 32] (n)

values in parallel through the circuits shown in Figure 10 and Figure 11.

6.3. KKinesthetic Feedback Force (KFF-HMD)

As illustrated in Figure 4, the hardware associated with the master device (HMD) implements the kinesthetic feedback force through the KFF-HMD module. Based on Equation (26), the KFF-HMD module was implemented in FPGA through the generic circuit illustrated in Figure 17. This circuit is composed of sub-circuits that correspond to parts of Equation (26). The sub-circuit called JM, described in Equation (28), is responsible for calculating the Jacobian matrix. The KFF sub-circuit makes the relationship between the Jacobian matrix (JM) module and the force array from Equation (38).

The circuit shown in Figure 17 has the input signals

θ_{1}^{M D} [F 32] (n)

,

θ_{2}^{M D} [F 32] (n)

and

θ_{3}^{M D} [F 32] (n)

that are received from the master device (MD) and also the

F_{x} [F 32] (n)

,

F_{y} [F 32] (n)

and

F_{z} [F 32] (n)

signals that are received from the hardware associated to the slave device (HSD). The three output signals are:

τ_{1}^{H M D} [F 32] (n)

,

τ_{2}^{H M D} [F 32] (n)

and

τ_{3}^{H M D} [F 32] (n)

.

The JM module that represents the sub-circuit responsible for performing the Jacobian matrix calculation consists of nine elements:

J_{11} [F 32] (n)

,

J_{21} [F 32] (n)

,

J_{31} [F 32] (n)

,

J_{12} [F 32] (n)

,

J_{22} [F 32] (n)

,

J_{32} [F 32] (n)

,

J_{13} [F 32] (n)

,

J_{23} [F 32] (n)

and

J_{33} [F 32] (n)

. The calculation of

J_{21} [F 32] (n)

based on Equation (30) does not have an associated circuit since its value is 0, i.e.,

J_{21} [F 32] (n) = 0

. Based on Equation (29), the algorithm for calculating

J_{11} [F 32] (n)

was implemented in FPGA according to the generic circuit illustrated in Figure 18. The circuit was designed to work with three input signals and one output signal. It uses the constants

L_{1}

and

L_{2}

and has three TFB sub-circuits: two for performing the cosine calculation and one for obtaining the sine value.

The calculation of

J_{31} [F 32] (n)

, based on Equation (31), was implemented in FPGA according to the generic circuit illustrated in Figure 19. The circuit was designed to work with three input signals and one output signal. The circuit has three TFB modules, two for sine calculation and one for cosine value and uses the

L_{1}

and

L_{2}

constants.

The generic circuit illustrated in Figure 20 was implemented in FPGA to perform the calculation of

J_{12} [F 32] (n)

and is based on Equation (32). The circuit was designed to work with two input signals and one output signal. The circuit has two TFB sub circuits to perform sine calculation and uses the

L_{1}

constant.

Based on Equation (33), the algorithm for calculating

J_{22} [F 32] (n)

was implemented in FPGA according to the generic circuit illustrated in Figure 21. The circuit was designed to work with one input signal and one output signal. The circuit has a TFB sub-circuit to perform cosine calculation and uses the constant

L_{1}

.

The calculation of

J_{32} [F 32] (n)

based on Equation (34) was implemented in FPGA according to the generic circuit illustrated in Figure 22. The circuit was designed to work with two input signals and one output signal. In addition to the use of the constant

L_{1}

, the circuit has two TFB sub circuits, one for performing the cosine calculation and one for the sine.

The generic circuit illustrated in Figure 23 was implemented in FPGA to perform the calculation of

J_{13} [F 32] (n)

and which is based on Equation (35). The circuit was designed to work with two inputs and one output signal. In addition to using the constant

L_{2}

, the circuit has two TFB sub circuits, one for performing cosine calculation and one for the sine.

Based on Equation (36), the algorithm for calculating

J_{23} [F 32] (n)

was implemented in FPGA according to the generic circuit illustrated in Figure 24. The circuit was designed to work with one input signal and one output signal. The circuit contains a TFB sub-circuit to perform the sine calculation and uses the

L_{2}

constant.

The calculation of

J_{33} [F 32] (n)

, based on Equation (37), was implemented in FPGA according to the generic circuit illustrated in Figure 25. The circuit was designed to work with two input signals and one output signal. In addition to the use of constant

L_{2}

, the circuit has two TFB sub-circuits to perform the cosine calculation.

All displayed circuits related to the JM sub-circuits are calculated in parallel at each n-th instant. The results are then sent to the KFF module which also performs the calculations of

τ_{1}^{H M D} [F 32] (n)

,

τ_{2}^{H M D} [F 32] (n)

and

τ_{3}^{H M D} [F 32] (n)

in parallel. The KF circuit shown in Figure 17 is designed to work with twelve input signals and three output signals.

Based on Equation (39), the algorithm for calculating

τ_{1}^{H M D} [F 32] (n)

was implemented in FPGA according to the generic circuit illustrated in Figure 26. The circuit was designed to work with six inputs and one output.

The calculation of

τ_{2}^{H M D} [F 32] (n)

based on Equation (40) was implemented in FPGA according to the generic circuit illustrated in Figure 27. The circuit was designed to work with six inputs and one output.

The generic circuit illustrated in Figure 28 has been implemented in FPGA to perform the calculation of

τ_{3}^{H M D} [F 32] (n)

and it is based on Equation (41). The circuit was designed to work with six inputs and one output.

6.4. Feedback Force (FBF-HSD)

As illustrated in Figure 4 the hardware associated with the slave device (HSD) implements the feedback force via the FBF-HSD module. The FPGA-implemented circuit of the FBF-HSD module is designed to work with six input signals and three output signals. Among the six input variables,

x^{O B J} [F 32] (n)

,

y^{O B J} [F 32] (n)

and

z^{O B J} [F 32] (n)

represent the spatial position of the closest object to the SD tool and the other three

x^{E N V} [F 32] (n)

,

y^{E N V} [F 32] (n)

and

z^{E N V} [F 32] (n)

represent the spatial position of the SD tool in the ENV module. The three outputs

F_{x}^{H S D} [F 32] (n)

,

F_{y}^{H S D} [F 32] (n)

and

F_{z}^{H S D} [F 32] (n)

represent the touch of the tool on the object. The variables

h_{x}

,

h_{y}

and

h_{z}

represent the elasticity coefficients associated with the object. All FBF-HSD module calculations are performed in parallel.

Based on Equation (60), the algorithm used for calculating

F_{x}^{H S D} [F 32] (n)

was implemented in FPGA according to the generic circuit illustrated in Figure 29. The circuit was designed to work with two inputs signals

x^{O B J} [F 32] (n)

and

x^{E N V} [F 32] (n)

and one variable

h_{x}

.

The calculation of

F_{y}^{H S D} [F 32] (n)

, based on Equation (61), was implemented in FPGA according to the generic circuit illustrated in Figure 30. The circuit was designed to work with two input signals

y^{O B J} [F 32] (n)

and

y^{E N V} [F 32] (n)

and one variable

h_{y}

.

The generic circuit shown in Figure 31 was implemented in FPGA to perform the calculation of

F_{z}^{H S D} [F 32] (n)

and it is based on Equation (62). The circuit was designed to work with two input signals

z^{O B J} [F 32] (n)

and

z^{E N V} [F 32] (n)

and one variable

h_{z}

.

7. Results

The entire tactile internet model infrastructure presented in Figure 4 was implemented with the purpose of validating the FPGA hardware implementation. A spatial trajectory that represents the data sent by the OP through the

a (n)

(Equation (42)) signal was created to validate the entire developed environment.

The created trajectory performs a variation in all of the three angles of the MD articulation (Figure 3). For this, it was first considered that the MD is in the initial angular position expressed as

θ_{1}^{M D} (0) = 0

,

θ_{2}^{M D} (0) = 0

and

θ_{3}^{M D} (0) = 0

, which corresponds to the spatial position

x^{O P} (0) = 0

,

y^{O P} (0) = - 0.107

and

z^{O P} (0) = - 0.035

of the tool as illustrated in Figure 32. Initially, the first joint is moved to

θ_{1}^{M D} (v n) = p i / 2

where v represents a quantity of samples that is equal to 4 s, thus resulting in the position

x^{O P} (v n) = - 0.132

,

y^{O P} (v n) = - 0.107

and

z^{O P} (v n) = - 0.167

. Then, the second joint is moved to

θ_{2}^{M D} (v n) = p i / 4

which results in the position

x^{O P} (v n) = - 0.093

,

y^{O P} (v n) = - 0.013

and

z^{O P} (v n) = - 0.167

and, finally, the third joint moves up to

θ_{3}^{M D} (v n) = p i / 4

, thus resulting in the

x^{O P} (v n) = - 0.186

,

y^{O P} (v n) = 0.025

and

z^{O P} (v n) = - 0.167

position. The path created is within the limits of the device workspace and takes a total time of

t_{1} = 12

s of which 4 s are used to perform the movement of each joint.

In an effort to validate the circuits from the implemented modules in FPGA, equivalent software models were used to compare the results of both implementations. The software models use a 32-bit floating point format while the hardware modules run a parallel implementation with a hybrid representation which uses both a 32-bit floating point and a fixed point representation in different parts of the proposed architecture, as presented in Section 6. In all scenarios, the signal sampling rate (or throughput) was

R_{s} = \frac{1}{t_{s}}

(samples per second), where

t_{s}

is the time between the n-th samples.

From the experimental results, the mean square error (MSE) between the software model and the hardware implementation proposed by this work was calculated using the MSE which can be expressed as

M S E = \frac{1}{Q} \sum_{n = 0}^{Q - 1} {(M^{S W} [F 32] (n) - M [F 32] (n))}^{2},

(72)

where Q represents the number of tested samples,

M^{S W} [F 32] (n)

corresponds to the variables of the software model and

M [F 32] (n)

corresponds to the variables of the model implemented in FPGA.

The quantity of tested samples for the results presented here is

Q = 1200

, which correspond to the quantity of samples of the generated trajectory. The variables that correspond to the hardware model

M [F 32] (n)

vary according to the module in which it was implemented. In the case of forward kinematics, as the FK-HMD and FK-HSD modules have the same implementation, the values corresponding to the variables

x [F 32] (n)

,

y [F 32] (n)

and

z [F 32] (n)

change according to the respective module. For the FK-HMD module, these variables correspond to

x^{H M D} [F 32] (n)

,

y^{H M D} [F 32] (n)

and

z^{H M D} [F 32] (n)

and for the FK-HSD module the same variables correspond to

x^{E N V} [F 32] (n)

,

y^{E N V} [F 32] (n)

and

z^{E N V} [F 32] (n)

as presented in Section 6. For inverse kinematics, the variables

M [F 32] (n)

of the IK-HSD module correspond to

θ_{1}^{H S D} [F 32] (n)

,

θ_{2}^{H S D} [F 32] (n)

and

θ_{3}^{H S D} [F 32] (n)

. For the kinesthetic feedback force, the variables

M [F 32] (n)

of the KFF-HMD module correspond to

τ_{1}^{H M D} [F 32] (n)

,

τ_{2}^{H M D} [F 32] (n)

and

τ_{3}^{H M D} [F 32] (n)

. For the feedback force, the variables

M [F 32] (n)

of the FBF-HSD module correspond to

F_{x}^{H S D} [F 32] (n)

,

F_{y}^{H S D} [F 32] (n)

and

F_{z}^{H S D} [F 32] (n)

. Finally, in the MSE equation the

M^{S W} [F 32] (n)

corresponds to the same variables as the software-implemented model.

Table 1 shows the mean square error between the software models and the hardware ones proposed in this paper. The obtained MSE-related results prove to be noteworthy, showing that the forward kinematics (FK-HMD and FK-HSD), inverse kinematics (IK-HSD), kinesthetic feedback force (KFF-HMD) and feedback force (FBF-HSD) modules had an acceptable response, even when using a hybrid representation, compared to the software model that uses a floating point representation. It can be observed that for the variables of the FK-HMD and FK-HSD modules the error was in the range of

10^{- 08}

, for the IK-HSD module the error was of

10^{- 06}

, for the variables of the KFF-HMD module the error was of

10^{- 07}

and for the FBF-HSD module the error was in the range of

10^{- 16}

. These values demonstrate that the FPGA implementations presented an equivalent behavior to the software models.

In a hardware implementation, it is important to analyze some requirements post-synthesis such as available hardware usage and the execution time. In the case of FPGAs, the resources are measured through the use of lookup tables (LUTs), Registers and Digital Signal Processors (DSPs) units, to name a few. After validating the hardware-implemented models, synthesis results were obtained using the implementation designed for an FPGA Xilinx Virtex 6 XC6VLX240T-1FF1156. The used Virtex 6 FPGA has

37, 680

slices that group

301, 440

flip-flops,

150, 720

logical cells that can be used to implement logical functions or memories, and 768 DSP cells with multipliers and accumulators. The implementations and results used the Matlab/Simulink and the Xilinx System Generator.

Table 2 presents the post-synthesis results related to FPGA resource utilization, sampling rate, and throughput for the modules FK-HMD, KFF-HMD, FK-HSD, IK-HSD, and FBF-HSD. The first column shows the name of the module, the next three columns called registers, LUTs and multipliers represent the amounts of resources used in the FPGA. The column register represents the number of flip-flops that were used, followed by the total percentage used. The column LUTs represents the number of LUTs that were used, followed by the total percentage used. In addition, the column multipliers represents the number of DPS48 internal multipliers that were used, followed by the total percentage used. The

t_{s}

column represents the sampling rate in nanoseconds that was obtained for each hardware module. Finally, the

R_{s}

column displays throughput (

R_{s} = \frac{1}{t_{s}}

) values in mega-samples per second for the hardware modules.

The synthesis results presented in Table 2 show that the resources used for the FK-HMD and FK-HSD modules were the same. This means that each module, individually, used a percentage of

1.01

% which is equivalent to 3041 of the available hardware resources for the registers, was used

5.31

% with LUTs, and

1.43

% for embedded multipliers DSP48. The IK-HSD module had a hardware percentage consumption of

1.04

% for registers,

9.36

% for LUTs and

3.52

% for multipliers. The KFF-HMD module had a consumption of

1.03

%,

8.13

% and

6.25

% for registers, LUTs and multipliers, respectively. Finally, the FBF-HSD module used a percentage of

0.11

% for registers,

0.82

% for LUTs and

1.17

% for multipliers.

Based on data presented in Table 2, the HMD modules (FK-HMD and KFF-HMD) that is associated with the MD device has consumed 6154 (

2.04

%) for register,

20, 259

(

13.44

%) for LUTs and 59 (

7.68

%) for multipliers. In the case of hardware associated with the SD device, the HSD modules (FK-HSD, IK-HSD and FBF-HSD) had consumed 6513 (

2.16

%) for register,

23, 351

(

15.49

%) for LUTs and 47 (

6.12

%) for multipliers.

The hardware resources consumed by the HMD hardware modules and the HSD hardware modules were very low. Even if all modules are implemented in single hardware, the consumption remains low. The total sum of hardware resources used in the FPGA by all modules (FK-HMD, KFF-HMD, FK-HSD, IK-HSD and FBF-HSD) was:

12, 667

(

4.20

%) for register,

43, 610

(

28.93

%) for LUTs and 106 (

13.80

%) for multipliers. The low hardware resources consumption demonstrates that the proposed implementations take up little hardware space in the FPGA which allows other separate implementations to be used concomitantly.

As per Table 2, the throughput values,

R_{s}

, obtained were significant. Values of

21.27 MSps

for the FK-HMD and FK-HSD modules,

4.58 MSps

for the IK-HSD module,

14.28 MSps

for the KFF-HMD module and

47.61 MSps

for the FBF-HSD module were achieved. These results enable critical applications that demand strict time constraints, as is the case with tactile internet applications. The times presented in Table 2 show the critical path (path in the entire design with the maximum delay) on FPGA.

In Table 3, it is possible to see the speedup obtained about latency time constraints. The first column shows the latency constraints for

1 ms

and

10 ms

[9,10,11,12]. The second column shows the minimum latency values required for the application to function normally (these values are the estimates obtained by Equation (71) for both time constraints). The third column shows the latency related to the hardware implementation presented here. The speedup, fourth column, is calculated using the values of minimum time, Latency Limit, for each constraint and the time of the proposed hardware. It is worth mentioning that this is an estimate to guide the calculations.

The

1 ms

restriction corresponds to the maximum latency limit of

37.5 μ s

for acceptable hardware performance. For the

10 ms

constraint, the maximum limit is

375 μ s

. The value

t_{hardware}

that is presented in Table 3 and according to Equation (71), corresponds to the sum of the latencies of the five implemented modules (Table 2), two modules are associated with the MD device (FK-HMD and KFF-HMD) and three modules are associated with the SD device (FK-HSD, IK-HSD, and FBF-HSD).

Thus, the presented value of

403 ns

in Table 3 corresponds to the sum of the two modules related to the master component, which has a total of

117 ns

of which

47 ns

come from the FK-HMD module and

70 ns

from the KFF-HMD module together with the sum of the three modules referring to the slave component, which has a total of

286 ns

of which

47 ns

derives from the FK-HSD module,

218 ns

from IK-HSD and

21 ns

from the FBS-HSD module. So for the

1 ms

constraint, the implementation presented a

93 \times

speedup relative to the

37.5 μ s

, and for the

10 ms

constraint, the speedup was

930 \times

relatives to the

375 μ s

limit.

The sample rates resulted from the five modules that were implemented in this work were notably fast. The values obtained contributed to the hardware meeting the time constraint limits required in a tactile internet environment. Hardware latency showed values significantly below the required constraints, as shown in Table 3. These values are well below the 30% presented in the literature and due to the fact that the communication medium demands 70% of application latency, this value can be increased as the latency of hardware devices showed to be significantly low. In other words, it can be said that the remaining latency not spent on the hardware devices can be consumed in the network.

It is important to remember that in a more complex tactile internet environment, there are several others more algorithms to be implemented in hardware such as prediction algorithms, dynamic control, AI based techniques, etc. However, as the proposed implementations present low hardware resource consumption, other necessary modules, as the ones previously mentioned, could also be implemented in the same shared hardware since resources would still be available.

Table 4 presents comparisons of the results obtained by the proposed implementation of this work with equivalent results found in works from the state of the art. The first column indicates the references of related works. The next two columns show the used FPGA platform and the amount of degrees of freedom of the used device. The fourth column presents the type of numerical representation used in the implementation and, finally, the last four columns present the times obtained by each reference for latency added by the forward kinematics (FK), inverse kinematics (IK), the kinesthetic force feedback (KFF) and feedback force (FBF) modules, respectively.

As described in Table 4, a hardware model for calculating the forward kinematics of a 5-DoF device is presented in [30]. For the trigonometric calculations, a Taylor series expansion was implemented in FPGA for computing the sine, cosine, and tangent arc functions. The proposed hardware was implemented using a 32-bit floating-point representation. The total time to perform the calculations was

1240 ns

. The calculations are performed in parallel. Comparing to the forward kinematics (FK) implementation using 32-bit floating-point proposed by this work, the speedup was 26.38× over the model presented in [30].

The work presented in [31] shows the results of an implementation of the inverse kinematics module using floating-point 32-bit representation. Three types of implementations are presented, but only the one with the best performance was compared. For that, it was used an Altera Cyclone IV FPGA, in which a microprocessor system based on the Nios II soft–processor was build. This processor enables to perform operations such as hardware summation multiplication, subtraction division and square root. The equations allow partial parallelization of individual operations, decreasing the computation time. The kinematic model was designed to work with a 3-DoF device, and the time required to calculate is

143000 ns

. When compared with the proposal of inverse kinematics (IK) presented in this work, which uses 32-bit floating-point representation, this implementation presented a speedup of 655.96× over in relation to the model proposed by [31].

The kinematics models presented in [32] described in Table 4, presented data regarding the forward and inverse kinematics implementations for controlling a 6-DoF device using the 32-bit fixed-point representation. The modules were implemented using 21-bit for the fractional part and 11-bit in the integer part. For the forward kinematics (FK),

3000 ns

are required to perform all calculations, and for inverse kinematics (IK),

4500 ns

is required. Based on the results of the implementations presented in this section, the implementation proposed for this work using floating-point representation had a speedup of

63.82 \times

for forward kinematics and

20.64 \times

for the inverse kinematics.

The research presented in [33] proposed a hardware implementation of inverse kinematics to control a 10-DoF device. Although the robotic model has 10 Dof, the equations for the calculations are just composed of subtraction and division calculations. Regarding trigonometric calculations, only the tangent arc is used in the equations, and the square root used through the CORDIC module. The hardware was projected using the 32-bit fixed-point representation, however the amount of bits used in the fractional part was not specified. The architecture proposed to calculate the inverse kinematics requires

440 ns

to perform the computation. All calculations are performed by the hardware in parallel. Comparing to the inverse kinematics (IK) implementation using 32-bit floating-point proposed by this work, the speedup was

2.01 \times

over the model presented in [33]. The processing time has a lower value when considering the DoFs, but this is due to the fact that the algorithms are less complex.

The authors in [34] present the results of fixed-point implementation for forward and inverse kinematics to control a 5-DoF device, as described in Table 4. The proposed hardware implementation uses the numerical representation of 32-bit (15-bit to fractional part) and 16-bit (7-bit to fractional part) in different parts of the modules. The equations associated with the calculation of the forward and inverse kinematics make use of the trigonometric functions sine, cosine, arctangent, and arccosine. To perform the arctangent and arccosine, the Taylor series expansion was used. The time required to perform the calculations is

680 ns

and

940 ns

for forward and inverse kinematics, respectively. Comparing to the floating-point implementation proposed by this work, the speedup was

14.46 \times

for forward kinematic and

4.31 \times

for inverse kinematic over the model presented in [34].

Differently from previous works (Table 4), in [35], the authors present unique hardware for calculating forward and inverse kinematics together. In the proposed model, the 32-bit fixed-point representation was used. The total time to perform the calculation is

2000 ns

. The time obtained was calculated taking into account the entire process duration; however, separate times for each module were not specified. Given this scenario, by adding the

t_{s}

FK module time that calculates forward kinematics with the IK module, the total time resulting from both implementations reaches

265 ns

, according to Table 4. Hence, the hardware presented in the work here developed achieved a

7.54 \times

speedup over the model presented in [35].

Differently from previous works (Table 4), in [35], the authors present unique hardware for calculating forward and inverse kinematics together. The hardware computes all calculations in parallel. The computation of forward and inverse kinematics share the same processing time. An ARM processor was used to make the communication part between the modules, and the FPGA is used to calculate the kinematics model. The CORDIC module was used to perform trigonometric calculations. In the proposed model, the 32-bit fixed-point representation was used. The total time to perform the calculation is

2000 ns

. The time obtained was calculated taking into account the entire process duration; however, separate times for each module were not specified. Given this scenario, by adding the

t_{s}

FK module time that calculates forward kinematics with the IK module, the total time resulting from both implementations reaches

265 ns

, according to Table 4. Hence, the hardware presented in the work here developed achieved a

7.54 \times

speedup over the model presented in [35].

It can be seen from Table 4, that none of the works from the state-of-the-art presented the hardware implementation of all four robotics algorithms that were presented here. It is also noted that just two works used the floating-point numerical representation. The floating-point implementation of robotics algorithms proposed by this work showed significant gains when compared to the works presented in the literature as shown in Table 4. The different amounts of degrees of freedom (DoF) used in the devices can somehow influence in values of sample rate and throughput. Another factor that can also influence these values is in relation to the type of FPGA that is used to perform the synthesis. Due to the fact that the implementation of this work was designed in a parallel architecture, the increase in the amount of DoF does not necessarily reflect in a significant increase in sample rate.

8. Conclusions

This paper presented an FPGA hardware reference model for four modules implementing robotics-associated algorithms. The FK-HMD and FK-HSD modules implement the forward kinematics, the IK-HSD module implements the inverse kinematics, the KFF-HMD module implements the kinesthetic feedback force, and the FBF-HSM module implements the feedback force. The parallel FPGA implementation of the four modules is intended to increase the tactile system’s processing speed to meet the latency constraints required for tactile internet applications. The modules were designed using a full-parallel implementation which works on a hybrid scheme that uses fixed point and floating point representation in distinct parts of the architecture. Compared to the state-of-the-art, this work describes and implements four different robotics algorithms in FPGA. The implementations presented in this work achieve higher module processing speed when compared to equivalent implementations from the state-of-the-art. All the modules presented here were analyzed based on the synthesis results, which included the FPGA resource utilization, sampling rate, and yield. Based on the synthesis results, it was observed that the implementations achieved high module processing speed, far below the latency limit of

1 ms

. Hardware modules accelerated

93 \times

compared to the

37.5 μ s

time constraint. This work demonstrates that using embedded systems on devices such as FPGAs enables parallel algorithm implementation, thus speeding up data processing and minimizing execution time. Runtime gains can make processing time possible for critical applications that require short time constraints or a large amount of data to be processed in a short time frame.

Author Contributions

All the authors have contributed in various degrees to ensure the quality of this work (e.g., J.C.V.S.J., T.M., M.D. and M.A.C.F. conceived the idea and experiments; J.C.V.S.J., T.M., M.D. and M.A.C.F. designed and performed the experiments; J.C.V.S.J., S.N.S., M.F.T., T.M., M.D. and M.A.C.F. analyzed the data; J.C.V.S.J., S.N.S., M.F.T., T.M., M.D. and M.A.C.F. wrote the paper. T.M., M.D. and M.A.C.F. coordinated the project). All authors have read and agreed to the published version of the manuscript.

Funding

This study was financed in part by the Coordenação de Aperfeiçoamento de Pessoal de Nível Superior (CAPES)—Finance Code 001.

Acknowledgments

The authors wish to acknowledge the financial support of the Coordenação de Aperfeiçoamento de Pessoal de Nível Superior (CAPES) for their financial support.

Conflicts of Interest

The authors declare no conflict of interest.

References

Fettweis, G. The Tactile Internet: Applications and Challenges. Veh. Technol. Mag. IEEE 2014, 9, 64–70. [Google Scholar] [CrossRef]
Dohler, M. The tactile internet IoT, 5G and cloud on steroids. In Proceedings of the 5G Radio Technology Seminar, Exploring Technical Challenges in the Emerging 5G Ecosystem, London, UK, 15 March 2015; pp. 1–16. [Google Scholar] [CrossRef]
Aijaz, A.; Dohler, M.; Aghvami, A.H.; Friderikos, V.; Frodigh, M. Realizing The Tactile Internet: Haptic Communications over Next Generation 5G Cellular Networks. arXiv 2015, arXiv:1510.02826. [Google Scholar] [CrossRef] [Green Version]
Berg, D.V.D.; Glans, R.; Koning, D.D.; Kuipers, F.A.; Lugtenburg, J.; Polachan, K.; Venkata, P.T.; Singh, C.; Turkovic, B.; Wijk, B.V. Challenges in Haptic Communications Over the Tactile Internet. IEEE Access 2017, 5, 23502–23518. [Google Scholar] [CrossRef]
Moskvitch, K. Tactile Internet: 5G and the Cloud on steroids. Eng. Technol. 2015, 10, 48–53. [Google Scholar] [CrossRef]
Holland, O.; Steinbach, E.; Prasad, R.V.; Liu, Q.; Dawy, Z.; Aijaz, A.; Pappas, N.; Chandra, K.; Rao, V.S.; Oteafy, S.; et al. The IEEE 1918.1 “Tactile Internet” Standards Working Group and its Standards. Proc. IEEE 2019, 107, 256–279. [Google Scholar] [CrossRef] [Green Version]
Maier, M.; Chowdhury, M.; Rimal, B.P.; Van, D.P. The tactile internet: Vision, recent progress, and open challenges. IEEE Commun. Mag. 2016, 54, 138–145. [Google Scholar] [CrossRef]
Simsek, M.; Aijaz, A.; Dohler, M.; Sachs, J.; Fettweis, G. The 5G-Enabled Tactile Internet: Applications, requirements, and architecture. In Proceedings of the 2016 IEEE Wireless Communications and Networking Conference, Doha, Qatar, 3–6 April 2016; pp. 1–6. [Google Scholar] [CrossRef]
Li, C.; Li, C.P.; Hosseini, K.; Lee, S.B.; Jiang, J.; Chen, W.; Horn, G.; Ji, T.; Smee, J.E.; Li, J. 5G-based systems design for tactile Internet. Proc. IEEE 2018, 107, 307–324. [Google Scholar] [CrossRef]
Antonakoglou, K.; Xu, X.; Steinbach, E.; Mahmoodi, T.; Dohler, M. Toward Haptic Communications Over the 5G Tactile Internet. IEEE Commun. Surv. Tutor. 2018, 20, 3034–3059. [Google Scholar] [CrossRef]
Nasrallah, A.; Thyagaturu, A.S.; Alharbi, Z.; Wang, C.; Shao, X.; Reisslein, M.; ElBakoury, H. Ultra-low latency (ULL) networks: The IEEE TSN and IETF DetNet standards and related 5G ULL research. IEEE Commun. Surv. Tutorials 2018, 21, 88–145. [Google Scholar] [CrossRef] [Green Version]
Simsek, M.; Aijaz, A.; Dohler, M.; Sachs, J.; Fettweis, G. 5G-enabled tactile internet. IEEE J. Sel. Areas Commun. 2016, 34, 460–473. [Google Scholar] [CrossRef] [Green Version]
Szabo, D.; Gulyas, A.; Fitzek, F.H.; Fitzek, F.H.; Lucani, D.E. Towards the Tactile Internet: Decreasing Communication Latency with Network Coding and Software Defined Networking. In Proceedings of the European Wireless 2015; 21th European Wireless Conference, Budapest, Hungary, 20–22 May 2015; pp. 1–6. [Google Scholar]
Dohler, M.; Mahmoodi, T.; Lema, M.A.; Condoluci, M.; Sardis, F.; Antonakoglou, K.; Aghvami, H. Internet of skills, where robotics meets AI, 5G and the Tactile Internet. In Proceedings of the 2017 European Conference on Networks and Communications (EuCNC), Oulu, Finland, 12–15 June 2017; pp. 1–5. [Google Scholar] [CrossRef] [Green Version]
Sachs, J.; Andersson, L.A.A.; Araújo, J.; Curescu, C.; Lundsjö, J.; Rune, G.; Steinbach, E.; Wikström, G. Adaptive 5G Low-Latency Communication for Tactile InternEt Services. Proc. IEEE 2019, 107, 325–349. [Google Scholar] [CrossRef]
Maier, M.; Ebrahimzadeh, A. Towards immersive tactile Internet experiences: Low-latency FiWi enhanced mobile networks with edge intelligence. IEEE/OSA J. Opt. Commun. Netw. 2019, 11, B10–B25. [Google Scholar] [CrossRef]
Mekikis, P.; Ramantas, K.; Antonopoulos, A.; Kartsakli, E.; Sanabria-Russo, L.; Serra, J.; Pubill, D.; Verikoukis, C. NFV-Enabled Experimental Platform for 5G Tactile Internet Support in Industrial Environments. IEEE Trans. Ind. Inform. 2020, 16, 1895–1903. [Google Scholar] [CrossRef]
Yu, Q.; Wang, C.; Ma, X.; Li, X.; Zhou, X. A Deep Learning Prediction Process Accelerator Based FPGA. In Proceedings of the 2015 15th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing, Shenzhen, China, 4–7 May 2015; pp. 1159–1162. [Google Scholar] [CrossRef]
De Souza, A.C.; Fernandes, M.A. Parallel fixed point implementation of a radial basis function network in an fpga. Sensors 2014, 14, 18223–18243. [Google Scholar] [CrossRef] [PubMed] [Green Version]
Da Costa, A.L.X.; Silva, C.A.D.; Torquato, M.F.; Fernandes, M.A.C. Parallel Implementation of Particle Swarm Optimization on FPGA. IEEE Trans. Circuits Syst. II Express Briefs 2019, 66, 1875–1879. [Google Scholar] [CrossRef] [Green Version]
Coutinho, M.G.F.; Torquato, M.F.; Fernandes, M.A.C. Deep Neural Network Hardware Implementation Based on Stacked Sparse Autoencoder. IEEE Access 2019, 7, 40674–40694. [Google Scholar] [CrossRef]
Torquato, M.F.; Fernandes, M.A.C. High-Performance Parallel Implementation of Genetic Algorithm on FPGA. Circ. Syst. Signal Process. 2019. [Google Scholar] [CrossRef] [Green Version]
Da Silva, L.M.D.; Torquato, M.F.; Fernandes, M.A.C. Parallel Implementation of Reinforcement Learning Q-Learning Technique for FPGA. IEEE Access 2019, 7, 2782–2798. [Google Scholar] [CrossRef]
Lopes, F.F.; Ferreira, J.C.; Fernandes, M.A.C. Parallel Implementation on FPGA of Support Vector Machines Using Stochastic Gradient Descent. Electronics 2019, 8, 631. [Google Scholar] [CrossRef] [Green Version]
Noronha, D.H.; Torquato, M.F.; Fernandes, M.A. A parallel implementation of sequential minimal optimization on FPGA. Microprocess. Microsystems 2019, 69, 138–151. [Google Scholar] [CrossRef]
Arjun, N.; Ashwin, S.M.; Polachan, K.; Prabhakar, T.V.; Singh, C. An End to End Tactile Cyber Physical System Design. In Proceedings of the 2018 4th International Workshop on Emerging Ideas and Trends in the Engineering of Cyber-Physical Systems (EITEC), Porto, Portugal, 11 April 2018; pp. 9–16. [Google Scholar] [CrossRef]
O’Malley, M.K.; Sevcik, K.S.; Kopp, E. Improved haptic fidelity via reduced sampling period with an FPGA-based real-time hardware platform. J. Comput. Inf. Sci. Eng. 2009, 9, 011002. [Google Scholar] [CrossRef]
Tanaka, H.; Ohnishi, K.; Nishi, H. Haptic communication system using FPGA and real-time network framework. In Proceedings of the Industrial Electronics, IECON’09, 35th Annual Conference of IEEE, Osaka, Japan, 3–7 July 2009; pp. 2931–2936. [Google Scholar]
Franc, M.; Hace, A. A study on the FPGA implementation of the bilateral control algorithm towards haptic teleoperation. Autom. -J. Control. Meas. Electron. Comput. Commun. 2013, 54. [Google Scholar] [CrossRef]
Sánchez, D.F.; Mu noz, D.M.; Llanos, C.H.; Motta, J.M. A reconfigurable system approach to the direct kinematics of a 5 dof robotic manipulator. Int. J. Reconfigurable Comput. 2010, 2010. [Google Scholar] [CrossRef] [Green Version]
Gac, K.; Karpiel, G.; Petko, M. FPGA based hardware accelerator for calculations of the parallel robot inverse kinematics. In Proceedings of the 2012 IEEE 17th International Conference on Emerging Technologies Factory Automation (ETFA 2012), Krakow, Poland, 17–21 September 2012; pp. 1–4. [Google Scholar] [CrossRef]
Wu, M.; Kung, Y.; Huang, Y.; Jung, T. Fixed-point computation of robot kinematics in FPGA. In Proceedings of the 2014 International Conference on Advanced Robotics and Intelligent Systems (ARIS), Taipei, Taiwan, 6–8 June 2014; pp. 35–40. [Google Scholar] [CrossRef]
Wong, C.C.; Liu, C.C. FPGA realisation of inverse kinematics for biped robot based on CORDIC. Electron. Lett. 2013, 49, 332–334. [Google Scholar] [CrossRef]
Linh, H.; Thi, B.; Kung, Y.S. Digital hardware realization of forward and inverse kinematics for a five-axis articulated robot arm. Math. Probl. Eng. 2015, 2015. [Google Scholar]
Jiang, Z.; Dai, Y.; Zhang, J.; He, S. Kinematics calculation of minimally invasive surgical robot based on FPGA. In Proceedings of the 2017 IEEE International Conference on Robotics and Biomimetics (ROBIO), Macau, China, 5–8 December 2017; pp. 1726–1730. [Google Scholar] [CrossRef]
Steinbach, E.; Strese, M.; Eid, M.; Liu, X.; Bhardwaj, A.; Liu, Q.; Al-Ja’afreh, M.; Mahmoodi, T.; Hassen, R.; El Saddik, A.; et al. Haptic Codecs for the Tactile Internet. Proc. IEEE 2019, 107, 447–470. [Google Scholar] [CrossRef] [Green Version]
Geomagic. Phantom Omni, Device Guide. Online. Available online: https://support.3dsystems.com/s/article/Haptic-Device-Guides (accessed on 12 October 2022).
Song, G.; Guo, S.; Wang, Q. A Tele-operation system based on haptic feedback. In Proceedings of the IEEE International Conference on Information Acquisition, Veihai, China, 20–23 August 2006; pp. 1127–1131. [Google Scholar] [CrossRef]
Sansanayuth, T.; Nilkhamhang, I.; Tungpimolrat, K. Teleoperation with inverse dynamics control for PHANToM Omni haptic device. In Proceedings of the 2012 SICE Annual Conference (SICE), Akita, Japan, 22–23 August 2012; pp. 2121–2126. [Google Scholar]
Silva, A.J.; Ramirez, O.A.D.; Vega, V.P.; Oliver, J.P.O. Phantom omni haptic device: Kinematic and manipulability. In Proceedings of the Electronics, Robotics and Automotive Mechanics Conference, CERMA’09, Cuernavaca, Mexico, 22–25 September 2009; pp. 193–198. [Google Scholar]
Cavusoglu, M.C.; Feygin, D. Kinematics and Dynamics of Phantom (tm) Model 1.5 Haptic Interface; EECS Department, University of California, Berkeley: Berkeley, CA, USA, 2001. [Google Scholar]
San Martin, J.; Trivi no, G. A study of the Manipulability of the PHANToM OMNI Haptic Interface. In Proceedings of the VRIPHYS, Madrid, Spain, 6–7 November 2006; pp. 127–128. [Google Scholar]
MATLAB. Simulink—R2016a; The MathWorks Inc.: Natick, MA, USA, 2016. [Google Scholar]
Kumar, A.; Gaidhane, P.J.; Kumar, V. A nonlinear fractional order PID controller applied to redundant robot manipulator. In Proceedings of the 2017 6th International Conference on Computer Applications In Electrical Engineering-Recent Advances (CERA), Roorkee, India, 5–7 October 2017; pp. 527–532. [Google Scholar] [CrossRef]
Yang, C.; Ma, H.; Fu, M. Intelligent Control of Robot Manipulator. In Advanced Technologies in Modern Robotic Applications; Springer: Singapore, 2016; pp. 49–96. [Google Scholar] [CrossRef]
Rahimi, H.; Nazemizadeh, M. Dynamic analysis and intelligent control techniques for flexible manipulators: A review. Adv. Robot. 2014, 28, 63–76. [Google Scholar] [CrossRef]
Tang, S.H.; Ang, C.K.; Ariffin, M.K.A.B.M.; Mashohor, S.B. Predicting the Motion of a Robot Manipulator with Unknown Trajectories Based on an Artificial Neural Network. Int. J. Adv. Robot. Syst. 2014, 11, 176. [Google Scholar] [CrossRef]
Chen, Y.; Li, L. Predictable Trajectory Planning of Industrial Robots with Constraints. Appl. Sci. 2018, 8, 2648. [Google Scholar] [CrossRef] [Green Version]
Xiang, Y. Simulation and Analysis of Three-Dimensional Space Path Prediction for Six-Degree-of-Freedom (SDOF) Manipulator. 3D Res. 2019, 10, 15. [Google Scholar] [CrossRef]
Bócsi, B.; Nguyen-Tuong, D.; Csató, L.; Schölkopf, B.; Peters, J. Learning inverse kinematics with structured prediction. In Proceedings of the 2011 IEEE/RSJ International Conference on Intelligent Robots and Systems, San Francisco, CA, USA, 25–30 September 2011; pp. 698–703. [Google Scholar] [CrossRef]
Shen, S.; Song, A.; Li, T. Predictor-Based Motion Tracking Control for Cloud Robotic Systems with Delayed Measurements. Electronics 2019, 8, 398. [Google Scholar] [CrossRef] [Green Version]
Yang, C.; Xie, Y.; Liu, S.; Sun, D. Force Modeling, Identification, and Feedback Control of Robot-Assisted Needle Insertion: A Survey of the Literature. Sensors 2018, 18, 561. [Google Scholar] [CrossRef] [PubMed] [Green Version]
Junior, J.C.V.S.; Torquato, M.F.; Noronha, D.H.; Silva, S.N.; Fernandes, M.A.C. Proposal of the Tactile Glove Device. Sensors 2019, 19, 5029. [Google Scholar] [CrossRef] [PubMed] [Green Version]
Weber, P.; Rueckert, E.; Calandra, R.; Peters, J.; Beckerle, P. A low-cost sensor glove with vibrotactile feedback and multiple finger joint and hand motion sensing for human-robot interaction. In Proceedings of the 2016 25th IEEE International Symposium on Robot and Human Interactive Communication (RO-MAN), New York, NY, USA, 26–31 August 2016; pp. 99–104. [Google Scholar]
Volder, J.E. The CORDIC trigonometric computing technique. IRE Trans. Electron. Comput. 1959, EC-8, 330–334. [Google Scholar] [CrossRef]

Figure 1. Proposed discrete model of the tactile internet system.

Figure 2. PHANToM Omni—MD and SD.

Figure 3. PHANToM Omni structure—MD and SD.

Figure 4. Detailed discrete model of a tactile internet system.

Figure 5. Proposed circuit for calculating trigonometric functions—TFB.

Figure 6. Proposed forward kinematics circuit for obtaining the

x [F 32] (n)

spatial coordinate (Equation (15))—FK-HMD and FK-HSD.

Figure 6. Proposed forward kinematics circuit for obtaining the

x [F 32] (n)

spatial coordinate (Equation (15))—FK-HMD and FK-HSD.

Figure 7. Proposed forward kinematics circuit for obtaining the

y [F 32] (n)

spatial coordinate (Equation (16))—FK-HMD and FK-HSD.

Figure 7. Proposed forward kinematics circuit for obtaining the

y [F 32] (n)

spatial coordinate (Equation (16))—FK-HMD and FK-HSD.

Figure 8. Proposed forward kinematics circuit for obtaining the

z [F 32] (n)

spatial coordinate (Equation (17))—FK-HMD and FK-HSD.

Figure 8. Proposed forward kinematics circuit for obtaining the

z [F 32] (n)

spatial coordinate (Equation (17))—FK-HMD and FK-HSD.

Figure 9. Proposed inverse kinematics circuit for obtaining the

θ_{1}^{H S D} [F 32] (n)

angular position (Equation (18))—IK-HSD.

Figure 9. Proposed inverse kinematics circuit for obtaining the

θ_{1}^{H S D} [F 32] (n)

angular position (Equation (18))—IK-HSD.

Figure 10. Proposed inverse kinematics circuit for obtaining the

θ_{2}^{H S D} [F 32] (n)

angular position (Equation (24))—IK-HSD.

Figure 10. Proposed inverse kinematics circuit for obtaining the

θ_{2}^{H S D} [F 32] (n)

angular position (Equation (24))—IK-HSD.

Figure 11. Proposed inverse kinematics circuit for obtaining the

θ_{3}^{H S D} [F 32] (n)

angular position (Equation (25))—IK-HSD.

Figure 11. Proposed inverse kinematics circuit for obtaining the

θ_{3}^{H S D} [F 32] (n)

angular position (Equation (25))—IK-HSD.

Figure 12. Proposed circuit to perform the calculation of

R [F 32] (n)

(Equation (19))—IK-HSD.

Figure 12. Proposed circuit to perform the calculation of

R [F 32] (n)

(Equation (19))—IK-HSD.

Figure 13. Proposed circuit to perform the calculation of

r [F 32] (n)

(Equation (20))—IK-HSD.

Figure 13. Proposed circuit to perform the calculation of

r [F 32] (n)

(Equation (20))—IK-HSD.

Figure 14. Proposed circuit to perform the calculation of

γ [F 32] (n)

(Equation (21))—IK-HSD.

Figure 14. Proposed circuit to perform the calculation of

γ [F 32] (n)

(Equation (21))—IK-HSD.

Figure 15. Proposed circuit to perform the calculation of

β [F 32] (n)

(Equation (22))—IK-HSD.

Figure 15. Proposed circuit to perform the calculation of

β [F 32] (n)

(Equation (22))—IK-HSD.

Figure 16. Proposed circuit to perform the calculation of

α [F 32] (n)

(Equation (23))—IK-HSD.

Figure 16. Proposed circuit to perform the calculation of

α [F 32] (n)

(Equation (23))—IK-HSD.

Figure 17. Proposed circuit to calculate kinesthetic feedback force (Equation (26))—KFF-HMD.

Figure 18. Proposed circuit to calculate the Jacobian matrix

J_{11} [F 32] (n)

(Equation (29))—JM.

Figure 18. Proposed circuit to calculate the Jacobian matrix

J_{11} [F 32] (n)

(Equation (29))—JM.

Figure 19. Proposed circuit to calculate the Jacobian matrix

J_{31} [F 32] (n)

(Equation (31))—JM.

Figure 19. Proposed circuit to calculate the Jacobian matrix

J_{31} [F 32] (n)

(Equation (31))—JM.

Figure 20. Proposed circuit to calculate the Jacobian matrix

J_{12} [F 32] (n)

(Equation (32))—JM.

Figure 20. Proposed circuit to calculate the Jacobian matrix

J_{12} [F 32] (n)

(Equation (32))—JM.

Figure 21. Proposed circuit to calculate the Jacobian matrix

J_{22} [F 32] (n)

(Equation (33))—JM.

Figure 21. Proposed circuit to calculate the Jacobian matrix

J_{22} [F 32] (n)

(Equation (33))—JM.

Figure 22. Proposed circuit to calculate the Jacobian matrix

J_{32} [F 32] (n)

(Equation (34))—JM.

Figure 22. Proposed circuit to calculate the Jacobian matrix

J_{32} [F 32] (n)

(Equation (34))—JM.

Figure 23. Proposed circuit to calculate the Jacobian matrix

J_{13} [F 32] (n)

(Equation (35))—JM.

Figure 23. Proposed circuit to calculate the Jacobian matrix

J_{13} [F 32] (n)

(Equation (35))—JM.

Figure 24. Proposed circuit to calculate the Jacobian matrix

J_{23} [F 32] (n)

(Equation (36))—JM.

Figure 24. Proposed circuit to calculate the Jacobian matrix

J_{23} [F 32] (n)

(Equation (36))—JM.

Figure 25. Proposed circuit to calculate the Jacobian matrix

J_{33} [F 32] (n)

(Equation (37))—JM.

Figure 25. Proposed circuit to calculate the Jacobian matrix

J_{33} [F 32] (n)

(Equation (37))—JM.

Figure 26. Proposed circuit to calculate the torque of the

τ_{1}^{H M D} [F 32] (n)

joint (Equation (39))—KFF.

Figure 26. Proposed circuit to calculate the torque of the

τ_{1}^{H M D} [F 32] (n)

joint (Equation (39))—KFF.

Figure 27. Proposed circuit to calculate the torque of the

τ_{2}^{H M D} [F 32] (n)

joint (Equation (40))—KFF.

Figure 27. Proposed circuit to calculate the torque of the

τ_{2}^{H M D} [F 32] (n)

joint (Equation (40))—KFF.

Figure 28. Proposed circuit to calculate the torque of the

τ_{3}^{H M D} [F 32] (n)

joint (Equation (41))—KFF.

Figure 28. Proposed circuit to calculate the torque of the

τ_{3}^{H M D} [F 32] (n)

joint (Equation (41))—KFF.

Figure 29. Proposed circuit to calculate the feedback force

F_{x}^{H S D} [F 32] (n)

(Equation (60))—FBF-HSD.

Figure 29. Proposed circuit to calculate the feedback force

F_{x}^{H S D} [F 32] (n)

(Equation (60))—FBF-HSD.

Figure 30. Proposed circuit to calculate the feedback force

F_{y}^{H S D} [F 32] (n)

(Equation (61))—FBF-HSD.

Figure 30. Proposed circuit to calculate the feedback force

F_{y}^{H S D} [F 32] (n)

(Equation (61))—FBF-HSD.

Figure 31. Proposed circuit to calculate the feedback force

F_{z}^{H S D} [F 32] (n)

(Equation (62))—FBF-HSD.

Figure 31. Proposed circuit to calculate the feedback force

F_{z}^{H S D} [F 32] (n)

(Equation (62))—FBF-HSD.

Figure 32. Trajectory used to validate hardware modules.

Table 1. Mean squared error (MSE) results for floating-point implementation.

Module	Variable	MSE
FK-HMD	$x^{H M D} [F 32] (n)$	$2.333 \times 10^{- 8}$
	$y^{H M D} [F 32] (n)$	$8.316 \times 10^{- 9}$
	$z^{H M D} [F 32] (n)$	$1.656 \times 10^{- 8}$
KFF-HMD	$τ_{1}^{H M D} [F 32] (n)$	$1.467 \times 10^{- 7}$
	$τ_{2}^{H M D} [F 32] (n)$	$5.207 \times 10^{- 9}$
	$τ_{3}^{H M D} [F 32] (n)$	$3.350 \times 10^{- 7}$
FK-HSD	$x^{E N V} [F 32] (n)$	$2.333 \times 10^{- 8}$
	$y^{E N V} [F 32] (n)$	$8.316 \times 10^{- 9}$
	$z^{E N V} [F 32] (n)$	$1.656 \times 10^{- 8}$
IK-HSD	$θ_{1}^{H S D} [F 32] (n)$	$3.731 \times 10^{- 6}$
	$θ_{2}^{H S D} [F 32] (n)$	$2.847 \times 10^{- 6}$
	$θ_{3}^{H S D} [F 32] (n)$	$2.702 \times 10^{- 6}$
FBF-HSD	$F_{x}^{H S D} [F 32] (n)$	$2.437 \times 10^{- 16}$
	$F_{y}^{H S D} [F 32] (n)$	$1.731 \times 10^{- 16}$
	$F_{z}^{H S D} [F 32] (n)$	$3.360 \times 10^{- 16}$

Table 2. FPGA resource utilization, sampling rate and throughput results for floating-point format.

Module Name	Registers (Flip-Flops)	LUTs	Multipliers (DSP48)	$t_{s}$ (ns)	$R_{s}$ (MSps)
FK-HMD	3041 ( $1.01$ %)	8008 ( $5.31$ %)	11 ( $1.43$ %)	47	$21.27$
KFF-HMD	3113 ( $1.03$ %)	12,251 ( $8.13$ %)	48 ( $6.25$ %)	70	$14.28$
FK-HSD	3041 ( $1.01$ %)	8008 ( $5.31$ %)	11 ( $1.43$ %)	47	$21.27$
IK-HSD	3149 ( $1.04$ %)	14,107 ( $9.36$ %)	27 ( $3.52$ %)	218	$4.58$
FBF-HSD	323 ( $0.11$ %)	1236 ( $0.82$ %)	9 ( $1.17$ %)	21	$47.61$

Table 3. Hardware speedup related to the time limits for the

1 ms

and

10 ms

latency constraints.

Table 3. Hardware speedup related to the time limits for the

1 ms

and

10 ms

latency constraints.

Time Restriction	Latency Limit	$t_{hardware}$	Speedup
$1 ms$	$37.5 μ s$	$403 ns$	93×
$10 ms$	$375 μ s$	$403 ns$	930×

Table 4. Comparative table with state of the art works.

Reference	Device	DoF	Data Type	FK	IK	KFF	FBF
This work	Virtex 6	3	Floating P.	$47 ns$	$218 ns$	$70 ns$	$21 ns$
[30]	Virtex 2	5	Floating P.	$1240 ns$	-	-	-
[31]	Cyclone IV	3	Floating P.	-	$143, 000 ns$	-	-
[32]	Unknown	6	Fixed P.	$3000 ns$	$4500 ns$	-	-
[33]	Cyclone IV	10	Fixed P.	-	$440 ns$	-	-
[34]	Cyclone IV	5	Fixed P.	$680 ns$	$940 ns$	-	-
[35]	Artix 7	3	Fixed P.	$2000 ns$		-	-

Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

© 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Junior, J.C.V.S.; Silva, S.N.; Torquato, M.F.; Mahmoodi, T.; Dohler, M.; Fernandes, M.A.C. FPGA Applied to Latency Reduction for the Tactile Internet. Sensors 2022, 22, 7851. https://doi.org/10.3390/s22207851

AMA Style

Junior JCVS, Silva SN, Torquato MF, Mahmoodi T, Dohler M, Fernandes MAC. FPGA Applied to Latency Reduction for the Tactile Internet. Sensors. 2022; 22(20):7851. https://doi.org/10.3390/s22207851

Chicago/Turabian Style

Junior, José C. V. S., Sérgio N. Silva, Matheus F. Torquato, Toktam Mahmoodi, Mischa Dohler, and Marcelo A. C. Fernandes. 2022. "FPGA Applied to Latency Reduction for the Tactile Internet" Sensors 22, no. 20: 7851. https://doi.org/10.3390/s22207851

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

FPGA Applied to Latency Reduction for the Tactile Internet

Abstract

1. Introduction

2. Related Work

3. Discrete Model of the Tactile Internet

4. PHANToM Omni Device Model (MD & SD)

4.1. Forward Kinematics

4.2. Inverse Kinematics

4.3. Kinesthetic Feedback Force

5. Simulated Tactile Internet Model

6. Implementation Description

6.1. Forward Kinematics (FK-HMD and FK-HSD)

6.2. Inverse Kinematics (IK-HSD)

6.3. KKinesthetic Feedback Force (KFF-HMD)

6.4. Feedback Force (FBF-HSD)

7. Results

8. Conclusions

Author Contributions

Funding

Acknowledgments

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI