1. Introduction
There has been a significant advancement towards ‘smart’ and ‘smarter’ systems due to the integration of smart objects into the existing and new infrastructure of today’s data-intensive applications [
1,
2]. This move has led to the evolution of IoT which is currently driving sectors such as agriculture, manufacturing, smart healthcare, etc. [
3,
4,
5,
6]. IoT devices range from simple wearable to large machines, each containing sensor chips or microcontrollers [
7]. These devices can occur in physical or virtual space. In physical space, consider humans, vehicles, residences, computers, switches, routers, smart devices, road networks, office buildings, etc. In virtual space, consider software, data streams, virtual machines, virtual networks, etc. [
8].
The building blocks of IoT include: Sensor, Aggregator, Communication channel, eUtility, and Decision Trigger. Sensors measure physical properties by employing mechanical, electrical, chemical, optical, or other effects at an interface to a controlled process or open environment. An aggregator is a software implementation based on mathematical function(s) that transforms groups of raw data into intermediate, aggregated data. Aggregators help in managing ‘big’ data. A communication channel is a medium by which data is transmitted (e.g., wireless, wired, etc.). An eUtility is an abstraction of unforeseen future services that can be incorporated in future types of IoTs yet to be defined. It may include databases, mobile devices, software or hardware systems, etc. A decision trigger creates the final result(s) needed to satisfy the purpose, specification, and requirements of a specific IoT device.
Most IoT devices follow a layered architecture comprising of three or four layers [
9,
10,
11,
12,
13,
14,
15,
16]; as shown in
Figure 1.
The perception layer comprises the physical level of objects and how they interact with the surrounding environment by collecting and processing information [
17]. This level includes objects that can interact with the external world and are also equipped with computing capability. The network layer, also known as the communication layer, transports the data provided by the perception layer to the application layer. This layer can be broken down into six (6) sublayers, viz., PHY, MAC, transport, network, session, and application. It includes all the technologies and protocols that make this connection possible. There are several protocols used by IoT devices [
1,
2]. The choice of the communication protocol is dependent on the type of technology been used; Zigbee, BLE, NFC, Z-Wave, LPWAN etc. [
18]. Some network layer protocols have been developed for low computational devices of which IoT devices form part. For instance, to meet the requirements of WSN, the 6LoWPAN protocol and other routing protocols such as RPL were created. Also, to enhance the security of traversing data, TLS (for TCP) and DTLS (for UDP) were developed. Application layer messaging protocols such as CoAP, MQTT, AMQP, etc have also been developed and tailored specifically for M2M type of communication. The support layer enhances the operation of the other layers, providing storage and computing services. The application layer includes all the software necessary to offer a specific service. The data from the previous levels are stored, aggregated, filtered, processed, and later used in making informed decisions or used in the provision of real-time IoT applications.
Information sharing among IoT devices is made possible by the use of application layer messaging protocols such as CoAP, WebSocket, DDS, XMPP and AMQP. These protocols are not optimal and efficient in instances where header and message size complexities, together with resource constraints, are of great concern. Besides, with the myriad growth of IoT devices, it is expected that these messaging protocols are optimal and efficient in supporting D2D communication. This research paper proposes a lightweight messaging protocol to reduce message header complexity without compromising the security of the protocol. The remaining sections of the research paper are organized as follows:
Section 2 reviews related literature. The research motivation is presented in
Section 3.
Section 4 describes the research approach used in the study.
Section 5 discusses the results.
Section 6 concludes the research paper.
2. Related Works
The most common application layer messaging protocols used by IoT devices include: CoAP, WebSocket, DDS, XMPP and AMQP.
CoAP is a specialized web transfer protocol for use with constrained nodes and constrained networks. The work on CoRE aims at realizing the REST architecture in a suitable form for constrained nodes. Constrained nodes such as 6LoWPAN support the fragmentation of IPv6 into small link-layer frames; however, this causes a significant reduction in packet delivery probability. CoAP has been designed to keep the message overhead small, thus limiting the need for fragmentation [
19]. It supports both UDP [
20] and TCP transport protocols with the default being UDP. It has optional reliability supporting both unicast and multicast requests. It supports asynchronous message exchanges and also has simple proxy and caching capabilities.
The WebSocket protocol was developed to address a high overhead due to HTTP polling. Bidirectional communication between a client and a server has required an abuse of HTTP to poll the server for updates while sending upstream notifications as distinct HTTP calls [
21]. This results in problems such as the server being forced to use several different underlying TCP connections for each client, one for sending information to the client and a new one for each incoming message. Besides, the wire protocol has a high overhead, with each client-to-server messaging having an HTTP header. Furthermore, the client-side script is forced to maintain a mapping from the outgoing connections to the incoming connections to track replies. The WebSocket protocol addresses these problems by using a single TCP connection for traffic in both directions. Combined with WebSocket API, it provides an alternative to HTTP polling for two-way communication between a client, and a server [
22].
Some IoT device communication is based on a publish/subscribe protocol such as MQTT. MQTT is a client-server publish/subscribe messaging transport protocol applicable in constrained environments for communication in M2M and IoT contexts. The protocol runs over TCP/IP, or over other network protocols that provide ordered, lossless, bi-directional connections. The use of the publish/subscribe message pattern provides one-to-many message distribution and decoupling of applications [
23].
XMPP is also used for the near-real-time exchange of information. It is an application profile of the XML that enables the exchange of structured yet extensible data (called “XML stanzas”) between any two or more network entities [
24,
25]; based on TCP. It is typically implemented using a distributed client-server architecture, wherein a client needs to connect to a server in order to gain access to the network and thus be allowed to exchange XML stanzas with other entities (which can be associated with other servers). Within XMPP, one server can optionally connect to another server to enable inter-domain or inter-server communication after the communicating servers negotiate a connection between themselves.
DDS is a middleware protocol and API for data-centric connectivity. It integrates the components of a system, provides low-latency data connectivity, extreme reliability, and scalable architecture that business and mission-critical IoT applications need. The middleware is a software layer that lies between the operating system and applications, as shown in
Figure 2. It enables the various components of the system to communicate and share data more easily. It abstracts the application for the details of the operating system, network transport, and low-level data formats. Low-level details like data wire format, discovery, connections, reliability, protocols, transport selection, QoS, security, etc., are managed by the middleware.
AMQP is a binary wire-level protocol that allows the reliable exchange of messages between two entities. It is a corporate messaging protocol designed for reliability, security, provisioning, and interoperability and supports both request/response and publish/subscribe architectures. The protocol also offers a wide range of features related to messaging, such as reliable queueing, topic-based publish-and-subscribe messaging, flexible routing, and transactions. AMQP communication system requires that either the publisher or consumer creates an “exchange" with a given name and then broadcasts that name. Publishers and consumers use the name of this exchange to discover each other. Messages are exchanged in various ways: directly, by topic, or based on headers [
27].
A summary of the mostly used application layer messaging protocols is shown in
Table 1.
Performance of IoT devices and applications are significantly influenced by choice of messaging protocols. The application layer messaging protocols are pervasive and different from each other. For example, CoAP has the smallest message size and overhead as compared to the other messaging protocols. However, MQTT is lightweight and has the most diminutive header size of 2-bytes per message, but its requirement of TCP connection increases the overall overhead, and thus the whole message size. AMQP is also a lightweight binary protocol; however, its support for security, reliability, provisioning, and interoperability increases the overhead and message size. WebSocket requires the highest power resource than any other protocols, and then it decreases for the other protocols with CoAP requiring the lowest power and resource [
27].
4. Methodology
This section presents LiMP, a lightweight messaging protocol, based on CoAP. We discuss CoAP in much details and also present how a lighter version was developed. CoAP deals with interchanges asynchronously over transport protocols such as UDP (default) and TCP. This is done logically using a layer of messages that supports optional reliability with exponential back-off. It defined four (4) types of messages: Confirmable, Non-confirmable, Acknowledgment, Reset.
Logically, CoAP can be considered as using a two-layer approach; a messaging layer and the asynchronous nature of the interactions (the request/response interactions using Method and Response Codes) as shown in
Figure 3. The CoAP messaging model is based on the exchanges of messages over UDP/TCP between endpoints and may also be used over DTLS and TLS.
4.1. CoAP Stack
CoAP messages are transport over UDP by default (i.e., each CoAP message occupies the data section of one UDP datagram). CoAP messages are encoded in a simple binary format. The message format is shown in
Figure 4.
A detail representation of the various sections in shown in
Figure 5.
The message format starts with a fixed 4-byte header (indicated as (1) in
Figure 5. The header consists of a the following fields: Version (
VER), Type (
T), Token Length (
TKL),
Code and a message identifier (
Message ID). The ‘VER’ field indicates the CoAP version. The ‘Type (T)’ field indicates whether a particular message is confirmable, non-confirmable, is an acknowledgement or a reset (a retransmission). The ‘Code’ is designated for request/response codes. It is similar to HTTP request/response codes which are used to indicate client/server response formats. For instance, a successful request to a server will return a status code. This status code can be found in the ‘Code’ field of the response message. The ‘Message ID’ is an identifier for requests/response. It is used to match messages of type Acknowledgement/Reset to messages of type Confirmable/Non-confirmable. A summary of CoAP’s header fields is shown in
Table 2.
The header field is followed by a ‘Token’ field (marked as (2) in
Figure 5) which is used to match responses to requests independently from the underlying messages. After the ‘Token’ field comes the ‘Options’ field (marked as (3) in
Figure 5); which is made up of some metadata such as the message format, ETag, etc. A payload marker comes after the ‘Options’ field. This marks the beginning of the actual message payload; marked in
Figure 5 as (4). The presence of a marker followed by a zero-length payload is processed as a message format error.
4.2. Analysis of the CoAP Stack
The current CoAP message format indicates the ability to add a
Token (whose length is specified by the TKL) for each request and response. Also, the
Message-ID is used to match each request to a response.
Figure 6 shows a sample packet analysis output of a point-to-point communication between two IoT devices. It can be observed that a sample request can be sent without a Token which sets the TKL to 0 in the message header.
Both the Token and Message-ID are redundant since both seem to perform the same role. Besides, the length of the Token payload is encoded as 4 bits in the header (which makes the maximum Token payload 15 bytes). Even though a maximum Token payload of 16 bytes can be encoded, the default specification of CoAP processes the TKL length from 9–15 bytes as a message format error.
Furthermore, Four (4) request and twenty-one (21) response Method Codes are supported in the CoAP message format;
GET,
POST,
PUT and
DELETE. All other Method Codes are unassigned. These response codes were borrowed from the HTTP request and response code formats. Not all these requests and response codes are used in the normal communication between network entities; hence can be simplified to make the protocol further lightweight. The ‘Option’ field in the CoAP’s message is delta encoded and defines several options that be included in a message. Fourteen (14) default ‘Option’ formats are supported. A sample of these fields can be seen after the Message-ID in
Figure 6. These ‘Option’ formats describe the structure of the message sent specifying fields such as Content-Format, ETag, Max-Age, etc. Most of these Option Formats are unused, which makes them redundant.
4.3. LiMP Stack
The LiMP message supports both TCP and UDP transport protocols. It is a simplistic protocol where messages are encoded in a simple binary format. Its simplicity is achieved by removing the redundant and unused fields in the standard CoAP implementation. The message format is shown in
Figure 7.
Figure 8 gives a detailed information of the message format. The message format starts with a fixed-size 2-byte header (marked as (1)). The message header comprises of the following fields: Version (
VER), Type (
T),
Code, Content-Format (
CF) and a message identifier (
Message ID).
The header fields are described as follows:
Version (VER): VER is a 2-bit unsigned integer. Indicates the LiMP version number. The default value is set to 1 (01 in binary) to indicate its initial release.
Type (T): The T field is a 2-bit unsigned integer. Indicates of the message is of the type Confirmable (0), Non-confirmable (1), Acknowledgement (2), or Reset (2).
Code: This field stores a 2-bit unsigned integer. Indicates the request and response Method Codes. GET (0), POST (1), BadRequest (2), ServiceUnavailable (3). Detailed responses to requests can be included in the message payload.
Content-Format (CF): The CF is a 2-bit unsigned integer. It indicates the type of encoding of the payload contents. It was narrowed down to two(2) most common content-formats; AppXML (XML format)—0 and AppJSON (JSON format)—1.
Message-ID: The Message-ID portion is an 8-bit unsigned integer in network byte order. It is used to match requests to responses. This is 8-bits lesser than CoAP’s implementation. To deal with request/response replays by an attacker or a malicious device, an ETag is included in the message payload to detect such attacks since 8-bit or 16-bit (in the case of CoAP) message-IDs are computationally easy for an attacker to compute. The ETag is a hexadecimal value resulting from a bitwise operation of the Message-ID and a current timestamp embedded in the message payload. Further details are provided in
Section 5.1.
The header is followed by the message payload whose content format is indicated by the ‘CF’ field.
4.4. Benchmark Test and Analysis
A benchmark analysis of CoAP and LiMP was conducted on some embedded devices to evaluate the flexibility and efficiency of the proposed message protocol. A TCP and UDP versions of CoAP and LiMP were implemented and used in the benchmark test. The ARM devices used were: Nvidia Jetson Nano (
N), Nvidia Tegra (
T), Raspberry Pi (
P) and and a Phone (
I); as shown in
Figure 9. ARM-based devices were selected because most IoT devices are of such architecture.
The Nvidia Jetson Nano and the Tegra have an ARM Cortex-A57 MPCore processor, the Raspberry Pi is an ARM Cortex-A53 processor, and the phone is an Apple A10 Fusion chipset. All the IoT devices except the phone were running an ARM-based Debian operating system, whereas the phone was running iOS. Two categories of the benchmark test was considered, (as shown in
Figure 10) viz., Intra-Devices communication, and Out-of-domain communication. The intra-communication test was done on the same LAN and the out-of-domain communication test was done between the devices and a remote server (16 hops from the local network). The key performance indicators include:
The RTT is the measured time taken for an IoT device to send a request and receive a response. A total of 14 (
Point-to-Point Communication)
x 4 (
Protocols) scenarios were used in the evaluation. A base implementation of CoAP and LiMP (written in
Golang) was used in the evaluation analysis (Source code is available at:
https://github.com/jayluxferro/levis accessed on 6 January 2022).
5. Results and Discussion
This section presents an analysis of the benchmark test and discusses the findings.
Table 3 shows the default PDU sizes (in bytes) of LiMP and CoAP. The PDU size of LiMP-TCP is 16% smaller than the CoAP-TCP. Furthermore, the PDU size of LiMP-UDP is 23% smaller than the CoAP-UDP.
Figure 11 shows the RTT for a point-to-point communication between two Jetson Nano devices. LiMP-TCP outperformed the CoAP-TCP. The LiMP-UDP outperformed all the other protocols; averaging 18 milliseconds (ms). This amounts to an efficiency of 17.758% more than CoAP.
In
Figure 12, the LiMP-UDP outperformed all the rest; averaging 5.7 ms. Similar observations were made in
Figure 13,
Figure 14,
Figure 15 and
Figure 16. The efficiency of LiMP-TCP is 10.28% more than CoAP-TCP and that of LiMP-UDP is 18.04% more than CoAP-UDP.
In
Figure 13, the efficiency of LiMP-TCP is 13.54% more than CoAP-TCP and that of LiMP-UDP is 20.85% more than CoAP-UDP.
In
Figure 14, the efficiency of LiMP-TCP is 11.22% more than CoAP-TCP and that of LiMP-UDP is 34.58% more than CoAP-UDP.
In
Figure 15, the efficiency of LiMP-TCP is 14.30% more than CoAP-TCP and that of LiMP-UDP is 20.54% more than CoAP-UDP.
In
Figure 16, the efficiency of LiMP-TCP is 13.76% more than CoAP-TCP and that of LiMP-UDP is 44.39% more than CoAP-UDP.
The RTTs for communication between the iPhone and the other embedded devices showed a lower RTT (very good) for LiMP-UDP as compared to the other protocols; as shown in
Figure 17,
Figure 18 and
Figure 19. For instance, in
Figure 19, LiMP-UDP was three-times lesser than that of CoAP-UDP with an average RTT of 4.55 ms. In
Figure 18, the efficiency of LiMP-TCP is 39.90% more than CoAP-TCP and that of LiMP-UDP is 56.22% more than CoAP-UDP.
In
Figure 17, the efficiency of LiMP-TCP is 31.60% more than CoAP-TCP and that of LiMP-UDP is 72.11% more than CoAP-UDP.
In
Figure 19, the efficiency of LiMP-TCP is 39.47% more than CoAP-TCP and that of LiMP-UDP is 67.10% more than CoAP-UDP.
For the communication between the two Phones, the marginal difference between each protocol averages 0.2 ms. In
Figure 20, the efficiency of LiMP-TCP is 17.65% more than CoAP-TCP and that of LiMP-UDP is 21.32% more than CoAP-UDP.
For the communication between the devices and the remote server, the marginal difference between LiMP-UDP and CoAP-UDP averages 5 ms. Also the LiMP-TCP outperformed that of CoAP-TCP with a marginal difference of 0.7 ms. These deductions are observed in
Figure 21,
Figure 22,
Figure 23 and
Figure 24. In
Figure 21, the efficiency of LiMP-TCP is 15% more than CoAP-TCP and that of LiMP-UDP is 15.90% more than CoAP-UDP.
In
Figure 22, the efficiency of LiMP-TCP is 15.04% more than CoAP-TCP and that of LiMP-UDP is 15.85% more than CoAP-UDP.
In
Figure 23, the efficiency of LiMP-TCP is 14.86% more than CoAP-TCP and that of LiMP-UDP is 16.15% more than CoAP-UDP.
In
Figure 24, the efficiency of LiMP-TCP is 15.45% more than CoAP-TCP and that of LiMP-UDP is 16.37% more than CoAP-UDP.
In summary, for communication over LAN, the LiMP-TCP outperformed the CoAP-TCP by an average of 21% whereas that of LiMP-UDP was over 37%. For a device to remote server communication, LiMP outperformed CoAP by an average of 15%. The LiMP-UDP achieved the fastest RTT. The LiMP-TCP is better in comparison to the CoAP-TCP in instances where the choice of the transport protocol has to be TCP.
5.1. Security Analysis of LiMP
This section provides a security analysis of LiMP in comparison to CoAP. The CoAP header size (as shown in
Figure 4) consists of a 16-bit message-ID used to identify request/response messages; a total of 65536 possible identifiers. One can argue that reducing the Message-ID to 8-bit (as shown in
Figure 7), in the case of LiMP, makes the protocol vulnerable to message spoofing. In this section, we demonstrate the following:
the ease of spoofing both 16 and 8-bit message-ID, and
how LiMP uses a simple ETag generation mechanism to prevent message spoofing.
Table 4 and
Figure 25 show how easy it is to compute both 16 and 8-bit Message-IDs. This analysis was made on the same devices used in the benchmark analysis.
It can be observed that both 8 and 16-bit Message-IDs are computationally feasible and easy for a compromised node to spoof Message-IDs. The fastest compute time was 1.890 ms in the case of the 16-bit Message-ID and 0.005 ms for that of the 8-bit Message-ID.
LiMP uses the 8-bit Message-ID as a seed to generate an ETag to prevent message spoofing. The ETag generation technique is based on binary-shift operations; since binary-shift operations are easy to compute as compared to other algebraic operations. The ETag generation mechanism (Algorithm 1) uses the seed together with a timestamp value (which is also contained in the message payload) to produce a unique hexadecimal string (the ETag). It is assumed that each IoT device is time-synchronized; hence time becomes a good entropy source. Line 1 of Algorithm 1 is the procedure for the generation of the ETag. As input parameters, it requires the seed value and the timestamp (in milliseconds). The binary equivalent of the seed value is prefixed to the binary value of the current timestamp. The resulting value is then converted to its hexadecimal equivalent, which then becomes the ETag. The ETag is collision-resistant due to its usage of the current timestamp as a source of entropy. The seed value ensures that concurrent requests do not end up having the same ETag value.
Algorithm 1 ETag Generation Algorithm. |
1: procedure generate(seed, timestamp) |
2: % seed (bounded between 0, 255 inclusive), timestamp (time in milliseconds) |
3: ← bin2hex(bin(seed) + bin(timestamp)) |
4: end procedure |
5: procedure bin2hex(binary_data) |
6: ← hex(int(binary_data)) |
7: end procedure |
8: procedure hex2bin(hex_data) |
9: ← bin(int(hex_data)) |
10: end procedure |
11: procedure is_valid(e_tag, timestamp, seed) |
12: if generate(seed, timestamp) == e_tag then |
13: ← True |
14: else |
15: ← False |
16: end if |
17: end procedure |
6. Conclusions and Recommendation
With the proliferation of IoT devices, it has become essential to simplify the development of network and application layer protocols for M2M communication. Although the most common application layer protocols such as CoAP, MQTT, XMPP, DDS, and XMPP are applicable in IoT networks; they are limited in instances where minimal overhead and message sizes are key requirements. For data-centric IoT, the minimal header and message complexity, as well as efficient delivery of messages, is key. In this research paper, we proposed a lightweight messaging protocol with a minimal header (2 bytes) size and a PDU of 68 and 44 bytes for TCP and UDP respectively. With the reduced header size, it can be argued that it compromises the security of the proposed protocol. Therefore, we proposed a mechanism through which spoofing of messages can be detected by proposing the use of an ETag. We also demonstrated that the proposed messaging protocol has a faster RTT due to reduced complexity; for communication over LAN, the LiMP-TCP outperformed the CoAP-TCP by an average of 21% whereas that of LiMP-UDP was over 37%. For a device to remote server communication, LiMP outperformed CoAP by an average of 15%. In the context of IoT, aside D2D communication, host-discovery and multicast communication are essential. Future works will explore how this minified protocol can be leveraged for host discovery with minimal broadcast overhead.