*5.2. Ranking*

(**a**) Latency for Blockchain

The Ranking component implements ranking mechanisms for IoT resources. Its purpose is to aid users and applications to not only find a set of resources relevant to their needs, but also to select the best or most appropriate one(s) from that set. There are multiple criteria for ranking IoT resources such as data type, proximity, latency and availability. Therefore, IoTCrawler's Ranking component supports application-dependent, multi-criteria ranking. Within the IoTCrawler framework, the Ranking component is available to the Search Enabler component to facilitate entity discovery. The Ranking component relies on an NGSI-LD compliant endpoint as a backend, which is often times the Indexing component but could also be any NGSI-LD broker. Upon receiving a query request and its ranking criteria, the Ranking component initially forwards the query to the underlying index or broker to ge<sup>t</sup> the set of IoTStreams entities matching the query. A ranking function then computes, for each result, a ranking score, according to the ranking criteria. The score is then attached to each IoTStream result as an additional property. The ranking criteria specifies the relevance of different properties to the application. The current ranking function computes a weighted average of the QoI values of a IoTStream

entity, where the weight values are specified in the ranking criteria, but it can be easily adapted to other ranking criteria. In this way, the ranking is addressing the requirement for search **R-3** by successfully ordering the search results.

The Ranking component offers an extended NGSI-LD interface, where ranking criteria can be specified in addition to the query. To avoid any influence of indexing strategies implemented by the Indexing component and be able to focus on the performance of the Ranking component itself, we have evaluated the Ranking component in a simplified architecture consisting only of the Ranking component and an NGSI-LD broker. Although the Ranking component supports horizontal scaling (adding more instances behind a load balancer for better scalability) due to its stateless implementation, in this evaluation we have only tested on a single instance. To assess the scalability of the component, multiple queries have been sent, both directly to the broker and to the ranking + broker combination. For the ranking + broker combination, we used a single ranking weight as the ranking criterium, that means that results were sorted based on the value of a single property. We have varied the number of concurrent query requests and measured the latency in retrieving the results. Each request returned 1000 entities, where the entities' size was approximately 7 kB.

The results shown in Figure 16 indicate that the Ranking component introduces a small latency in retrieving the results, but it can nevertheless scale with the volume of query requests.

**Figure 16.** Ranking latency.

### *5.3. Search Enabler*

The Search Enabler component is responsible for providing functionally rich query language and the search interface for seeking over metadata of discovered sensors and streams. Using GraphQL technology, the IoTCrawler search component offers end-to-end functionality for performing complex queries, allowing users to access data coming from distributed large-scale IoT deployments. Any complex GraphQL query is decomposed and resolved via a corresponding number of atomic NGSI-LD queries, as it is prescribed by the NGSI-LD standard. The schema-based approach of GraphQL allows to describe key entities (see IoTStream Ontology [14]) and the relationships between them. A compiled schema becomes a basis for query parser/validator engine and for a GUI, where users can design their queries. To comply with the linked data approach, all types and their properties in the schema are annotated with type URIs according to the IoTCrawler data model. Annotations describe hierarchical relations (equal to the subclassOf) between types, which are considered during query resolution process. This allows to be fully compliant with ontologies used for data modelling. For example, to describe a set of sensors hosted by a platform, a correct definition in terms of SOSA ontology would be: "system hosted by a platform", which means that sensors, actuators and others subtypes belong to the more generic type used in this statement. Use of types and subtypes and considering their relations during query resolution process is an exclusive feature of the Search Enabler component developed for the IoTCrawler platform. Another exclusive feature developed for IoTCrawler is the resolution of nested filters made on top of NGSI-LD. Nested filters are equivalent to join clauses in traditional query languages (e.g., SPARQL), where multiple entity types can be returned or used as filters in a query. The recursive query resolution processor carefully passes through all the types used as filters or output fields and initiates the corresponding number of NGSI-LD requests. GraphQL queries designed and tested via GraphiQL (GUI) might be integrated into IoT applications and executed programmatically. Results are returned in machine-interpretable JSON format. Alongside the GraphQL-based search, the IoTCrawler is equipped with a rule-/patternbased generator and mapping mechanism for generating filter conditions [7]. As a result, a state-based context model empowers GraphQL queries with context-based reasoning. The described Search Enabler's search functionality is performed on top of the federated metadata infrastructure, which employs security and privacy-aware mechanisms.

The Search Enabler component offers a GraphQL interface, where search queries expressed in GraphQL are resolved via HTTP-requests over NGSI-LD interface. Since NGSI-LD allows to query only one type of entities per request, complex GraphQL queries (requesting more than one data type) require a corresponding number of NGSI-LD requests to be performed. The number and order of subsequent requests are prescribed by Search Enabler according to a structure of a GraphQL query. For example, a simple query of stream identifiers (streams{ id }) would be resolved by a single NGSI-LD request for entities with type iot-stream:IotStream (query #1). The extension of the query by the names of sensors (query #2) requires an additional resolution step: one NGSI-LD request for each sensor ID associated with the stream from the list of query #1. Further extension of query #2, e.g., by the names of properties observed by sensors, requires an additional resolution step: one NGSI-LD request for each property IDs associated with sensors. In case different sensors observe the same property, the Search Enabler avoids duplicating NGSI-LD requests.

For performance benchmarking, four different GraphQL queries have been selected. The difference between queries is in their complexity (requesting from 1 to 4 different entity types), which would require a different number of NGSI-LD requests to be performed. The expected number of NGSI-LD requests *N* depends on (1) a number of requested types *T* and if *T* > 1, then on (2) a number of unique entities *R* of subsequent types referenced in the results set. More formally, it is described as follows:

$$N = (T - 1) \* R + 1\tag{1}$$

The caching mechanism avoids duplicating requests, so a real number of them might be significantly lower than was expected. During the experiments, we have measured the average GrahpQL query execution times and summarised the execution time of the corresponding NGSI-LD queries. Dependency on a number of results is demonstrated via limiting them within the range 1–500 with step size 100. Each experiment was repeated 10 times and the average times were calculated. In Figure 17, an average query execution time depending on number of results is demonstrated. Figure 18 represents a GraphQL query execution time against the summarised execution time of the corresponding NGSI-LD requests. From Figure 18d, it can be seen that GraphQL query execution goes faster than execution of the corresponding NGSI-LD requests. This can be explained by a particular query's structure, where two types (observable properties and platforms) can be resolved in parallel. In the case of no parallel type resolutions (Figure 18a–c), the overhead of GraphQL engine is not higher than 0.2 s (1% of the overall query execution time). For complex queries with parallel type resolution, the overhead is mitigated at all. Experiments have been done using the NGSI-LD broker (Scorpio) running on Intel NUC i5-5250U with 8 GB of RAM. The Search Enabler and GraphQL client were running on a laptop Intel Core i7-5600U with 16GB of RAM, both were connected to a 1 GB/s local network.

The Search Enabler solves the machine-initiated search challenge (**R-2**) by providing programmatic interfaces (APIs), to which remote IoT applications can send search requests and ge<sup>t</sup> results back in an automated way.

(**d**) 4 Types (+Platforms)

**Figure 18.** Requests execution time (Next Generation Service Interface for Linked Data (NGSI-LD)) vs. query execution time (GraphQL).
