1. Introduction
Rapid advancements in Earth-observing systems have led to a large amount of remote sensing data that can be used in disaster monitoring, response, mitigation and recovery [
1,
2,
3]. These data have become significant in the application of geographic information technology to disaster reduction. Additionally, such systems can be used to retrieve high-spatiotemporal resolution images and establish global coverage of digital earth through aerospace remote sensing technology [
4,
5,
6,
7].
Remote sensing images play a critical role in the disaster reduction process because they deliver spatially related information in a direct and intuitive manner [
8]. Based on the discrete global grid system [
9], observed images are commonly cut into tiles and incorporated into pyramid models to provide broad data service for every phase of disaster management. For instance, high-resolution image layers are widely used as base maps for collaborative plotting [
10] and feature extraction [
1] in disaster mitigation processes [
11].
An image layer is a complete or partially continuous tile pyramid that is built from a set of images. Tiles of different levels or areas of the pyramid can be cut from various image sources. Using Google Maps as an example, its tiles originate from QUICKBIRD, LANDSAT, IKONOS and other satellite sensors [
12,
13]. Each tile in a specific layer is unique, with its coding representing a single three-dimensional spatial location on the Earth’s surface.
In the disaster reduction process, the demand for image layers is diverse and ever changing. Using disaster warning as an example, the monitoring of different disasters is associated with distinct requirements regarding the spatial and temporal resolutions of observed remote sensing images (see
Figure 1). Moreover, different tasks in one disaster reduction process introduce different needs regarding temporal scales and image resolution [
14]. In a complete disaster reduction visualization, the disaster scene continues to access various image tiles that differ with respect to their temporal and spatial resolutions to satisfy different degrees of terrain feature recognition. For instance, in the process of flood reduction, the basic disaster environmental visualization requires 1 km spatial resolution for daily monitoring [
15]. Additionally, 5 m to 10 m spatial resolution is required for water extraction in the response phase [
1], and sub-meter resolution is required for the extraction of flood-affected buildings in the loss assessment phase [
16]. These images are generally continuously accessed during each phase of disaster reduction; thus, it is difficult to pre-organize them into the same pyramid-based dataset. Therefore, many datasets are independently stored for diverse disaster reduction tasks. When a disaster occurs, real-time observed images are obtained and used to build new pyramids. Then, new image layers are released to represent the destroyed areas with different spatial and temporal resolutions, in addition to the historical image layers.
With the advancement in disaster reduction studies, an increasingly large number of remote sensing image services overlap in disaster-affected areas. Therefore, tiles that have the same name but are from different datasets will create redundancies in the process of tile retrieval, transmission and visualization. This paper proposes a hybrid NoSQL-based on-demand retrieval method to address this inefficiency in multi-layer image tile services. This method is twofold. First, it provides a layer description model based on semantics, which is used to represent more dataset information and build correlations based on task demands. In the retrieval process, the method automatically filters irrelevant layers and selects the most suitable layers for tile retrieval by matching dataset semantic information with the real-time visualization demand. The second objective is the implementation of the description model and tile selection process based on a two-layer NoSQL database architecture. An in-memory distributed database is used as the first layer for tile caching, and a document database is used as the second layer for the effective storage and querying of many tiles. Moreover, at the transmission level, the HTTP/2.0 protocol is adopted to promote tile scheduling efficiency.
The remainder of this paper is organized as follows.
Section 2 introduces the concept of multi-layer pyramid overlapping and the visualization requirements. In
Section 3, the semantic model and on-demand tile matching flow are presented, while
Section 4 introduces the two-layer database design and its implementation.
Section 5 presents the experiment. A performance evaluation and the associated discussion are presented in
Section 6. Finally,
Section 7 offers conclusions.
2. Background and Motivation
The global multi-resolution pyramid is a discrete global grid model that defines a hierarchical division of the surface of the Earth [
17,
18]. Many division methods have been proposed in the literature, such as the spherical quad-tree [
19], ISEA model, QTM and latitude/longitude model. The latitude/longitude model is the most practical division model in actual applications because the structure is easily understandable and operable for users to manage large-scale and multi-resolution images and to locate any fixed grid space by applying the following division rules [
20]:
The surface of the Earth is transformed to a regular rectangle ranging from −180 to 180 degrees in longitude and −90 to 90 degrees in latitude. Values outside of this area are invalid.
The tile span at pyramid level k is two times that at level k + 1. The global pyramid starts from level 0, which has two tiles whose spans are each 180 degrees.
The ratio of the number of pyramid tiles between the transversal and vertical directions is 2:1 at any pyramid level.
The coding order for pyramid tiles starts from west to east and from south to north at any pyramid level.
The coding order for pyramid levels starts from top to bottom.
When the geographical location and height are specified globally, the resulting tile is unique.
Based on the rules, any image with a geospatial reference can be mapped into the global grid model. Furthermore, the image can be cut into an independent local pyramid (see
Figure 2). The maximum level of the local pyramid is determined based on the spatial resolution of the image. Suppose the pixel size of the image is
s ×
t; then, the spatial resolution of tiles (
f(
l)) in level
l can be calculated (taking the longitude direction as an example) as follows.
Suppose the spatial area of the image is
A and its pixel matrix is
m ×
n; then, the spatial resolution of the image (
r(
x)) can be defined (taking the longitudinal direction as an example) as follows:
where (
,
) and (
,
) are the minimum and maximum coordinates of area
A, respectively. Therefore, the maximum level of the local pyramid (
) can be calculated based on Equations (1) and (2).
Thus, a unique tile can be computed based on the level, column and row number. Moreover, if a longitude and latitude (
x,
y) are given, the associated tile coding (
X,
Y) in any level of the pyramid can be retrieved.
Clearly, when the pyramid layer is single, tile visualization is simple because the result of a retrieval based on the above formulas is unique. However, in disaster reduction applications, multiple pyramids are produced and overlaid in the same disaster-affected area. Every new layer is an independent sub-pyramid from a respective data source. As shown in
Figure 3, layers A, B and C overlap at pyramids of level k, k + 1 and k + 2 in the same area, which produces namesake tiles that belong to the respective layers at these levels. When viewpoint retrieval is used for a tile in this region, it cannot distinguish the namesake tiles from these layers. Repeated tile access and loading lead to visual confusion and discontinuity in refreshed scenes, and more resources are wasted in servers and transmission.
The traditional tile retrieval method is oriented toward a single layer, and it calculates a unique tile code based on the spatial location of the current viewpoint [
21,
22]. When multiple image layers are overlaid in a spatial region, redundant tiles can exist with the same name but from different layers in certain areas that correspond to the spatial intersection of multiple image datasets [
23]. Because the viewpoint-based tile retrieval method cannot distinguish the target tile from namesake tiles as multiple layers are overlaid, users generally perform layer switching to ensure that the visible layer is unique at any given time [
24]. In such a situation, the target layer is clear before any tile searching occurs.
However, in a disaster reduction process, real-time image data are continuously accessed to build new layers, each of which is considerably different with respect to the temporal and spatial resolutions and extents [
25]. Loading only one layer at a time can neither represent the complete disaster characteristics of a disaster-affected area nor satisfy the variety of spatial-temporal resolution demands in the process of disaster reduction visualization. Additionally, it is complex and time consuming to integrate various datasets with different spatial resolutions and scale them into one pyramid, which may also lead to temporal chaos if the temporal resolutions of the integrated datasets do not match.
The viewpoint-based tile retrieval method cannot filter the namesake tiles from multiple layers; it must traverse every dataset stored in the database according to tile level, row and column codes [
26]. Then, all of the searched tiles are repeatedly loaded from the server to the client. Clearly, considerable tile redundancy occurs and wastes service resources in the process of retrieval. In the servers, every tile request forces the database to search all of the image datasets and return a group of namesake results, causing repeated and unnecessary retrieval in irrelevant datasets, as well as query speed degradation. In the clients, the namesake tiles are received, parsed and drawn for visualization individually. This process considerably decreases the refresh speed of the visual scene and greatly increases the memory pressure of the clients. In the transfer layer, large numbers of invalid tiles occupy the majority of the bandwidth resources, which seriously affects the transmission rate of available tiles. Moreover, as the number of datasets stored in the database increases, the wasted resources and low efficiency of retrieval become more serious.
An on-demand retrieval method for multi-layer image tiles is proposed in this paper. It forces servers to automatically analyze the visualization demand based on the viewpoint location and reduction task information. Then, the tile is actively selected from suitable layers to respond to the clients, instead of traversing all the datasets or relying on a man-machine layer control switch. As shown in
Figure 4, when the viewpoint is far from the surface of the Earth, servers provide tiles from the low-resolution pyramid A for basic visualization. As the viewpoint moves to the flood-affected area, the servers select medium-resolution pyramid B for the flood extraction tasks. When the viewpoint is close to the surface of the flooded city, the servers actively choose high-resolution tiles only from pyramid C for the loss assessment tasks. This autonomy takes full advantage of the flexibility and diversity of the multi-layer image service to satisfy the various demands of disaster reduction visualization and guarantee an efficient tile request for 3D scene refreshing by the clients.
3. Methodology
The viewpoint-based method can easily retrieve unique tiles from a single layer, but as the number of overlaid layers increases, this method cannot distinguish the most suitable tiles from multiple datasets. To overcome this problem, we propose a semantic description model of image layers. Multi-dimensional semantics are defined in the model, and an annotation approach based on the resource description framework (RDF) is used to describe the layers. Layer selection and tile filtering are performed in a matching process.
3.1. Semantic Description of an Image Layer
Semantic annotation has been widely used in geoprocessing, particularly in the composition of semantic web services [
27,
28]. This approach features high semantic integrity and formalizes metadata representation to make the metadata machine readable [
29], which improve intercomprehension and interoperation among datasets [
30,
31].
The semantic description model contains five elements: the spatial semantics describe the spatial region range of a dataset; temporal semantics record the collection time of remote sensing images; resolution semantics define the vertical range of the local pyramid; priority semantics control the order in which datasets are scheduled; and theme semantics are a set of sub-tags that record disaster information, such as disaster areas, disaster types, and other factors. The detailed descriptions and functions of the semantics are as follows.
3.1.1. Spatial Semantics
The two-dimensional range of a dataset on the surface of the Earth is described by spatial semantics. In the disaster reduction process, a new set of images is typically cut and joined into one image whose shape seamlessly fits with the administrative region of the disaster area. Spatial semantics represent a complex polygon shape that accurately describes the true extent of the composite image. Spatial semantics are used to precisely filter irrelevant datasets that are outside the viewport.
3.1.2. Temporal Semantics
Temporal semantics describe the order and life cycle of the dataset in the time dimension. The time range of each dataset is defined in the form of a time stamp that provides support for sequence analysis and multi-temporal presentation of multiple layers. These are typical applications in which tiles from multiple layers are requested and scheduled at the same time; thus, the stamp is the key indicator that controls the selection sequence of namesake tiles.
3.1.3. Resolution Semantics
The spatial resolution of an image represents the maximum level of its pyramid. In the pyramid building process, the maximum level of the current pyramid reflects the resolution semantics. It specifies the visible extent of the image layers in the vertical range of 3D space, which represents the functional information of the current pyramid. For instance, when the viewpoint reaches a location that is close to the surface, it implies that the user is more concerned with the feature details of the high-resolution image dataset than the low-resolution texture of the global background dataset. In such a situation, the level information is applied to further exclude layers that are outside the field of view of interest.
3.1.4. Priority Semantics
Priority is the secondary index that determines the order in which the layers are loaded. In the emergency response and loss assessment stages of disaster reduction, there are many new layers that overlap with respect to the spatial extent, visible depth and life cycle. Priority is used to distinguish the highly overlapping layers. The priority is initialized by artificial experience as the pyramid is stored in the database. For example, if current images are more suitable for water extraction in a flood disaster, the priority of the corresponding pyramid is higher than that of pyramids from other sensors. A default setting is supported, as the latest stored pyramid has a higher priority when other indicators are the same. This setting is used to ensure that the tile search process can always find the only appropriate layer to be distinguished from other layers based on the tile uniqueness rule of global discrete grid division.
3.1.5. Theme Semantics
The theme semantic constraint is established to distinguish image differences in terms of sensor types, data sources, related disaster events and other information. In the disaster area visualization task, different disasters feature various preferences regarding image resolution and corresponding sensor types. In theme semantics, a group of sub-tags is built to describe the theme characteristics of the image layer. It also effectively increases the association constraints of different visualization tasks and datasets, which further supports users in selecting and discarding layers in batches.
The RDF is adopted to represent the semantic annotation. An RDF file effectively organizes the semantics of each dataset and transforms them into resource information that the database can easily identify and analyze. An RDF file, as shown in
Scheme 1, is produced when a new pyramid has been built. Then, the annotation information is interpreted and stored in the database for the matching process.
3.2. The Matching Process
The traditional viewpoint-based tile retrieval method cannot determine the differences among namesake tiles belonging to different layers; instead, it must traverse all the datasets to find available tiles. The improvement presented in this paper lies in the description of the multi-dimensional features of the dataset based on semantic information and adaptive matching tiles combined with the current viewpoint position information in the retrieval process.
Figure 5 shows the process of automatic tile matching and filtering.
The detailed steps of the above process are as follows:
- Step 1
A tile request is obtained and first parsed into requirement semantics of visualization tasks, including the geographic extent of the viewport, pyramid level of the tile, theme information, priority requirement and time constraints.
- Step 2
The metadata of all the datasets are traversed to perform the intersection operation based on the viewport range and boundary polygon of every dataset to find datasets list1 in which every member intersects the viewport.
- Step 3
Traverse list1 to select datasets whose pyramid level range contains the current requested level; the search results generate list2.
- Step 4
Match the theme information with the theme tabs of every dataset in list2; the matched datasets compose list3. Datasets in list3 represent task preferences for layer services.
- Step 5
In list3, sort datasets based on priority and select the dataset with the highest priority to generate list4.
- Step 6
A time constraint is applied to select the most suitable dataset from list4. The default setting is to choose the latest dataset; a unique dataset could also be searched for by setting a special time period in the tile request condition.
Additionally, as shown in
Figure 5, each diamond represents a judgment, such that if the current list has only one member, the ideal dataset has been selected, and the process goes straight to the end node. If the current list has no member, the process must lower the filter conditions by re-locating the previous list and skipping from the last filtering node to the next one. Specifically, if the first judgment node returns null, which means there are no suitable datasets intersecting the current viewport range, the process must stop and return null to the application server.
The process of layer matching and filtering can effectively select the most suitable image layer that satisfies the current visualization task. To take full advantage of server efficiency for data searching and information publishing, the implementation of the matching and filtering process is arranged on the server side. A two-layer NoSQL database architecture is designed to effectively meet the storage and access needs of mass datasets.
4. Implementation
A two-layer hybrid architecture based on key-value and document NoSQL databases is designed for efficient layer storage and tile retrieval. NoSQL databases have been widely used to storage and manage large amounts of image data retrieved via pyramid and hash indexing techniques on the server side [
32]. The pyramid provides the rules of image cutting, and hash indexing gives a solution to store and search these tiles in a highly effective manner. NoSQL is one of the key technologies for large-capacity storage and fast retrieval oriented toward the continuous growth of datasets [
33,
34,
35,
36]. The main features of NoSQL databases are as follows [
7,
37]:
The most popular categories of NoSQL databases include key-value databases and document databases. Key-value databases are used for fast and simple operations because they provide simple mapping from the key to the corresponding value, which yields fast object searches. Document databases offer flexible data models with query possibilities. They are considered an evolution of key-value databases because they include all the benefits of key-value databases while supporting strong and rich query capabilities.
4.1. Storage Design
The two-layer database architecture includes the storage layer and cache layer. The storage layer adopts MongoDB to manage large amounts of image layers and metadata. MongoDB provides a rich data structure to store the semantic information of datasets. In MongoDB, the GridFS structure supports efficient storage of mass tile data, and the collection structure provides effective metadata management [
38]. Each tile pyramid is stored in GridFS, and the RDF file of its metadata is stored as a document in the metadata collection.
The cache layer provides an extensible cache pool based on Redis, a key-value NoSQL database. As an in-memory database, Redis has a high ability with respect to data reading, writing and querying tasks. Additionally, its horizontal scalability and distribution design support fast caching and the secondary access requirements of massive tile data. Cached tiles are evenly distributed among the memory nodes. The key of each tile is designed as “Dataset name: Tile location”; the dataset name is based on the GridFS name of datasets that contain the tile in MongoDB, and the tile location includes the row, column and level number of the discrete global grid system. This design ensures consistent tile indexes in the two databases.
On top of the two-layer databases, an application server is built to implement the layer matching process. When the tile request is received in the application server, the server parses the request and performs
step2 and
step3 (see
Scheme 2) of the process via a spatial union query in MongoDB. The query results, including theme, time and priority semantics, return to the application server. The remaining steps of the process are quickly implemented in the server memory. The suitable layer is selected at the end of the matching process. Then, the server assembles the tile key for search in Redis. If no tile is found in the cache, then the sever searches again in MongoDB; the result is first returned to the server and subsequently sent to Redis for caching. The entire query process is shown in
Figure 6.
4.2. Scheduling Optimization
To enhance the response efficiency and transfer stability between the clients and servers, tile scheduling optimization is implemented, including the multi-queue least recently used (LRU) replacement method and HTTP/2.0-based multiplexing tile transmission.
4.2.1. Multi-Queue LRU Replacement
The technique of geospatial data caching is widely used in cloud-based environments [
39,
40]. A pyramid model of tiles provides a good management and caching method for geospatial data in a cloud-based environment [
41]. Prefetching methods that predict data access [
42,
43] have been proposed to improve tile caching when the cache storage size is limited. As large memory size becomes more available in caching techniques, taking advantage of memory database abilities (large storage size and high-efficiency data retrieval and memory updating) could further enhance the tile hit rate and secondary access efficiency in the visualization process.
The LRU algorithm eliminates data according to the historical records of data access; its core idea is that if the data have been recently visited, then the probability of future visits is also higher [
44], which is widely used in cache data management. The multi-queue LRU replacement algorithm divides the LRU queue into several queues, each with different access priorities. Multi-queue LRU has a higher hit rate than the traditional LRU algorithm but is also associated with higher complexity and computational costs. Because the clustered in-memory database has a sufficient memory capacity and query efficiency, the multi-queue LRU replacement algorithm is adopted to manage cached tiles in the memory.
As demonstrated in
Figure 7, there are several queues in the cache, from
Q0 to
Qk, and the access priorities are addressed in turn. A new tile is stored in
Q0 when it is requested by the client the first time. As the number of accesses increases beyond the threshold, the tile is removed from the current queue and added to the head of a high-level queue. If the tile has been not accessed for a certain time, it must be removed from the current queue and sent to the head of a low-level queue. Tiles at the end of any queue are removed and pushed to the head of a historical queue if its queue has been filled. The order of tile retrieval in the cache depends on the access priority of queues, and the history queue has the lowest priority. If a tile in the history queue is re-accessed, its priority is restored, and it goes back to the head of the last queue. Finally, the history queue clears the “useless” tiles according to the LRU rule. This algorithm makes full use of the efficient retrieval capability of Redis and takes advantage of the clustered memory space to make the storage time of popular tiles as long as possible, which ensures a high tile hit rate and improves the secondary access efficiency.
4.2.2. HTTP/2.0-Based Multiplexing Tile Transmission
Tile requests from distributed clients are converted to HTTP requests and sent to the server. Then, the server parses the URL and searches the database. After the tiles are retrieved, they are returned to the clients through the network. HTTP is an application protocol that is most commonly used on the Internet; it is also the common language between the clients and servers.
Although most hypertext transfer protocols are based on HTTP/1.x, traditional HTTP/1.x transmission protocol-based data scheduling suffers from poor temporal efficiency and continuity. Repeated establishment and disconnection between the clients and servers leads to substantial time consumption, and the repeated transmission of the HTTP header results in a large amount of wasted network resources.
In a real-time tile visualization application, the application server must respond to many tile data requests in a relatively short period to ensure continuous visualization by the clients. In HTTP/1.x, a connection is established for a single request-response process, and the connection is closed when the response is received. The process of establishment and disconnection requires a large amount of extra transfer time. Moreover, HTTP/1.x uses the pipeline process approach, which queues several requests into a serial single-thread process. The request at the back of the queue must wait until the former request has been addressed. The single-thread method is prone to tile blocks and response timeouts when the number of requested tiles rapidly increases, which easily causes visualization blocking in the clients.
The HTTP/2.0-based transport protocol can achieve asynchronous connection multiplexing, which can greatly enhance the efficiency of tile transmission. In contrast to HTTP/1.x, which builds connections for every request-response, HTTP/2.0 establishes a continuous connection for all data requests. It decomposes a request message into multiple frames for transmission. The frames are reassembled at the receiving end so that a number of requests can be transmitted and received in parallel without affecting each other. The responses are also sent and received in this same way (see
Figure 8). The multiplexing tile transmission of HTTP/2.0 fully uses the network to minimize tile blocking in the concurrent access situation.