*Article* **LandQ***v2***: A MapReduce-Based System for Processing Arable Land Quality Big Data**

**Xiaochuang Yao 1,\* ID , Mohamed F. Mokbel 2, Sijing Ye 3, Guoqing Li 1, Louai Alarabi 2, Ahmed Eldawy 4, Zuliang Zhao <sup>5</sup> ID , Long Zhao <sup>5</sup> and Dehai Zhu 5,\***


Received: 25 May 2018; Accepted: 6 July 2018; Published: 10 July 2018

**Abstract:** Arable land quality (ALQ) data are a foundational resource for national food security. With the rapid development of spatial information technologies, the annual acquisition and update of ALQ data covering the country have become more accurate and faster. ALQ data are mainly vector-based spatial big data in the ESRI (Environmental Systems Research Institute) shapefile format. Although the shapefile is the most common GIS vector data format, unfortunately, the usage of ALQ data is very constrained due to its massive size and the limited capabilities of traditional applications. To tackle the above issues, this paper introduces LandQ*v2*, which is a MapReduce-based parallel processing system for ALQ big data. The core content of LandQ*v2* is composed of four key technologies including data preprocessing, the distributed R-tree index, the spatial range query, and the map tile pyramid model-based visualization. According to the functions in LandQ*v2*, firstly, ALQ big data are transformed by a MapReduce-based parallel algorithm from the ESRI Shapefile format to the GeoCSV file format in HDFS (Hadoop Distributed File System), and then, the spatial coding-based partition and R-tree index are executed for the spatial range query operation. In addition, the visualization of ALQ big data with a GIS (Geographic Information System) web API (Application Programming Interface) uses the MapReduce program to generate a single image or pyramid tiles for big data display. Finally, a set of experiments running on a live system deployed on a cluster of machines shows the efficiency and scalability of the proposed system. All of these functions supported by LandQ*v2* are integrated into SpatialHadoop, and it is also able to efficiently support any other distributed spatial big data systems.

**Keywords:** spatial big data; parallel processing; MapReduce; arable land quality (ALQ); GIS

#### **1. Introduction**

Arable land is a foundational resource for national food security. Arable land quality (ALQ) data are about the quality evaluation of arable land by classification and gradation. With the rapid development of spatial information technologies, the annual acquisition and update of ALQ data covering the country have become more accurate and faster. ALQ data are dominated by spatial vector data in the ESRI shapefile format. In China, the government began to perform all round arable

land quality monitoring [1,2] and evaluation work in 2011. In addition, in 2013, the government formed a relatively complete and large arable land quality dataset of approximately 2.51 TB in the ESRI shapefile format [3]. To make such an enormous amount of data readily available requires a management information system with high-performance computing that will play a crucial role in offering accessibility for the government to make scientific decisions and ensuring valuable spatial data safety, readily, and practically.

The traditional standalone version GIS development pattern has been challenged by the volume, velocity, and variety of the arable land quality big dataset. Our team has developed a GIS cluster-based management information system, LandQ*v1* (version 1) [3]. In the first version, the ArcGIS platform was used as the basic development framework and Oracle 11g was adopted as the spatial database. However, along with the continuous application of the system (LandQ*v1*), the increasing challenges and overhead make it very hard to use such a rich spatial vector dataset from the national scale. On the one hand, data preprocessing takes a large amount of time and is not ideal in terms of system performance. On the other hand, a large number of tables not only cause data organization difficulties, but also bring about client service scheduling problems. In addition, there are serious deficiencies in the data querying and visualizations for ALQ big data.

The rise of big data has brought about a remarkable change in the traditional GIS industry, especially based on cloud computing technology [4,5], which has provided a potential solution for addressing these computational challenges. Several processing and analyzing platforms have been developed for spatial big data, such as Hadoop-GIS [6], SpatialHadoop [7], GeoMesa [8], GeoSpark [9], ST-Hadoop [10], and others. Meanwhile, related application systems have also sprung up from different fields around the world, including satellite or remote sensing data [11,12], sensor data [13], social media [14], and others [15,16]. For the various application areas, the directions and emphasis of the above systems focus on different topics. In this paper, we take the national land and resource information fields, especially the arable land quality (ALQ) data, into consideration. For the ALQ big data, cloud computing technology is adopted to solve the practical problems in data storage and management.

LandQ*v2* (version 2) is a MapReduce-based parallel processing system that lays out the necessary infrastructure to store, index, query, and visualize the nation's arable land quality (ALQ) big data. To solve the problem of data storage in the cloud environment, we designed a GeoCSV model to support the data processing, and then, we used a spatial coding-based approach to partition the whole dataset and build a distributed R-tree index in HDFS. In addition, the spatial query and visualization operations are developed and implemented within the MapReduce program. All these functions, supported by LandQ*v2*, are integrated into SpatialHadoop [7], which is a very successful framework for spatial big data, and it is also able to efficiently support any other distributed big spatial data systems.

#### **2. LandQ***v2* **System Overview**

Figure 1 shows the LandQ*v2* system overview. It was developed in a cloud computing environment, which mainly includes four levels, namely, the spatial data infrastructure layer, the cloud computing platform layer, the key technologies layer, and the system application layer. The spatial data infrastructure layer is composed of ordinary servers that form a resilient and scalable cluster, mainly providing data storage and computing resources for the upper layer. In the cloud computing platform layer, this paper selects the mainstream Hadoop ecosystem, and the current system architecture mainly involves the processing system (MapReduce), program language (Pig), and storage system (HDFS). The core content of this system is the key technologies layer, which includes the vector data cloud storage model, spatial data partitioning, the distributed R-tree index, the tile pyramid model-based visualization, and others, as shown in Figure 1. The system application layer is mainly based on the above layers to handle and leverage arable land quality (ALQ) big data.

In the system development process, in order to improve the compatibility and continuity of LandQ*v2*, the existing open source standards and basic tools are referenced. For example, the cloud storage model—GeoCSV—is proposed based on the OGC (Open Geospatial Consortium) standard, characteristics of the distributed storage and parallel programming model. For spatial data processing, the ESRI Geometry API (Application Programming Interface) for Java is used to complete the analysis of simple geometric object elements. In addition, in the data visualization engine, the mainstream ArcGIS Runtime for the WPF (Windows Presentation Foundation) desktop development and ESRI Web API for web development are used to meet the application requirements of the system client. All of these functions supported by LandQ*v2* are integrated into SpatialHadoop.

**Figure 1.** The LandQ*v2* system overview.

#### **3. Key Technologies in LandQ***v2*

#### *3.1. Data Preprocessing*

Arable land quality (ALQ) data are mainly vector-based spatial data in the ESRI Shapefile format, which is a mainstream spatial vector data storage format [17] because of its simple structure, high precision, and rapid display capabilities. However, in the era of spatial big data, this format faces severe challenges. On the one hand, its storage volume is constrained by the 2 GB upper limit, which will result in thousands of small files and will increase the difficulty of data management and processing. On the other hand, there are obvious shortcomings in the field types, index efficiency, and network transmission. In addition, the visualization of the Shapefile generally requires professional GIS tools or software to operate. To solve the above shortcomings, GeoCSV is proposed based on the OGC standard. This format also offers to the idea of distributed storage and the parallel programing model.

#### 3.1.1. Vector Data Cloud Storage Model

As Figure 2 shown, the data structure of GeoCSV [18] adopts the object-based vector data model to store spatial geometric elements in the cloud environment. GeoCSV uses the simple Key-Value storage model and the OGC-WKT format [19] to describe the spatial geometry. It takes the advantages of CSV (comma-separated values) files to organize the spatial vector data, that is, each record represents only one spatial geometry object. CSV file contains a number of data records. This is in line with the cloud computing platform, which is conducive to spatial data segmentation, processing, and analysis.

**Figure 2.** The spatial vector data cloud storage model, GeoCSV [18].

Based on the GeoCSV model, this model has the following advantages: (1) object-oriented storage of existing spatial vector data, and the parallel computation of fine granularity is supported; (2) simple data structure with OGC standard data and the advantages of network transmission; (3) CSV is used for the separation of storage, which is conducive to the distributed processing of data segmentation, processing, and analysis; and (4) spatial data and attribute data formats are unified, and it is easy to add a field or adjust the data structure.

#### 3.1.2. Data Conversion

Data conversion from Shapefile to GeoCSV is executed by one MapReduce job as shown in Figure 3. In this way, batch conversions can be implemented, which will change multiple small Shapefiles into one large GeoCSV file. Since a Shapefile is composed of multiple sub-files (.shp, .shx, .dbf, etc.), and each sub-file contains a unique header file, the conversion of a single Shapefile file will be treated as a separate subtask in the implementation of the data conversion algorithm. The task is done primarily as a two-part process, namely, Shapefile parsing and spatial object reconstruction in GeoCSV.

**Figure 3.** The MapReduce-based parallel data conversion algorithm.

#### *3.2. Spatial Index*

In the distributed storage environment, a spatial dataset will be separated into several blocks dispensed to different nodes in the cluster [20], and the spatial index technology based on a single node cannot be used directly. Compared with the centralized index, the distributed index increases the communication protocol and communication overhead among the nodes in the cluster [21]. The index mechanism of a distributed storage system must take full account of the distributed systems and the organization of the data storage.

#### 3.2.1. Data Partitioning

In this section, we present the spatial partitioning architecture based on the spatial coding-based approach (SCA) [22]. Figure 4 summarizes the six steps, which can be done by two MapReduce jobs. (1) Computing the spatial location: the central point is used instead of the spatial objects; (2) Defining the spatial coding matrix (SCM): we define the spatial coding values as a spatial coding matrix; (3) Computing the sensing information set (SIS): the SIS includes the spatial code, location, size, and count; (4) Computing the spatial partitioning matrix (SPM): SPM is built according to the SIS and the blocked size in HDFS; (5) Spatial data partitioning: every spatial object will be matched with the spatial coding and written into the data blocks in HDFS; (6) Allocating the data blocks: all blocks will be distributed into the nodes in the cluster with their spatial codes.

**Figure 4.** The algorithm of spatial partitioning based on SCA (Spatial Coding-based Approach) [22].

#### 3.2.2. Distributed R-Tree Index

The R-tree index is the most influential data access method in the GIS database [23]. As Figure 5 shows, the distributed R-tree index includes the local index and global index [24]. The local index, also called a data block index, is created for the collection of spatial object information in the data block after the spatial partitioning steps in HDFS. Here, this index will generate an index header file for each data block in HDFS, and the contents of the index header file include the envelope rectangle (MBR), the count of spatial objects (Count), offset to the start (Offset), and other basic information. The global index is created from the local index. The purpose of this phase is to build the global index structure, indexing all partitions in HDFS. By concatenating all the local index files into one file, the global index file will be generated. In addition, the content of the global index file includes the data block id (ID), the envelope rectangle (MBR), the total size (Size), and others.

**Figure 5.** The distributed R-tree index.

#### *3.3. Spatial Range Query*

The spatial query operation is closely related to the spatial data storage model and spatial index, and it can be regarded as their reverse process. In the non-indexed Hadoop cloud environment, a spatial query needs to traverse all the spatial information records to match the corresponding results, and for large-scale spatial data, the query efficiency is extremely low.

As shown in Figure 6, this paper combines the spatial range query algorithm into three phases: (a) the Global filtering phase. The content stored in the global index is the basic information of all the data blocks in HDFS. The size of the global index file is relatively small, so this phase is executed at the master node. All query operations need to go through the global filtering phase first and the global index file usually resides in the memory of the master node; (b) the Local filtering stage. The local filtering phase is mainly for filtering the candidate sets in the data blocks from the global filtering results. The local index file is mainly the leaf node of the distributed R-tree stored in the data block header file; (c) the Refining stage. Through step (b), it can get all the leaf nodes in the current data block that intersects the query range. The refining stage mainly matches each spatial geometric object in the candidate set; if the object intersects or is included in the spatial query range, it will be put in the results file, otherwise, it will not perform this operation.

**Figure 6.** The spatial query steps based on the distributed R-tree index.

#### *3.4. Data Visualization*

The tile pyramid model uses "vertical grading, horizontal block" thinking, and the map data are divided into uniform rectangular tiles where the number of levels depends on the map scale, and the tile number is decided by the picture size [25].

The visualization based on the tile pyramid model is developed by employing a three phase technique [26]: (1) The partitioning phase: here, we use the default Hadoop partitioning to split the total input spatial big data into partitions in HDFS; (2) The plotting phase: through the internal loop, all tiles occupied by each spatial object will be plotted in the tile pyramid model; (3) The merging phase: the partial images having the same index code will be combined into one final image. As shown in Figure 7, each tile in the map pyramid model is indexed according to the level, row, and column as the unique identifier. In the parallel algorithm, the construction of each tile can be regarded as a separate sub-task, and the entire tile construction of the pyramid model can be regarded as a collection of multiple subtasks.

**Figure 7.** The schematic diagram of the tile pyramid model.

In general, we assume that the enclosing rectangle of the space object shape is mbr, and the maximum and minimum coordinates are (*x*1, *y*1) and (*x*2, *y*2), respectively. Then, the following formula is used to get the spatial object in the pyramid model. The cover of the map tile (r) column (c) is

$$\mathbf{r}\_{min}(\text{shape}) = \left[ \frac{\text{mbr}\_{y1} - \text{MBR}\_{y1}}{\text{MBR}\_{length}/2^l} \right] \tag{1}$$

$$\mathbf{r}\_{\max}(\text{shape}) = \left\lceil \frac{\text{mbr}\_{y2} - \text{MBR}\_{y1}}{\text{MBR}\_{\text{length}} / 2^l} \right\rceil \tag{2}$$

$$\mathbf{c}\_{min}(\text{shape}) = \left\lceil \frac{\text{mbr}\_{x1} - \text{MBR}\_{x1}}{\text{MBR}\_{width}/2^{l}} \right\rceil \tag{3}$$

$$\mathbf{c}\_{\text{max}}(\text{shape}) = \left\lceil \frac{\text{mbr}\_{\text{x2}} - \text{MBR}\_{\text{x1}}}{\text{MBR}\_{\text{width}} / 2^l} \right\rceil \tag{4}$$

where the [r*min*, r*max*] and [c*min*, c*max*] for the space object shape of the map tile correspond to the line number and column number interval, respectively, and MBR is for the entire map. The *l* is the pyramid level.

In particular, if the map tile size is p and the level has a spatial resolution of r, the spatial resolution of the tile in the *x* and *y* directions satisfies the following conditions:

$$\mathbf{r}\_x = \frac{\mathbf{MBR}\_{width}}{\mathbf{p} \times \mathbf{2}^I} \tag{5}$$

$$\mathbf{r}\_y = \frac{\mathbf{M} \mathbf{B} \mathbf{R}\_{length}}{\mathbf{p} \times \mathbf{2}^l} \tag{6}$$

#### **4. System Test and Results**

In this section, the key technologies mentioned above are tested with the national arable land quality (ALQ) big data. The content of the test in this paper includes three aspects: data conversion, spatial query operation, and visualization performance. The test results will be given bellow.

#### *4.1. Data Conversion*

The parallel conversion experiment of the vector data mainly includes two aspects. First, the parallel algorithm proposed in this paper is compared with the data transformation algorithm in the ArcGIS software. On the other hand, the parallel algorithm is tested in the cluster environment. The experimental environment in this paper includes the stand-alone and cluster environment. Table 1 configures the information for the experimental environment. Table 2 shows the test data and that the size of the ESRI Shapefile varies from 2 GB to 128 GB.


**Table 1.** The test environment for the data conversion algorithm.


**Table 2.** The datasets for the parallel conversion test.

The experimental results are shown in Figures 8 and 9. It can be seen that the conversion time of the two algorithms increases with the growth of the size of the ESRI Shapefile. At the same time, it is fairly clear that the parallel conversion algorithm based on the MapReduce proposed in this paper is more efficient in terms of time. Moreover, with the increase of the number of ESRI Shapefile, the efficiency growth is clearer, approximately 4 to 6 times to the ArcToolBox.

**Figure 8.** The time-consuming comparison of data conversion in a stand-alone environment.

**Figure 9.** The time-consuming comparison of data conversion in a cluster environment.

Figure 9 shows the time consuming comparison of the data conversion efficiency in the cluster environment. Here, we only test a data volume of 64 GB. Through the test results, we can find that the execution time of the algorithm becomes shorter with the increase of the number of cluster nodes, which fully embodies the advantages of the cloud computing environment.

#### *4.2. Data Query*

#### 4.2.1. Index Quality of the R-Tree

The R-tree spatial index quality can be measured by the envelope rectangle of its index tree hierarchies. The calculation formula is as follows, where Area (*T*) in Equation (7) represents the total area covered by the MBR of each index layer of the R-tree. Generally, the less the coverage area is, the better the R-tree index performance. In Equation (8), Overlap (*T*) represents the overlap area between the MBRs of each index layer in the R-tree. The larger the overlap area is, the more space the query path will have, which leads to a greater query time consumption and greatly reduces the efficiency of the data retrieval. Thus, for overlapping areas, the smaller the area is, the better the R-tree performance will be. The R-tree spatial index quality analysis formula is as follows:

$$\text{Area}(T) = \sum\_{i=1}^{n} (T\_i \text{MBR}) \tag{7}$$

$$\text{Overlap} \left( T \right) = \sum\_{i=1}^{n} \sum\_{j=i+1}^{n} Area \left( T\_i MBR \cap T\_j MBR \right) \tag{8}$$

As shown in Figure 10, spatial coding-based data partition (Hilbert coding-based approach, HCA) shows a better index quality than random sampling for building an R-tree in both Area (*T*) and Overlay (*T*). The main reasons include the following two aspects. (1) Due to the randomness of the sample set and lost spatial features. The data partitioning method based spatial-coding can make good use of the adjacent features of spatial coding, which greatly guarantees the spatial distribution of the vector data; (2) the data partitioning method based on spatial sampling. Only the spatial location information of the sample is considered, however, the other one is considered with more influencing factors, including spatial location information, the number of elements, the amount of data bytes, and so on. In addition, the sampling based method is also due to the randomness of the samples, the partitioning strategy for the same dataset and the same sampling rate are different, and the algorithm lacks stability.

**Figure 10.** The index quality comparison of the R-tree based on random and HCA.

#### 4.2.2. Spatial Query Performance

Based on the distributed R-tree, here, we test the query performance of LandQ*v2*. We measure the query time consumption in the different numbers of jobs and nodes.

In experiment 1, the result in Figure 11 shows the query performance comparison between the different job numbers with two modes, namely, the non-index and R-tree index. It can be seen that the spatial query execution time for the data increases with the increase of the spatial parallel access times, and there is a linear trend of growth for the non-index model, which is mainly due to traversing all the data blocks regardless of any spatial query. For the R-tree index model, because the search is only executed with the relevant data block, there is no obvious linear growth trend and there is a great relationship with the spatial query itself. In addition, for both models, the more query jobs are queried, the longer it will take.

**Figure 11.** The query performance comparison between the different job numbers.

Experiment 2 tests the vector data query efficiencies with spatial index in different cluster nodes. The experimental results are shown in Figure 12. It can be seen that the efficiency of the data query increases with the increase in the number of cluster nodes. At the same time, the parallel query algorithm in this paper can perform well and shows good superiority.

**Figure 12.** The query performance comparison between the different node numbers.

Through the above comparison tests, the following conclusions can be drawn. (1) The efficiency of the spatial query based on the spatial index has been obviously improved; (2) The R-tree index based on the spatial coding-data partitioning also showed a good advantage in the spatial query operation; (3) With the increase of the nodes, the efficiency of the spatial parallel query can be significantly improved, and it can expand the number of clusters to improve the efficiency of the data retrieval.

#### *4.3. Visualization Performance*

#### 4.3.1. Tile Pyramid Construction Performance

In this section, we test two experiments. One experiment is a comparison between the parallel construction algorithm proposed in this paper and the existing mainstream ArcGIS server. The other tests the algorithm performance in the cluster. The experimental data is in the Table 3.

**Table 3.** The datasets for visualization.


Experiment 1 was conducted in the same hardware environment and the tile pyramid model parameters were consistent. The test results are shown in Figure 13. Under the same test environment, the algorithm proposed in this paper is better. In experiment 2, the efficiency of the parallel construction of different datasets is compared with the number of different cluster nodes. The tile pyramid level is set to level 8. The result is shown in Figure 14. Under the same task, the greater the number of cluster nodes was, the clearer the advantages of tile pyramid parallel construction were.

**Figure 13.** The time-consuming comparison of the tile pyramid construction in a single machine environment.

**Figure 14.** The time-consuming comparison of the tile pyramid construction in a cluster environment.

#### 4.3.2. Visualization for Arable Land Quality Big Data

Based on the parallel construction algorithm of the tile pyramid model, the provincial and county level of arable land quality (ALQ) big data are respectively visualized. Following the OGC map standard, we construct the map tiles and integrate them with the ESRI API for JavaScript very well.

For the 2013 national and provincial level of arable land quality, there were equal data map elements of the tile pyramid parallel construction. In this paper, we showed the national economic classification, which reflects the economic value of the arable land. The resulting tile details are shown in Table 4. The tile pyramid model has 8 levels of map scale ranging from 1:20 million to 1:31.25 million. The county dataset had a parallel build time of 1 h 15 min 7 s, and the provincial data had a parallel build time of 11 min and 52 s.

**Table 4.** The datasets for the parallel conversion test.


Figures 15 and 16 show the national arable land quality data global browsing and local browsing diagrams, respectively. In line with the WMS standard, the resulting map tiles are stored in folders with different scales which will be accessed and loaded by the Web client. At the same time, it can integrate well with other layers, such as national administrative boundary, railway, water, and other layers in the client.

**Figure 15.** The map tiles of the national arable land quality dataset.

**Figure 16.** The schematic diagram of visualization for the national arable land quality dataset.

#### **5. Discussion**

The increased amount of the arable land quality (ALQ) data has actually brought challenges for processing and managing the vector-based spatial big data, especially in terms of data processing efficiency. Traditional GIS solutions have been unable to meet the needs of the national application. Here, we present a MapReduce-based parallel processing system, LandQ*v2*, for national arable land quality (ALQ) big data. LandQ*v2* is a complete set of application systems and is integrated into SpatialHadoop. Similarly, the relevant technologies can also be applied to other areas.

In LandQ*v2*, four key technologies are designed and implemented. It is based on the cloud storage model of the spatial vector data (GeoCSV). The many shortcomings of the ESRI Shapefile will be addressed and solved in the data storage and process. The batch conversion algorithm with MapReduce improves the efficiency of data processing from the ESRI Shapefile to GeoCSV. Aimed at the problem of data query efficiency, a spatial coding-based approach for partitioning the spatial big data is presented, and then the distributed R-tree index is built for the spatial range query. This method will improve the query performance of the spatial big data, as well as the data balance in HDFS [22]. For the visualization of the remote sensing data, the tile pyramid model is a good and mature solution [27,28]. In this paper, we use this model to solve the problem of visualizing the vector-based spatial big data. The tile pyramid model-building algorithm is proposed and parallelized with the MapReduce program.

We tested the above key technologies and system functions. By comparing these features with existing GIS tools, all of the parallel algorithms, including data conversion, data partitioning, distributed R-tree index, data query, and visualization, show good performance and, at the same time, they can take full advantages of the cluster scalability. Compared with LandQ*v1*, the visualization of the ALQ dataset in national level makes the macroscopic grasp of data more comprehensive and scientific, which will be useful to policymakers.

#### **6. Conclusions**

Recent advancements in cloud computing technology have offered a potential solution to the big data challenges [29]. This paper presents LandQ*v2*, which is a MapReduce-based parallel processing system for arable land quality (ALQ) big data that uses the Hadoop cloud computing platform. The contents of system architecture, key technologies, and the tests shown in this study are efficient enough to bring the following remarkable advantages of the developed system: (1) The spatial vector big data cloud storage model, GeoCSV, which is based on the characteristics of spatial vector data

and the advantages of the Hadoop cloud platform; (2) the distributed R-tree index in LandQ*v2*, the data partitioning strategy based on spatial coding is designed, and the parallel construction of the distributed R-tree index is realized. Through experiments, the efficiency and feasibility of different distributed spatial indexing algorithms are verified from two perspectives: spatial index quality and data balance; (3) parallel processing for ALQ big data. This paper carries out parallel processing methods including a data conversion algorithm, spatial range query, and tile pyramid model construction. All these parallel algorithms are implemented with the MapReduce program. In addition, using ALQ big data, the experiments are used to verify the efficiency and feasibility of the proposed vector data processing algorithm.

**Author Contributions:** X.Y. developed the LandQ*v2* and wrote this paper. M.F.M., L.A. and A.E. are the contributors of the SpatialHadoop and provided key technical guidance for the system development. S.Y., Z.Z., and L.Z. assisted in the system development. G.L. and D.Z. are the supporters of this project and supervised this paper.

**Funding:** This work was sponsored by National Key Research and Development Program of China from MOST (2016YFB0501503).

**Acknowledgments:** We would like to thank the anonymous reviewers and editors for commenting on this paper.

**Conflicts of Interest:** The authors declare no conflict of interest.

#### **References**


© 2018 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).
