State-of-the-Art Geospatial Information Processing in NoSQL Databases
Abstract
:1. Introduction
2. Geospatial Data Characteristics and Related Concepts
- Raster data are made up of a matrix of cells (grain or pixels), in which a cell has an associated value representing information, such as a brightness value or temperature, and are arranged into rows and columns (or a grid).
- Vector data consist of individual points that are stored as pairs of (x, y) in 2D cases or (x, y, z) in 3D cases. The points are connected through certain orders/rules to create lines, polygons, surfaces, and solids. In this paper, most of the discussions refer to vector data.
- According to ISO 19109:2015, a feature is defined as the “abstraction of real-world phenomena”. Features may have attributes, e.g., spatial attributes giving the location/extent of the feature, thematic attributes giving descriptive characteristics of the feature, and also other kinds of attributes, such as metadata/quality.
- The geometry is any geometric shape that can represent a feature’s spatial attribute, such as a point (0D), line (1D), polygon/surface (2D), or solid/volume (3D). The geometries can be embedded in 1D space, 2D space, or 3D space. The dimension of the geometry must be smaller than or equal to the dimension of the embedded space. For simple cases such as visualization using traditional 2D maps (2D space), points, lines, and polygons might be sufficient for user needs. For more complex cases requiring 3D space, surfaces and volumes/solids are also required.
- A coordinate reference system (CRS), or a spatial reference system (SRS), is a coordinate-based system for locating geographical entities and establishing their relationships. Popular coordinate reference systems include the geocentric coordinate system, geographic coordinate system (WGS84 datum), Universal Transverse Mercator (UTM), and Cartesian coordinate system.
- In coordinate reference systems, Well-known Text (WKT) is a text markup language that represents coordinate reference systems and conversions between different coordinate reference systems, as defined by the Open Geospatial Consortium (OGC).
- The EPSG Geodetic Parameter Dataset (also called the EPSG registry), which was developed by the European Petroleum Survey Group (EPSG) in 1985, is a public collection of the definitions of coordinate reference systems and coordinate transformations. The EPSG code is widely used in geographic information systems and GIS libraries.
3. The Research Methodology
4. State-of-the-Art Geospatial Processing in NoSQL Databases
4.1. MongoDB
- An array: <field>: [ <x>, <y> ]
- An embedded document: <field>: {<field1>: <x>, <field2>: <y> }.
4.2. Couchbase
4.3. Neo4j
- WITH point ({latitude:toFloat(’13.43’), longitude:toFloat(’56.21’)}) AS p1,
- point ({latitude:toFloat(’13.10’), longitude:toFloat(’56.41’)}) AS p2
- RETURN toInteger(distance(p1, p2)/1000) AS km
4.4. Apache Cassandra
- CREATE CUSTOM INDEX (IF NOT EXISTS)? <index_name>
- ON <table_name> ()
- USING 'com.stratio.cassandra.lucene.Index'
- WITH OPTIONS = <options>
4.5. Apache HBase
4.6. Redis
- GEOADD building 15.45244 -76.78506 my-house
- redis.zrem (building, my-house)
4.7. Amazon DynamoDB
4.8. Elasticsearch
- POST /example/_doc
- {
- “location” : {
- “type” : “point”,
- “coordinates” : [-77.03653, 38.897676]
- }
- }
4.9. Splunk
- [<lookup_name>]
- external_type = geo
- filename = <name_of_KMZ_file>
- feature_id_element = <XPath_expression>
4.10. Solr
- geojson: [geo f=mySpatialField w=GeoJSON].
- LatLonPointSpatialField: this is most commonly used for latitude–longitude point data;
- RptWithGeometrySpatialField: for indexing and searching for non-point data (it can do points, as well, but it cannot do sorting/boosting);
- BBoxField: for indexing bounding boxes, queries using a box, or specifying a search predicate (Intersects, Within, Contains, Disjoint, Equals);
- LatLonType (now defunct): it still exists, but has been replaced by LatLonPointSpatialField.
- &q=*:*&fq={!bbox sfield=store} & pt=45.15,-93.85 & d=5
- &q=*:*&fq={!geofilt sfield=store} & pt=45.15,-93.85 & d=5
- d is the radial distance, usually in kilometers;
- pt is the center point using the format “latitude, longitude”;
- sfield is a spatial index field;
- the geofilt filter retrieves results based on the geospatial distance (circle distance) using a given point as the center of a circle and d as the radius;
- the bbox (bounding box) filter uses the bounding box of the geofilt circle.
5. Comparisons of Geospatial Data Processing in NoSQL Databases
6. Conclusions
Author Contributions
Funding
Conflicts of Interest
References
- Vatsavai, R.R.; Ganguly, A.; Chandola, V.; Stefanidis, A.; Klasky, S.; Shekhar, S. Spatiotemporal data mining in the era of big spatial data: Algorithms and applications. In Proceedings of the 1st ACM SIGSPATIAL International Workshop on Analytics for Big Geospatial Data, 6 November 2012; ACM: Redondo Beach, CA, USA, 2012; pp. 1–10. [Google Scholar]
- Lee, J.-G.; Kang, M. Geospatial Big Data: Challenges and Opportunities. Big Data Res. 2015, 2, 74–81. [Google Scholar] [CrossRef]
- Liu, Z.; Guo, H.; Wang, C. Considerations on Geospatial Big Data. IOP Conf. Ser. 2016, 46, 012058. [Google Scholar] [CrossRef] [Green Version]
- Yue, P.; Jiang, L.C. BigGIS: How Big Data Can Shape Next-Generation GIS. In Proceedings of the Third International Conference on Agro-Geoinformatics, Beijing, China, 11–14 August 2014; pp. 413–418. [Google Scholar]
- Baralis, E.; Valle, A.D.; Garza, P.; Rossi, C.; Scullino, F. SQL versus NoSQL Databases for Geospatial Applications. In Proceedings of the 2017 IEEE International Conference on Big Data, IEEE, Boston, MA, USA, 11–14 December 2017; pp. 3388–3397. [Google Scholar]
- Schmid, S.; Galicz, E.; Reinhardt, W. Performance investigation of selected SQL and NoSQL databases. In Proceedings of the AGILE 2015, Lisbon, Portugal, 9–12 June 2015. [Google Scholar]
- Hu, F.; Xu, M.C.; Yang, J.C.; Liang, Y.S.; Cui, K.J.; Little, M.M.; Lynnes, C.S.; Duffy, D.Q.; Yang, C.W. Evaluating the Open Source Data Containers for Handling Big Geospatial Raster Data. ISPRS Int. J. Geo Inf. 2018, 7, 144. [Google Scholar]
- Obe, R.O.; Hsu, L.S. PostGIS in Action, 2nd ed.; Manning Publications Co.: Shelter Island, NY, USA, 2015. [Google Scholar]
- Zhong, Y.; Han, J.; Zhang, T.; Fang, J. A distributed geospatial data storage and processing framework for large-scale WebGIS. In Proceedings of the 2012 20th International Conference on Geoinformatics, Hongkong, China, 15–17 June 2012; pp. 1–7. [Google Scholar]
- Oracle Database Documentation. Available online: https://docs.oracle.com/en/database/oracle/oracle-database/index.html (accessed on 4 October 2019).
- Au, C.; Rischpater, R. Geospatial Data with Azure SQL Database. In Microsoft Mapping: Geospatial Development in Windows 10 with Bing Maps and C#; Apress: Berkeley, CA, USA, 2015; pp. 33–53. [Google Scholar]
- Microsoft SQL Server/Geospatial Data. Available online: https://en.wikibooks.org/wiki/Microsoft_SQL_Server/Geospatial_Data (accessed on 8 May 2018).
- Tang, E.Q.; Fan, Y.S. Performance Comparison between Five NoSQL Databases. In Proceedings of the 2016 7th International Conference on Cloud Computing and Big Data, Macau, China, 16–18 November 2016; pp. 105–109. [Google Scholar]
- Cheng, B.; Guan, X. Design and evaluation of a high-concurrency web map tile service framework on a high performance cluster. Int. J. Grid Distrib. Comput. 2016, 9, 127–142. [Google Scholar] [CrossRef]
- Ramzan, S.; Bajwa, I.S.; Kazmi, R. Amna. Challenges in NoSQL-Based Distributed Data Storage: A Systematic Literature Review. Electronics 2019, 8, 29. [Google Scholar] [CrossRef] [Green Version]
- Akbar, A.M.A.M.; Purnama, I.K.E.; Nugroho, S.M.S.; Hariadi, M. Fast and Efficient Cluster Based Map for Ship Tracking. In Proceedings of the 2018 International Conference on Computer Engineering, Network and Intelligent Multimedia, Surabaya, Indonesia, 26–27 November 2018; pp. 265–269. [Google Scholar]
- Polakova, M.; Vitols, G. Use of NoSQL Technology for Analysis of Unstructured Spatial Data. In Research for Rural Development 2018; Treija, S., Skujeniece, S., Eds.; Latvia University of Life Sciences and Technologies: Yelgava, Latvia, 2018; Volume 2, pp. 267–270. [Google Scholar]
- Agarwal, S.; Rajan, K.S. Performance analysis of MongoDB versus PostGIS/PostGreSQL databases for line intersection and point containment spatial queries. Spat. Inf. Res. 2016, 24, 671–677. [Google Scholar] [CrossRef]
- Agarwal, S.; Rajan, K. Analyzing the performance of NoSQL vs. SQL databases for Spatial and Aggregate queries. In Proceedings of the Free and Open Source Software for Geospatial (FOSS4G) Conference Proceedings, Boston, MA, USA, 14–19 August 2017. [Google Scholar]
- Simmonds, R.; Watson, P.; Halliday, J. Antares: A Scalable, Real-time, Fault tolerant Data Store for Spatial Analysis. In 2015 IEEE World Congress on Services (SERVICES); Zhang, L.J., Bahsoon, R., Eds.; IEEE: New York City, NY, USA, 2015; pp. 105–112. [Google Scholar]
- Amirian, P.; Basiri, A.; Winstanley, A. Evaluation of Data Management Systems for Geospatial Big Data. In Computational Science and Its Applications—ICCSA 2014; Springer International Publishing: Guimarães, Portugal, 2014; pp. 678–690. [Google Scholar]
- Brink, L.V.D.; Barnaghi, P.; Tandy, J.; Atemezing, G.; Atkinson, R.; Cochrane, B.; Fathy, Y.; Castro, R.G.; Haller, A.; Harth, A.; et al. Best practices for publishing, retrieving, and using spatial data on the web. Semant. Web 2019, 10, 95–114. [Google Scholar] [CrossRef] [Green Version]
- More, N.P.; Nikam, V.B.; Sen, S.S. Experimental Survey of Geospatial Big Data Platforms. In Proceedings of the 2018 Ieee 25th International Conference on High Performance Computing Workshop, Bengaluru, India, 17–20 December 2018; pp. 137–143. [Google Scholar]
- Patroumpas, K.; Giannopoulos, G.; Athanasiou, S. Towards GeoSpatial semantic data management: Strengths, weaknesses, and challenges ahead. In Proceedings of the ACM International Symposium on Advances in Geographic Information Systems, Association for Computing Machinery, New York, NY, USA, 7–10 August 2017; pp. 301–310. [Google Scholar]
- Consortium, O.G. OpenGIS® Implementation Standard for Geographic Information—Simple Feature Access—Part 1: Common Architecture. Available online: https://www.ogc.org/standards/sfa (accessed on 4 December 2011).
- Egenhofer, M.J.; Franzosa, R.D. Point-set topological spatial relations. Int. J. Geogr. Inf. Syst. 1991, 5, 161–174. [Google Scholar] [CrossRef] [Green Version]
- Randell, D.A.; Cui, Z.; Cohn, A.G. A spatial logic based on regions and connection. In Proceedings of the Third International Conference on Principles of Knowledge Representation and Reasoning, Cambridge, MA, USA, 23–25 October 1992; pp. 165–176. [Google Scholar]
- Battle, R.; Kolas, D. Enabling the geospatial Semantic Web with Parliament and GeoSPARQL. Semant. Web 2012, 3, 355–370. [Google Scholar] [CrossRef]
- Qian, C.Y.; Yi, C.; Cheng, C.Q.; Pu, G.L.; Wei, X.F.; Zhang, H.C. GeoSOT-Based Spatiotemporal Index of Massive Trajectory Data. ISPRS Int. J. Geo Inf. 2019, 8, 284. [Google Scholar] [CrossRef] [Green Version]
- DB-Engines Ranking. Available online: https://db-engines.com/en/ranking (accessed on 8 October 2019).
- Bartoszewski, D.; Piorkowski, A.; Lupa, M. The Comparison of Processing Efficiency of Spatial Data for PostGIS and MongoDB Databases, in Beyond Databases. In Architectures and Structures. Paving the Road to Smart Data Processing and Analysis; Kozielski, S., Mrozek, D., Kasprowski, P., Małysiak-Mrozek, B., Kostrzewa, D., Eds.; Springer: Cham, Switzerland, 2019; pp. 291–302. [Google Scholar]
- Laksono, D. Testing Spatial Data Deliverance in SQL and NoSQL Database Using NodeJS Fullstack Web App. In Proceedings of the 2018 4th International Conference on Science and Technology, ICST 2018, Yogyakarta, Indonesia, 7–8 August 2018. [Google Scholar]
- Duan, M.R.; Chen, G. Assessment of MongoDB’s Spatial Retrieval Performance. In Proceedings of the 2015 23rd International Conference on Geoinformatics, Wuhan, China, 19–21 June 2015. [Google Scholar]
- Bordogna, G.; Ciriello, D.E.; Psaila, G. Flexible Framework to Cross-Analyze Heterogeneous Multi-Source Geo-referenced Information: The J-CO-QL Proposal and its Implementation. In Proceedings of the 2017 IEEE/Wic/Acm International Conference on Web Intelligence, Association for Computing Machinery, Leipzig, Germany, 23–26 August 2017; pp. 499–508. [Google Scholar]
- Detti, A.; Rossi, G.; Melazzi, N.B. Exploiting Information-Centric Networking to Federate Spatial Databases. IEEE Access 2019, 7, 165248–165261. [Google Scholar] [CrossRef]
- Krämer, M. GeoRocket: A scalable and cloud-based data store for big geospatial files. SoftwareX 2020, 11, 100409. [Google Scholar] [CrossRef]
- Rainho, F.D.; Bernardino, J. Web GIS: A new system to store spatial data using GeoJSON in MongoDB. In Proceedings of the 2018 13th Iberian Conference on Information Systems and Technologies, Caceres, Spain, 13–16 June 2018. [Google Scholar]
- Yaqot, M.; Trigunarsyah, B. Web-GIS to support maintenance reporting system: Application in Saudi Arabia. Infrastruct. Asset Manag. 2018, 5, 14–21. [Google Scholar] [CrossRef]
- Zhang, X.M.; Song, W.; Liu, L.M. An Implementation Approach to Store GIS Spatial Data on NoSQL Database. In Proceedings of the 2014 22nd International Conference on Geoinformatics, Kaohsiung, Taiwan, 25–27 July 2014. [Google Scholar]
- Zhao, C.X.; Wu, A.H.; Tao, Y.C.; Gao, W.C.; Liu, J. Design and implementation of Beijing fundamental geospatial framework platform. In Proceedings of the International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences, Suzhou, China, 14–16 May 2014. [Google Scholar]
- Ferro, M.; Fragoso, R.; Fidalgo, R. Document-oriented geospatial data warehouse: An experimental evaluation of SOLAP queries. In Proceedings of the 21st IEEE Conference on Business Informatics, CBI 2019, Moscow, Russia, 15–17 July 2019. [Google Scholar]
- Ferro, M.; Lima, R.; Fidalgo, R. Evaluating redundancy and partitioning of geospatial data in document-oriented data warehouses. In Lecture Notes in Computer Science (including Subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); Springer: Genève, Switzerland, 2019; pp. 221–235. [Google Scholar]
- Jun, S.; Lee, S. Prototype system for geospatial data building-sharing developed by utilizing open source web technology. Spat. Inf. Res. 2017, 25, 725–733. [Google Scholar] [CrossRef]
- Xiang, L.G.; Shao, X.T.; Wang, D.H. Providing R-Tree Support for MongoDB. ISPRS Int. Arch. Photogramm. Remote Sens. Spat. Inf. Sci. 2016, XLI-B4, 545–549. [Google Scholar] [CrossRef]
- Li, Y.; Kim, D.; Shin, B.S. Geohashed Spatial Index Method for a Location-Aware WBAN Data Monitoring System Based on NoSQL. J. Inf. Process. Syst. 2016, 12, 263–274. [Google Scholar]
- Xiang, L.G.; Huang, J.T.; Shao, X.T.; Wang, D.H. A MongoDB-Based Management of Planar Spatial Data with a Flattened R-Tree. ISPRS Int. J. Geo Inf. 2016, 5, 119. [Google Scholar] [CrossRef] [Green Version]
- Guan, X.F.; Bo, C.; Li, Z.Q.; Yu, Y.J. ST-Hash: An Efficient Spatiotemporal Index for Massive Trajectory Data in a NoSQL Database. In Proceedings of the 2017 25th International Conference on Geoinformatics, Buffalo, NY, USA, 2–4 August 2017. [Google Scholar]
- Why Couchbase? Available online: https://docs.couchbase.com/server/6.0/introduction/intro.html (accessed on 10 June 2019).
- Spatial Values. Available online: https://neo4j.com/docs/cypher-manual/3.5/syntax/spatial/ (accessed on 2 September 2019).
- Neo4j Spatial v0.24-neo4j-3.1.4. Available online: https://neo4j-contrib.github.io/spatial/0.24-neo4j-3.1/index.html (accessed on 1 June 2017).
- Sarwat, M.; Sun, Y.H.; IEEE. Answering Location-Aware Graph Reachability Queries on GeoSocial Data. In Proceedings of the 2017 IEEE 33rd International Conference on Data Engineering, San Diego, CA, USA, 19–22 April 2017; pp. 207–210. [Google Scholar]
- Sun, Y.H.; Sarwat, M. A spatially-pruned vertex expansion operator in the Neo4j graph database system. Geoinformatica 2019, 23, 397–423. [Google Scholar] [CrossRef]
- Bakkal, F.; Savas, N.S.; Eken, S.; Sayar, A. Modeling and Querying Trajectories using Neo4j Spatial and TimeTree for Carpool Matching. In Proceedings of the 2017 Ieee International Conference on Innovations in Intelligent Systems and Applications, Gdynia, Poland, 3–5 July 2017; pp. 219–222. [Google Scholar]
- Idziaszek, P.; Mueller, W.; Gorna, K.; Okon, P.; Boniecki, P.; Koszela, K.; Fojud, A. Identification of the condition of crops based on geospatial data embedded in graph databases. In Proceedings of the Ninth International Conference on Digital Image Processing, Hong Kong, China, 19–22 May 2017. [Google Scholar]
- Mueller, W.; Rudowicz-Nawrocka, J.; Otrzasek, J.; Idziaszek, P.; Weres, J. Spatial data and graph databases for identifying relations among members of cattle herd. Inform. Geoinf. Remote Sens. Conf. Proc. Sgem 2016, 1, 835–842. [Google Scholar]
- Ben Brahim, M.; Drira, W.; Filali, F.; Hamdi, N. Spatial data extension for Cassandra NoSQL database. J. Big Data 2016, 3, 11. [Google Scholar] [CrossRef]
- Nidzwetzki, J.K.; Guting, R.H. Distributed SECONDO: A Highly Available and Scalable System for Spatial Data Processing. In Advances in Spatial and Temporal Databases; Claramunt, C., Schneider, M., Wong, R.C.-W., Xiong, L., Loh, W.-K., Shahabi, C., Li, K.J., Eds.; Springer: Genève, Switzerland, 2015; pp. 491–496. [Google Scholar]
- Nidzwetzki, J.K.; Guting, R.H. DISTRIBUTED SECONDO: An extensible and scalable database management system. Distrib. Parallel Databases 2017, 35, 197–248. [Google Scholar] [CrossRef]
- Vasavi, S.; Priya, M.P.; Gokhale, A.A. Framework for geospatial query processing by integrating cassandra with hadoop. In Knowledge Computing and Its Applications: Knowledge Manipulation and Processing Techniques: Volume 1; Springer: Singapore, 2018; pp. 131–160. [Google Scholar]
- Shah, P.; Chaudhary, S. Big data analytics framework for spatial data. In Lecture Notes in Computer Science (including Subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); Springer: Genève, Switzerland, 2018; pp. 250–265. [Google Scholar]
- Website of Spark-Cassandra-Connector. Available online: https://github.com/datastax/spark-cassandra-connector (accessed on 4 January 2020).
- Apache HBase ™ Reference Guide. Available online: https://hbase.apache.org/book.html (accessed on 16 April 2020).
- Le, H.V.; Takasu, A. G-HBase: A High Performance Geographical Database Based on HBase. Ieice Trans. Inf. Syst. 2018, 101, 1053–1065. [Google Scholar] [CrossRef] [Green Version]
- Han, D.; Stroulia, E. HGrid: A Data Model for Large Geospatial Data Sets in HBase. In Proceedings of the 2013 IEEE Sixth International Conference on Cloud Computing, Santa Clara, CA, USA, 28 June–3 July 2013; pp. 910–917. [Google Scholar]
- Han, D.; Stroulia, E. Federating Web-Based Applications on a Hierarchical Cloud. In Proceedings of the 2014 IEEE 7th International Conference on Cloud Computing, Anchorage, AK, USA, 27 June–2 July 2014; IEEE: Anchorage, AK, USA, 2014; pp. 946–947. [Google Scholar]
- Zhang, N.Y.; Zheng, G.Z.; Chen, H.J.; Chen, J.Y.; Chen, X. HBaseSpatial: A Scalable Spatial Data Storage Based on HBase. In Proceedings of the 2014 IEEE 13th International Conference on Trust, Security and Privacy in Computing and Communications, Beijing, China, 24–26 September 2014; pp. 644–651. [Google Scholar]
- Wang, Y.; Li, C.J.; Li, M.; Liu, Z.L. HBase storage schemas for massive spatial vector data. Clust. Comput. J. Netw. Softw. Tools Appl. 2017, 20, 3657–3666. [Google Scholar] [CrossRef]
- Jo, B.; Jung, S. Quadrant-Based Minimum Bounding Rectangle-Tree Indexing Method for Similarity Queries over Big Spatial Data in HBase. Sensors 2018, 18, 3032. [Google Scholar] [CrossRef] [Green Version]
- He, S.W.; Chu, L.X.; Li, X.Y. Spatial Query Processing for Location Based Application on Hbase. In Proceedings of the 2017 IEEE 2nd International Conference on Big Data Analysis, Beijing, China, 10–12 March 2017; pp. 115–119. [Google Scholar]
- Chen, X.Y.; Zhang, C.; Ge, B.; Xiao, W.D. Spatio-temporal Queries in HBase. In Proceedings of the 2015 IEEE International Conference on Big Data, Santa Clara, CA, USA, 29 October–1 November 2015; pp. 1929–1937. [Google Scholar]
- Li, L.H.; Liu, W.D.; Zhong, Z.Y.; Huang, C.Q. SP-Phoenix: A Massive Spatial Point Data Management System based on Phoenix. In Proceedings of the IEEE 20th International Conference on High Performance Computing and Communications/IEEE 16th International Conference on Smart City/IEEE 4th International Conference on Data Science and Systems, Exeter, UK, 28–30 June 2018; pp. 1634–1641. [Google Scholar]
- Shangguan, B.; Yue, P.; Wu, Z.; Jiang, L. Big spatial data processing with Apache Spark. In Proceedings of the 2017 6th International Conference on Agro-Geoinformatics, Agro-Geoinformatics, Fairfax, VA, USA, 7–10 August 2017. [Google Scholar]
- Zheng, K.; Zheng, K.; Fang, F.L.; Zhang, M.; Li, Q.; Wang, Y.H.; Zhao, W.Y. An extra spatial hierarchical schema in key-value store. Clust. Comput. J. Netw. Softw. Tools Appl. 2019, 22, S6483–S6497. [Google Scholar] [CrossRef]
- GeoMesa: Store, Index, Query, and Transform Spatio-Temporal Data at Scale in HBase, Accumulo, Cassandra, Kafka and Spark. Available online: https://www.geomesa.org/index.html (accessed on 16 August 2018).
- Cho, W.; Choi, E. A basis of spatial big data analysis with map-matching system. Clust. Comput. J. Netw. Softw. Tools Appl. 2017, 20, 2177–2192. [Google Scholar] [CrossRef]
- Gao, F.; Yue, P.; Wu, Z.; Zhang, M. Geospatial data storage based on HBase and MapReduce. In Proceedings of the 2017 6th International Conference on Agro-Geoinformatics, Fairfax, VA, USA, 7–10 August 2017; pp. 55–58. [Google Scholar]
- MapReduce Tutorial. Available online: http://hadoop.apache.org/docs/r1.2.1/mapred_tutorial.html (accessed on 22 August 2019).
- Du, N.B.; Zhan, J.F.; Zhao, M.; Xiao, D.R.; Xie, Y.C. Spatio-temporal Data Index model of moving objects on fixed networks using HBase. In Proceedings of the 2015 Ieee International Conference on Computational Intelligence and Communication Technology Cict 2015, Ghaziabad, India, 13–14 February 2015; pp. 247–251. [Google Scholar]
- Jo, B.; Jung, S. Quadrant-Based MBR-Tree Indexing Technique for Range Query Over HBase. In Proceedings of the 7th International Conference on Emerging Databases: Technologies, Applications, and Theory, Busan, Korea, 7–9 August 2017; Lee, W., Choi, W., Jung, S., Song, M., Eds.; Springer: Singapore, 2018; pp. 14–24. [Google Scholar]
- Van, L.H.; Takasu, A. An Efficient Distributed Index for Geospatial Databases. In Proceedings of the International Conference on Database and Expert Systems Applications, Valencia, Spain, 1–4 September 2015; pp. 28–42. [Google Scholar]
- Zhai, W.X.; Yang, Z.; Wang, L.; Wu, F.L.; Cheng, C.Q. The Non-sql Spatial Data Management Model in Big Data Time. In Proceedings of the 2015 IEEE International Geoscience and Remote Sensing Symposium, Milan, Italy, 26–31 July 2015; pp. 4506–4509. [Google Scholar]
- Wang, K.; Liu, G.L.; Zhai, M.; Wang, Z.W.; Zhou, C.Y. Building an efficient storage model of spatial-temporal information based on HBase. J. Spat. Sci. 2019, 64, 301–317. [Google Scholar] [CrossRef]
- Zhang, D.F.; Wang, Y.; Liu, Z.L.; Dai, S.J. Improving NoSQL Storage Schema Based on Z-Curve for Spatial Vector Data. IEEE Access 2019, 7, 78817–78829. [Google Scholar] [CrossRef]
- Zheng, K.; Gu, D.; Fang, F.; Zhang, M.; Zheng, K.; Li, Q. Data storage optimization strategy in distributed column-oriented database by considering spatial adjacency. Clust. Comput. J. Netw. Softw. Tools Appl. 2017, 20, 2833–2844. [Google Scholar] [CrossRef]
- Thomas, G.; Alexander, G.; Sasi, P.M. Design of High Performance Cluster based Map for Vehicle Tracking of public transport vehicles in Smart City. In Proceedings of the 2017 IEEE Region 10 International Symposium on Technologies for Smart Cities, Cochin, India, 14–16 July 2017. [Google Scholar]
- Bartlett, R. Local geographic information storing and querying using elasticsearch. In Proceedings of the 13th Workshop on Geographic Information Retrieval, Lyon, France, 28–29 November 2019. [Google Scholar]
- Toepke, S.L. Leveraging elasticsearch and botometer to explore volunteered geographic information. In Proceedings of the International ISCRAM Conference, Rochester, NY, USA, 20–23 May 2018. [Google Scholar]
- Cinquini, L.; Crichton, D.; Mattmann, C.; Harney, J.; Shipman, G.; Wang, F.Y.; Ananthakrishnan, R.; Miller, N.; Denvil, S.; Morgan, M.; et al. The Earth System Grid Federation: An open infrastructure for access to distributed geospatial data. Future Gener. Comput. Syst. Int. J. Grid Comput. Escience 2014, 36, 400–417. [Google Scholar] [CrossRef]
- Florance, P.; McGee, M.; Barnett, C.; McDonald, S. The Open Geoportal Federation. J. Map Geogr. Libr. 2015, 11, 376–394. [Google Scholar] [CrossRef]
- Corti, P.; Kralidis, A.T.; Lewis, B. Enhancing discovery in spatial data infrastructures using a search engine. Peerj Comput. Sci. 2018, 4, e152. [Google Scholar] [CrossRef] [Green Version]
- Celko, J. Complete Guide to NoSQL, What Every SQL Professional Needs to Know about Non-Relational Databases; Morgan Kaufmann: Boston, MA, USA, 2014; pp. 103–117. [Google Scholar]
- Sharma, M.; Sharma, V.D.; Bundele, M.M. Performance Analysis of RDBMS and No SQL Databases: PostgreSQL, MongoDB and Neo4j. In Proceedings of the 3rd International Conference and Workshops on Recent Advances and Innovations in Engineering, ICRAIE 2018, Jaipur, India, 22–25 November 2018. [Google Scholar]
- Wang, L.; Chen, B.; Liu, Y.H. Distributed Storage and Index of Vector Spatial Data Based on HBase. In Proceedings of the 2013 21st International Conference on Geoinformatics, Kaifeng, China, 20–22 June 2013. [Google Scholar]
- Jiang, H.; Kang, J.F.; Du, Z.H.; Zhang, F.; Huang, X.Z.; Liu, R.Y.; Zhang, X.T. Vector Spatial Big Data Storage and Optimized Query Based on the Multi-Level Hilbert Grid Index in HBase. Information 2018, 9, 116. [Google Scholar] [CrossRef] [Green Version]
- Zhang, H.C.; Lu, F.; Chen, J. A Line Graph-Based Continuous Range Query Method for Moving Objects in Networks. ISPRS Int. J. Geo Inf. 2016, 5, 246. [Google Scholar] [CrossRef] [Green Version]
- Ahn, J.S.; Seo, C.; Mayuram, R.; Yaseen, R.; Kim, J.S.; Maeng, S. ForestDB: A Fast Key-Value Storage System for Variable-Length String Keys. IEEE Trans. Comput. 2016, 65, 902–915. [Google Scholar] [CrossRef]
Simple Features | Egenhofer | RCC8 |
---|---|---|
equals | equal | EQ |
disjoint | disjoint | DC |
intersects | ¬disjoint | ¬DC |
touches | meet | EC |
within | inside + coveredBy | NTPP + TPP |
contains | contains + covers | NTPPi + TPPi |
overlaps | overlap | PO |
Database Model | NoSQL Databases |
---|---|
Document | MongoDB (21), Couchbase (1) |
Graph | Neo4j (6) |
Wide column | Cassandra (5), HBase (24) |
Key-value | Redis (2) |
Multi-model | Amazon DynamoDB (0) |
Search engine | Elasticsearch (3), Splunk (0), Solr (3) |
Database Name | Primary Database Model | Secondary Database Models | Website | Developer | Server Operating Systems | APIs and Other |
---|---|---|---|---|---|---|
MongoDB | Document | www.mongodb.com | MongoDB, Inc | OS X, Windows | Proprietary protocol using JSON | |
Couchbase | Document | Key-value | http://www.couchbase.com | Couchbase, Inc. | OS X, Windows | Native language bindings for CRUD, query, search and analytics APIs |
Neo4j | Graph | neo4j.com | Neo4j, Inc. | Linux, Unix, Windows | Bolt protocol, Cypher query language, Java API, Neo4j-OGM, RESTful HTTP API, Spring Data Neo4j | |
Cassandra | Wide column | cassandra.apache.org | Apache Software Foundation | BSD, Linux, | Proprietary protocol | |
HBase | Wide column | hbase.apache.org | Apache Software Foundation | Linux, | Java API, RESTful HTTP API, Thrift | |
Redis | Key-value | Document, graph DBMS, search engine | redis.io | Salvatore Sanfilippo | Linux, OS X, Solaris, Windows | Proprietary protocol |
Amazon DynamoDB | Document, Key-value | aws.amazon.com/dynamodb/ | Amazon | Hosted | RESTful HTTP API | |
Elasticsearch | Search engine | Document | https://www.elastic.co/elasticsearch/ | Elastic | Linux, OS X, Solaris, Windows | Java API, RESTful HTTP/JSON API |
Splunk | Search engine | www.splunk.com | Splunk Inc. | BSD, Linux, OS X, Windows | HTTP REST | |
Solr | Search engine | https://lucene.apache.org/solr/ | Apache Software Foundation | All OS with a Java VM | Java API, RESTful HTTP/JSON API |
Database | Supported Geometry Objects | Main Supported Geometry Functions | Supported Spatial Indexes | Declarative Query Language | Data Format |
---|---|---|---|---|---|
MongoDB | Point, LineString, Polygon, MultiPoint, MultiLineString, MultiPolygon, GeometryCollection | $geoIntersects, $geoWithin, $near, $nearSphere | 2dsphere index, 2d index based on geohash | not supported | GeoJSON objects, Legacy Coordinate Pairs |
Couchbase | Point, LineString, Polygon, MultiPoint, MultiLineString, MultiPolygon, GeometryCollection | BBox | R-Tree, designed by users | N1QL | GeoJSON |
Amazon DynamoDB | Point | GeoPoint, putPoint deletePoint, updatePoint queryRectangle, queryRadius | geohash | not supported | GeoJSON |
Cassandra | Point, LineString, Polygon | intersects, contains, is_within (provided by a plug-in) | Lucene index (provided by a plug-in) | Cassandra Query Language (CQL) | WKT |
HBase | No | No | No | not supported | No |
Neo4j | Only Point in Neo4j; Point, LineString, Polygon, MultiPoint, MultiLineString, MultiPolygon, GeometryCollection in Neo4j Spatial Library | distance(), point()-WGS 84 2D, point()-WGS 84 3D, point()-Cartesian 2D, point()-Cartesian 3D | B+Tree, R-Tree (default) | Cypher | Graph Model, RDF |
Elasticsearch | Point, LineString, Polygon, MultiPoint, MultiLineString, MultiPolygon, GeometryCollection | geo_shape, geo_bounding_box, geo_distance, and geo_polygon | GeohashPrefixTree and QuadPrefixTree | Domain Specific Language (DSL) | GeoJSON and WKT |
Redis | Point | geoadd, geodist, geohash, geopos, georadius and georadiusbymember | geohash | not supported | GeoJSON |
Solr | points, circles, envelopes, line strings, polygons, and “multi” variants of these | convexHull, enclosingDisk, | Prefix tree, geohash, and quadtree | “Lucene” query parser (default) | WKT or GeoJSON |
Splunk | Point, LineString, Polygon, MultiPoint, MultiLineString, MultiPolygon, GeometryCollection | Geom inputlookup | Summary indexing | Splunk Process Language (SPL) | KMZ or KML |
Data Models | Main Characteristics in Terms of Geospatial Processing | Main Applications | Academic Attention |
---|---|---|---|
Graph database | 1. Fast graph traversal 2. Distance calculations and location query applications 3. Limited geospatial indexes, queries, and functions | Spatio-temporal data (moving object) | Low |
Document databases | 1. Good and stable performance in geospatial spatial data queries and retrievals 2. Wide application scenarios 3. Abundant and effective geospatial management, indexes, queries, and functions | Widely (distributed or web/cloud platform) | High |
Wide column databases | 1. Fast mass data insertion and data retrieval 2. Require users to design or construct indexes and query functions 3. Limited geospatial queries and functions, but provide basic classes’ extension | Widely (distributed or web/cloud platform) | High |
Key-value database | 1. Fast data loading and workloads execution 2. In-memory storage and specific application scenario 3. Limited geospatial queries and functions | Tracking applications | Low |
© 2020 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).
Share and Cite
Guo, D.; Onstein, E. State-of-the-Art Geospatial Information Processing in NoSQL Databases. ISPRS Int. J. Geo-Inf. 2020, 9, 331. https://doi.org/10.3390/ijgi9050331
Guo D, Onstein E. State-of-the-Art Geospatial Information Processing in NoSQL Databases. ISPRS International Journal of Geo-Information. 2020; 9(5):331. https://doi.org/10.3390/ijgi9050331
Chicago/Turabian StyleGuo, Dongming, and Erling Onstein. 2020. "State-of-the-Art Geospatial Information Processing in NoSQL Databases" ISPRS International Journal of Geo-Information 9, no. 5: 331. https://doi.org/10.3390/ijgi9050331
APA StyleGuo, D., & Onstein, E. (2020). State-of-the-Art Geospatial Information Processing in NoSQL Databases. ISPRS International Journal of Geo-Information, 9(5), 331. https://doi.org/10.3390/ijgi9050331