River Basin Cyberinfrastructure in the Big Data Era: An Integrated Observational Data Control System in the Heihe River Basin
Abstract
:1. Introduction
2. Description of the Proposed System
2.1. System Overview
2.2. Automated Data Reception and Storage
2.3. Automated Data Quality Control
2.4. Distributed Storage System
2.5. Model Integration
2.6. Visualizations
3. Application in the Heihe River Basin
3.1. Case Study Area and the Overall Implementation
3.2. Data Management and Service
3.3. Real-Time Online Data Browsing and Analysis
3.4. Data Downloading on Demand
3.5. Intelligent Analysis of the Status of the Observational Network
3.6. Anomaly Detection
3.7. Online Computing with Integrated Models
4. Summary and Outlook
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Conflicts of Interest
References
- Toffler, A. The Third Wave, 1st ed.; Bantam Books, Inc.: New York, NY, USA, 1980; pp. 199–236. [Google Scholar]
- Bertot, J.C.; Choi, H. Big data and e-government: Issues, policies, and recommendations. In Proceedings of the 14th Annual International Conference on Digital Government Research, Quebec, QC, Canada, 17–20 June 2013; pp. 1–10. [Google Scholar] [CrossRef]
- Simović, A. A Big Data smart library recommender system for an educational institution. Libr. Hi Tech 2018, 36, 498–523. [Google Scholar] [CrossRef]
- Guo, H.; Wang, L.; Liang, D. Big Earth Data from space: A new engine for Earth science. Sci. Bull. 2016, 61, 505–513. [Google Scholar] [CrossRef]
- Baumann, P.; Mazzetti, P.; Ungar, J.; Barbera, R.; Barboni, D.; Beccati, A.; Bigagli, L.; Boldrini, E.; Bruno, R.; Calanducci, A.; et al. Big data analytics for earth sciences: The EarthServer approach. Int. J. Digit. Earth 2016, 9, 3–29. [Google Scholar] [CrossRef]
- Li, D.R. Towards Geo-spatial Information Science in Big Data Era. Acta Geod. Cartogr. Sin. 2016, 45, 379–384. (In Chinese) [Google Scholar]
- Jones, M.O.; Jones, L.A.; Kimball, J.S.; McDonald, K.C. Satellite passive microwave remote sensing for monitoring global land surface phenology. Remote Sens. Environ. 2011, 115, 1102–1114. [Google Scholar] [CrossRef]
- Li, X. Characterization, controlling and reduction of uncertainties in the modeling and observation of land-surface systems. Sci. China Earth Sci. 2014, 57, 80–87. [Google Scholar] [CrossRef]
- El-Zeiny, A.M.; Effat, H.A. Environmental monitoring of spatiotemporal change in land use/land cover and its impact on land surface temperature in El-Fayoum governorate, Egypt. Remote Sens. Appl. Soc. Environ. 2017, 8, 266–277. [Google Scholar] [CrossRef]
- Cheng, G.S.; Li, X.; Zhao, W.; Xu, Z.; Feng, Q.; Xiao, S.; Xiao, H. Integrated study of the water-ecosystem-economy in the Heihe River Basin. Natl. Sci. Rev. 2014, 1, 413–428. [Google Scholar] [CrossRef] [Green Version]
- Li, X.; Zhao, N.; Jin, R.; Liu, S.; Sun, X.; Wen, X.; Wu, D.; Zhou, Y.; Guo, J.; Chen, S.; et al. Internet of Things to network smart devices for ecosystem monitoring. Sci. Bull. 2019, 64, 1234–1245. [Google Scholar] [CrossRef] [Green Version]
- Zhang, M.H.; Li, X. Drone-enabled Internet-of-Things relay for environmental monitoring in remote areas without public networks. IEEE Internet Things J. 2020, 7, 7648–7662. [Google Scholar] [CrossRef]
- Hart, J.K.; Martinez, K. Toward an environmental Internet of Things. Earth Space Sci. 2015, 2, 194–200. [Google Scholar] [CrossRef] [Green Version]
- Lvovich, I.Y.; Lvovich, Y.E.; Preobrazhenskiy, A.P.; Preobrazhenskiy, Y.P.; Choporov, O. Modeling of information processing in the internet of things at agricultural enterprises. In IOP Conference Series: Earth and Environmental Science; IOP Publishing: Bristol, UK, 2019; Volume 315, p. 032029. [Google Scholar] [CrossRef]
- Guo, J.W.; Shang, Q.S.; Chang, H.L.; Liu, F.; Li, J.; Wu, A. Design of Field Observation Data Automatic Assembling System. Remote Sens. Technol. Appl. 2013, 28, 399–404. (In Chinese) [Google Scholar]
- Wu, A.D.; Guo, J.W.; Wang, L.X. Improvement and Application of automatic data in Heihe river basin downloading system. Remote Sens. Technol. Appl. 2015, 30, 1027–1032. (In Chinese) [Google Scholar]
- Wang, H.W.; Zhang, W.G.; Yu, X.W.; Zhang, X.; Deng, G.; Liu, Y.; Wang, J.; Li, F. Design and Operation of Network Management Platform for Forest Ecological Positioning Observation System. World For. Res. 2018, 31, 28–33. (In Chinese) [Google Scholar]
- Khayyat, Z.; Ilyas, I.F.; Jindal, A.; Madden, S.; Ouzzani, M.; Papotti, P.; Quiané-Ruiz, J.-A.; Tang, N.; Yin, S. Bigdansing: A system for big data cleansing. In Proceedings of the 2015 ACM SIGMOD International Conference on Management of Data, Melbourne, Australia, 31 May 2015–4 June 2015; Association for Computing Machinery: New York, NY, USA, 2015. [Google Scholar]
- Štefanič, P.; Cigale, M.; Jones, A.C.; Knight, L.; Taylor, I.; Istrate, C.; Suciu, G.; Ulisses, A.; Stankovski, V.; Taherizadeh, S.; et al. SWITCH workbench: A novel approach for the development and deployment of time-critical microservice-based cloud-native applications. Future Gener. Comput. Syst. Int. J. Escience 2019, 99, 197–212. [Google Scholar] [CrossRef]
- Koulouzis, S.; Martin, P.; Zhou, H.; Hu, Y.; Wang, J.; Carval, T.; Grenier, B.; Heikkinen, J.; De Laat, C.; Zhao, Z. Time-critical data management in clouds: Challenges and a Dynamic Real-Time Infrastructure Planner (DRIP) solution. Concurr. Comput. Pract. Exp. 2020, 32. [Google Scholar] [CrossRef]
- Liu, X.; Song, H.; Liu, A. Intelligent UAVs Trajectory Optimization from Space-Time for Data Collection in Social Networks. IEEE Trans. Netw. Sci. Eng. 2020, 8, 853–864. [Google Scholar] [CrossRef]
- Huang, S.; Liu, A.; Zhang, S.; Wang, T.; Xiong, N. BD-VTE: A Novel Baseline Data based Verifiable Trust Evaluation Scheme for Smart Network Systems. IEEE Trans. Netw. Sci. Eng. 2020. [Google Scholar] [CrossRef]
- Ren, Y.; Wang, T.; Zhang, S.; Zhang, J. An intelligent big data collection technology based on micro mobile data centers for crowdsensing vehicular sensor network. Pers. Ubiquitous Comput. 2020. [Google Scholar] [CrossRef]
- Aftab, M.U.; Oluwasanmi, A.; Alharbi, A.; Sohaib, O.; Nie, X.; Qin, Z.; Ngo, S.T. Secure and dynamic access control for the Internet of Things (IoT) based traffic system. Peerj Comput. Sci. 2021. [Google Scholar] [CrossRef]
- Díaz, J.J.; Mura, I.; Franco, J.F.; Akhavan-Tabatabaei, R. aiRe-A web-based R application for simple, accessible and repeatable analysis of urban air quality data. Environ. Model. Softw. 2021, 138. [Google Scholar] [CrossRef]
- Gorton, I. Cyberinfrastructures: Bridging the Divide between Scientific Research and Software Engineering. Computer 2014, 47, 48–55. [Google Scholar] [CrossRef]
- Wang, L.; Wang, S.; Ran, Y. Data sharing and data set application of watershed allied telemetry experimental research. IEEE Geosci. Remote. Sens. Lett. 2014, 11, 2020–2024. [Google Scholar] [CrossRef]
- Shih, Y.T.; Cheng, H.M.; Sung, S.H.; Hu, W.C.; Chen, C.H. Quantification of the calibration error in the transfer function-derived central aortic blood pressures. Am. J. Hypertens. 2011, 24, 1312–1317. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Zhou, B.; Chen, Q.; Xiao, P. The error propagation analysis of the received signal strength-based simultaneous localization and tracking in wireless sensor networks. IEEE Trans. Inf. Theory 2017, 63, 3983–4007. [Google Scholar] [CrossRef] [Green Version]
- Fei, H.; Xiao, F.; Li, G.H.; Sun, L.J. An Anomaly Detection Method of Wireless Sensor Network Based on Multi-Modals Data Stream. Chin. J. Comput. 2017, 40, 1829–1842. (In Chinese) [Google Scholar]
- Zhang, M.H.; Li, X.; Wang, L.L. An adaptive outlier detection and processing approach towards time series sensor data. IEEE Access 2019, 7, 175192–175212. [Google Scholar] [CrossRef]
- Zhang, M.H.; Guo, J.W.; Li, X.; Jin, R. Data-driven anomaly detection approach for time-series streaming data. Sensors 2020, 20, 5646. [Google Scholar] [CrossRef]
- Guo, J.W.; Liu, F. Automatic data quality control of observations in wireless sensor network. IEEE Geosci. Remote. Sens. Lett. 2014, 12, 716–720. [Google Scholar]
- Schwichtenberg, H. Installing Entity Framework Core. In Modern Data Access with Entity Framework Core; Apress: Berkeley, CA, USA, 2018; pp. 15–29. [Google Scholar]
- Albertini, O.R.; Bhargov, D.; Denissov, A.; Guerrero, F.; Jayaram, N.; Kak, N.; Khanna, E.; Kislal, O.; Kumar, A.; McQuillan, F.; et al. Image classification in Greenplum database using deep learning. In Proceedings of the International Conference on Management of Data, Portland, OR, USA, 14–19 June 2020; ACM SIGMOD Record. pp. 1–4. [Google Scholar]
- Wu, A.D.; Guo, J.W.; Yang, P.F. Research on an Application of Shared Architecture for Ecological Monitoring-oriented IoT Streaming Data. IEEE Access 2020, 8, 195385–195397. [Google Scholar] [CrossRef]
- Wu, A.D.; Che, T. Application Research of 3D Visualization System for Three Poles Scientific Discovery. J. Glaciol. Geocryol. 2021, 43, 1–11. (In Chinese) [Google Scholar]
- OpenLayers API Docs. Available online: https://openlayers.org/en/latest/apidoc/ (accessed on 26 March 2021).
- ECharts Docs. Available online: https://echarts.apache.org/en/api.html#echarts (accessed on 26 March 2021).
- WebSocket. Available online: https://developer.mozilla.org/en-US/docs/Web/API/WebSocket (accessed on 26 March 2021).
- Li, X.; Cheng, G.; Liu, S.; Xiao, Q.; Ma, M.; Jin, R.; Che, T.; Liu, Q.; Wang, W.; Qi, Y.; et al. Heihe Watershed Allied Telemetry Experimental Research (HiWATER): Scientific objectives and experimental design. Bull. Am. Meteorol. Soc. 2013, 94, 1145–1160. [Google Scholar] [CrossRef]
- Li, X.; Liu, S.; Xiao, Q.; Ma, M.; Jin, R.; Che, T.; Wang, W.; Hu, X.; Xu, Z.; Wen, J.; et al. A multiscale dataset for understanding complex eco-hydrological processes in a heterogeneous oasis system. Sci. Data 2017, 170083. [Google Scholar] [CrossRef] [Green Version]
- Jin, R.; Li, X.; Yan, B.; Luo, W.; Li, X.; Guo, J. Introduction of eco-hydrological wireless sensor network in the Heihe River Basin. Adv. Earth Sci. 2012, 27, 993–1005. [Google Scholar]
- Liu, S.; Li, X.; Xu, Z.; Che, T.; Xiao, Q.; Ma, M.; Liu, Q.; Jin, R.; Guo, J.; Wang, L.; et al. The Heihe integrated observatory network: A basin-scale land surface processes observatory in China. Vadose Zone J. 2018, 17, 180072. [Google Scholar] [CrossRef]
- Xu, Z.; Liu, S.; Li, X.; Shi, S.; Wang, J.; Zhu, Z.; Xu, T.; Wang, W.; Ma, M. Intercomparison of surface energy flux measurement systems used during the HiWATER-MUSOEXE. J. Geophys. Res. Atmos. 2013, 118, 13140–13157. [Google Scholar] [CrossRef]
- Jin, R.; Li, X.; Yan, B.; Li, X.; Luo, W.; Ma, M.; Guo, J.; Kang, J.; Zhu, Z.; Zhao, S. A nested eco-hydrological wireless sensor network for capturing the surface heterogeneity in the midstream area of the Heihe River Basin, China. IEEE Geosci. Remote Sens. Lett. 2014, 11, 2015–2019. [Google Scholar] [CrossRef]
- Liu, F.; Guo, J.W. Study on quality control approach for Heihe wireless sensor network observation data. Remote Sens. Technol. Appl. 2013, 28, 252–257. (In Chinese) [Google Scholar]
- Wang, J.; Li, X.; Lu, L.; Fang, F. Parameter sensitivity analysis of crop growth models based on the Extended Fourier Amplitude Sensitivity Test method. Environ. Model. Softw. 2013, 48, 171–182. [Google Scholar] [CrossRef]
- Wang, J.; Li, X.; Lu, L.; Fang, F. Estimating near future regional corn yields by integrating multi-source observations into a crop growth model. Eur. J. Agron. 2013, 49, 126–140. [Google Scholar] [CrossRef]
- Jian, K.; Rui, J.; Shaojie, Z.; Linna, C. Spatial Sampling Design of the Sensor Network for Monitoring the Surface Freeze / thaw Cycles over the Heterogeneous Surface in the Heihe River Basin. Remote Sens. Technol. Appl. 2014, 29, 833–838. (In Chinese) [Google Scholar]
- Jin, R.; Li, X.; Liu, S.M. Understanding the heterogeneity of soil moisture and evapotranspiration using multiscale observations from satellites, airborne sensors, and a ground-based observation matrix. IEEE Geosci. Remote. Sens. Lett. 2017, 14, 2132–2136. [Google Scholar] [CrossRef]
- Fortran 90. Available online: https://www.fortran90.org/ (accessed on 26 March 2021).
No. | System Name | Year | Main Functions | References |
---|---|---|---|---|
1 | BigDansing | 2015 | A big data cleaning system to tackle efficiency, scalability, and ease-of-use issues in data cleaning; it can be run on most general-purpose data processing platforms, ranging from DBMSs to MapReduce-like frameworks. | [18] |
2 | SWITCH | 2019 | It offers a flexible co-programming architecture that provides an abstraction layer and an underlying infrastructure environment, which can help to both specify and support the life cycle of time-critical cloud native applications. | [19] |
3 | DRIP | 2019 | It was developed for the dynamic optimization of data services in research support environments and might be used for a number of similar applications involving distributed services and large, dynamic datasets with further investigation and development. | [20] |
4 | SPS-IUTO | 2020 | To achieve significant improvements in terms of energy and redundant data, a matrix completion-based sampling point selection joint intelligent unmanned trajectory optimization (SPS-IUTO) scheme for unmanned aerial vehicles (UAVs) was proposed to plan sampling points for UAVs in both time and space. | [21] |
5 | BD-VTE | 2020 | A novel baseline data based verifiable trust evaluation (BD-VTE) scheme was proposed to guarantee security at a low cost for massive data. The BD-VTE scheme includes a verifiable trust evaluation (VTE) mechanism, an effectiveness-based incentive (EI) mechanism, and a secondary path planning (SPP) strategy, which are used for reliable trust evaluation, reasonable reward, and efficient path adjustment, respectively. | [22] |
6 | DRMCS | 2020 | DRMCS, a data collection scheme for mobile crowdsensing vehicular networks, was proposed to enhance the data collection rate in vehicular networks for opportunistic communication. | [23] |
7 | SDAC | 2021 | A novel secure and dynamic access control (SDAC) model was developed for IoT networks (smart traffic control and roadside parking management). It allows IoT devices to securely communicate and share information through busing wired and wireless networks (cellular networks or Wi-Fi). | [24] |
8 | aiRe | 2021 | This open-access tool simplifies air quality data analysis and visualization, with the desirable effects of removing ownership costs, fostering appropriation by nonexpert users, and ultimately promoting informed decision making for the general public and local government authorities. | [25] |
Method | Type | Function | Impact on Data |
---|---|---|---|
Instrument null value | Format conversion | Detect null values caused by the instrument | Depending on the strategy, the data may be modified |
Unit and format conversion | Format conversion | Detect and handle unit and format inconsistencies in observational data | The data will be modified |
Null value during transmission and calculation | Format conversion | Detect and handle null values caused by other reasons | Depending on the strategy, the data may be modified |
Outlier | Format conversion | Detect and handle data that do not adhere to data trends | Depending on the strategy, the data may be modified |
Redundant processing | Format conversion | Detect and delete duplicate data based on timestamps | The duplicate data will be deleted |
Dataset integrity | Quality evaluation | Check whether all the variables of integrated observations have observational values. For example, is there a missing value in a temperature profile? | The data will not be modified but will be tagged |
Timeliness | Quality evaluation | Check the timeliness of warehousing data | No data will be modified, but the system will tag the variable |
Frequency consistency | Quality evaluation | Check whether the data are collected according to the acquisition frequency | No data will be modified, but the system will tag the variable |
Data integrity | Quality evaluation | Tag the data according to the result of outlier detection | No data will be modified, but the system will tag the variable |
Data imperfection | Quality evaluation | Tag the data according to the result of null value detection | No data will be modified, but the system will tag the variable |
Data outlier | Quality evaluation | Tag the abnormal data according to the result of outlier detection | No data will be modified, but the system will tag the variable |
Instrument consistency | Quality evaluation | Detect abnormal data caused by instrument | No data will be modified, but the system will tag the variable |
Device Type | Number of Nodes | Main Observational Variables | Communication Mode |
---|---|---|---|
SoilNET | 50 | soil moisture/temperature | ZigBee, GPRS/3G/4G |
WATERNET | 55 | soil moisture/temperature/salinity, rainfall, snow depth, air moisture/temperature, and wind speed/direction | GPRS/3G/4G/Radio |
LAINET | 50 | leaf area index | GPRS/3G/4G/Radio |
AWS | 18 | soil moisture/temperature/heat flux, surface temperature air moisture/temperature/pressure, wind speed/direction, and radiation | GPRS/3G/4G/Radio |
No. | Model | Function | Development Language |
1 | WOFOST crop growth model | With the physiological and ecological processes of crops, e.g., assimilation, respiration, transpiration, and dry matter partitioning, as the simulation basis, the WOFOST crop growth model simulates the growth of crops under the circumstances of potential growth, restricted water, and limited nutrients. | Fortran [52] |
2 | Spatial kriging interpolation model | After examining soil moisture in the typical irrigated farmland in the upper reaches of the Heihe River as the object of study, relevant extension packs of the Python language are applied to analyze the spatial variability in the observational data and build the spatial kriging interpolation model to estimate the soil moisture in the study area. | Python |
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations. |
© 2021 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Guo, J.; Zhang, M.; Shang, Q.; Liu, F.; Wu, A.; Li, X. River Basin Cyberinfrastructure in the Big Data Era: An Integrated Observational Data Control System in the Heihe River Basin. Sensors 2021, 21, 5429. https://doi.org/10.3390/s21165429
Guo J, Zhang M, Shang Q, Liu F, Wu A, Li X. River Basin Cyberinfrastructure in the Big Data Era: An Integrated Observational Data Control System in the Heihe River Basin. Sensors. 2021; 21(16):5429. https://doi.org/10.3390/s21165429
Chicago/Turabian StyleGuo, Jianwen, Minghu Zhang, Qingsheng Shang, Feng Liu, Adan Wu, and Xin Li. 2021. "River Basin Cyberinfrastructure in the Big Data Era: An Integrated Observational Data Control System in the Heihe River Basin" Sensors 21, no. 16: 5429. https://doi.org/10.3390/s21165429
APA StyleGuo, J., Zhang, M., Shang, Q., Liu, F., Wu, A., & Li, X. (2021). River Basin Cyberinfrastructure in the Big Data Era: An Integrated Observational Data Control System in the Heihe River Basin. Sensors, 21(16), 5429. https://doi.org/10.3390/s21165429