Using Machine Learning to Predict Visitors to Totally Protected Areas in Sarawak, Malaysia
Abstract
:1. Introduction
2. Literature Review
3. Materials and Methods
3.1. Study Area
3.2. Data Description
3.3. Machine Learning Methods
3.3.1. K-NN Algorithm
3.3.2. Naïve Bayes
3.3.3. Decision Tree
3.4. Performance Evaluation
- Classification accuracy measures the proportion of correctly classified cases predicted by the model. It divides the total of correctly classified cases with all observations. The higher the accuracy, the better the model is.
- Sensitivity measures the true positive rate, which refers to the probability of detecting the true outcome. It measures the proportion of positive cases that are correctly identified by the model.
- Specificity measures the true negative rate, which refers to the probability of detecting the false outcome. It measures the proportion of negative cases that are correctly identified by the model.
4. Results
4.1. Prediction Model for Local Visitors
4.2. Prediction Model for Foreign Visitors
5. Discussion
6. Conclusions
Author Contributions
Funding
Data Availability Statement
Acknowledgments
Conflicts of Interest
References
- UNWTO. Global and Regional Tourism Performance. 2021. Available online: https://www.unwto.org/global-and-regional-tourism-performance (accessed on 15 October 2021).
- Tourism Malaysia. International Tourist Arrivals to Malaysia Plunge 83.4% in 2020. 2021. Available online: https://www.tourism.gov.my/media/view/international-tourist-arrivals-to-malaysia-plunge-83-4-in-2020 (accessed on 10 October 2021).
- De Urioste-Stone, S.M.; Scaccia, M.D.; Howe-Poteet, D. Exploring visitor perceptions of the influence of climate change on tourism at Acadia National Park, Maine. J. Outdoor Recreat. Tour. 2015, 11, 34–43. [Google Scholar] [CrossRef]
- Gosal, A.S.; McMahon, J.A.; Bowgen, K.M.; Hoppe, C.H.; Ziv, G. Identifying and mapping groups of protected area visitors by environmental awareness. Land 2021, 10, 560. [Google Scholar] [CrossRef]
- Rashid Niaghi, A.; Hassanijalilian, O.; Shiri, J. Estimation of reference evapotranspiration using spatial and temporal machine learning approaches. Hydrology 2021, 8, 25. [Google Scholar] [CrossRef]
- Drexl, J.; Hilty, R.; Beneke, F.; Desaunettes, L.; Finck, M.; Globocnik, J.; Gonzalez Otero, B.; Hoffmann, J.; Hollander, L.; Kim, D.; et al. Technical Aspects of Artificial Intelligence: An Understanding from an Intellectual Property Law Perspective; research paper no. 19–13; Max Planck Institute for Innovation & Competition: Munich, Germany, 2019; pp. 1–14. [Google Scholar] [CrossRef] [Green Version]
- Krizhevsky, A.; Sutskever, I.; Hinton, G.E. Imagenet Classification with Deep Convolutional Neural Networks. In Advances in Neural Information Processing Systems; Curran Associates Inc.: Red Hook, NY, USA, 2012; pp. 1–9. [Google Scholar]
- Mnih, V.; Heess, N.; Graves, A. Recurrent Models of Visual Attention. Advances in Neural Information Processing Systems. In Proceedings of the 28th Annual Conference on Neural Information Processing Systems 2014 [(NIPS)], Montreal, QC, Canada, 8–13 December 2014; pp. 1–9. [Google Scholar]
- Li, K.; Lu, W.; Liang, C.; Wang, B. Intelligence in tourism management: A hybrid FOA-BP method on daily tourism demand forecasting with web search data. Mathematics 2019, 7, 531. [Google Scholar] [CrossRef] [Green Version]
- Sun, S.; Wei, Y.; Tsui, K.L.; Wang, S. Forecasting tourist arrivals with machine learning and internet search index. Tour. Manag. 2019, 70, 1–10. [Google Scholar] [CrossRef]
- Rezapouraghdam, H.; Akhshik, A.; Ramkissoon, H. Application of machine learning to predict visitors’ green behavior in marine protected areas: Evidence from Cyprus. J. Sustain. Tour. 2021, 30, 1–25. [Google Scholar] [CrossRef]
- Rodríguez-Piñeros, S.; Mayett-Moreno, Y. Forest owners’ perceptions of ecotourism: Integrating community values and forest conservation. Ambio 2015, 44, 99–109. [Google Scholar] [CrossRef] [Green Version]
- Rossi, S.D.; Byrne, J.A.; Pickering, C.M. The role of distance in peri-urban national park use: Who visits them and how far do they travel? Appl. Geogr. 2015, 63, 77–88. [Google Scholar] [CrossRef] [Green Version]
- Taplin, R.H.; Rodger, K.; Moore, S.A. A method for testing the effect of management interventions on the satisfaction and loyalty of national park visitors. Leis. Sci. 2016, 38, 140–160. [Google Scholar] [CrossRef]
- Widawski, K.; Olesniewicz, P.; Rozenkiewicz, A.; Zareba, A.; Jandová, S. Protected areas: Geotourist attractiveness for weekend tourists based on the example of Gorczanski National Park in Poland. Resources 2020, 9, 35. [Google Scholar] [CrossRef] [Green Version]
- Joppa, L.N.; Pfaff, A. High and far: Biases in the location of protected areas. PLoS ONE 2009, 4, e8273. [Google Scholar] [CrossRef]
- Groulx, M.; Lemieux, C.J.; Lewis, J.L.; Brown, S. Understanding consumer behaviour and adaptation planning responses to climate-driven environmental change in Canada’s parks and protected areas: A climate future scapes approach. J. Environ. Plan. Manag. 2017, 60, 1016–1035. [Google Scholar] [CrossRef]
- Höpken, W.; Eberle, T.; Fuchs, M.; Lexhagen, M. Improving tourist arrival prediction: A big data and artificial neural network approach. J. Travel Res. 2020, 60, 998–1017. [Google Scholar] [CrossRef]
- Alfares, H.K.; Nazeeruddin, M. Electric load forecasting: Literature survey and classification of methods. Int. J. Syst. Sci. 2002, 33, 23–34. [Google Scholar] [CrossRef]
- Caraka, R.E.; Yasin, H.; Chen, R.C.; Goldameir, N.E.; Supatmanto, B.D.; Toharudin, T.; Basyuni, M.; Gio, P.U. Evolving hybrid cascade neural network genetic algorithm space-time forecasting. Symmetry 2021, 13, 1158. [Google Scholar] [CrossRef]
- Santra, A.K.; Christy, C.J. Genetic algorithm and confusion matrix for document clustering. Int. J. Comput. Sci. 2012, 9, 322–328. [Google Scholar]
- AgaAzizi, S.; Rasekh, M.; Abbaspour-Gilandeh, Y.; Kianmehr, M.H. Identification of impurity in wheat mass based on video processing using artificial neural network and PSO algorithm. J. Food Process. Preserv. 2021, 45, 1–13. [Google Scholar] [CrossRef]
- Zeinalnezhad, M.; Chofreh, A.G.; Goni, F.A.; Klemeš, J.J. Air pollution prediction using semi-experimental regression model and Adaptive Neuro-Fuzzy Inference System. J. Clean. Prod. 2020, 261, 1–16. [Google Scholar] [CrossRef]
- HHrdle, W.K.; Prastyo, D.D.; Hafner, C.M. Support vector machines with evolutionary feature selection for default prediction. SSRN Electron. J. 2017, 1–24. [Google Scholar] [CrossRef] [Green Version]
- Caraka, R.E.; Hudaefi, F.A.; Ugiana, P.; Toharudin, T.; Tyasti, A.E.; Goldameir, N.E.; Chen, R.C. Indonesian Islamic moral incentives in credit card debt repayment: A feature selection using various data mining. Int. J. Islam. Middle East. Financ. Manag. 2021. [Google Scholar] [CrossRef]
- Nayak, J.; Naik, B.; Behera, H.S. A comprehensive survey on support vector machine in data mining tasks: Applications & challenges. Int. J. Database Theory Appl. 2015, 8, 169–186. [Google Scholar] [CrossRef]
- Sani, N.S.; Rahman, M.A.; Bakar, A.A.; Sahran, S.; Sarim, H.M. Machine learning approach for bottom 40 percent households (B40) poverty classification. Int. J. Adv. Sci. Eng. Inf. Technol. 2018, 8, 1698–1705. [Google Scholar] [CrossRef]
- Zhao, D.; Huang, C.; Wei, Y.; Yu, F.; Wang, M.; Chen, H. An effective computational model for bankruptcy prediction using kernel extreme learning machine approach. Comput. Econ. 2017, 49, 325–341. [Google Scholar] [CrossRef]
- Livieris, I.E.; Pintelas, E.; Kotsilieris, T.; Stavroyiannis, S.; Pintelas, P. Weight-constrained neural networks in forecasting tourist volumes: A case study. Electronics 2019, 8, 1005. [Google Scholar] [CrossRef] [Green Version]
- Borhan, N.; Arsad, Z. Determining Factors Affecting Tourism Demand for Malaysia Using ARDL Modeling: A Case of Europe Countries. In AIP Conference Proceedings; AIP Publishing LLC: Melville, NY, USA, 2016; Volume 1782, p. 050005. [Google Scholar]
- Mordecki, G. Determinants of Argentinean Tourism Demand in Uruguay. In Documentos de Trabajo; Working Papers; Instituto de Economia—IECON, Universidad de la República: Montevideo, Uruguay, 2014; pp. 14–17. [Google Scholar]
- Ţigu, G.; Simoni, S. Analyzing the mountain tourism demand in Romania over the last two decades. Ann. Univ. Oradea Econ. Sci. Ser. 2015, 24, 696–705. [Google Scholar]
- Rasekhi, S.; Mohammadi, S. Factors affecting tourism demand in the Caspian Sea Littoral States. Tour. Manag. Stud. 2017, 12, 63–81. [Google Scholar]
- Pishbahar, E.; Yadavar, H. Effective factors on tourism demand of aras free zone: Using structural equation modeling (SEM) approach. Iran. J. Agric. Econ. Dev. Res. 2018, 48, 547–729. [Google Scholar]
- Azlizam, A.; Syed-Alias, S.N.H.; Mazlina, J.; Idris, N.H.; Manohar, M. The attractiveness of Taman Negara National Park, Malaysia as perceived by local visitors. J. Wildl. Parks 2018, 33, 1–13. [Google Scholar]
- Nur Hafizah, I.; Azlizam, A.; Manohar, M.; Mazlina, J. Attractiveness of recreational forests: An overview on selected research. Int. J. Indep. Res. Stud. 2013, 2, 102–108. [Google Scholar]
- Hong-Bumm, K. Perceived attractiveness of Korean destinations. Ann. Tour. Res. 1998, 25, 340–361. [Google Scholar] [CrossRef]
- Castro, E.V.; Souza, T.B.; Thapa, B. Determinants of tourism attractiveness in the national parks of Brazil. Parks J. 2015, 21, 51–62. [Google Scholar]
- Neuvonen, M.; Pouta, E.; Puustinen, J.; Sievanen, T. Visits to national parks: Effects of park characteristics and spatial demand. J. Nat. Conserv. 2019, 18, 224–229. [Google Scholar] [CrossRef]
- Martinette, K.; Melville, S. Travel motivation of tourists to Krugerand Tsitsikamma National Parks: A comparative study. S. Afr. J. Wildl. Res. 2010, 40, 93–102. [Google Scholar]
- Hanink, D.M.; White, K. Distance effects in the demand for wildland recreational services: The case of national parks in the United States. Environ. Plan. A 1999, 31, 477–492. [Google Scholar] [CrossRef]
- Mills, A.S.; Westover, T.N. Structural differentiation: A determinant of park popularity. Ann. Tour. Res. 1987, 14, 486–498. [Google Scholar] [CrossRef]
- Liu, R.; Xiao, J. Factors affecting users’ satisfaction with urban parks through online comments data: Evidence from Shenzhen, China. Int. J. Environ. Res. Public Health 2020, 18, 253. [Google Scholar] [CrossRef] [PubMed]
- Marbán, Ó.; Mariscal, G.; Segovia, J. A Data Mining & Knowledge Discovery Process Model; IntechOpen: London, UK, 2019. [Google Scholar]
- Yaacob, W.F.W.; Nasir, S.A.M.; Yaacob, W.F.W.; Sobri, N.M. Supervised data mining approach for predicting student performance. Indones. J. Electr. Eng. Comput. Sci. 2019, 16, 1584–1592. [Google Scholar] [CrossRef]
- Salim, N.A.M.; Wah, Y.B.; Reeves, C.; Smith, M.; Yaacob, W.F.W.; Mudin, R.N.; Dapari, R.; Sapri, N.N.F.F.; Haque, U. Prediction of dengue outbreak in Selangor Malaysia using machine learning techniques. Sci Rep. 2021, 11, 939. [Google Scholar] [CrossRef]
- Ministry of Tourism, Arts and Culture Sarawak. Economic Planning Unit Sarawak. Sarawak Tourism Quick Facts. 2019. Available online: https://mtac.sarawak.gov.my/page-0-228-200-SARAWAK-TOURISM-QUICK-FACTS.html (accessed on 1 January 2022).
- The Geography of Sarawak. Available online: https://sarawak.gov.my/web/home/article_view/159/176/ (accessed on 1 January 2022).
- Abang Abdurahman, A.Z.; Md Nasir, S.A.; Wan Yaacob, W.F.; Jaya, S.; Mokhtar, S. Spatio-temporal clustering of Sarawak Malaysia total protected area visitors. Sustainability 2021, 13, 11618. [Google Scholar] [CrossRef]
- Totally Protected Area (TPA). Available online: https://forestry.sarawak.gov.my/modules/web/pages.php?mod=webpage&sub=page&id=661# (accessed on 1 January 2022).
- Han, J.; Kamber, M.; Pei, J. Data Mining: Concepts and Techniques; Elsevier: Amsterdam, The Netherlands, 2012; Volume 10, pp. 978–981. [Google Scholar]
Attribute | Description | Value |
---|---|---|
Visitors_CatL | Visitors’ category for local visitors | 1 = Low visitors |
2 = Medium visitors | ||
3 = High visitors | ||
Visitors_CatF | Visitors’ category for foreign visitors | 1 = Low visitors |
2 = Medium visitors | ||
3 = High visitors | ||
Type of park | Type of the park | 1 = National park 2 = Nature reserve 3 = Wildlife centre |
Size | Size of the park | km2 |
Natural | Number of natural characteristics available at the park | 1, 2, …, 10 |
Rec_Service | Number of recreational services available | 1, 2, …, 10 |
Type of connectivity | Types of connectivity | 1 = Road 2 = Road and water 3 = Road and air 4 = Water and air 5 = Road, water, and air |
Distance city | Distance from the nearest city | Distance of the park from the nearest city (km) |
No | Algorithm | Accuracy Rate (%) | Std Deviation (+/−) | Misclassification Rate (%) |
---|---|---|---|---|
1. | K-NN | 76.48 | 6.78 | 23.52 |
2. | Naïve Bayes | 75.28 | 5.71 | 24.72 |
3. | Decision tree—gain ratio | 80.65 | 4.35 | 19.35 |
4. | Decision tree—Gini | 80.65 | 4.77 | 19.35 |
5. | Decision tree—entropy | 80.65 | 4.77 | 19.35 |
No | English Rules of Gain Ratio |
---|---|
1. | The park is high visitors park when the distance of the park to the nearest city is less than 22.6 km. |
2. | The park is a medium visitors park when the size of the park is less than 61.88 km2 and the distance of the park to the nearest city is between 22.6 km and 115 km. |
3. | The park is low visitors park when the size of the park is greater than 98.66 km2 and the distance of the park to the nearest city is greater than 22.6 km. |
4. | The park is low visitors park when the distance of the park to the nearest city is greater than 115 km, with the size of the park less than 98.66 km2 |
No | Algorithm | Accuracy Rate (%) | Std Deviation (+/−) | Misclassification Rate (%) |
---|---|---|---|---|
1. | K-NN | 83.43 | 3.77 | 0.1657 |
2. | Naïve Bayes | 84.35 | 3.19 | 0.1565 |
3. | Decision tree—gain ratio | 84.35 | 3.19 | 0.1565 |
4. | Decision tree—GINI | 84.35 | 3.19 | 0.1565 |
5. | Decision tree—entropy | 84.35 | 3.19 | 0.1565 |
No | English Rules of Gain Ratio |
---|---|
1. | The park is a high foreign visitors park when is when the age of the park is more than 54.5 years old. |
2. | The park is a high foreign visitors park when the age is less than 54.5 years old and type of park is wildlife center with a distance of the park to the nearest city less than 22 km. |
3. | The park is a medium foreign visitors park when the age of the park is less than 54.5 years and the type of the park is a nature reserve. |
4. | The park is a medium foreign visitors park when the age of the park is less than 54.5 years and the type of the park is a wildlife center with the distance of the park to the nearest city is 33.2 km. |
5. | The park is a medium foreign visitors park when the age of the park is less than 54.5 years old and the type of park is a national park with a size of the park more than 549 km2. |
6. | The park is a medium foreign visitors park when the age of the park is less than 54.5 years old and the type of park is a national park with a size between 15.1 and 79.7 km2 and has connectivity access by road, water and air. |
7. | The park is a low foreign visitors park when the age of the park is less than 54.5 years old and the type of park is a national park with a size between 79.7 and 549 km2. |
8. | The park is a low foreign visitors park when the age of the park is less than 54.5 years old; the type of park is a national park, and the size of the park is less than 15.1 km2. |
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations. |
© 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Abang Abdurahman, A.Z.; Wan Yaacob, W.F.; Md Nasir, S.A.; Jaya, S.; Mokhtar, S. Using Machine Learning to Predict Visitors to Totally Protected Areas in Sarawak, Malaysia. Sustainability 2022, 14, 2735. https://doi.org/10.3390/su14052735
Abang Abdurahman AZ, Wan Yaacob WF, Md Nasir SA, Jaya S, Mokhtar S. Using Machine Learning to Predict Visitors to Totally Protected Areas in Sarawak, Malaysia. Sustainability. 2022; 14(5):2735. https://doi.org/10.3390/su14052735
Chicago/Turabian StyleAbang Abdurahman, Abang Zainoren, Wan Fairos Wan Yaacob, Syerina Azlin Md Nasir, Serah Jaya, and Suhaili Mokhtar. 2022. "Using Machine Learning to Predict Visitors to Totally Protected Areas in Sarawak, Malaysia" Sustainability 14, no. 5: 2735. https://doi.org/10.3390/su14052735
APA StyleAbang Abdurahman, A. Z., Wan Yaacob, W. F., Md Nasir, S. A., Jaya, S., & Mokhtar, S. (2022). Using Machine Learning to Predict Visitors to Totally Protected Areas in Sarawak, Malaysia. Sustainability, 14(5), 2735. https://doi.org/10.3390/su14052735