1. Introduction
Habitat loss, human–elephant conflict, and poaching threaten Bornean elephants (
Elephas maximus) [
1]. Despite global anti-poaching efforts, the illegal ivory trade continues to drive poaching, reducing the population to fewer than 1500 [
2,
3,
4]. In Sabah, Malaysian Borneo, over 200 elephants died between 2010 and 2021, many through poisoning near oil palm plantations [
5,
6]. High-profile incidents, such as the 2013 poisoning of 14 pygmy elephants, highlight the escalating conflict between expanding agriculture and wildlife conservation.
Other species, including Bornean orangutans (
Pongo pygmaeus), proboscis monkeys (
Nasalis larvatus), and Sunda pangolins (
Manis javanica), face similar threats from habitat loss and animal trafficking for illegal trade [
6]. Poaching also endangers human lives, with rangers facing armed poachers [
7].
Scarce and inconsistent poaching data complicate reliable predictions [
8], prompting researchers to leverage environmental data for insights. For example, GPS sensors used by the World Wildlife Fund in Sabah help monitor elephant behaviour and mitigate human-elephant conflicts. Advanced machine learning models built on GPS and environmental data can predict wildlife movements, assisting targeted anti-poaching efforts.
Studies like those by Chibeya et al. [
9] and Fang et al. [
10] have demonstrated the potential of combining sparse data with computational techniques to predict poaching hotspots. However, challenges remain in integrating diverse datasets with advanced algorithms to improve prediction accuracy.
This paper introduces PoachNet, a predictive tool designed to address these challenges by integrating wildlife data with advanced algorithms. PoachNet employs deep learning with an ontology-based knowledge graph, creating a dynamic and hybrid model for poaching prediction. Elephant GPS observations are processed through a sequential neural network to predict geo-locations, which are semantically modelled and incorporated into the knowledge graph. Semantic Web Rule Language (SWRL) asserts poaching rules based on events not explicitly expressed in the data (ontology-based knowledge graph). PoachNet’s performance was benchmarked against state-of-the-art methods and demonstrated higher accuracy, consistently outperforming them.
The remainder of this paper is structured as follows:
Section 2 reviews some related work.
Section 3 includes the methodology for developing the ontology-based knowledge graph.
Section 4 introduces the elephant geo-location prediction model.
Section 5 presents the results.
Section 6 discusses the results.
Section 7 concludes the paper.
2. Related Work
In this section we survey some existing literature about knowledge graphs in predictive modelling and crime prediction. We also discuss recent wildlife crime prediction methods.
2.1. Knowledge Graphs for Data Modelling
Knowledge graphs (KGs) have seen significant advancements, particularly in enhancing predictive models. Pahuja et al. [
11] addressed challenges in traditional Graph Neural Networks (GNNs), such as over-smoothing and scalability, by proposing a “retrieve-and-read” framework using a Transformer-based GNN to improve contextual predictions. Similarly, Duan and Chiang [
12] developed an ontology-driven system to streamline predictive tasks by integrating diverse data into KGs.
In urban planning, Ning [
13] introduced the Unified Urban Knowledge Graph (UUKG), containing millions of data triplets for two cities, to enhance spatiotemporal predictions. The study revealed complex structural patterns in UrbanKGs, tested KG embedding methods in predictive models, and made the dataset publicly available. Yan [
14] applied virtual KGs for predictive analytics in hydraulic systems, aligning with Industry 4.0 needs, particularly in predictive maintenance.
In healthcare, Feng et al. [
15] introduced DKADE, combining deep learning and KGs to detect adverse drug events (ADEs) while addressing gaps in clinical narratives. Zeng et al. [
15] reviewed KG applications in drug discovery and ADE prediction, while Wang [
16] developed KG-DTI, a KG-driven deep learning model for drug-target interaction predictions in Alzheimer’s disease treatment. These methods demonstrate the transformative potential of KGs across fields like urban planning, industrial systems, and healthcare.
2.2. Knowledge Graphs for Crime Prediction
Tompson et al. [
17] highlighted the value of integrating Open Data for crime prediction, emphasizing the role of knowledge graphs in organizing and enriching data with semantics for better decision-making, as explored by Sikos [
18]. Deepak et al. [
19] used a Bi-LSTM neural network to classify crimes based on Google News and Twitter data, integrating ontologies dynamically crafted from weighted graphs of news and social media sources.
Wang et al. [
20] developed HAGEN, a graph convolutional recurrent network that predicts crime across regions by leveraging homophily-aware constraints for similar crime patterns. Iqbal et al. [
21] employed Naïve Bayesian methods, Decision Trees, and Confusion Matrices to classify crimes using U.S. crime datasets, while Bogomolov et al. [
22] combined mobile network, demographic, and crime data to forecast hotspots in London using Logistic Regression, Neural Networks, and Random Forests.
In other studies, Almanie et al. [
23] analyzed crime datasets from Denver and Los Angeles with Decision Trees and Bayesian methods, and Chen et al. [
24] used Twitter and weather data with linear models to predict crimes. Kang et al. [
25] integrated crime, demographic, weather, and image data using a deep neural network (DNN) with feature-level data fusion, enhancing predictions with environmental context insights.
2.3. Wildlife Crime Prediction
Efforts in wildlife conservation have been bolstered by major data platforms like the Global Biodiversity Information Facility (GBIF) [
26], the Encyclopedia of Life (EOL) [
27], Wikidata [
28], and eBird [
29], which aggregate and provide accessible biodiversity records. These resources play a vital role in addressing wildlife crime, a transnational issue involving syndicates exploiting biodiversity for illegal gains.
Research and data science have addressed various aspects of wildlife crime, including wildfire prediction, by employing advanced techniques such as deep learning and data compression [
30] to enhance prediction accuracy and reduce computational costs [
31]. For poaching crimes, Hofer et al. [
32] explored it from an economic point of view, while Bakana et al. [
33] focused on multimedia data mining for poacher detection. Haas et al. [
34] employed federated databases to disrupt wildlife trafficking networks and later developed a political-ecological model to guide conservation decisions in poaching-prone areas [
35]. Critchlow et al. [
36] used ranger patrol data with Bayesian models to map illegal activities, emphasizing the predictive value of past crime locations over ecological covariates.
Technological advancements have strengthened anti-poaching efforts. Gore et al. [
37] emphasized geospatial data standards to enhance data sharing and wildlife crime analysis. Yang et al. [
38] introduced PAWS, a game theory-based ranger patrol optimization tool, while Nguyen et al. [
39] improved upon it with CAPTURE, which addressed observational and temporal uncertainties in poaching prediction. CAPTURE’s limitations, such as interpretability and learning efficiency, were addressed by Kar et al. [
40] with INTERCEPT, which employed spatially aware decision trees and ensemble models to improve prediction accuracy.
Gholami et al. [
41] further advanced the field by combining CAPTURE and INTERCEPT into a hybrid spatiotemporal model, enabling precise hotspot prediction and more effective ranger deployment. Their iWare-E1 model, trained on 14 years of wildlife crime data, achieved remarkable success in detecting poaching activities in protected areas.
Figure 1 illustrates various past research efforts addressing poaching and compares them with our approach.
Building on these foundations, this work introduces PoachNet, a novel approach that integrates deep learning and an ontology-based knowledge graph. By capturing complex animal behaviours and incorporating reasoning rules, PoachNet outperforms traditional and state-of-the-art models.
3. Methodology
This section discusses how PoachNet was created. The process begins with constructing a knowledge graph using the Forest Observatory Ontology (FOO) (w3id.org/def/foo#). Following this, spatiotemporal predictions were performed and subsequently incorporated into the ontology-based knowledge graph, enhancing its capacity for reasoning about poaching.
3.1. Ontology-Based Knowledge Graph
The Forest Observatory Ontology (FOO) integrates and links heterogeneous Resource Description Framework (RDF) graphs to create ontology-based knowledge graph(s).
Figure 2 illustrates the generation process of an ontology-based knowledge graph for this study, while
Figure 3 shows a lightweight representation of the resulting knowledge graph. FOO was developed using the Protégé platform and enriched with elephant Seri GPS observation variables. To construct the knowledge graph, we employed YARRML, a mapping tool described by Van Assche et al. [
46], which uses FOO’s URI to connect external data entities as fragments (i.e., appending them to the URI with a hash (#)). The mapper processed GPS collar data from a CSV file to create an RDF knowledge graph using the Forest Observatory Ontology (FOO). It parses fields such as local and GMT dates, times, latitude, longitude, temperature, speed, and activity, mapping them to semantic entities like sensor observations and linking them to their sensor (GPS) and its features of interest (i.e., elephants).
3.2. Study Hub
Our study modelled data from the Lower Kinabatangan Wildlife Sanctuary in Sabah, Malaysian Bornean. This sanctuary, spanning about 270 km² and situated between E 118°00′–118°50′, N 5°20′–5°50′, features a tropical rain-forest climate and is home to a variety of endangered species (e.g., Bornean elephants (Elephas maximus), orangutans (Pongo pygmaeus), and Sunda pangolins (Manis javanice). The data obtained for this area comprised animal tracking sensor data.
3.3. Elephant Tracker Data
Elephant tracker data were sourced from Danau Girang Field Centre [
47], which contained Global Positioning System (GPS) collars on adult Bornean elephants (
Elephas maximus). The collars, provided by Africa Wildlife Tracking (awt.co.za), recorded data every two hours from 2012 to 2018, including location, time, temperature, and speed. These 14 kg collars, equipped with a GPS receiver and VHF transmitter, were fitted by researchers, trackers, and wildlife practitioners.
Table 1 describes 9168 observations from a GPS tracking collar attached to an elephant named Seri. Metrics include latitude (lat), longitude (long), temperature (Temperature), external temperature (ExtTemp), activity, speed, direction, covariance (Cov), horizontal dilution of precision (HDOP), distance, and count. The latitude and longitude data show minimal variation, with averages of 5.20° and 118.66°, respectively, indicating a specific geographic area. The mean temperature is 29.20 °C, with a wide range from −37.00 °C to 60.50 °C, suggesting potential anomalies or extreme environmental conditions. Metrics such as external temperature, activity, speed, and direction show uniformity, with all values at 0, possibly indicating static conditions or missing data. Covariance and HDOP have average values of 1.23 and 2.21, respectively, highlighting variable GPS signal quality. Distance values range widely, with an average of 273.89 units, and the Count metric spans from 2199 to 11,366, indicating diverse data recording frequencies. The historical elephant Seri data were first converted into RDF format and merged with FOO in a triplestore (Stardog) to form the ontology-based knowledge graph containing 202,885 triples.
4. PoachNet Geo-Location Prediction
We developed a neural network model to predict geo-location attributes based on data extracted from the ontology-based knowledge graph. The selected features—localDate, localTime, latitude, and longitude—were chosen for their critical role in capturing both temporal and spatial dimensions of movement. Temporal features such as localDate and localTime provide essential information about when an event or movement occurred, while latitude and longitude define the exact geographic location, making them key predictors for modelling movement patterns. These features were extracted from the knowledge graph using the rdflib library and SPARQL queries.
4.1. Data Extraction and Preprocessing
We employed SPARQL Protocol and RDF Query Language [
48] (Listing 1) to retrieve relevant attributes from the RDF graph. The query extracted geo-location data points representing observations for latitude, longitude, date, and time. The extracted data were converted into numerical values for further preprocessing.
Listing 1. Query to extract elephant geo-location data including latitude, longitude, date, and time.
|
SELECT * {?observation a <https://w3id.org/def/foo#gPSObservation>; <http://www.w3.org/2003/01/geo/wgs84_pos#latitude> ?lat; <http://www.w3.org/2003/01/geo/wgs84_pos#longitude> ?long; <https://w3id.org/def/foo#localDate> ?localDate; <https://w3id.org/def/foo#localTime> ?localTime. }
|
The retrieved features were then processed to generate feature vectors (‘day’, ‘month’, ‘year’, ‘hour’) and label vectors (‘latitude’, ‘longitude’). The ‘NumPy’ library was used to convert these vectors into float-compatible arrays, enabling compatibility with the neural network. The dataset was split into training, validation, and testing subsets. Initially, 20% of the data was reserved for testing, while the remaining 80% was split further into 60% training and 20% validation sets.
4.2. Architecture and Training
A sequential neural network model was developed using TensorFlow and Keras to predict continuous target variables. The model architecture consisted of an input layer that accepted four features (‘date’, ‘time’, ‘longitude’, and ‘latitude’), followed by two hidden layers with 128 and 64 neurons, respectively. Each hidden layer used the Rectified Linear Unit (ReLU) activation function to learn non-linear patterns in the data. The output layer employed a linear activation function, enabling precise predictions of the target values (‘date’, ‘time’, ‘longitude’, and ‘latitude’).
The model was compiled using the Adam optimizer and Root Mean Squared Error (RMSE) as the loss function, quantifying prediction accuracy. The model training was conducted on 500 epochs with a batch size of 32, incorporating validation data to monitor performance and mitigate overfitting. The trained model achieved precise geo-location predictions, evaluated using RMSE. The predictions were transformed into RDF graphs and integrated into the ontology-based knowledge graph, enriching it for rule-based reasoning.
Figure 4 illustrates the predictive framework, and the pseudocode in Algorithm 1 outlines the entire workflow.
Algorithm 1: Data Extraction and Regression Model Training |
![Sensors 24 08142 i001]() |
5. Reasoning for Poaching Prediction
Our experiment created a specific rule (refer to Rule Listing 2) to anticipate poaching activities. This rule predicts poaching based on an elephant’s proximity to a designated hazardous area, like an oil palm plantation. The criterion states that if an elephant equipped with a GPS tracker and termed elephant Seri is near oil palm plantations—areas marked as hazardous owing to previous poaching/poisoning incidents—then there is an increased likelihood that the elephant will be poached. Rule-based semantic reasoning ability led to a new binary poaching indicator in the knowledge graph database (triple store), where ‘1’ indicates potential poaching and ‘0’ indicates its absence.
To determine if an elephant is within a 5 km radius of the oil palm location (see
Figure 5), we created buffer zones with a radius of 5 km between the oil palm plantation and the elephant geo-location.
Figure 6 visualises the 5 km buffer zones used to formulate the semantic rule.
The haversine formula was applied to calculate the distance between the two points. The elephant can be marked as potentially poached if the calculated distance is less than or equal to 5 km. The Haversine formula is given by this equation:
where
is the difference in latitude,
is the difference in longitude,
and
are the latitudes of the two points,
R is the radius of the Earth (mean radius = 6371 km), and
d is the distance between the two points.
Listing 2. Poaching Rule. |
INSERT { ?s a <https://w3id.org/def/foo#gPSObservation>; <https://w3id.org/def/foo#poaching> ?poaching. } WHERE { ?s a <https://w3id.org/def/foo#gPSObservation>; <http://www.w3.org/2003/01/geo/wgs84_pos#latitude> ?lat; <http://www.w3.org/2003/01/geo/wgs84_pos#longitude> ?long. # Retrieve plantation details <https://w3id.org/def/foo#plantation> a <https://w3id.org/def/foo#OilPalmPlantation>; <http://www.w3.org/2003/01/geo/wgs84_pos#latitude> ?plantationLat; <http://www.w3.org/2003/01/geo/wgs84_pos#longitude> ?plantationLong. # Convert coordinates to float (if stored as literals) BIND(xsd:float(?lat) AS ?latitude) BIND(xsd:float(?long) AS ?longitude) BIND(xsd:float(?plantationLat) AS ?oilpalmLat) BIND(xsd:float(?plantationLong) AS ?oilpalmLong) # Calculate distance using the Haversine formula BIND(6371 * 2 * ASIN(SQRT( POW(SIN((?latitude - ?oilpalmLat) * PI() / 180 / 2), 2) + COS(?oilpalmLat * PI() / 180) * COS(?latitude * PI() / 180) * POW(SIN((?longitude - ?oilpalmLong) * PI() / 180 / 2), 2) )) AS ?distance) # Determine poaching based on the calculated distance BIND(IF(?distance <= 5, 1, 0) AS ?poaching. }
|
6. Results
This section presents the results of PoachNet. The foundation of this approach is the ontology-based knowledge graph, now publicly accessible online. However, the elephant Seri GPS Observations dataset used in this research is kept confidential due to its sensitive nature. The graph injected with Semantic Web Rule Language (SWRL) enabled the semantic reasoning about poaching and introduced new triples to assert the poaching likelihood (Listing 3). The query results fed into the deep learning models demonstrated high accuracy and compatibility with machine learning formats. To evaluate the accuracy of the geo-location predictions, the Root Mean Square Error (RMSE) was used. The deep learning model was trained using Tensorflow in Google Colab. The computer hosting the model is a Dell Latitude 4520, equipped with an 11th Gen Intel(R) Core(TM) i7-1165G7 processor @ 2.8GHz (8 CPUs) and 16GB of DDR4 RAM, sourced from the United Kingdom.
Listing 3. Query to retrieve poaching status in the format of turtle graph with the geo-location coordinates, local data and poaching likelihood. |
CONSTRUCT WHERE { ?Observation a <https://w3id.org/def/foo#gPSObservation>; <https://w3id.org/def/foo#localDate> ?LocalDate ; <http://www.w3.org/2003/01/geo/wgs84_pos#latitude> ?lat ; <http://www.w3.org/2003/01/geo/wgs84_pos#longitude> ?long ; <https://w3id.org/def/foo#poaching> ?poaching. }
|
6.1. PoachNet Geo-Locations Prediction Result
The proposed neural network model for the geo-location prediction is a linear model and was built using the TensorFlow and Keras frameworks. The model contained an input layer intended to accommodate four critical features (data, time, longitude and latitude) related to the geographical positioning of elephant Seri. The network also includes two subsequent dense layers, containing 128 and 64 neurons, respectively, using the Rectified Linear Unit (ReLU) activation function to capture non-linear patterns in the data effectively. In other words, the model is an output layer with two neurons, employing a linear activation function. Such configuration is well-suited for regression tasks of continuous outputs (i.e., longitude and latitude). The data used contained 9168 observations, and their distribution is shown in
Figure 4 step 3. This model underwent multiple training epochs with a batch size of 32. It achieved its highest accuracy at 500 epochs, registering an average geospatial RMSE of 0.0166 for the elephant Seri GPS observations dataset.
6.2. PoachNet Evaluation
To evaluate predictive methods on elephant Seri GPS collar data, we used its dataset in CSV format containing (date, time, longitude, latitude) features. The goal was to predict spatialtemporal coordinates (date, time, longitude, latitude) and compare the performance of three models: linear regression, polynomial regression, and Vector Autoregression (VAR). The performance was assessed using the average Root Mean Square Error (RMSE).
6.3. Data Preprocessing
The independent variables in this analysis are the input features used to make predictions. These include the day, month, and year, all of which are extracted from the ‘LocalDate’. These temporal features provide the contextual information necessary for the models to make accurate predictions.
The dependent variables, or targets, are the outputs the models aim to predict. These include geospatial coordinates such as latitude (‘lat’) and longitude (‘long’), as well as temporal features like the day, represented as a numeric value indicating the day of the month, and time, converted into a numeric representation of seconds since midnight (e.g., ‘12:34:56’ becomes ‘45,296’ s). Together, these outputs capture both spatial and temporal dynamics.
The data were divided into training and testing sets to ensure fair evaluation, and models were trained and tested sequentially to respect the temporal structure of the data.
6.4. Models and Results
- 1.
Linear Regression: Applied simple linear regression using day as the predictor for both lat and long. Evaluated using 5-fold time-series cross-validation.
- 2.
Polynomial Regression: Incorporated polynomial features (degree 4) to account for non-linear relationships. Similarly validated with 5-fold time-series cross-validation.
- 3.
Vector Autoregression (VAR): Used both latitude and longitude as a multivariate time-series for temporal forecasting. Reserved a portion of the dataset for out-of-sample prediction and RMSE calculation.
The RMSE results for all models are summarized in
Table 2. From the negatively-oriented RMSE scores (where a lower score indicates better performance), the linear regression model demonstrated strong performance for predicting latitude and longitude, with RMSE values of 0.123 and 0.164, respectively. Notably, the linear regression model also performed exceptionally well for predicting the day, achieving an RMSE close to zero, but struggled with time predictions, yielding an RMSE of approximately 25,186 s.
The polynomial regression model, while offering a more complex representation, exhibited higher RMSE values for latitude (2.396) and longitude (1.050) predictions. It achieved a near-perfect prediction for the day (RMSE: 7.2 × ) but produced the highest error for time predictions, with an RMSE of 483,988 s. This suggests that the added complexity of the polynomial model may have led to overfitting or inefficiency for these particular features.
In comparison, the VAR model excelled in predicting longitude, with the lowest RMSE of 0.089. It also performed reasonably well for latitude predictions (RMSE: 0.222) but struggled with day and time predictions, with RMSE values of 8.69 and 25,093 s, respectively. These results indicate that the VAR model effectively captures temporal dependencies for geospatial coordinates but may require additional feature refinement for accurate temporal predictions.
However, PoachNet (a neural network built with TensorFlow and Keras) trained on the same elephant Seri data but in an ontology-based knowledge graph, outperformed all other models. PoachNet achieved test RMSE values of 0.0247 for longitude, 0.0084 for latitude, 0.0123 for ‘localDate’, and 0.0086 for ‘localTime’. These results demonstrate the effectiveness of leveraging a knowledge graph representation and deep learning methods for highly accurate geospatial and temporal predictions.
Figure 7 presents a comparison of the RMSE scores of PoachNet against those of other prediction models. Codes are available on Github
https://github.com/Naeima/PoachNet accessed on 11 December 2024.
7. Discussion
The loss of forest elephants and their dispersal from poaching or habitat loss and fragmentation [
2] could lead to reduced forest diversity, the inability of elephants to colonise new or deforested areas, and potentially reduced carbon stocks. Combating poaching in Sabah is a priority, and various organizations, including the Sabah Wildlife Department and Sabah Forestry Department with the support of Danau Girang Field Centre and WWF-Malaysia, are working to protect the Bornean elephant and many other species. The Bornean Elephant Action Plan for Sabah 2020–2029 is a ten-year plan approved by the state government of Sabah to conserve the Bornean elephant population and many other species. The plan has four main objectives: improve protection and reduce elephant deaths, improve landscape connectivity and permeability, ensure the best ex-situ practices for elephant management and conservation, and monitor and predict elephant population trends.
This research differentiates itself from existing views by integrating heterogeneous wildlife data with deep learning on an ontology-based knowledge graph. While prior approaches have primarily focused on specific aspects, such as social network analysis, multimedia data mining, or hierarchical models on ranger patrol data, this methodology offers an interconnected understanding of wildlife dynamics. The results highlight that while linear regression is well-suited for simple relationships in this dataset, and the VAR model shows promise for geospatial predictions, PoachNet surpasses them significantly, showcasing the potential of neural networks combined with knowledge graph techniques. Polynomial regression, despite its theoretical flexibility, did not outperform the simpler models and may require better feature engineering to improve its effectiveness.
PoachNet predictions can assist in the strategic resource allocation for anti-poaching efforts. It can also guide the decision to deploy ground truth sensors and motion-activated camera traps in areas most likely to have anticipated poaching crimes.
Research challenges include semantic heterogeneity among diverse data sources, which risks the consistent representation of information in the knowledge graph. Scalability issues may emerge as the knowledge graph expands, necessitating careful resource management. To address scalability issues in our knowledge graphs, several strategies can be recommended. Partitioning the graph into manageable subgraphs and using distributed triple-store databases like Stardog, Neo4j or Amazon Neptune can enhance processing efficiency. Incremental updates minimise reprocessing, while graph compression and summarisation reduce storage demands. Scalable cloud-based storage, optimized query processing with indexing, and the use of high-performance graph algorithms further improve performance. Edge computing can preprocess data near collection points, reducing bandwidth and latency.
Optimizing the deep learning algorithms in PoachNet to enhance predictive performance while minimising computational costs can be achieved through model compression techniques such as pruning and quantisation [
51]. These approaches reduce the size of deep neural networks while maintaining accuracy, enabling faster inference, reduced storage requirements, and lower training costs. Techniques like low-rank decomposition, knowledge distillation, and lightweight model design can further streamline model deployment, making them more efficient for use in resource-constrained environments [
52].
PoachNet can be expanded by integrating additional wildlife data sources such as acoustic sensors, satellite imagery, and crime intelligence. Acoustic sensors can detect gunshots, elephant vocalisations, or vehicle noises associated with poaching crimes. Satellite imagery can monitor changes in habitat, detect unauthorised human activity, and assess landscape connectivity. Crime intelligence data can add historical context, identifying patterns in poaching incidents and aiding in predicting future hotspots.
8. Conclusions
This study introduced PoachNet, a novel tool integrating Semantic Web technologies and deep learning to predict wildlife dynamics and poaching crime. By combining diverse wildlife data into an ontology-based knowledge graph enriched with rule-based reasoning, PoachNet provided a dynamic, hybrid predictive solution for conservation. Custom-built dataset and advanced neural network models accurately predicted elephant geo-locations and potential poaching incidents, achieving an average geospatial RMSE of 0.0166, surpassing state-of-the-art methods. This approach predicts future elephant geo-locations and uses this information to infer poaching risks based on proximity to identified hazardous areas. PoachNet equips biologists and conservationists with advanced tools for spatiotemporal poaching predictions, offering a transformative paradigm for wildlife crime prevention. While challenges such as semantic heterogeneity, data sensitivity, and ecosystem dynamics persist, the public release of the ontology-based knowledge graph and source code demonstrates our commitment to transparency and collaboration, encouraging the research community to collaborate with us and build upon this work.
Author Contributions
Conceptualisation, N.H., O.R., P.O.-t., B.G. and C.P.; methodology, N.H.; software, N.H.; validation, N.H., B.G., P.O.-t. and O.R.; formal analysis, N.H.; investigation, N.H.; resources, N.H., O.R., P.O.-t., B.G. and C.P.; data curation, N.H.; writing—original draft preparation, N.H.; writing—review and editing, N.H., O.R., P.O.-t. B.G. and C.P.; visualization, N.H.; supervision, C.P., O.R., P.O.-t. and B.G.; project administration, C.P. All authors have read and agreed to the published version of the manuscript.
Funding
This research received no external funding.
Institutional Review Board Statement
Not applicable.
Informed Consent Statement
Not applicable.
Data Availability Statement
The data are shared in github.com/Naeima/PoachNet.
Conflicts of Interest
The authors declare no conflicts of interest.
References
- Evans, L.J.; Goossens, B.; Davies, A.B.; Reynolds, G.; Asner, G.P. Natural and anthropogenic drivers of Bornean elephant movement strategies. Glob. Ecol. Conserv. 2020, 22, e00906. [Google Scholar] [CrossRef]
- Goossens, B.; Sharma, R.; Othman, N.; Kun-Rodrigues, C.; Sakong, R.; Ancrenaz, M.; Ambu, L.N.; Jue, N.K.; O’Neill, R.J.; Bruford, M.W.; et al. Habitat fragmentation and genetic diversity in natural populations of the Bornean elephant: Implications for conservation. Biol. Conserv. 2016, 196, 80–92. [Google Scholar] [CrossRef]
- Abram, N.; Skara, B.; Othman, N.; Ancrenaz, M.; Mengersen, K.; Goossens, B. Understanding the spatial distribution and hot spots of collared Bornean elephants in a multi-use landscape. Sci. Rep. 2022, 12, 12830. [Google Scholar] [CrossRef] [PubMed]
- Cheah, C.; Yoganand, K. Recent estimate of Asian elephants in Borneo reveals a smaller population. Wildl. Biol. 2022, 2022, e01024. [Google Scholar] [CrossRef]
- Nuwer, R.L. Poached: Inside the Dark World of Wildlife Trafficking; Hachette: Edinburgh, UK, 2018. [Google Scholar]
- Department, S.W. Bornean Elephant Action Plan for Sabah 2020–2029; Sabah Wildlife Department: Sabah, Malaysia, 2020. [Google Scholar]
- Mukwazvure, A.; Magadza, T.B. A Survey on Anti-Poaching Strategies. Int. J. Sci. Res. 2014, 3, 1064–1066. [Google Scholar]
- Xu, L.; Gholami, S.; McCarthy, S.; Dilkina, B.; Plumptre, A.; Tambe, M.; Singh, R.; Nsubuga, M.; Mabonga, J.; Driciru, M.; et al. Stay Ahead of Poachers: Illegal Wildlife Poaching Prediction and Patrol Planning Under Uncertainty with Field Test Evaluations (Short Version). In Proceedings of the 2020 IEEE 36th International Conference on Data Engineering (ICDE), Dallas, TX, USA, 20–24 April 2020; pp. 1898–1901. [Google Scholar] [CrossRef]
- Chibeya, D.; Wood, H.; Cousins, S.; Carter, K.; Nyirenda, M.A.; Maseka, H. How do African elephants utilize the landscape during wet season? A habitat connectivity analysis for Sioma Ngwezi landscape in Zambia. Ecol. Evol. 2021, 11, 14916–14931. [Google Scholar] [CrossRef]
- Fang, F.; Nguyen, T.H.; Pickles, R.; Lam, W.Y.; Clements, G.R.; An, B.; Singh, A.; Tambe, M.; Lemieux, A. Deploying PAWS: Field Optimization of the Protection Assistant for Wildlife Security. In Proceedings of the Twenty-Eighth Innovative Applications of Artificial Intelligence Conference, Phoenix, AZ, USA, 12–17 February 2016. [Google Scholar]
- Pahuja, V.; Wang, B.; Latapie, H.; Srinivasa, J.; Su, Y. A retrieve-and-read framework for knowledge graph link prediction. In Proceedings of the 32nd ACM International Conference on Information and Knowledge Management, Birmingham, UK, 21–25 October 2023; pp. 1992–2002. [Google Scholar]
- Duan, W.; Chiang, Y.Y. Building knowledge graph from public data for predictive analysis: A case study on predicting technology future in space and time. In Proceedings of the 5th ACM SIGSPATIAL International Workshop on Analytics for Big Geospatial Data, San Francisco, WA, USA, 31 October 2016; pp. 7–13. [Google Scholar]
- Ning, Y.; Liu, H.; Wang, H.; Zeng, Z.; Xiong, H. UUKG: Unified Urban Knowledge Graph Dataset for Urban Spatiotemporal Prediction. arXiv 2023, arXiv:2306.11443. [Google Scholar]
- Yan, W.; Shi, Y.; Ji, Z.; Sui, Y.; Tian, Z.; Wang, W.; Cao, Q. Intelligent predictive maintenance of hydraulic systems based on virtual knowledge graph. Eng. Appl. Artif. Intell. 2023, 126, 106798. [Google Scholar] [CrossRef]
- Feng, Z.Y.; Wu, X.H.; Ma, J.L.; Li, M.; He, G.F.; Cao, D.S.; Yang, G.P. DKADE: A novel framework based on deep learning and knowledge graph for identifying adverse drug events and related medications. Brief. Bioinform. 2023, 24, bbad228. [Google Scholar] [CrossRef]
- Wang, S.; Du, Z.; Ding, M.; Rodriguez-Paton, A.; Song, T. KG-DTI: A knowledge graph based deep learning method for drug-target interaction predictions and Alzheimer’s disease drug repositions. Appl. Intell. 2022, 52, 846–857. [Google Scholar] [CrossRef]
- Tompson, L.; Johnson, S.; Ashby, M.; Perkins, C.; Edwards, P. UK open source crime data: Accuracy and possibilities for research. Cartogr. Geogr. Inf. Sci. 2015, 42, 97–111. [Google Scholar] [CrossRef]
- Sikos, L.F. Cybersecurity knowledge graphs. Knowl. Inf. Syst. 2023, 65, 3511–3531. [Google Scholar] [CrossRef]
- Deepak, G.; Rooban, S.; Santhanavijayan, A. A knowledge centric hybridized approach for crime classification incorporating deep bi-LSTM neural network. Multimed. Tools Appl. 2021, 80, 28061–28085. [Google Scholar] [CrossRef]
- Wang, C.; Lin, Z.; Yang, X.; Sun, J.; Yue, M.; Shahabi, C. Hagen: Homophily-aware graph convolutional recurrent network for crime forecasting. In Proceedings of the AAAI Conference on Artificial Intelligence, Virtual, 22 February–1 March 2022; Volume 36, pp. 4193–4200. [Google Scholar]
- Iqbal, R.; Murad, M.A.A.; Mustapha, A.; Panahy, P.H.S.; Khanahmadliravi, N. An experimental study of classification algorithms for crime prediction. Indian J. Sci. Technol. 2013, 6, 4219–4225. [Google Scholar] [CrossRef]
- Bogomolov, A.; Lepri, B.; Staiano, J.; Oliver, N.; Pianesi, F.; Pentland, A. Once upon a crime: Towards crime prediction from demographics and mobile data. In Proceedings of the ICMI 2014—2014 International Conference on Multimodal Interaction, Istanbul, Turkey, 12–16 November 2014; pp. 427–434. [Google Scholar] [CrossRef]
- Almanie, T.; Mirza, R.; Lor, E. Crime Prediction Based on Crime Types and Using Spatial and Temporal Criminal Hotspots. Int. J. Data Min. Knowl. Manag. Process 2015, 5, 1–19. [Google Scholar] [CrossRef]
- Chen, X.; Cho, Y.; Jang, S.Y. Crime prediction using Twitter sentiment and weather. In Proceedings of the 2015 Systems and Information Engineering Design Symposium, SIEDS 2015, Charlottesville, VA, USA, 24 April 2015; pp. 63–68. [Google Scholar] [CrossRef]
- Kang, H.W.; Kang, H.B. Prediction of crime occurrence from multi-modal data using deep learning. PLoS ONE 2017, 12, e0176244. [Google Scholar] [CrossRef] [PubMed]
- Lane, M.A.; Edwards, J.L. The global biodiversity information facility (GBIF). Syst. Assoc. Spec. Vol. 2007, 73, 1. [Google Scholar]
- Parr, C.S.; Wilson, M.N.; Leary, M.P.; Schulz, K.S.; Lans, M.K.; Walley, M.L.; Hammock, J.A.; Goddard, M.A.; Rice, M.J.; Studer, M.M.; et al. The encyclopedia of life v2: Providing global access to knowledge about life on earth. Biodivers. Data J. 2014, 2, e1079. [Google Scholar] [CrossRef]
- Vrandečić, D.; Krötzsch, M. Wikidata: A free collaborative knowledgebase. Commun. ACM 2014, 57, 78–85. [Google Scholar] [CrossRef]
- Sullivan, B.L.; Aycrigg, J.L.; Barry, J.H.; Bonney, R.E.; Bruns, N.; Cooper, C.B.; Damoulas, T.; Dhondt, A.A.; Dietterich, T.; Farnsworth, A.; et al. The eBird enterprise: An integrated approach to development and application of citizen science. Biol. Conserv. 2014, 169, 31–40. [Google Scholar] [CrossRef]
- Cheng, S.; Prentice, I.C.; Huang, Y.; Jin, Y.; Guo, Y.K.; Arcucci, R. Data-driven surrogate model with latent data assimilation: Application to wildfire forecasting. J. Comput. Phys. 2022, 464, 111302. [Google Scholar] [CrossRef]
- Zhong, C.; Cheng, S.; Kasoar, M.; Arcucci, R. Reduced-order digital twin and latent data assimilation for global wildfire prediction. Nat. Hazards Earth Syst. Sci. 2023, 23, 1755–1768. [Google Scholar] [CrossRef]
- Hofer, H.; Campbell, K.L.; East, M.L.; Huish, S.A. Modeling the spatial distribution of the economic costs and benefits of illegal game meat hunting in the serengeti. Nat. Resour. Model. 2000, 13, 151–177. [Google Scholar] [CrossRef]
- Bakana, S.R.; Zhang, Y. Mitigating Wild Animals Poaching Through State-of-the-art Multimedia Data Mining Techniques: A Review. In Proceedings of the IPMV ’20: 2020 2nd International Conference on Image Processing and Machine Vision, Bangkok, Thailand, 5–7 August 2020. [Google Scholar] [CrossRef]
- Haas, T.C.; Ferreira, S.M. Federated databases and actionable intelligence: Using social network analysis to disrupt transnational wildlife trafficking criminal networks. Secur. Inform. 2015, 4, 1–14. [Google Scholar] [CrossRef]
- Haas, T.C.; Ferreira, S.M. Finding politically feasible conservation policies: The case of wildlife trafficking. Ecol. Appl. 2018, 28, 473–494. [Google Scholar] [CrossRef] [PubMed]
- Critchlow, R.; Plumptre, A.J.; Driciru, M.; Rwetsiba, A.; Stokes, E.J.; Tumwesigye, C.; Wanyama, F.; Beale, C.M. Spatiotemporal trends of illegal activities from ranger-collected data in a Ugandan national park. Conserv. Biol. 2015, 29, 1458–1470. [Google Scholar] [CrossRef]
- Gore, M.L.; Griffin, E.; Dilkina, B.; Ferber, A.; Griffis, S.E.; Keskin, B.B.; Macdonald, J. Advancing interdisciplinary science for disrupting wildlife trafficking networks. Proc. Natl. Acad. Sci. USA 2023, 120, e2208268120. [Google Scholar] [CrossRef] [PubMed]
- Yang, R.; Ford, B.; Tambe, M.; Lemieux, A. Adaptive resource allocation for wildlife protection against illegal poachers. In Proceedings of the 13th International Conference on Autonomous Agents and Multiagent Systems, AAMAS 2014, Paris, France, 5–9 May 2014; Volume 1, pp. 453–460. [Google Scholar]
- Nguyen, T.H.; Sinha, A.; Gholami, S.; Plumptre, A.; Joppa, L.; Tambe, M.; Driciru, M.; Wanyama, F.; Rwetsiba, A.; Critchlow, R.; et al. CAPTURE: A new predictive anti-poaching tool for wildlife protection. In Proceedings of the International Joint Conference on Autonomous Agents and Multiagent Systems, AAMAS, Singapore, 9–13 May 2016; pp. 767–775. [Google Scholar]
- Kar, D.; Ford, B.; Gholami, S.; Fang, F.; Plumptre, A.; Tambe, M.; Driciru, M.; Wanyama, F.; Rwetsiba, A.; Nsubaga, M.; et al. Cloudy with a chance of poaching: Adversary behavior modeling and forecasting with real-world poaching data. In Proceedings of the International Joint Conference on Autonomous Agents and Multiagent Systems, AAMAS, Sao Paulo, Brazil, 8–12 May 2017. [Google Scholar]
- Gholami, S.; McCarthy, S.; Dilkina, B.; Plumptre, A.; Tambe, M.; Driciru, M.; Wanyama, F.; Rwetsiba, A.; Nsubaga, M.; Mabonga, J.; et al. Adversary models account for imperfect crime data: Forecasting and planning against real-world poachers. In Proceedings of the International Joint Conference on Autonomous Agents and Multiagent Systems, AAMAS, Stockholm, Sweden, 10–15 July 2018; Volume 2, pp. 823–831. [Google Scholar]
- Edemacu, K.; Kim, J.W.; Jang, B.; Park, H.K. Poacher detection in African game parks and reserves with IoT: Machine learning approach. In Proceedings of the 2019 International Conference on Green and Human Information Technology (ICGHIT), Kuala, Lumpur, Malaysia, 15–17 January 2019; pp. 12–17. [Google Scholar]
- Ferber, A.; Griffin, E.; Dilkina, B.; Keskin, B.; Gore, M. Predicting Wildlife Trafficking Routes with Differentiable Shortest Paths. In Proceedings of the International Conference on Integration of Constraint Programming, Artificial Intelligence, and Operations Research, Nice, France, 29 May–1 June 2023; Springer: Berlin/Heidelberg, Germany, 2023; pp. 460–476. [Google Scholar]
- Gholami, S.; Ford, B.; Fang, F.; Plumptre, A.; Tambe, M.; Driciru, M.; Wanyama, F.; Rwetsiba, A.; Nsubaga, M.; Mabonga, J. Taking It for a Test Drive: A Hybrid Spatio-Temporal Model for Wildlife Poaching Prediction Evaluated Through a Controlled Field Test. In Lecture Notes in Computer Science (Including Subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); Springer: Berlin/Heidelberg, Germany, 2017; Volume 10536 LNAI, pp. 292–304. [Google Scholar] [CrossRef]
- Fang, F.; Nguyen, T.H.; Sinha, A.; Gholami, S.; Plumptre, A.; Joppa, L.; Tambe, M.; Driciru, M.; Wanyama, F.; Rwetsiba, A.; et al. Predicting poaching for wildlife Protection. IBM J. Res. Dev. 2017, 61, 3:1–3:12. [Google Scholar] [CrossRef]
- Van Assche, D.; Delva, T.; Heyvaert, P.; De Meester, B.; Dimou, A. Towards a more human-friendly knowledge graph generation and publication. In Proceedings of the International Semantic Web Conference (ISWC) 2021: Posters, Demos, and Industry Tracks, Virtual, 24–28 October 2021. [Google Scholar]
- Lynn, M.S.; Jumail, A. The Danau Girang Field Centre: Field Station Profile. ECOTROPICA 2021, 23, 202103. [Google Scholar]
- Arenas, M.; Pérez, J. Querying semantic web data with SPARQL. In Proceedings of the Thirtieth ACM SIGMOD-SIGACT-SIGART Symposium on Principles of Database Systems, Athens, Greece, 12–16 June 2011; pp. 305–316. [Google Scholar]
- English, M.; Gillespie, G.; Ancrenaz, M.; Ismail, S.; Goossens, B.; Nathan, S.; Linklater, W. Plant selection and avoidance by the Bornean elephant (Elephas maximus borneensis) in tropical forest: Does plant recovery rate after herbivory influence food choices? J. Trop. Ecol. 2014, 30, 371–379. [Google Scholar] [CrossRef]
- Corlett, R.T. Frugivory and seed dispersal by vertebrates in tropical and subtropical Asia: An update. Glob. Ecol. Conserv. 2017, 11, 1–22. [Google Scholar] [CrossRef]
- Paranayapa, T.; Ranasinghe, P.; Ranmal, D.; Meedeniya, D.; Perera, C. A Comparative Study of Preprocessing and Model Compression Techniques in Deep Learning for Forest Sound Classification. Sensors 2024, 24, 1149. [Google Scholar] [CrossRef]
- Li, Z.; Li, H.; Meng, L. Model Compression for Deep Neural Networks: A Survey. Computers 2023, 12, 60. [Google Scholar] [CrossRef]
| Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).