Image Geo-Site Estimation Using Convolutional Auto-Encoder and Multi-Label Support Vector Machine

Jain, Arpit; Verma, Chaman; Kumar, Neerendra; Raboaca, Maria Simona; Baliya, Jyoti Narayan; Suciu, George

doi:10.3390/info14010029

Open AccessArticle

Image Geo-Site Estimation Using Convolutional Auto-Encoder and Multi-Label Support Vector Machine

¹

Department of Computer Science and Engineering, Koneru Lakshmaiah Education Foundation, Vaddeswaram, Andhra Pradesh 515001, India

²

Department of Media and Educational Informatics, Faculty of Informatics, Eötvös Loránd University, 1053 Budapest, Hungary

³

Department of Computer Science & IT, Central University of Jammu, Jammu 181143, India

⁴

Doctoral School, University Politehnica of Bucharest, Splaiul Independentei Street No. 313, 060042 Bucharest, Romania

⁵

Faculty of Building Services Engineering, Technical University of Cluj-Napoca, C-tin Daicoviciu Street, No. 15, 400020 Cluj-Napoca, Romania

⁶

ICSI Energy National Research and Development Institute for Cryogenic and Isotopic Technologies, 240050 Ramnicu Valcea, Romania

⁷

Department of Educational Studies, Central University of Jammu, Jammu 181143, India

⁸

R&D Department Beia Consult International Bucharest, 041386 Bucharest, Romania

^*

Authors to whom correspondence should be addressed.

Information 2023, 14(1), 29; https://doi.org/10.3390/info14010029

Submission received: 3 November 2022 / Revised: 26 December 2022 / Accepted: 29 December 2022 / Published: 3 January 2023

(This article belongs to the Special Issue Trends in Computational and Cognitive Engineering)

Download

Browse Figures

Versions Notes

Abstract

:

The estimation of an image geo-site solely based on its contents is a promising task. Compelling image labelling relies heavily on contextual information, which is not as simple as recognizing a single object in an image. An Auto-Encode-based support vector machine approach is proposed in this work to estimate the image geo-site to address the issue of misclassifying the estimations. The proposed method for geo-site estimation is conducted using a dataset consisting of 125 classes of various images captured within 125 countries. The proposed work uses a convolutional Auto-Encode for training and dimensionality reduction. After that, the acquired preprocessed input dataset is further processed by a multi-label support vector machine. The performance assessment of the proposed approach has been accomplished using accuracy, sensitivity, specificity, and F1-score as evaluation parameters. Eventually, the proposed approach for image geo-site estimation presented in this article outperforms Auto-Encode-based K-Nearest Neighbor and Auto-Encode-Random Forest methods.

Keywords:

SVM classification; convolutional Auto-Encoder; Auto-Encoder-KNN algorithm; random forest algorithm

1. Introduction

The concept of estimating geo-location is an intrinsically rich and yet astonishingly under-explored topic that has received little attention to date [1]. In the standard domain of computer vision or recognition, it continues to be a topic distinct from many others because it requires the identification of more key, abstract features that remain constant across large spatial and temporal scales—making it an inherently subtle yet novel task to be undertaken by researchers. The capacity to extract contextual information from our environment and make inferences about our current position and surroundings is also a fundamental element of the human experience since we have the intrinsic ability to gather contextual information from our environment [2]. It is consequently essential to comprehend and achieve visual geo-recognition to construct a more refined and sensitive Artificial Intelligence framework.

In computing, geo-site refers to any sort of technology that may be used to identify a geographic position, whether it is a large expanse of land or something as little as a needle in a haystack. You can find a critical asset, like a trailer, a container, a pallet, or any other similar item, by identifying a linked device in real time. Frequently, the gadget is a mobile phone or a device that can connect to the internet [3].

1.1. Bluetooth Low Energy

Bluetooth is a technical standard for wireless short-range communication that was created in the 1990s [4]. It is primarily intended for short-distance communication, meaning the signals do not propagate very far; thus, the connecting devices must be in the range of around ten meters of one another.

Although Bluetooth has been available to us for more than twenty years, its latest advancement, Bluetooth Low Energy (BLE), is leading to compelling advances in geo-site and positioning technology. Bluetooth functionality is already available on the majority of smartphones and other gadgets [5]. As a result, when you place Bluetooth Low Energy beacons in well-known areas, the beacons will broadcast their unique identification to any Bluetooth-enabled devices that are nearby.

1.2. Network-Based Geo-Site

To determine the location of a user the network architecture of a service provider can also be utilized. The correctness of network-based approaches may differ depending on the methodology. This is reliant on the count of ground stations and the network’s implementation using the recently updated timing algorithms. Network triangulation is a method that is utilized by a variety of different network providers [6]. As indicated in the picture, you may identify the location of a position by drawing triangles pointing to it from coordinates. The service provider’s network infrastructure will equip a module that will be used by your tracking device.

Network-based geo-site is the least energy-intensive of all the geo-site technologies that have been considered. The precision of this locating approach is dependent on the density of accessible base stations and networks and may vary significantly from one location to another.

1.3. Wi-Fi

Wireless local area networks (WLANs) fall under the category of networks that link to a certain radio frequency, generally 2.4 GHz or 5.0 GHz, for Wi-Fi positioning [7], to provide location services. Once the data has been sent, the gadget transmits it with the help of radio waves up to a range of hundred meters, making Wi-Fi suitable for both indoor and outdoor use.

Wi-Fi uses very little energy and may work precisely to a range of ten meters, relying on the accessibility of Wi-Fi networks. Aside from that, there are no more infrastructures required. As previously indicated, your Wi-Fi gadget is capable of receiving data from other nearby networks. Please remember that this could necessitate the usage of a paid service or knowledge about the regional infrastructure network.

1.4. GPS

In brief, the Global Positional System (GPS), is a well-known satellite-based radio navigation system which consists of around thirty satellites orbiting the Earth and delivering positioning data. At one time, it was only available to military personnel, but now, anybody with a GPS gadget may accept radio signals emitted by these satellites [8].

If there are no obstructions and at least three GPS satellites accessible, when there is no interference, this worldwide satellite system can provide timely data and geo-site to a GPS module everywhere on or near the ground surface.

There are different types of geo-site as mentioned in Table 1 [9]. They include:

Data can be created and retrieved in a variety of ways using any of these types. Geo-site is evolving with time in such a way that it is making everyday life more comfortable. Many applications make use of this geo-site to identify the location of customers or users [10].

Travelling applications [11] like Uber, and Airbnb [12] are using this location feature to detect the location of the user and provide the service.
Facebook makes use of the geo-location feature to provide relevant information according to their place which will be more exciting and interesting to the users [13].
Delivery applications Zomato, Swiggy, Amazon [14], and Flipkart also use this geo-site to deliver products to the right place on time.
Dating apps like tinder also use the geo-location feature to show nearby people
Fitness apps like Nike Running Club make use of GPS route tracking for estimating the distance covered by the user [15,16].

These geo-site tags would be a real-time opportunity for effectively exploring many issues. Many researchers had explored geo-site and expanded its application in different fields. The usefulness of integrating geographical information in IR information, such as latitude and longitude, as well as the installation of a U-Net-based convolutional neural network for boosting the efficiency of retrieval algorithms, were investigated in [15] to improve the accuracy of retrieval methodologies. According to the findings of this study, applying a suitable CNN architecture to geographical and infrared information gives a chance to increase the quality of satellite-derived precipitation products. The authors in [16] suggested developing an indoor positioning estimation system based on the unique and quasi-stationary magnetic field of the rooms. This method was used to examine the spectrum development of magnetic field data, reducing temporal dependencies. In [17], the Fourier transform, a convolutional neural network (CNN) predicts the user’s location within a building using bi-dimensional magnetic field data. When installing an indoor locating system, the ROC curve is utilized to assess the CNN model’s sensitivity and specificity (ILS).

Using UAV multispectral images, a convolutional neural network (CNN) approach was technologically advanced to estimate the number of citrus plants in dense orchards. Each pixel is assigned a chance of containing a plant [18]. To test the approach’s resilience, the author used two modern object detection CNN methods [19]. In [20], the authors proposed using representation learning and label propagation to infer Twitter user location. First, the heterogeneous connection relation graph is formed using connections between Twitter users and location-specific phrases, and non-geographic interactions are filtered. Then the user vectors are learned from the connection relation graph [19]. Finally, using vector representations, iterative label propagation predicts the positions of unknown users. The proposed technique can properly compute label propagation possibilities built on vector illustrations and increase location inference accuracy on two samples of Twitter datasets. In [20], a review of existing research on microblog user geo-site. We provide a system for geo-locating microblog users. This framework presents location-related data and an overview of microblogging services. The framework’s essential phases are also explained in depth [21]. Modern geo-site techniques fall into three categories: network-based, text-based, and multiview-based. Results proved that multi-view-based approaches outperform text-based and network-based methods.

In this research, an Auto-Encoder-based multi-class classification algorithm is proposed to detect the geo-site of the images. A dataset of around 125 countries is considered, consisting of pictures captured within the respective countries. To reduce the dimensionality of the images Auto-Encoder is used and a multi-class SVM classification algorithm [22] classifies the images according to their labelled classes. This algorithm is proposed will be compared with KNN [23] and the random forest algorithm [24] to validate the efficiency of the proposed algorithm.

This paper has been written such that Section 2 deals with the literature review, Section 3 deals with the methodology used, and Section 4 depicts the results interpreted. At the end, Section 3 will conclude the work.

1. Literature Review:

Ref. [25] proposes an Ultra Wide Band (UWB) multi-antenna array Step Frequency Radar (SFR) system and signal processing was constructed for subterranean utility network identification, localization, and classification. Linear antenna arrays will cover the road or survey channel at ideal high speed for highways and small urban roads. It will also produce several parallel B-scans (radar image representation). The thesis was driven by government requirements requiring geomatics databases of subsurface utility networks and placement guidelines to minimize damage during excavations. In France’s NF-S70-003-2 standard, class A sensitive subterranean networks’ three-dimensional position is 11.25 cm. This application requires a strong solution in a highly dynamic, complicated, unpredictable, dispersive subsurface environment. We divided the large objective into two subtasks: SFR system prototype with data collection software development and signal processing algorithms for automated pipe identification, depth, and diameter estimation with Class A localization accuracy. Thus, future signal processing approaches focus on machine learning and computer vision-based AI algorithms and other physical signal processing techniques.

Ref. [26] proposes a hyperspectral image categorization algorithm. The method first extracts neighboring spatial regions using a suitable statistical support vector machine (SVM-Linear) architecture, SVM-RBF, and Deep Learning (DL) architecture that includes principal component analysis (PCA) and convolutional neural networks (CNN), then applies a softmax classifier. PCA reduces input picture noise, high spectral dimensionality, and redundant information. The SVM-Linear, SVM-RBF, and CNN models automatically extract high-level features for hyperspectral image categorization. However, because the CNN and SVM models alone may fail to extract features with different scales and tolerate the large-scale variance of image objects, the presented methodology uses PCA optimization for spatial regions to construct features that the SVM and CNN model can use to classify hyperspectral images. The Hyperspec-VNIR Chikusei dataset classification results reveal that the provided model performs comparably to other DL and classic ma-chine-learning approaches. The SVM-RBF model has the greatest accuracy for Hyper-spec-VNIR Chikusei datasets at 98.84%.

Ref. [27] proposes during the COVID-19 epidemic, cyber assaults on the healthcare, education, and financial sectors increased, drawing attention to distributed denial of service (DDoS) attacks. Virtualization, softwarization, and IoT devices enhance network attack surface and effect. This work introduces an Auto-Encoder-based time-based anomaly detection method. We investigate how various time windows affect the flow-based detection of numerous DDoS assaults. Our Auto-Encoder gets an anomaly detection F1-score of over 99% for most assaults and over 95% for all attacks on the latest CICDDoS2019 dataset.

Ref. [28] proposes due to its potential for future epidemics and worldwide concerns, the new coronavirus (COVID-19) outbreak, which was found in late 2019, demands particular attention. Since Artificial Intelligence (AI) offers a new paradigm for healthcare, a variety of AI tools built upon Machine Learning (ML) algorithms are used for data analysis and deci-sion-making processes, in addition to clinical procedures and treatments. This implies that AI-driven techniques assist in both forecasting the global spread of COVID-19 outbreaks and identifying them when they occur. Contrary to other healthcare challenges, COVID-19 detection requires AI-driven tools to use cross-population train/test models based on active learning that uses multitudinal and multimodal data, which is the main goal of the article.

Integrating several data sources could provide researchers with additional information for spotting anomalous COVID-19 trends. Therefore, in this research, [29] developed a Deep Neural Network (DNN) adapted to Convolutional Neural Networks (CNN) that can jointly train and assess CXRs and CT scans. Overall accuracy in our studies was 96.28% (AUC = 0.9808, false negative rate = 0.0208). To identify COVID-19-positive patients, major current DNNs also integrated CT scans and CXRs and produced coherent findings.

Table 2 summarizes the previous related works.

2. Methods

In this research, the Python-3 programming language is used for implementation [32]. A dataset containing 125 classes is considered for estimating the geo-site. These classes are the countries, and the data present in these classes are the images of objects, monuments, roads, etc. [33] which are captured within the country. Auto-Encoder with a binary classifier algorithm is proposed to estimate the location. As the dataset consists of high-dimensional images, these dimensions will be reduced using an Auto-Encoder. The filtered images will be trained to a multi-label SVM classifier [31] to identify the geo-location of the images. The proposed multi-label SVM classifier will be compared with another classification algorithm like KNN, and Random Forest to prove the strength and validity of the algorithm proposed, depicted in Figure 1.

Because the data in this dataset is sparse and simple to categorize, SVM operates more quickly and produces better results. However, although Random Forest and KVM also produce strong results, it falls short of SVM for this specific dataset. The intended result determines the algorithm to use.

The Materials and Methods should be described with sufficient details to allow others to replicate and build on the published results. Please note that the publication of your manuscript implies that you must make all materials, data, computer code, and protocols associated with the publication available to readers. Please disclose at the submission stage any restrictions on the availability of materials or information. New methods and protocols should be described in detail while well-established methods can be briefly described and appropriately cited.

Research manuscripts reporting large datasets that are deposited in a publicly available database should specify where the data have been deposited and provide the relevant accession numbers. If the accession numbers have not yet been obtained at the time of submission, please state that they will be provided during review. They must be provided before publication.

Interventional studies involving animals or humans, and other studies that require ethical approval, must list the authority that provided approval and the corresponding ethical approval code.

Convolutional Auto-Encoder (CAE)

Auto-Encoder is an unsupervised Neural Network that is used to compress the data into smaller dimensions (bottleneck layer or code) and hence decode the data to regain the actual original input. The compressed data given as input is stored in the bottleneck layer. Because we are aiming to recreate the input data, the number of output units in Auto-Encoder must be identical to the number of input units. An encoder and a decoder are commonly seen in Auto-Encoders [32]. The encoder compresses the data delivered into a reduced dimension equal to the size of the bottleneck layer, while the decoder restores the original data.

As we add more layers to the encoder, the number of neurons in the layers will decrease, whilst the number of neurons in the layers of the decoder will increase. We will extract the bottleneck layer and utilize it to lower the dimensions when we employ Auto-Encoders for dimensionality reduction. This is referred to as feature extraction [33].

Dimensionality reduction is a concept of lowering and reducing the number of dimensions in data by removing fewer valuable elements [34] or transforming the data into smaller dimensions. Overfitting is avoided when the dimensionality is reduced. As a result, the data is depersonalized to prevent overfitting and make the procedure easier [35]. These dimensionally reduced images will be trained to a multi-label SVM classifier to detect the classes of the pictures with their geo-location.

In multi-label classification, zero or more class labels are predicted. Unlike classic classification tasks where class labels were necessarily compatible, multi-label classifications need specialized machine learning algorithms that really can anticipate numerous mutually non-exclusive classes. SVM is one of the machine learning algorithms which could deal with non-exclusive classes [36]. The advantages and limitations of the proposed multi-label SVM of the algorithm are discussed below:

Pros:

o: It performs well when there is a distinct margin of separation
o: It works well in environments with several dimensions.
o: It works well in situations when there are more samples than dimensions.
o: It is also memory-efficient since it only needs a small fraction of training points for the decision function (known as support vectors).

Cons:

o: It does not work well when we have a big data set since the needed training time is longer.
o: It also does not function well when the data set has more noise, i.e., target classes overlap. It is part of the Python scikit-learn library’s related SVC algorithm.

3. Results and Discussion

To perform the tests using the Geoguessr dataset, we made use of Jupyter–a free online cloud-based notebook which helped in training our deep learning as well as machine learning methods on CPUs, GPUs, and TPUs, which holds various street view map images around the globe.

3.1. Dataset Description

The Geoguessr dataset contains overall 126 subfolders and each one of them contains the street views of its respective country. Some of them are visualized in Table 3.

3.2. Preprocessing

3.2.1. Scaling of the Image

Practically, the data is acquired from various sources to overcome the case of image dimensionality dispute all the images are reshaped as a 224 × 224 × 3 array.

3.2.2. Dimensionality Reduction

Dimensionality reduction [35,36,37] is a method for shrinking the feature space to build a stable [34] and statistically sound machine learning model while avoiding the dimensionality constraint. The Auto-Encoder is utilized to decrease the data’s dimensionality in this study. The SVM classifier uses the data produced after dimensionality reduction as features.

3.2.3. Label Binarization

They consider data are associated with 126 classes whose labels are binarized i.e., converted into 1’s and 0’s to train the classifier.

3.2.4. Train Test Split

To train, validate and test the classifier the data set is divided into the ratio 7:2:1. This data set is used to train the model multi-label SVM classifier and its performance is compared with the KNN and random forest classifiers using accuracy, specificity, and sensitivity and F1 score as metrics.

3.3. Evaluation of the Proposed Algorithm

Figure 2 represents the performance parameter analysis of three chosen algorithms. The first group of bars depicts the accuracy of the algorithms.

In this, Auto-Encoder-SVM has an accuracy of 95.46%, Auto-Encoder-KNN is with an accuracy of 89.07% and Auto-Encoder-RF has an accuracy of 84.71%. From Figure 2a it can be depicted that the proposed algorithm is more precise and accurate when compared with other algorithms. It can also be observed that the proposed algorithm is 6.39% more accurate than the Auto-Encoder-KNN algorithm and Auto-Encoder-RF is 4.36% less accurate than Auto-Encoder-KNN.

The second group Figure 2b of bars depicts the sensitivity values of three algorithms. The sensitivity value of the proposed algorithm is 91.72%, the value of KNN is 74.79%, and Random Forest is 71.61%. From these values of sensitivity, it has been observed that the algorithm being proposed has more sensitivity than other algorithms. Auto-Encoder-KNN has 16.93% less sensitivity than the proposed algorithm and Auto-Encoder-RF has 3.18% less sensitivity when compared with Auto-Encoder-KNN.

The third group of Figure 2c bars demonstrates the specificity values of the three algorithms. The specificity should be equal to one, to prove the effectiveness of the algorithm. In the above graph, the specificity value of the Auto-Encoder-SVM algorithm is 81.67%, the Auto-Encoder-KNN algorithm is 79.02% and the Auto-Encoder-RF algorithm is 65.09%. In these algorithms, the specificity value of the proposed algorithm is closer to one when compared with the three algorithms. The specificity value of Auto-Encoder-KNN is 2.65% less than the proposed algorithm and Auto-Encoder-RF has 13.93% less specificity than Auto-Encoder-KNN.

From all these evaluations of performance parameters on three algorithms, it can be depicted that the proposed algorithm Auto-Encoder-SVM is a more accurate and effective algorithm than the other two Auto-Encoder-KNN, and Auto-Encoder-RF algorithms.

The numerical values of the performance metrics of the three algorithms are demonstrated in following Table 4.

Table 4 presents the F1-score of the three algorithms. The F1-measure value of Auto-Encoder-SVM is 0.9693, Auto-Encoder-KNN value is 0.8934, and Auto-Encoder-RF value is 0.8276. It has been ascertained that the proposed algorithm Auto-Encoder-SVM has more F1-score than other algorithms. The comparisons of this F1-score can be checked in Figure 3.

The comparison of the selected algorithms’ performance metrics with the suggested algorithm is shown in Figure 4. The improvement in accuracy for the SVM with KNN is shown to be smaller, at 7.17413, than for the SVM plus RF, at 12.6904. As a result, in terms of accuracy, the KNN method is more similar to the suggested approach. In terms of sensitivity, KNN is 22.6367 and RF is 28.0827; as a result, KNN differs more from the suggested method in terms of improvement. In terms of specificity, KNN and the proposed algorithm are very similar (3.35358), whereas there are significant differences (25.4724 and 23.0287, respectively) between the RF algorithm and the proposed algorithm in terms of specificity. Lastly, KNN and the proposed algorithm differ less in terms of the F1-score (18.4245 vs. 23.0287).

The numerical values of improvements of chosen algorithms with the proposed algorithm are depicted in Table 5.

Table 5 depicts improvement in the percentage of performance parameters by comparing the proposed algorithm with other chosen algorithms. It is observable that the proposed algorithm Auto-Encoder-SVM is 7.17413% more accurate than Auto-Encoder-KNN, and 12.6904% more accurate than Auto-Encoder-RF. When compared with the sensitivity proposed algorithm Auto-Encoder-SVM has 22.6367% more sensitivity than Auto-Encoder-KNN and 28.0827 more accuracies than Auto-Encoder-RF. It is also observed from the last column, a specificity that the proposed algorithm has 3.35358% less specificity than the proposed algorithm Auto-Encoder-SVM, and 25.4724% less specificity than the proposed Auto-Encoder-SVM algorithm. From all the above evaluations, it has been determined that the algorithm being proposed, Auto-Encoder-SVM is more accurate and effective than other algorithms.

4. Conclusions

In this work, the Jupyter Notebook is used for the implementation of estimating the geo-site of images. For this, the image dataset of 125 different countries is considered. In total, 125 countries are taken as 125 classes and images which were captured within the country are in their respective classes. The Auto-Encoder-SVM classification is proposed to classify the geo-site images. The Auto-Encoder is trained with the images to reduce dimensionality. The filtered images will be trained to a Multi-label SVM classification algorithm to classify the images according to their classes. This proposed algorithm is compared with the Multi-label KNN classification algorithm and Multi-label Random Forest classification algorithm and evaluated the performance of the three algorithms. Evaluation metrics like accuracy are evaluated and it depicted that Auto-Encoder-SVM Hassan accuracy of 95.46%, Auto-Encoder-KNN is with an accuracy of 89.07% and Auto-Encoder-RF Hassan accuracy of 84.71%. From this, it is observable that the proposed algorithm is highly precise. Other performance metrics like specificity, sensitivity, and F1-score are evaluated to demonstrate the effectiveness of the algorithms. From all the evaluations, it can be inferred that the proposed algorithm Auto-Encoder-SVM is highly accurate and potent compared to other chosen algorithms.

Author Contributions

Conceptualization, A.J., C.V. and N.K.; methodology, A.J., C.V. and N.K.; software, M.S.R., J.N.B. and G.S.; validation, M.S.R., J.N.B. and G.S.; formal analysis, M.S.R., J.N.B. and G.S.; investigation, A.J., C.V. and N.K.; resources, A.J., C.V. and N.K.; data curation, A.J., C.V. and N.K.; writing—original draft preparation, A.J., C.V., N.K., M.S.R., J.N.B. and G.S.; writing—review and editing, A.J., C.V., N.K., M.S.R., J.N.B. and G.S.; visualization, M.S.R., J.N.B. and G.S.; supervision, C.V. and J.N.B.; project administration, M.S.R. and G.S.; funding acquisition, M.S.R. and G.S. All authors have read and agreed to the published version of the manuscript.

Funding

The authors received no specific funding for this study.

Data Availability Statement

Not applicable.

Acknowledgments

This paper was partially supported by UEFISCDI Romania and MCI through BEIA projects AutoDecS, SOLID-B5G, T4ME2, DISAVIT, PIMEO-AI, AISTOR, MULTI-AI, ADRIATIC, Hydro3D, PREVENTION, DAFCC, EREMI, ADCATER, MUSEION, FinSESCo, iPREMAS, IPSUS, U-GARDEN, CREATE and by European Union's Horizon 2020 research and innovation program under grant agreements No. 101073932 (RITHMS). The results were obtained with the support of the Ministry of Investments and European Projects through the Human Capital Sectoral Operational Program 2014-2020, Contract no. 62461/03.06.2022, SMIS code 153735. This work is supported by Ministry of Research, Innovation, Digitization from Romania by the National Plan of R & D, Project PN 19 11, Subprogram 1.1. Institutional performance-Projects to finance excellence in RDI, Contract No. 19PFE/30.12.2021 and a grant of the National Center for Hydrogen and Fuel Cells (CNHPC)—Installations and Special Objectives of National In-terest (IOSIN).

Conflicts of Interest

The authors declare that they have no conflict of interest to report regarding the present study.

References

Elgui, K.; Bianchia, P.; Portiera, F.; Issonb, O. Learning Methods for RSSI-based Geo-site: A Comparative Study. Pervasive Mob. Comput. 2020, 67, 101199. [Google Scholar] [CrossRef]
Jain, A.; Kumar, A. Desmogging of still smoggy images using a novel channel prior. J. Ambient. Intell. Humaniz. Comput. 2021, 12, 1161–1177. [Google Scholar] [CrossRef]
Agrawal, S.; Sharma, A.; Bhatnagar, C.; Chauhan, D.S. Modelling and analysis of emitter geo-site using satellite tool kit. Def. Sci. J. 2020, 70, 440–447. [Google Scholar] [CrossRef]
Tosi, J.; Taffoni, F.; Santacatterina, M.; Sannino, R.; Formica, D. Performance Evaluation of Bluetooth Low Energy: A Systematic Review. Sensors 2017, 17, 2898. [Google Scholar] [CrossRef] [Green Version]
Hamrouni, L.; Kherfi, M.L.; Aiadi, O.; Benbelghit, A. Plant Leaves Recognition Based on a Hierarchical One-Class Learning Scheme with Convolutional Auto-Encoder and Siamese Neural Network. Symmetry 2021, 13, 1705. [Google Scholar] [CrossRef]
Gouel, M.; Vermeulen, K.; Fourmaux, O.; Beverly, R. IP Geo-site Database Stability and Implications for Network Research. In Proceedings of the Network Traffic Measurement and Analysis Conference, Virtual Event, 14–15 September 2021; pp. 1–10. [Google Scholar]
Song, K.; Zhou, L.; Wang, H. Deep Coupling Recurrent Auto-Encoder with Multi-Modal EEG and EOG for Vigilance Estimation. Entropy 2021, 23, 1316. [Google Scholar] [CrossRef] [PubMed]
El Kader, I.A.; Xu, G.; Shuai, Z.; Saminu, S.; Javaid, I.; Ahmad, I.S.; Kamhi, S. Brain Tumor Detection and Classification on MR Images by a Deep Wavelet Auto-Encoder Model. Diagnostics 2021, 11, 1589. [Google Scholar] [CrossRef]
Battey, C.; Ralph, P.L.; Kern, A.D. Predicting geographic location from genetic variation with deep neural networks. eLife 2020, 9, e54507. [Google Scholar] [CrossRef]
Mishra, P. Geo-site of Tweets with a BiLSTM Regression Model. In Proceedings of the 7th Workshop on NLP for Similar Languages, Varieties and Dialects, Barcelona, Spain, 13 December 2020; pp. 283–289. [Google Scholar]
Sarasa-Cabezuelo, A. The Use of Geolocation to Manage Passenger Mobility between Airports and Cities. Computers 2020, 9, 73. [Google Scholar] [CrossRef]
Umakantha, A.; Morina, R.; Cowley, B.R.; Snyder, A.C.; Smith, M.A.; Yu, B.M. Bridging neuronal correlations and dimensionality reduction. Neuron 2021, 109, 2740–2754. [Google Scholar] [CrossRef]
Giese, D.; Noubir, G. Amazon echo dot or the reverberating secrets of IoT devices. In Proceedings of the 14th ACM Conference on Security and Privacy in Wireless and Mobile Networks, Abu Dhabi, United Arab Emirates, 28 June–1 July 2021; pp. 13–24. [Google Scholar] [CrossRef]
Struminskaya, B.; Lugtig, P.; Keusch, F.; Höhne, J.K. Augmenting Surveys with Data from Sensors and Apps: Opportunities and Challenges. Soc. Sci. Comput. Rev. 2020, 12, 15–29. [Google Scholar] [CrossRef]
Agrawal, N.; Jain, A.; Agarwal, A. Simulation of network on chip for 3D router architecture. Int. J. Recent Technol. Eng. 2019, 8, 58–62. [Google Scholar] [CrossRef]
Peddada, A.V.; Hong, J. Geo-Location Estimation with Convolutional Neural Networks. 2016, pp. 1–9. Available online: http://cs231n.stanford.edu/reports/2015/pdfs/CS231N_Final_Report_amanivp_jamesh93.pdf (accessed on 1 November 2022).
Jain, A.; Kumar, A.; Sharma, S. Comparative Design and Analysis of Mesh, Torus and Ring NoC. Procedia Comput. Sci. 2015, 48, 330–337. [Google Scholar] [CrossRef] [Green Version]
Yan, X.; Xu, Y.; She, D.; Zhang, W. Reliable Fault Diagnosis of Bearings Using an Optimized Stacked Variational Denoising Auto-Encoder. Entropy 2021, 24, 36. [Google Scholar] [CrossRef]
Tian, H.; Zhang, M.; Luo, X.; Liu, F.; Qiao, Y. Twitter User Location Inference Based on Representation Learning and Label Propagation. In Proceedings of the Web Conference, Taipei, Taiwan, 20–24 April 2020. [Google Scholar] [CrossRef]
Jain, A.; Gahlot, A.K.; Dwivedi, R.; Kumar, A.; Sharma, S.K. Fat Tree NoC Design and Synthesis. Intell. Commun. Control Devices 2018, 624, 1749–1756. [Google Scholar] [CrossRef]
Luo, X.; Qiao, Y.; Li, C.; Ma, J.; Liu, Y. An overview of microblog user geolocation methods. Inf. Process. Manag. 2020, 57, 102375. [Google Scholar] [CrossRef]
Ghai, D.; Gianey, H.; Jain, A.; Uppal, R.S. Quantum and dual-tree complex wavelet transform-based image watermarking. Int. J. Mod. Phys. B 2020, 34, 2050009. [Google Scholar] [CrossRef]
Binbusayyis, A.; Vaiyapuri, T. Unsupervised deep learning approach for network intrusion detection combining convolutional Auto-Encoder and one-class SVM. Appl. Intell. 2021, 51, 7094–7108. [Google Scholar] [CrossRef]
Yang, Y.; Whinston, A. Identifying Mislabeled Images in Supervised Learning Utilizing Auto-Encoder. In Proceedings of the Future Technologies Conference, Vancouver, BC, Canada, 5–6 November 2020; pp. 266–282. [Google Scholar] [CrossRef]
Li, X.; Chen, W.; Zhang, Q.; Wu, L. Building Auto-Encoder Intrusion Detection System based on random forest feature selection. Comput. Secur. 2020, 95, 101851. [Google Scholar] [CrossRef]
Han, B.; Cook, P.; Baldwin, T. Geo-site prediction in social media data by finding location indicative words. In Proceedings of the 24th International Conference on Computational Linguistics (COLING), Mumbai, India, 8–15 December 2012; pp. 1045–1062. [Google Scholar]
Miura, Y. A Simple Scalable Neural Networks based Model for Geo-site Prediction in Twitter. In Proceedings of the 2nd Workshop on Noisy User-Generated Text (WNUT), Osaka, Japan, 5–16 January 2016; pp. 235–239. [Google Scholar]
Alizadeh, J.; Bogdan, M.; Classen, J.; Fricke, C. Support Vector Machine Classifiers Show High Generalizability in Automatic Fall Detection in Older Adults. Sensors 2021, 21, 7166. [Google Scholar] [CrossRef]
Shetty, A.; Gao, G.X. UAV pose estimation using cross-view geolocalization with satellite imagery. In Proceedings of the International Conference on Robotics and Automation (ICRA), Montreal, QC, Canada, 20–24 May 2019; pp. 1827–1833. [Google Scholar]
Zola, P.; Ragno, C.; Cortez, P. A Google Trends spatial clustering approach for a worldwide Twitter user geolocation. Inf. Process. Manag. 2020, 57, 102312. [Google Scholar] [CrossRef]
Weyand, T.; Kostrikov, I.; Philbin, J. PlaNet—Photo Geo-site with Convolutional Neural Networks. In Proceedings of the European Conference on Computer Vision, Amsterdam, The Netherlands, 11–14 October 2016; pp. 37–55. [Google Scholar] [CrossRef]
Espadoto, M.; Hirata, N.S.T.; Telea, A.C. Self-supervised dimensionality reduction with neural networks and pseudo-labeling. In Proceedings of the 16th International Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications (VISIGRAPP), Okaska, Japan, 15–16 January 2021; pp. 27–37. [Google Scholar] [CrossRef]
Hurtik, P.; Molek, V.; Perfilieva, I. Novel dimensionality reduction approach for unsupervised learning on small datasets. Pattern Recognit. 2020, 103, 107291. [Google Scholar] [CrossRef]
Ferreira, C.B.R.; Soares, F.A.A.M.N.; Pedrini, H.; Bruce, N.; Ferreira, W.D.; Da Cruz, G. A Study of Dimensionality Reduction Impact on an Approach to People Detection in Gigapixel Images. Can. J. Electr. Comput. Eng. 2020, 43, 122–128. [Google Scholar] [CrossRef]
Jain, A.; Singh, J.; Kumar, S.; Florin-Emilian, Ț.; Candin, M.T.; Chithaluru, P. Improved Recurrent Neural Network Schema for Validating Digital Signatures in VANET. Mathematics 2022, 10, 3895. [Google Scholar] [CrossRef]
Wang, D.; Zhang, S. Unsupervised Person Re-Identification via Multi-Label Classification. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Seattle, WA, USA, 13–19 June 2020; pp. 10978–10987. [Google Scholar] [CrossRef]
Becker, M.; Lippel, J.; Stuhlsatz, A.; Zielke, T. Robust dimensionality reduction for data visualization with deep neural networks. Graph. Model. 2020, 108, 101060. [Google Scholar] [CrossRef]

Figure 1. Schematic Representation of the Auto-Encoder.

Figure 2. Flow chart of the proposed algorithm. (a) Autoencoder and KNN. (b) Autoencoder and SVM. (c) Autoencoder and RF.

Figure 3. Performance parameters of the algorithms.

Figure 4. Improvement in performance parameters between algorithms.

Table 1. Comparisons of different types of geo-locations.

	Bluetooth Low Energy	Network-Based Location	Wi-Fi	GPS
Accuracy	Indoor/outdoor	Indoor/outdoor	Indoor/outdoor	Outdoor only
Accuracy dependency	infrastructure	Everywhere	Availability of network	Up to five meters
Infrastructure	required	Not required	Not required	Not required
Accuracy in positioning	Less than 25 m	Up to 5 km	Less than 25 m	5 m

Table 2. A summary of the most relevant related work.

Author	Features	Database	Findings	Observations	Shortcomings
[15]	Recursive neural network	Washington RGBD dataset	To present a primary outcome of a method centered around approximating the GPS position of a drone with the use of Convolutional Neural Networks (CNN) and a learning-based strategy [28].	High Accuracy	Large training data sets may be required, resulting in a protracted training procedure.
[16]	Convolutional neural network	Magnetic field dataset	To present the implementation of an indoor location estimator system, where the data is generated by the magnetic field of the rooms, which has been exhibited and is quasi-stationary and unique. [29]	Better accuracy	Enhancing fitness in the blind test conducted without jeopardizing overfitting, as well as evaluating a lengthier training to update neural network weights. In addition to this, it is considered appropriate to test the model in various situations with varying setups to visualize the behavior of the acquired findings.
[30]	Convolutional neural network	Unmanned aerial vehicle multispectral images	To propose a convolutional neural network (CNN) approach to resolve the issue of approximating the count of citrus trees in highly packed orchards from UAV multispectral images.	Better accuracy	By expanding the count of stages, the cost of computation leads to the increase in the speed/accuracy trade-off can be examined in the choice of the number of stages [30]
[18]	Twitter user location inference method	Twitter datasets	To present a Twitter user location inference methodology reliant on label propagation representation learning.	More Accurate	The method often predicts the coordinates of the users based on only the particular single social network data rather than different social network data.
[20]	Geo localization methods	Real-world dataset	To provide information related to location in addition to an analysis of the micro-blogging platforms [31].	Good results	The Geo-localization model was not dynamically updated and is unable to capture the change in the user’s home location change and update it.
[22]	One-dimensional convolutional Auto-Encoder (1D CAE) and a one-class support vector machine (OCSVM)	NSL-KDD and UNSW-NB15 datasets	To propose intrusion detection using an unsupervised deep learning method.	Potential results	Prolonged time for training compared to other ablations.

Table 3. Samples of the dataset.

Sample street map images of Aland
SampleStreet map images of Canada
SampleStreet map images of the United States
Sample Street map images of India
Sample street map images of Thailand

Table 4. Performance parameters of algorithms.

Models	Accuracy (%)	Sensitivity (%)	Specificity (%)	F1-Score
Auto-Encoder-SVM	95.46	91.72	81.67	0.9693
Auto-Encoder-KNN	89.07	74.79	79.02	0.8934
Auto-Encoder-RF	84.71	71.61	65.09	0.8276

Table 5. Accuracy, sensitivity, and specificity Improvement in percentage.

Models	Accuracy (%)	Sensitivity (%)	Specificity (%)
Auto-Encoder SVM Auto-Encoder KNN	7.17413	22.6367	3.35358
Auto-Encoder SVM Auto Encoder RF	12.6904	28.0827	25.4724

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Jain, A.; Verma, C.; Kumar, N.; Raboaca, M.S.; Baliya, J.N.; Suciu, G. Image Geo-Site Estimation Using Convolutional Auto-Encoder and Multi-Label Support Vector Machine. Information 2023, 14, 29. https://doi.org/10.3390/info14010029

AMA Style

Jain A, Verma C, Kumar N, Raboaca MS, Baliya JN, Suciu G. Image Geo-Site Estimation Using Convolutional Auto-Encoder and Multi-Label Support Vector Machine. Information. 2023; 14(1):29. https://doi.org/10.3390/info14010029

Chicago/Turabian Style

Jain, Arpit, Chaman Verma, Neerendra Kumar, Maria Simona Raboaca, Jyoti Narayan Baliya, and George Suciu. 2023. "Image Geo-Site Estimation Using Convolutional Auto-Encoder and Multi-Label Support Vector Machine" Information 14, no. 1: 29. https://doi.org/10.3390/info14010029

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Image Geo-Site Estimation Using Convolutional Auto-Encoder and Multi-Label Support Vector Machine

Abstract

1. Introduction

1.1. Bluetooth Low Energy

1.2. Network-Based Geo-Site

1.3. Wi-Fi

1.4. GPS

2. Methods

Convolutional Auto-Encoder (CAE)

3. Results and Discussion

3.1. Dataset Description

3.2. Preprocessing

3.2.1. Scaling of the Image

3.2.2. Dimensionality Reduction

3.2.3. Label Binarization

3.2.4. Train Test Split

3.3. Evaluation of the Proposed Algorithm

4. Conclusions

Author Contributions

Funding

Data Availability Statement

Acknowledgments

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI