remotesensing-logo

Journal Browser

Journal Browser

Urban Multi-Category Object Detection Using Aerial Images

A special issue of Remote Sensing (ISSN 2072-4292). This special issue belongs to the section "Urban Remote Sensing".

Deadline for manuscript submissions: closed (1 July 2022) | Viewed by 30397

Special Issue Editor


E-Mail Website
Guest Editor
Department of Computer Science, VSB-Technical University of Ostrava, Ostrava, Czech Republic
Interests: Machine Learning, Data Compression, Data Mining, Optimization
Special Issues, Collections and Topics in MDPI journals

Special Issue Information

Dear Colleagues,

The detection of urban objects from aerial images has become a prevalent and useful task, as aerial images may be used for surveillance, tracking, mapping, or search and rescue tasks. However, satellite and aerial image prices have significantly decreased in the past few years. Due to the availability of UAVs and drones for real-time monitoring of a defined area or exploratory flights, a precise detection of the captured objects is required, as many kinds of objects may be present simultaneously in a picture and must be detected and classified.

Many approaches that utilize deep neural networks have been developed recently. Many so-called standard algorithms based on convolutions and residual or recurrent networks have been modified to fulfill the task. Nevertheless, new architectures that can deal with noisy images, complex backgrounds, and complex environments are still required. A transformer approach that has become immensely popular in text processing is also auspicious in image processing, and it may lead to revolutionary results. Moreover, tracking objects in a set of consecutive images needs modified algorithms with memory and partial or complete cover of other objects due to the camera-resistant approaches’ movement. However, everything starts with the excellent image preprocessing phase, dealing with different light conditions, day- and night-time, weather conditions, and other aspects.

Any high-quality novel and efficient approaches that deal with all or any aspect of urban multi-object detection from the aerial images are welcome.

Dr. Jan Platoš
Guest Editor

Manuscript Submission Information

Manuscripts should be submitted online at www.mdpi.com by registering and logging in to this website. Once you are registered, click here to go to the submission form. Manuscripts can be submitted until the deadline. All submissions that pass pre-check are peer-reviewed. Accepted papers will be published continuously in the journal (as soon as accepted) and will be listed together on the special issue website. Research articles, review articles as well as short communications are invited. For planned papers, a title and short abstract (about 100 words) can be sent to the Editorial Office for announcement on this website.

Submitted manuscripts should not have been published previously, nor be under consideration for publication elsewhere (except conference proceedings papers). All manuscripts are thoroughly refereed through a single-blind peer-review process. A guide for authors and other relevant information for submission of manuscripts is available on the Instructions for Authors page. Remote Sensing is an international peer-reviewed open access semimonthly journal published by MDPI.

Please visit the Instructions for Authors page before submitting a manuscript. The Article Processing Charge (APC) for publication in this open access journal is 2700 CHF (Swiss Francs). Submitted papers should be well formatted and use good English. Authors may use MDPI's English editing service prior to publication or during author revisions.

Keywords

  • Aerial images
  • Urban object classification
  • Aerial image preprocessing
  • Object detection
  • Aerial image enhancement
  • Multi-object localization
  • Multi-object detection and classification
  • Weather condition resistance

Benefits of Publishing in a Special Issue

  • Ease of navigation: Grouping papers by topic helps scholars navigate broad scope journals more efficiently.
  • Greater discoverability: Special Issues support the reach and impact of scientific research. Articles in Special Issues are more discoverable and cited more frequently.
  • Expansion of research network: Special Issues facilitate connections among authors, fostering scientific collaborations.
  • External promotion: Articles in Special Issues are often promoted through the journal's social media, increasing their visibility.
  • e-Book format: Special Issues with more than 10 articles can be published as dedicated e-books, ensuring wide and rapid dissemination.

Further information on MDPI's Special Issue polices can be found here.

Published Papers (9 papers)

Order results
Result details
Select all
Export citation of selected articles as:

Research

Jump to: Other

17 pages, 23251 KiB  
Article
Built-Up Area Change Detection Using Multi-Task Network with Object-Level Refinement
by Song Gao, Wangbin Li, Kaimin Sun, Jinjiang Wei, Yepei Chen and Xuan Wang
Remote Sens. 2022, 14(4), 957; https://doi.org/10.3390/rs14040957 - 16 Feb 2022
Cited by 11 | Viewed by 2703
Abstract
The detection and monitoring of changes in urban buildings, as a major place for human activities, have been considered profoundly in the field of remote sensing. In recent years, comparing with other traditional methods, the deep learning-based methods have become the mainstream methods [...] Read more.
The detection and monitoring of changes in urban buildings, as a major place for human activities, have been considered profoundly in the field of remote sensing. In recent years, comparing with other traditional methods, the deep learning-based methods have become the mainstream methods for urban building change detection due to their strong learning ability and robustness. Unfortunately, often, it is difficult and costly to obtain sufficient samples for the change detection method development. As a result, the application of the deep learning-based building change detection methods is limited in practice. In our work, we proposed a novel multi-task network based on the idea of transfer learning, which is less dependent on change detection samples by appropriately selecting high-dimensional features for sharing and a unique decoding module. Different from other multi-task change detection networks, with the help of a high-accuracy building mask, our network can fully utilize the prior information from building detection branches and further improve the change detection result through the proposed object-level refinement algorithm. To evaluate the proposed method, experiments on the publicly available WHU Building Change Dataset were conducted. The experimental results show that the proposed method achieves F1 values of 0.8939, 0.9037, and 0.9212, respectively, when 10%, 25%, and 50% of change detection training samples are used for network training under the same conditions, thus, outperforming other methods. Full article
(This article belongs to the Special Issue Urban Multi-Category Object Detection Using Aerial Images)
Show Figures

Graphical abstract

20 pages, 5744 KiB  
Article
Location and Extraction of Telegraph Poles from Image Matching-Based Point Clouds
by Jingru Wang, Cheng Wang, Xiaohuan Xi, Pu Wang, Meng Du and Sheng Nie
Remote Sens. 2022, 14(3), 433; https://doi.org/10.3390/rs14030433 - 18 Jan 2022
Cited by 5 | Viewed by 2296
Abstract
The monitoring of telegraph poles as essential features supporting overhead distribution network lines is the primary subject of this work. This paper proposes a method for locating and extracting telegraph poles from an image matching-based point cloud. Firstly, the point cloud of the [...] Read more.
The monitoring of telegraph poles as essential features supporting overhead distribution network lines is the primary subject of this work. This paper proposes a method for locating and extracting telegraph poles from an image matching-based point cloud. Firstly, the point cloud of the poles is extracted using the planar grid segmentation clustering algorithm and the connected component analysis algorithm of the region grows according to the isolated features of the poles perpendicular to the ground. Secondly, the candidate telegraph poles are located based on the suspension point of the buffer, considering that the top of the pole is connected to the power suspension line. Thirdly, the horizontal projection method of the backbone area is utilized to eliminate the interference of vegetation in the buffer area. Finally, the point cloud of the telegraph pole is extracted through the density-based spatial clustering of applications with noise (DBSCAN) algorithm. The experimental results demonstrate that the average values of Recall, Precision, and F1-score in telegraph pole detection can reach 91.09%, 90.82%, and 90.90%, respectively. The average RMSE value of location deviation is 0.51m. The average value of the F1-score in the telegraph pole extraction is 91.83%, and the average extraction time of a single pole is 0.27s. Accordingly, this method has strong adaptability to areas with lush vegetation and can automatically locate and extract the telegraph pole point cloud with high accuracy, and it can still achieve very high accuracy even under the holes in the data. Full article
(This article belongs to the Special Issue Urban Multi-Category Object Detection Using Aerial Images)
Show Figures

Graphical abstract

24 pages, 2397 KiB  
Article
Selecting Post-Processing Schemes for Accurate Detection of Small Objects in Low-Resolution Wide-Area Aerial Imagery
by Xin Gao, Sundaresh Ram, Rohit C. Philip, Jeffrey J. Rodríguez, Jeno Szep, Sicong Shao, Pratik Satam, Jesús Pacheco and Salim Hariri
Remote Sens. 2022, 14(2), 255; https://doi.org/10.3390/rs14020255 - 6 Jan 2022
Cited by 9 | Viewed by 2732
Abstract
In low-resolution wide-area aerial imagery, object detection algorithms are categorized as feature extraction and machine learning approaches, where the former often requires a post-processing scheme to reduce false detections and the latter demands multi-stage learning followed by post-processing. In this paper, we present [...] Read more.
In low-resolution wide-area aerial imagery, object detection algorithms are categorized as feature extraction and machine learning approaches, where the former often requires a post-processing scheme to reduce false detections and the latter demands multi-stage learning followed by post-processing. In this paper, we present an approach on how to select post-processing schemes for aerial object detection. We evaluated combinations of each of ten vehicle detection algorithms with any of seven post-processing schemes, where the best three schemes for each algorithm were determined using average F-score metric. The performance improvement is quantified using basic information retrieval metrics as well as the classification of events, activities and relationships (CLEAR) metrics. We also implemented a two-stage learning algorithm using a hundred-layer densely connected convolutional neural network for small object detection and evaluated its degree of improvement when combined with the various post-processing schemes. The highest average F-scores after post-processing are 0.902, 0.704 and 0.891 for the Tucson, Phoenix and online VEDAI datasets, respectively. The combined results prove that our enhanced three-stage post-processing scheme achieves a mean average precision (mAP) of 63.9% for feature extraction methods and 82.8% for the machine learning approach. Full article
(This article belongs to the Special Issue Urban Multi-Category Object Detection Using Aerial Images)
Show Figures

Figure 1

19 pages, 11641 KiB  
Article
An Automatic Conflict Detection Framework for Urban Intersections Based on an Improved Time Difference to Collision Indicator
by Qing Li, Zhanzhan Lei, Jiasong Zhu, Jiaxin Chen and Tianzhu Ma
Remote Sens. 2021, 13(24), 4994; https://doi.org/10.3390/rs13244994 - 8 Dec 2021
Cited by 1 | Viewed by 2582
Abstract
Urban road intersections are one of the key components of road networks. Due to complex and diverse traffic conditions, traffic conflicts occur frequently. Accurate traffic conflict detection allows improvement of the traffic conditions and decreases the probability of traffic accidents. Many time-based conflict [...] Read more.
Urban road intersections are one of the key components of road networks. Due to complex and diverse traffic conditions, traffic conflicts occur frequently. Accurate traffic conflict detection allows improvement of the traffic conditions and decreases the probability of traffic accidents. Many time-based conflict indicators have been widely studied, but the sizes of the vehicles are ignored. This is a very important factor for conflict detection at urban intersections. Therefore, in this paper we propose a novel time difference conflict indicator by incorporating vehicle sizes instead of viewing vehicles as particles. Specially, we designed an automatic conflict recognition framework between vehicles at the urban intersections. The vehicle sizes are automatically extracted with the sparse recurrent convolutional neural network, and the vehicle trajectories are obtained with a fast-tracking algorithm based on the intersection-to-union ratio. Given tracking vehicles, we improved the time difference to the conflict metric by incorporating vehicle size information. We have conducted extensive experiments and demonstrated that the proposed framework can effectively recognize vehicle conflict accurately. Full article
(This article belongs to the Special Issue Urban Multi-Category Object Detection Using Aerial Images)
Show Figures

Figure 1

24 pages, 3116 KiB  
Article
Street-Level Image Localization Based on Building-Aware Features via Patch-Region Retrieval under Metropolitan-Scale
by Lanyue Zhi, Zhifeng Xiao, Yonggang Qiang and Linjun Qian
Remote Sens. 2021, 13(23), 4876; https://doi.org/10.3390/rs13234876 - 1 Dec 2021
Cited by 3 | Viewed by 2511
Abstract
The aim of image-based localization (IBL) is to localize the real location of query image by matching reference image in database with GNSS-tags. Popular methods related to IBL commonly use street-level images, which have high value in practical application. Using street-level image to [...] Read more.
The aim of image-based localization (IBL) is to localize the real location of query image by matching reference image in database with GNSS-tags. Popular methods related to IBL commonly use street-level images, which have high value in practical application. Using street-level image to tackle IBL task has the primary challenges: existing works have not made targeted optimization for urban IBL tasks. Besides, the matching result is over-reliant on the quality of image features. Methods should address their practicality and robustness in engineering application, under metropolitan-scale. In response to these, this paper made following contributions: firstly, given the critical of buildings in distinguishing urban scenes, we contribute a feature called Building-Aware Feature (BAF). Secondly, in view of negative influence of complex urban scenes in retrieval process, we propose a retrieval method called Patch-Region Retrieval (PRR). To prove the effectiveness of BAF and PRR, we established an image-based localization experimental framework. Experiments prove that BAF can retain the feature points that fall on the building, and selectively lessen the feature points that fall on other things. While this effectively compresses the storage amount of feature index, we can also improve recall of localization results; implemented in the stage of geometric verification, PRR compares matching results of regional features and selects the best ranking as final result. PRR can enhance effectiveness of patch-regional feature. In addition, we fully confirmed the superiority of our proposed methods through a metropolitan-scale street-level image dataset. Full article
(This article belongs to the Special Issue Urban Multi-Category Object Detection Using Aerial Images)
Show Figures

Figure 1

19 pages, 10525 KiB  
Article
Moving Car Recognition and Removal for 3D Urban Modelling Using Oblique Images
by Chong Yang, Fan Zhang, Yunlong Gao, Zhu Mao, Liang Li and Xianfeng Huang
Remote Sens. 2021, 13(17), 3458; https://doi.org/10.3390/rs13173458 - 31 Aug 2021
Cited by 12 | Viewed by 4272
Abstract
With the progress of photogrammetry and computer vision technology, three-dimensional (3D) reconstruction using aerial oblique images has been widely applied in urban modelling and smart city applications. However, state-of-the-art image-based automatic 3D reconstruction methods cannot effectively handle the unavoidable geometric deformation and incorrect [...] Read more.
With the progress of photogrammetry and computer vision technology, three-dimensional (3D) reconstruction using aerial oblique images has been widely applied in urban modelling and smart city applications. However, state-of-the-art image-based automatic 3D reconstruction methods cannot effectively handle the unavoidable geometric deformation and incorrect texture mapping problems caused by moving cars in a city. This paper proposes a method to address this situation and prevent the influence of moving cars on 3D modelling by recognizing moving cars and combining the recognition results with a photogrammetric 3D modelling procedure. Through car detection using a deep learning method and multiview geometry constraints, we can analyse the state of a car’s movement and apply a proper preprocessing method to the geometrically model generation and texture mapping steps of 3D reconstruction pipelines. First, we apply the traditional Mask R-CNN object detection method to detect cars from oblique images. Then, a detected car and its corresponding image patch calculated by the geometry constraints in the other view images are used to identify the moving state of the car. Finally, the geometry and texture information corresponding to the moving car will be processed according to its moving state. Experiments on three different urban datasets demonstrate that the proposed method is effective in recognizing and removing moving cars and can repair the geometric deformation and error texture mapping problems caused by moving cars. In addition, the methods proposed in this paper can be applied to eliminate other moving objects in 3D modelling applications. Full article
(This article belongs to the Special Issue Urban Multi-Category Object Detection Using Aerial Images)
Show Figures

Figure 1

28 pages, 41409 KiB  
Article
An Inverse Node Graph-Based Method for the Urban Scene Segmentation of 3D Point Clouds
by Bufan Zhao, Xianghong Hua, Kegen Yu, Xiaoxing He, Weixing Xue, Qiqi Li, Hanwen Qi, Lujie Zou and Cheng Li
Remote Sens. 2021, 13(15), 3021; https://doi.org/10.3390/rs13153021 - 1 Aug 2021
Viewed by 2230
Abstract
Urban object segmentation and classification tasks are critical data processing steps in scene understanding, intelligent vehicles and 3D high-precision maps. Semantic segmentation of 3D point clouds is the foundational step in object recognition. To identify the intersecting objects and improve the accuracy of [...] Read more.
Urban object segmentation and classification tasks are critical data processing steps in scene understanding, intelligent vehicles and 3D high-precision maps. Semantic segmentation of 3D point clouds is the foundational step in object recognition. To identify the intersecting objects and improve the accuracy of classification, this paper proposes a segment-based classification method for 3D point clouds. This method firstly divides points into multi-scale supervoxels and groups them by proposed inverse node graph (IN-Graph) construction, which does not need to define prior information about the node, it divides supervoxels by judging the connection state of edges between them. This method reaches minimum global energy by graph cutting, obtains the structural segments as completely as possible, and retains boundaries at the same time. Then, the random forest classifier is utilized for supervised classification. To deal with the mislabeling of scattered fragments, higher-order CRF with small-label cluster optimization is proposed to refine the classification results. Experiments were carried out on mobile laser scan (MLS) point dataset and terrestrial laser scan (TLS) points dataset, and the results show that overall accuracies of 97.57% and 96.39% were obtained in the two datasets. The boundaries of objects were retained well, and the method achieved a good result in the classification of cars and motorcycles. More experimental analyses have verified the advantages of the proposed method and proved the practicability and versatility of the method. Full article
(This article belongs to the Special Issue Urban Multi-Category Object Detection Using Aerial Images)
Show Figures

Graphical abstract

17 pages, 3141 KiB  
Article
PointNet++ Network Architecture with Individual Point Level and Global Features on Centroid for ALS Point Cloud Classification
by Yang Chen, Guanlan Liu, Yaming Xu, Pai Pan and Yin Xing
Remote Sens. 2021, 13(3), 472; https://doi.org/10.3390/rs13030472 - 29 Jan 2021
Cited by 48 | Viewed by 7440
Abstract
Airborne laser scanning (ALS) point cloud has been widely used in the fields of ground powerline surveying, forest monitoring, urban modeling, and so on because of the great convenience it brings to people’s daily life. However, the sparsity and uneven distribution of point [...] Read more.
Airborne laser scanning (ALS) point cloud has been widely used in the fields of ground powerline surveying, forest monitoring, urban modeling, and so on because of the great convenience it brings to people’s daily life. However, the sparsity and uneven distribution of point clouds increases the difficulty of setting uniform parameters for semantic classification. The PointNet++ network is an end-to-end learning network for irregular point data and highly robust to small perturbations of input points along with corruption. It eliminates the need to calculate costly handcrafted features and provides a new paradigm for 3D understanding. However, each local region in the output is abstracted by its centroid and local feature that encodes the centroid’s neighborhood. The feature learned on the centroid point may not contain relevant information of itself for random sampling, especially in large-scale neighborhood balls. Moreover, the centroid point’s global-level information in each sample layer is also not marked. Therefore, this study proposed a modified PointNet++ network architecture which concentrates the point-level and global features on the centroid point towards the local features to facilitate classification. The proposed approach also utilizes a modified Focal Loss function to solve the extremely uneven category distribution on ALS point clouds. An elevation- and distance-based interpolation method is also proposed for the objects in ALS point clouds which exhibit discrepancies in elevation distributions. The experiments on the Vaihingen dataset of the International Society for Photogrammetry and Remote Sensing and the GML(B) 3D dataset demonstrate that the proposed method which provides additional contextual information to support classification achieves high accuracy with simple discriminative models and new state-of-the-art performance in power line categories. Full article
(This article belongs to the Special Issue Urban Multi-Category Object Detection Using Aerial Images)
Show Figures

Graphical abstract

Other

Jump to: Research

14 pages, 5128 KiB  
Technical Note
Small-Sized Vehicle Detection in Remote Sensing Image Based on Keypoint Detection
by Lijian Yu, Xiyang Zhi, Jianming Hu, Shikai Jiang, Wei Zhang and Wenbin Chen
Remote Sens. 2021, 13(21), 4442; https://doi.org/10.3390/rs13214442 - 4 Nov 2021
Cited by 2 | Viewed by 2400
Abstract
The vehicle detection in remote sensing images is a challenging task due to the small size of the objects and interference of a complex background. Traditional methods require a large number of anchor boxes, and the intersection rate between these anchor boxes and [...] Read more.
The vehicle detection in remote sensing images is a challenging task due to the small size of the objects and interference of a complex background. Traditional methods require a large number of anchor boxes, and the intersection rate between these anchor boxes and an object’s real position boxes needs to be high enough. Moreover, the size and aspect ratio of each anchor box need to be designed manually. For small objects, more anchor boxes need to be set. To solve these problems, we regard the small object as a keypoint in the relevant background and propose an anchor-free vehicle detection network (AVD-kpNet) to robustly detect small-sized vehicles in remote sensing images. The AVD-kpNet framework fuses features across layers with a deep layer aggregation architecture, preserving the fine features of small objects. First, considering the correlation between the object and the surrounding background, a 2D Gaussian distribution strategy is adopted to describe the ground truth, instead of a hard label approach. Moreover, we redesign the corresponding focus loss function. Experimental results demonstrate that our method has a higher accuracy for the small-sized vehicle detection task in remote sensing images compared with several advanced methods. Full article
(This article belongs to the Special Issue Urban Multi-Category Object Detection Using Aerial Images)
Show Figures

Graphical abstract

Back to TopTop