Segment-before-Detect: Vehicle Detection and Classification through Semantic Segmentation of Aerial Images
:1. Introduction
2. Related Work
3. Proposed Method
- Semantic segmentation to infer pixel-level class masks using a fully convolutional network;
- Vehicle detection by regressing the bounding boxes of connected components;
- Object-level classification with a traditional convolutional neural network.
3.1. SegNet for Semantic Segmentation
3.2. Small Object Detection
3.3. CNN-Based Vehicle Classification
4. Experiments
4.1. Datasets
4.1.1. VEDAI
4.1.2. ISPRS Potsdam
4.1.3. NZAM/ONERA Christchurch
4.2. Semantic Segmentation
4.2.1. ISPRS Potsdam
4.2.2. NZAM/ONERA Christchurch
4.3. Detection Results
4.4. Learning a Vehicle Classifier
4.5. Transfer Learning for Vehicle Classification
4.6. Traffic Density Estimation
5. Conclusions
Author Contributions
Conflicts of Interest
AA | Average Accuracy |
CNN | Convolutional Neural Network |
COCO | Common Objects in Context |
CRF | Conditional Random Field |
DTIS | Département de Traitement de l’Information et Systèmes |
DtMM | Discriminatively-trained Mixture of Models |
FCN | Fully Convolutional Network |
GRSS | Geoscience & Remote Sensing Society |
HOG | Histogram of Oriented Gradients |
IEEE | Institute of Electrical and Electronics Engineers |
ILSVRC | ImageNet Large Scale Visual Recognition Competition |
IR | Infrared |
IRRGB | Infrared-Red-Green-Blue |
ISPRS | International Society for Photogrammetry and Remote Sensing |
NZAM | New Zealand Assets Management |
OA | Overal Accuracy |
ONERA | Office national d’études et de recherches aérospatiales |
RGB | Red-Green-Blue |
ReLU | Rectified Linear Unit |
VEDAI | Vehicle Detection in Aerial Imagery |
VGG | Visual Geometry Group |
VHR | Very High Resolution |
VOC | Visual Object Classes |
SGD | Stochastic Gradient Descent |
SVM | Support Vector Machine |
Dataset/Class | Car | Truck | Van | Pickup | Boat | Camping Car | Other | Plane | Tractor |
VEDAI | 1340 | 300 | 100 | 950 | 170 | 390 | 200 | 47 | 190 |
NZAM/ONERA Christchurch | 2267 | 73 | 120 | 90 | - | - | - | - | - |
ISPRS Potsdam | 1990 | 33 | 181 | 40 | - | - | - | - | - |
Dataset | Method | Imp. Surfaces | Building | Low veg. | Tree | Cars | OA |
Validation 12.5cm/px | SegNet RGB | 92.4% ± 0.6 | 95.8% ± 1.9 | 85.8% ± 1.3 | 83.0% ± 2.1 | 95.7% ± 0.3 | 90.6% ± 0.6 |
Test 5cm/px | SegNet IRRG | 92.4% | 95.8% | 86.7% | 87.4% | 95.1% | 90.0% |
FCN + CRF [17] | 91.8% | 95.9% | 86.3% | 87.7% | 89.2% | 89.7% | |
ResNet-101 [CASIA] | 92.8% | 96.9% | 86.0% | 88.2% | 94.2% | 89.6% |
Source | Background | Building | Vegetation | Vehicle | OA |
RGB | 75.6% ± 8.9 | 91.7% ± 1.3 | 55.2% ± 11.6 | 61.9% ± 2.4 | 84.4% ± 2.6 |
Dataset | Preprocessing | mIoU | Precision | Recall |
NZAM/ONERA Christchurch | ∅ | 60.0% | 0.597 | 0.797 |
Opening | 69.8% | 0.817 | 0.791 | |
Opening + remove small objects | 70.7% | 0.833 | 0.791 | |
ISPRS Potsdam | ∅ | 70.1% | 0.748 | 0.842 |
Opening | 73.3% | 0.866 | 0.842 | |
Opening + remove small objects | 74.2% | 0.907 | 0.841 |
Dataset | Method | Precision | Recall |
NZAM/ONERA Christchurch | HOG + SVM [20] | 0.402 | 0.398 |
DtMM (5 models) [24] | 0.743 | 0.737 | |
Ours | 0.833 | 0.791 | |
ISPRS Potsdam | Ours | 0.907 | 0.841 |
Model | Car | Truck | Ship | Tractor | Camping Car | Van | Pickup | Plane | Vehicle | OA | Time (ms) |
LeNet | 74.3 | 54.4 | 31.0 | 61.1 | 85.9 | 38.3 | 67.7 | 13.0 | 47.5 | 66.3 ± 1.7 | 2.1 |
AlexNet | 91.0 | 84.8 | 81.4 | 83.3 | 98.0 | 71.1 | 85.2 | 91.4 | 77.8 | 87.5 ± 1.5 | 5.7 |
VGG-16 | 90.2 | 86.9 | 86.9 | 86.5 | 99.6 | 71.1 | 91.4 | 100.0 | 77.2 | 89.7 ± 1.5 | 31.7 |
Model | Car | Truck | Ship | Tractor | Camping Car | Van | Pickup | Plane | Vehicle | OA | AA |
Baseline | 90.4 | 66.7 | 80.4 | 89.5 | 96.6 | 63.3 | 78.7 | 92.6 | 75.0 | 83.9 ± 2.7 | 81.5 ± 1.9 |
DA | 88.2 | 82.2 | 78.4 | 82.5 | 97.4 | 63.3 | 85.1 | 66.7 | 73.3 | 85.6 ± 1.4 | 77.3 ± 8.7 |
R | 87.9 | 71.1 | 86.3 | 84.2 | 97.4 | 73.3 | 87.2 | 100.0 | 75.0 | 86.1 ± 0.9 | 84.7 ± 1.7 |
DA + R | 91.4 | 85.6 | 88.2 | 87.6 | 97.4 | 70.0 | 87.2 | 100.0 | 81.7 | 89.0 ± 0.5 | 87.7 ± 1.5 |
Dataset | Classifier | Car | Van | Truck | Pick up | OA | AA |
Potsdam | Cars only | 100% | 0% | 0% | 0% | 94% | 25% |
AlexNet | 98% | 66% | 67% | 0% | 95% | 58% | |
VGG-16 | 92% | 66% | 75% | 33% | 89% | 67% | |
Christchurch | Cars only | 100% | 0% | 0% | 0% | 94% | 25% |
AlexNet | 94% | 40% | 67% | 89% | 93% | 73% | |
VGG-16 | 97% | 80% | 67% | 78% | 96% | 80% |
Dataset | ISPRS Potsdam | NZAM/ONERA Christchurch |
Absolute error (average error/ground truth total) | 3/52 | 6/66 |
Relative error | 7.9% | 9.1% |
