*3.5. Discussion*

In this section, both the framework and detail of proposed approach are introduced, the overall architecture is similar to FPN except positive/negative candidate sampling method are adjutsted. First, the framework is introduced, including network architecture, pre-trained backbone, object detection pipeline and network detail. Second, this section represents the matching mechanism multi-scale objects, the IOU threshold for small and medium anchors are reduced to ensure more small objects will be matching successfully. Third, a sampling algorithm is introduced to ensure the diversity of sampling results, all the samples are divided into different intervals, the algorithm tries to sample balance amount of samples in each of the interval. Finally, loss function of the proposed approach is introduced, including the cross entropy loss for classification and smooth L1 loss for localization.

#### **4. Experiments**

#### *4.1. Benchmarks*

The proposed approach are evaluated on two datasets: Object Detection in Aerial Images(DOTA) and e Unmanned Aerial Vehicle Benchmark(UAVB) [15,16], the detail of them are as the follow:



**Table 2.** Quantitative performance(AP%) of our model on DOTA benchmark datasets compared with comparison approaches. The best performance on each category is colored in red.

Both the DOTA and UAVB dataset contain all kinds scale and small objects account a large proportion. It means that both the ability of detecting small objects and all the scale of objects are important.

**Figure 5.** Objects distribution of DOTA and UAVB dataset.
