*5.1. Datasets*

As leaf recognition becomes more and more attractive, many open source leaf datasets can be used for studies, such as Flavia [5], ICL [37], Swedish [38,39], MEW2012 [32], and so on.

Flavia dataset (http://flavia.sourceforge.net/) contains 1907 leaf images of 32 kinds, and it is the most used dataset for leaf recognition. Most leaves of Flavia dataset, as shown in Figure 7, are common plants in the Yangtze Delta, China. To each species, there are at least 50 leaves, which is enough for training and testing. These leaves are single leaves with the petiole removed and without complex background.

**Figure 7.** Standard leaf image of Flavia dataset.

The ICL dataset (http://www.intelengine.cn/English/dataset) is collected by the Intelligent Computing Laboratory of the Chinese Academy of Sciences. The database contains 16,848 leaf images from 220 plants, with a different number of leaf images for each species. Some examples are shown in Figure 8.

**Figure 8.** Standard leaf image of ICL dataset.

The Swedish leaf dataset (http://www.cvl.isy.liu.se/en/research/datasets/sw) contains leaf images of 15 species each with 75 samples, for a total of 1125 Swedish leaf images. Figure 9 shows some example leaf samples of the Swedish dataset.

**Figure 9.** Standard leaf image of Swedish dataset.

The MEW (Middle European Woods) dataset is a large dataset containing 153 species of Central European woody plants with a total of 9745 samples. Some examples are shown in Figure 10.

**Figure 10.** Standard leaf image of MEW dataset.

#### *5.2. Length of DPCNN*

When DPCNN works, the iteration is a significant parameter which would influence the effect of features. In Ref. [28], the iteration is set at 47. Generally, for most all PCNN models, e.g., ICM and SCM, used for feature extraction, iterations are more than 30. To some degree, the iterative process is a process of feature extraction by using a dynamic threshold, which is the most prominent feature of PCNN models.

While the iterative process is also essential for BOF\_DP, how much times it costs is the key point of this part. To find the best iteration of DPCNN, the iteration number was changed from 5 to 45 to find a better iteration below 45. In fact, if the iteration were 45 or more, the time for feature extraction would be too long, so the maximum of iteration was set at 45. On the other hand, for an image, the entropy vector is an approximate periodic vector; too many iterations would not be helpful, and, on the contrary, it would lower the feature's productivity. Flavia dataset was selected for testing, where 30 sample images were selected for training for each species, and the remaining images were tested. The average recognition rates are listed in Figure 11. It is clear that the accuracy reaches its peak after a sharp increase. After the peak, when the iteration is 20, the accuracy shows a noticeable steady fall, and it never presents a rising trend. Hence, the best iteration number is around 20.

**Figure 11.** Relationship between Iteration of DPCNN and accuracy.

BOF\_DP has the best effect when iteration number is 20 while the traditional DPCNN has the best feature when the iteration number is 47. The iteration process is reduced obviously. To some extent, this may be caused by the method of sub-block processing, when images are divided into smaller pieces. The local feature is more outstanding in each block, but, when the iteration number is oversized, there would be some unnecessary data that can be regarded as noise. Actually, when the iteration number is smaller than 20, the redundancy and noise also exist. Hence, an effective method for feature selection will be helpful for improving the efficiency of the proposed feature.
