**4. Conclusions**

This study proposes a Transformer-based apple flowering monitoring method for monitoring the whole flower growth process of full fruit trees in the open world. In this work, the non-pure convolutional S-YOLO was model used to detect the four growth stages of apple blossoms accurately and to analyze the changes in the numbers and percentages of blossoms at each growth stage in order to estimate the peak flowering time and flowering intensity and to complete the monitoring process. The main conclusions are as follows.

1. Based on the combination of YOLOX and Swin Transformer, the SAHI algorithm was added to form the S-YOLO model. S-YOLO-s improved the precision compared to the original YOLOX-s by 7.94%, 8.05%, 3.49%, and 6.96% for the four flowering states and by 10.00%, 9.10%, 13.10%, and 7.20% for the *mAP*ALL, *mAP*S, *mAP*M, and *mAP*L, respectively. S-YOLO-l resulted in 88.18%, 88.95%, 89.50%, and 91.95% precision at each flowering state and 39.00%, 32.10%, 50.60%, and 64.30% for each type of *mAP*, respectively. Without considering the SAHI algorithm boost, the non-pure convolutional S-YOLO-l model slightly outperformed the YOLOX-l model with similar parameters and FLOPs in the original dataset, with improvements of 3.30%, 1.98%, 0.26%, and 1.88% in detection precision. In addition, using a bigger Swin Transformer as the backbone, designing an appropriate percentage of structural parameters, and collecting more training data may have resulted in improved experimental outcomes.


The apple flower monitoring method proposed in this study is applicable to orchard environments in the open world. Based on the detection of four stages of tiny flowers in complete fruit tree images, the quantitative analysis of data and the assessment of blossom intensity were realized, and then the flower information monitoring was realized. It is important to note that the existence of diverse viewing angles, illumination fluctuations, occlusions, uncertain stances, low pixel ratio, complicated backdrops, etc., makes it challenging for models trained on the source dataset to achieve high performance. This method establishes the foundation for the proper use of IoT technology for the remote monitoring of flowering information in modern orchards.

**Author Contributions:** Conceptualization, X.Z. (Xinzhu Zhou); methodology, X.Z. (Xinzhu Zhou) and G.S.; software, X.Z. (Xinzhu Zhou) and X.Z. (Xiaolei Zhang); validation, X.Z. (Xinzhu Zhou) and N.X.; formal analysis, X.Z. (Xinzhu Zhou), Y.Y. and J.C.; investigation, X.Z. (Xinzhu Zhou), J.C. and Y.H.; writing—original draft preparation, X.Z. (Xinzhu Zhou), G.S. and N.X.; writing—review and editing, X.Z. (Xinzhu Zhou), Y.Y. and J.C.; project administration, G.S.; funding acquisition, G.S. All authors have read and agreed to the published version of the manuscript.

**Funding:** This work was supported by the key R&D Program of Jiangsu Province (No. BE2022363), High-end Foreign Experts Recruitment Plan of China (No. G2021145009L), Jiangsu agricultural science and technology Innovation Fund (No. CX(22)3097), and Jiangsu agricultural science and technology Innovation Fund (No. CX(21)2006).

**Institutional Review Board Statement:** Not applicable.

**Data Availability Statement:** The data supporting this study's findings are available from the corresponding author upon reasonable request.

**Acknowledgments:** The author would like to thank the editors and reviewers for their comments on improving the quality of this work and MDPI for their English language revisions.

**Conflicts of Interest:** The authors declare no conflict of interest.
