Context-Aware Multi-Scale Aggregation Network for Congested Crowd Counting
Abstract
:1. Introduction
- We propose a new MSAM. Different dilation rates are used to obtain multi-scale feature information. In addition to each multi-scale sampling branch, a branch of the global receptive field (GRF) is added to help the other multi-scale branch sample features more accurately.
- We propose a new CAM, which uses an attention mechanism to identify the features of the crowd information in an image by relying on the context information.
- We propose a novel context-aware multi-scale aggregation network named CMSNet for dense crowd counting, which utilizes a weighted attention method to strengthen the expression of crowd information and a multi-scale sampling method to obtain information at different scales, thus improving the counting accuracy.
2. Related Work
2.1. Traditional Methods
2.2. CNN-Based Methods
2.3. Multi-Scale Feature Learning
3. Proposed Method
3.1. Overview
3.2. Context-Aware Multi-Scale Aggregation Module
3.2.1. Multi-Scale Aggregation Module
3.2.2. Context-Aware Module
3.3. Density Map Generation and Loss Function
4. Experiments
4.1. Datasets
- ShanghaiTech [15]: This dataset consists of two parts, ShanghaiTech Part_A (SHHA) and ShanghaiTech Part_B (SHHB). SHHA contains 482 crowd images from Internet searches. 300 images were used for the training set and 182 for the test set. In this dataset, the population size ranges from 33 to 3139 people, which is a large scope and can provide a good test of the network’s ability to handle variations in the population size. SHHB includes 716 crowd images taken from Shanghai’s busy streets and scenic spots. A total of 400 images were used for the training set, and 316 images were used for the test set.
- UCF_CC_50 [22]: This dataset contains many images of very crowded scenes, mostly from FLICKR. The number of images in this dataset is very limited (only 50 images), but the variation range of the number of people is very large (the number of people in one image reaches 4543), which brings great challenges for training and testing of the network. Similar to other mainstream test methods, we use 5-fold cross-validation for evaluation.
- UCF-QNRF [43]: This dataset contains 1535 crowd images, of which 1201 images were used for training and 334 for testing. In addition, the number of people in the images in this dataset varies from 49 to 12,865, so it is a great option for testing the performance of the network.
4.2. Evaluation Metrics and Implementation Details
4.3. Ablation Study
4.3.1. Ablation for the MSAM
4.3.2. Ablation for the CAM
4.3.3. Ablation for the CMSM
4.4. Comparison with State-of-the-Art Methods
5. Conclusions
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Conflicts of Interest
References
- Yu, Y.; Huang, J.; Du, W.; Xiong, N. Design and analysis of a lightweight context fusion CNN scheme for crowd counting. Sensors 2019, 19, 2013. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Ilyas, N.; Lee, B.; Kim, K. HADF-crowd: A hierarchical attention-based dense feature extraction network for single-image crowd counting. Sensors 2021, 21, 3483. [Google Scholar] [CrossRef] [PubMed]
- Tong, M.; Fan, L.; Nan, H.; Zhao, Y. Smart camera aware crowd counting via multiple task fractional stride deep learning. Sensors 2019, 19, 1346. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Zhang, Y.; Zhao, H.; Duan, Z.; Huang, L.; Deng, J.; Zhang, Q. Congested Crowd Counting via Adaptive Multi-Scale Context Learning. Sensors 2021, 21, 3777. [Google Scholar] [CrossRef] [PubMed]
- Csönde, G.; Sekimoto, Y.; Kashiyama, T. Crowd counting with semantic scene segmentation in helicopter footage. Sensors 2020, 20, 4855. [Google Scholar] [CrossRef] [PubMed]
- Hsu, Y.W.; Chen, Y.W.; Perng, J.W. Estimation of the number of passengers in a bus using deep learning. Sensors 2020, 20, 2178. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Bai, S.; He, Z.; Qiao, Y.; Hu, H.; Wu, W.; Yan, J. Adaptive dilated network with self-correction supervision for counting. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Seattle, WA, USA, 13–19 June 2020; pp. 4594–4603. [Google Scholar]
- Wang, Q.; Gao, J.; Lin, W.; Li, X. NWPU-crowd: A large-scale benchmark for crowd counting and localization. IEEE Trans. Pattern Anal. Mach. Intell. 2020, 43, 2141–2149. [Google Scholar] [CrossRef] [PubMed]
- Yang, J.; Zhou, Y.; Kung, S.Y. Multi-scale generative adversarial networks for crowd counting. In Proceedings of the 2018 24th International Conference on Pattern Recognition (ICPR), Beijing, China, 20–24 August 2018; pp. 3244–3249. [Google Scholar]
- Thanasutives, P.; Fukui, K.i.; Numao, M.; Kijsirikul, B. Encoder-Decoder Based Convolutional Neural Networks with Multi-Scale-Aware Modules for Crowd Counting. In Proceedings of the 2020 25th International Conference on Pattern Recognition (ICPR), Milan, Italy, 10–15 January 2021; pp. 2382–2389. [Google Scholar]
- Shen, Z.; Xu, Y.; Ni, B.; Wang, M.; Hu, J.; Yang, X. Crowd counting via adversarial cross-scale consistency pursuit. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Salt Lake City, UT, USA, 18–23 June 2018; pp. 5245–5254. [Google Scholar]
- Zhao, M.; Zhang, J.; Zhang, C.; Zhang, W. Leveraging heterogeneous auxiliary tasks to assist crowd counting. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Long Beach, CA, USA, 15–20 June 2019; pp. 12736–12745. [Google Scholar]
- Zhang, Y.; Zhou, C.; Chang, F.; Kot, A.C.; Zhang, W. Attention to head locations for crowd counting. In Proceedings of the International Conference on Image and Graphics, Beijing, China, 23–25 August 2019; Springer: Berlin/Heidelberg, Germany, 2019; pp. 727–737. [Google Scholar]
- Hossain, M.; Hosseinzadeh, M.; Chanda, O.; Wang, Y. Crowd counting using scale-aware attention networks. In Proceedings of the 2019 IEEE Winter Conference on Applications of Computer Vision (WACV), Waikoloa Village, HI, USA, 7–11 January 2019; IEEE: Piscataway, NJ, USA, 2019; pp. 1280–1288. [Google Scholar]
- Zhang, Y.; Zhou, D.; Chen, S.; Gao, S.; Ma, Y. Single-image crowd counting via multi-column convolutional neural network. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, NV, USA, 27–30 June 2016; pp. 589–597. [Google Scholar]
- Sam, D.B.; Surya, S.; Babu, R.V. Switching convolutional neural network for crowd counting. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Honolulu, HI, USA, 21–26 July 2017; pp. 4031–4039. [Google Scholar]
- Li, Y.; Zhang, X.; Chen, D. Csrnet: Dilated convolutional neural networks for understanding the highly congested scenes. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Salt Lake City, UT, USA, 18–23 June 2018; pp. 1091–1100. [Google Scholar]
- Amirgholipour, S.; He, X.; Jia, W.; Wang, D.; Liu, L. Pdanet: Pyramid density-aware attention net for accurate crowd counting. arXiv 2020, arXiv:2001.05643. [Google Scholar]
- Punia, S.K.; Kumar, M.; Stephan, T.; Deverajan, G.G.; Patan, R. Performance analysis of machine learning algorithms for big data classification: Ml and ai-based algorithms for big data analysis. Int. J. E-Health Med. Commun. IJEHMC 2021, 12, 60–75. [Google Scholar] [CrossRef]
- Viola, P.; Jones, M.J.; Snow, D. Detecting pedestrians using patterns of motion and appearance. Int. J. Comput. Vis. 2005, 63, 153–161. [Google Scholar] [CrossRef]
- Wang, M.; Wang, X. Automatic adaptation of a generic pedestrian detector to a specific traffic scene. In Proceedings of the IEEE Conference on Computer Vision and Pattern recognition (CVPR), Colorado Springs, CO, USA, 20–25 June 2011; IEEE: Piscataway, NJ, USA, 2011; pp. 3401–3408. [Google Scholar]
- Idrees, H.; Saleemi, I.; Seibert, C.; Shah, M. Multi-source multi-scale counting in extremely dense crowd images. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Portland, OR, USA, 23–28 June 2013; pp. 2547–2554. [Google Scholar]
- Liu, B.; Vasconcelos, N. Bayesian model adaptation for crowd counts. In Proceedings of the IEEE International Conference on Computer Vision (ICCV), Santiago, Chile, 7–13 December 2015; pp. 4175–4183. [Google Scholar]
- Pham, V.Q.; Kozakaya, T.; Yamaguchi, O.; Okada, R. Count forest: Co-voting uncertain number of targets using random forest for crowd density estimation. In Proceedings of the IEEE International Conference on Computer Vision (ICCV), Santiago, Chile, 7–13 December 2015; pp. 3253–3261. [Google Scholar]
- Lempitsky, V.; Zisserman, A. Learning to count objects in images. Adv. Neural Inf. Process. Syst. 2010, 23, 1324–1332. [Google Scholar]
- Felzenszwalb, P.F.; Girshick, R.B.; McAllester, D.; Ramanan, D. Object detection with discriminatively trained part-based models. IEEE Trans. Pattern Anal. Mach. Intell. 2009, 32, 1627–1645. [Google Scholar] [CrossRef] [Green Version]
- Chan, A.B.; Vasconcelos, N. Bayesian poisson regression for crowd counting. In Proceedings of the IEEE International Conference on Computer Vision (ICCV), Kyoto, Japan, 29 September–2 October 2009; pp. 545–551. [Google Scholar]
- Zhang, C.; Li, H.; Wang, X.; Yang, X. Cross-scene crowd counting via deep convolutional neural networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Boston, MA, USA, 7–12 June 2015; pp. 833–841. [Google Scholar]
- Goodfellow, I.; Pouget-Abadie, J.; Mirza, M.; Xu, B.; Warde-Farley, D.; Ozair, S.; Courville, A.; Bengio, Y. Generative adversarial nets. Adv. Neural Inf. Process. Syst. 2014, 27. [Google Scholar] [CrossRef]
- Ronneberger, O.; Fischer, P.; Brox, T. U-net: Convolutional networks for biomedical image segmentation. In Proceedings of the International Conference on Medical Image Computing and Computer-Assisted Intervention, Munich, Germany, 5–9 October 2015; Springer: Berlin/Heidelberg, Germany, 2015; pp. 234–241. [Google Scholar]
- Sindagi, V.A.; Patel, V.M. Generating high-quality crowd density maps using contextual pyramid cnns. In Proceedings of the IEEE International Conference on Computer Vision (ICCV), Venice, Italy, 22–29 October 2017; pp. 1861–1870. [Google Scholar]
- Sindagi, V.A.; Patel, V.M. Multi-level bottom-top and top-bottom feature fusion for crowd counting. In Proceedings of the IEEE/CVF International Conference on Computer Vision, Seoul, Korea, 27–28 October 2019; pp. 1002–1012. [Google Scholar]
- Liu, N.; Long, Y.; Zou, C.; Niu, Q.; Pan, L.; Wu, H. Adcrowdnet: An attention-injective deformable convolutional network for crowd understanding. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Long Beach, CA, USA, 15–20 June 2019; pp. 3225–3234. [Google Scholar]
- Zhang, A.; Yue, L.; Shen, J.; Zhu, F.; Zhen, X.; Cao, X.; Shao, L. Attentional neural fields for crowd counting. In Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), Long Beach, CA, USA, 15–20 June 2019; pp. 5714–5723. [Google Scholar]
- Yang, S.D.; Su, H.T.; Hsu, W.H.; Chen, W.C. Class-agnostic Few-shot Object Counting. In Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision (WACV), Waikoloa, HI, USA, 3–8 January 2021; pp. 870–878. [Google Scholar]
- Modolo, D.; Shuai, B.; Varior, R.R.; Tighe, J. Understanding the impact of mistakes on background regions in crowd counting. In Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision (WACV), Waikoloa, HI, USA, 3–8 January 2021; pp. 1650–1659. [Google Scholar]
- Simonyan, K.; Zisserman, A. Very deep convolutional networks for large-scale image recognition. arXiv 2014, arXiv:1409.1556. [Google Scholar]
- Hu, J.; Shen, L.; Sun, G. Squeeze-and-excitation networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Salt Lake City, UT, USA, 18–23 June 2018; pp. 7132–7141. [Google Scholar]
- Woo, S.; Park, J.; Lee, J.Y.; Kweon, I.S. Cbam: Convolutional block attention module. In Proceedings of the European Conference on Computer Vision (ECCV), Munich, Germany, 8–14 September 2018; pp. 3–19. [Google Scholar]
- Chen, L.C.; Zhu, Y.; Papandreou, G.; Schroff, F.; Adam, H. Encoder-decoder with atrous separable convolution for semantic image segmentation. In Proceedings of the European Conference on Computer Vision (ECCV), Munich, Germany, 8–14 September 2018; pp. 801–818. [Google Scholar]
- Huang, L.; Zhu, L.; Shen, S.; Zhang, Q.; Zhang, J. SRNet: Scale-Aware Representation Learning Network for Dense Crowd Counting. IEEE Access 2021, 9, 136032–136044. [Google Scholar] [CrossRef]
- Zhang, Y.; Zhao, H.; Zhou, F.; Zhang, Q.; Shi, Y.; Liang, L. MSCANet: Adaptive Multi-scale Context Aggregation Network for Congested Crowd Counting. In Proceedings of the International Conference on Multimedia Modeling, Prague, Czech Republic, 22–24 June 2021; Springer: Berlin/Heidelberg, Germany, 2021; pp. 1–12. [Google Scholar]
- Idrees, H.; Tayyab, M.; Athrey, K.; Zhang, D.; Al-Maadeed, S.; Rajpoot, N.; Shah, M. Composition loss for counting, density map estimation and localization in dense crowds. In Proceedings of the European Conference on Computer Vision (ECCV), Munich, Germany, 8–14 September 2018; pp. 532–546. [Google Scholar]
- Wang, Q.; Gao, J.; Lin, W.; Yuan, Y. Learning from Synthetic Data for Crowd Counting in the Wild. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Long Beach, CA, USA, 15–20 June 2019; pp. 8198–8207. [Google Scholar]
- Liu, W.; Salzmann, M.; Fua, P. Context-aware crowd counting. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Long Beach, CA, USA, 15–20 June 2019; pp. 5099–5108. [Google Scholar]
- Xiong, H.; Lu, H.; Liu, C.; Liu, L.; Cao, Z.; Shen, C. From open set to closed set: Counting objects by spatial divide-and-conquer. In Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), Seoul, Korea, 27–28 October 2019; pp. 8362–8371. [Google Scholar]
- Jiang, X.; Zhang, L.; Xu, M.; Zhang, T.; Lv, P.; Zhou, B.; Yang, X.; Pang, Y. Attention scaling for crowd counting. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Seattle, WA, USA, 13–19 June 2020; pp. 4706–4715. [Google Scholar]
- Zeng, L.; Xu, X.; Cai, B.; Qiu, S.; Zhang, T. Multi-scale convolutional neural networks for crowd counting. In Proceedings of the 2017 IEEE International Conference on Image Processing (ICIP), Beijing, China, 17–20 September 2017; pp. 465–469. [Google Scholar]
- Cao, X.; Wang, Z.; Zhao, Y.; Su, F. Scale aggregation network for accurate and efficient crowd counting. In Proceedings of the European Conference on Computer Vision (ECCV), Munich, Germany, 8–14 September 2018; pp. 734–750. [Google Scholar]
- Wu, X.; Zheng, Y.; Ye, H.; Hu, W.; Yang, J.; He, L. Adaptive scenario discovery for crowd counting. In Proceedings of the ICASSP 2019-2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Brighton, UK, 12–17 May 2019; pp. 2382–2386. [Google Scholar]
- Shi, M.; Yang, Z.; Xu, C.; Chen, Q. Revisiting perspective information for efficient crowd counting. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Long Beach, CA, USA, 15–20 June 2019; pp. 7279–7288. [Google Scholar]
- Sam, D.B.; Peri, S.V.; Sundararaman, M.N.; Kamath, A.; Radhakrishnan, V.B. Locate, size and count: Accurately resolving people in dense crowds via detection. IEEE Trans. Pattern Anal. Mach. Intell. 2020, 43, 2739–2751. [Google Scholar]
- Shi, X.; Li, X.; Wu, C.; Kong, S.; Yang, J.; He, L. A real-time deep network for crowd counting. In Proceedings of the ICASSP 2020-2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Barcelona, Spain, 4–8 May 2020; pp. 2328–2332. [Google Scholar]
- Oh, M.h.; Olsen, P.; Ramamurthy, K.N. Crowd counting with decomposed uncertainty. In Proceedings of the AAAI Conference on Artificial Intelligence, New York, NY, USA, 7–12 February 2020; Volume 34, pp. 11799–11806. [Google Scholar]
- Zhou, J.T.; Zhang, L.; Jiawei, D.; Peng, X.; Fang, Z.; Xiao, Z.; Zhu, H. Locality-Aware Crowd Counting. IEEE Trans. Pattern Anal. Mach. Intell. 2021. [Google Scholar] [CrossRef] [PubMed]
- Liu, Y.B.; Jia, R.S.; Liu, Q.M.; Zhang, X.L.; Sun, H.M. Crowd counting method based on the self-attention residual network. Appl. Intell. 2021, 51, 427–440. [Google Scholar] [CrossRef]
- Li, Y.C.; Jia, R.S.; Hu, Y.X.; Han, D.N.; Sun, H.M. Crowd density estimation based on multi scale features fusion network with reverse attention mechanism. Appl. Intell. 2022, 1–17. [Google Scholar] [CrossRef]
- Wang, W.; Liu, Q.; Wang, W. Pyramid-dilated deep convolutional neural network for crowd counting. Appl. Intell. 2021, 1825–1837. [Google Scholar] [CrossRef]
Datasets | Resolution | Number of Images | Max | Min | Total |
---|---|---|---|---|---|
SHHA | different | 482 | 3139 | 33 | 241,677 |
SHHB | 768 × 1024 | 716 | 578 | 9 | 88,488 |
UCF_CC_50 | different | 50 | 4543 | 94 | 63,974 |
UCF-QNRF | different | 1535 | 12,865 | 49 | 1,251,642 |
Dilation Rates | MAE | MSE |
---|---|---|
1, 2, 3, 4 | 67.1 | 110.3 |
1, 2, 3, 5 | 66.5 | 113.4 |
1, 2, 3, 8 | 65.3 | 112.3 |
1, 4, 7, 9 | 65.5 | 108.1 |
1, 2, 3, 6 | 64.9 | 111.2 |
Model | MAE | MSE |
---|---|---|
w/o GRF | 66.8 | 115.9 |
w. GRF | 64.9 | 111.2 |
Model | MAE | MSE |
---|---|---|
w/o residual | 67.2 | 118.6 |
w. residual | 64.9 | 111.2 |
Model | MAE | MSE |
---|---|---|
w/o residual | 66.4 | 115.2 |
w. residual | 64.9 | 111.2 |
Model | MAE | MSE |
---|---|---|
w/o CAM | 68.2 | 118.8 |
w/o MSAM | 71.0 | 117.6 |
w. CAM & MSAM | 64.9 | 111.2 |
Network Model | Year | Parameter (K) | FLOPs(G) | MAE | MSE |
---|---|---|---|---|---|
CSRNet [17] | 2018 | 16,263 | 20.74 | 111.8 | 198.1 |
CAN [45] | 2019 | 18,103 | 21.99 | 107.0 | 183.0 |
S-DCNet [46] | 2019 | 14,979 | 15.36 | 104.4 | 176.1 |
ASNet [47] | 2020 | 30,398 | 31.33 | 91.6 | 159.7 |
Ours | 2022 | 9668 | 16.67 | 102.3 | 176.5 |
Network Model | Year | SHHA | SHHB | UCF_CC_50 | UCF-QNRF | ||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
MAE | MSE | MAE | MSE | MAE | MSE | MAE | MSE | ||||||||||
MCNN [15] | 2016 | 110.2 | −61.6% | 173.2 | −50.6% | 26.4 | −149.1% | 41.3 | −158.1% | 377.6 | −41.9% | 509.1 | −28.1% | 277 | −130.2% | 426 | −104.3% |
MSCNN [48] | 2017 | 83.8 | −22.9% | 127.4 | −10.8% | 17.7 | −67.0% | 30.2 | −88.8% | 363.7 | −36.7% | 468.4 | −17.8% | - | - | - | - |
Switching-CNN [16] | 2017 | 90.4 | −32.6% | 135.0 | −17.4% | 21.6 | −103.8% | 33.4 | −108.8% | 318.1 | −19.5% | 439.2 | −10.5% | 228 | −90.0% | 445 | −113.4% |
CSRNet [17] | 2018 | 68.2 | - | 115.0 | - | 10.6 | - | 16.0 | - | 266.1 | - | 397.5 | - | 120.3 | - | 208.5 | - |
SANet [49] | 2018 | 67 | 1.8% | 104.5 | 9.1% | 8.4 | 20.8% | 13.6 | 15.0% | 258.4 | 2.9% | 334.9 | 15.7% | - | - | - | - |
ASD [50] | 2019 | 65.6 | 3.8% | 98 | 14.8% | 8.5 | 19.8% | 13.7 | 14.4% | 196.2 | 26.3% | 270.9 | 31.8% | - | - | - | - |
PACNN [51] | 2019 | 66.3 | 2.8% | 106.4 | 7.5% | 8.9 | 19.1% | 13.5 | 15.6% | 267.9 | −0.7% | 357.8 | 10.0% | - | - | - | - |
LSC-CNN [52] | 2020 | 66.4 | 2.6% | 117.0 | −1.7% | 8.1 | 23.6% | 12.7 | 20.6% | 225.6 | 15.2% | 302.7 | 23.8% | 120.5 | −0.2% | 218.2 | −4.7% |
C-CNN [53] | 2020 | 88.1 | −29.2% | 141.7 | −23.2% | 14.9 | −40.6% | 22.1 | −38.1% | - | - | - | - | - | - | - | - |
DUBNet [54] | 2020 | 64.6 | 5.3% | 106.8 | 7.1% | 7.7 | 27.4% | 12.5 | 21.9% | 243.8 | 8.4% | 329.3 | 17.2% | 105.6 | 12.2% | 180.5 | 13.4% |
LA-Batch [55] | 2021 | 65.8 | 3.5% | 103.6 | 9.9% | 8.6 | 18.9% | 13.6 | 15.0% | 203 | 23.7% | 230.6 | 42.0% | 113 | 6.1% | 210 | −0.7% |
MSCANet [42] | 2021 | 66.5 | 2.5% | 102.1 | 11.2% | - | - | - | - | 242.8 | 8.8% | 329.8 | 17.0% | 104.1 | 13.5% | 183.8 | 11.8% |
SRNet [41] | 2021 | 66.0 | 3.2% | 96.7 | 15.9% | - | - | - | - | 184.1 | 30.8% | 232.7 | 41.5% | 108.2 | 10.0% | 177.5 | 14.9% |
HADF-Crowd [2] | 2021 | 71.1 | −4.3% | 111.6 | 3.0% | 9.7 | 9.3% | 15.7 | 1.9% | - | - | - | - | - | - | - | - |
SRN [56] | 2021 | 64.4 | 5.6% | 100.2 | 12.9% | 8.4 | 20.8% | 13.4 | 16.3% | 242.3 | 8.9% | 320.4 | 19.4% | - | - | - | - |
PDD-CNN [58] | 2021 | 64.7 | 5.1% | 99.1 | 13.8% | 8.8 | 17.0% | 14.3 | 10.6% | 205.4 | 22.8% | 311.7 | 21.6% | 115.3 | 4.2% | 190.2 | 8.8% |
IA-MFFCN [57] | 2022 | 62.9 | 7.8% | 100.8 | 12.3% | 9.8 | 7.5% | 13.2 | 17.5% | 242.7 | 8.8% | 320.4 | 19.4% | - | - | - | - |
Ours | 2022 | 64.9 | 4.8% | 111.2 | 3.3% | 8.5 | 19.8% | 13.3 | 16.8% | 203.9 | 23.4% | 259.9 | 34.6% | 102.3 | 15.0% | 176.5 | 15.3% |
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations. |
© 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Huang, L.; Shen, S.; Zhu, L.; Shi, Q.; Zhang, J. Context-Aware Multi-Scale Aggregation Network for Congested Crowd Counting. Sensors 2022, 22, 3233. https://doi.org/10.3390/s22093233
Huang L, Shen S, Zhu L, Shi Q, Zhang J. Context-Aware Multi-Scale Aggregation Network for Congested Crowd Counting. Sensors. 2022; 22(9):3233. https://doi.org/10.3390/s22093233
Chicago/Turabian StyleHuang, Liangjun, Shihui Shen, Luning Zhu, Qingxuan Shi, and Jianwei Zhang. 2022. "Context-Aware Multi-Scale Aggregation Network for Congested Crowd Counting" Sensors 22, no. 9: 3233. https://doi.org/10.3390/s22093233
APA StyleHuang, L., Shen, S., Zhu, L., Shi, Q., & Zhang, J. (2022). Context-Aware Multi-Scale Aggregation Network for Congested Crowd Counting. Sensors, 22(9), 3233. https://doi.org/10.3390/s22093233