CNN-Based Temporal Video Segmentation Using a Nonlinear Hyperbolic PDE-Based Multi-Scale Analysis
Abstract
:1. Introduction
2. Nonlinear PDE-Based Filtering Model for Scale-Space Representation
2.1. A Nonlinear Second-Order Hyperbolic PDE Model
2.2. Mathematical Treatment of PDE Model’s Validity
2.3. Numerical Approximation Algorithm
3. Multi-Scale Deep Learning-Based High-Level Frame Feature Extraction
3.1. PDE-Based Scale Space
3.2. Deep Learning-Based Feature Extraction
4. Automatic Video Frame Clustering Technique
5. Discussion
6. Conclusions
Funding
Data Availability Statement
Conflicts of Interest
References
- Koprinska, I.; Carrato, S. Temporal video segmentation: A survey. Signal Process Image Commun. 2001, 16, 477–500. [Google Scholar] [CrossRef]
- Zhu, X.; Aref, W.G.; Fan, J.; Catlin, A.C.; Elmagarmid, A.K. Medical video mining for efficient database indexing, management and access. In Proceedings of the 19th IEEE international conference on data engineering (ICDE 2003), Bangalore, India, 5–8 March 2003. [Google Scholar]
- Boreczky, J.S.; Rowe, L.A. Comparison of video shot boundary detection techniques. J. Electron. Imaging 1996, 5, 122–128. [Google Scholar] [CrossRef] [Green Version]
- Patel, U.; Shah, P.; Panchal, P. Shot detection using pixel wise difference with adaptive threshold and color histogram method in compressed and uncompressed video. Int. J. Comput. Appl. 2013, 64, 38–44. [Google Scholar] [CrossRef]
- Jacobs, A.; Miene, A.; Ioannidis, G.T.; Herzog, O. Automatic shot boundary detection combining color, edge, and motion features of adjacent frames. In TRECVID 2004 Workshop Notebook Papers; NIST: Gaithersburg, MD, USA, 2004; pp. 197–206. [Google Scholar]
- Tekalp, A.M. Digital Video Processing; Prentice-Hall: Hoboken, NJ, USA, 1995. [Google Scholar]
- Yusoff, Y.; Christmas, W.; Kittler, J. Video shot cut detection using adaptive thresholding. In Proceedings of the 11th British Machine Vision Conference University of Bristol, Bristol, UK, 11–14 September 2000. [Google Scholar]
- Idan, Z.N.; Abdulhussain, S.H.; Mahmmod, B.M.; Al-Utaibi, K.A.; Al-Hadad, S.A.R.; Sait, S.M. Fast Shot Boundary Detection Based on Separable Moments and Support Vector Machine. IEEE Access 2021, 9, 106412–106427. [Google Scholar] [CrossRef]
- Guimarães, S.J.F.; Couprie, M. Video segmentation based on 2d image analysis. Pattern Recognit. Lett. 2003, 24, 947–957. [Google Scholar] [CrossRef]
- Fang, H.; Jiang, J.; Feng, Y. A fuzzy logic approach for detection of video shot boundaries. Pattern Recognit. 2006, 39, 2092–2100. [Google Scholar] [CrossRef]
- Chakraborty, D.; Chiracharit, W.; Chamnongthai, K. Video Shot Boundary Detection Using Principal Component Analysis (PCA) and Deep Learning. In Proceedings of the 18th International Conference on Electrical Engineering/Electronics, Computer, Telecommunications and Information Technology (ECTI-CON), Virtual Conference, 19–22 May 2021. [Google Scholar] [CrossRef]
- Xu, J.; Song, L.; Xie, R. Shot Boundary Detection Using Convolutional Neural Networks. In Proceedings of the 2016 Visual Communications and Image Processing (VCIP), Chengdu, China, 27–30 November 2016. [Google Scholar]
- Jose, J.T.; Rajkumar, S.; Ghalib, M.R.; Shankar, A.; Sharma, P.; Khosravi, M.R. Efficient Shot Boundary Detection with Multiple Visual Representations. Mob. Inf. Syst. 2022, 2022, 4195905. [Google Scholar] [CrossRef]
- Barbu, T. Novel automatic video cut detection technique using Gabor filtering. Comput. Electr. Eng. 2009, 35, 712–721. [Google Scholar] [CrossRef]
- Barbu, T. Novel Diffusion-Based Models for Image Restoration and Interpolation; Book Series: Signals and Communication Technology; Springer International Publishing: Berlin/Heidelberg, Germany, 2019. [Google Scholar]
- Barbu, T.; Miranville, A.; Moroșanu, C. A Qualitative Analysis and Numerical Simulations of a Nonlinear Second-order Anisotropic Diffusion Problem with Non-homogeneous Cauchy-Neumann boundary conditions. Appl. Math. Comput. 2019, 350, 170–180. [Google Scholar] [CrossRef]
- Barbu, T. Second-order anisotropic diffusion-based framework for structural inpainting. Proc. Rom. Acad. Ser. A 2018, 19, 329–336. [Google Scholar]
- Barbu, T. Feature Keypoint-Based Image Compression Technique Using a Well-Posed Nonlinear Fourth-Order PDE-Based Model. Mathematics 2020, 8, 930. [Google Scholar] [CrossRef]
- Barbu, T. Automatic Edge Detection Solution using Anisotropic Diffusion-based Multi-scale Image Analysis and Fine-to-coarse Tracking. Proc. Rom. Acad. Ser. A 2021, 22, 267–274. [Google Scholar]
- Barbu, T. Robust contour tracking model using a variational level-set algorithm. In Numerical Functional Analysis and Optimization; Taylor & Francis: Abingdon, UK, 2014; Volume 35, pp. 263–274. [Google Scholar]
- Ren, X. Multi-Scale Improves Boundary Detection in Natural Images. European Conference on Computer Vision; Springer: Berlin/Heidelberg, Germany, 2008; pp. 533–545. [Google Scholar]
- Weickert, J. Anisotropic Diffusion in Image Processing; European Consortium for Mathematics in Industry; B.G. Teubner: Stuttgart, Germany, 1998. [Google Scholar]
- Johnson, P. Finite Difference for PDEs; School of Mathematics, University of Manchester, Semester I: Manchester, UK, 2008. [Google Scholar]
- Barbu, V. Nonlinear Semigroups and Differential Equations in Banach Spaces; Noordhoff International Publishing: Groningen, The Netherlands, 1976. [Google Scholar]
- Murphy, J. An Overview of Convolutional Neural Network Architectures for Deep Learning; Microway Inc.: Plymouth, MA, USA, 2016. [Google Scholar]
- Simonyan, K.; Zisserman, A. Very Deep Convolutional Networks for Large-Scale Image Recognition. arXiv 2014, arXiv:1409.1556. [Google Scholar]
- Szegedy, C.; Ioffe, S.; Vanhoucke, V.; Alemi, A. Inception-v4, Inception-ResNet and the Impact of Residual Connections on Learning. AAAI 2017, 4, 12. [Google Scholar] [CrossRef]
- Krizhevsky, A.; Sutskever, I.; Hinton, G.E. Imagenet classification with deep convolutional neural networks. Adv. Neural Inf. Process. Syst. 2012, 60, 84–90. [Google Scholar] [CrossRef] [Green Version]
- Gao, H.; Liu, Z.; Van Der Maaten, L.; Weinberger, K.Q. Densely Connected Convolutional Networks. CVPR 2017, 1, 3. [Google Scholar]
- Bishop, C.M. Pattern Recognition and Machine Learning; Springer: Berlin/Heidelberg, Germany, 2006. [Google Scholar]
- Barbu, T. An automatic unsupervised pattern recognition approach. Proc. Rom. Acad. Ser. A 2006, 7, 73–78. [Google Scholar]
- Powers, D.M. Evaluation: From precision, recall and f-measure to roc, informedness, markedness & correlation. J. Mach. Learn. Technol. 2011, 2, 37–63. [Google Scholar]
- Tharwat, A. Classification assessment methods. Appl. Comput. Inform. 2021, 17(1), 168–192. [Google Scholar] [CrossRef]
- Kumar, P. A wavelet based methodology for scale-space anisotropic analysis. Geophys. Res. Lett. 1995, 22, 2777–2780. [Google Scholar] [CrossRef]
- Bescos, J.; Martinez, J.M.; Cabrera, J.; Menendez, J.M.; Cisneros, G. Gradual shot transition detection based on multidimensional clustering. In Proceedings of the 4th IEEE Southwest Symposium on Image Analysis and Interpretation, Austin, TX, USA, 2–4 April 2000; pp. 53–57. [Google Scholar] [CrossRef]
Technique | Precision | Recall | F1 |
---|---|---|---|
The proposed technique | 0.984 | 0.991 | 0.987 |
Color histograms | 0.745 | 0.724 | 0.734 |
Edge histograms | 0.815 | 0.752 | 0.782 |
Pixel differences (SAD) | 0.775 | 0.763 | 0.769 |
Gabor 2D filter-based model | 0.941 | 0.840 | 0.887 |
Statistical features with LHR | 0.645 | 0.618 | 0.631 |
Pairwise pixel comparisons | 0.741 | 0.723 | 0.731 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2023 by the author. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Barbu, T. CNN-Based Temporal Video Segmentation Using a Nonlinear Hyperbolic PDE-Based Multi-Scale Analysis. Mathematics 2023, 11, 245. https://doi.org/10.3390/math11010245
Barbu T. CNN-Based Temporal Video Segmentation Using a Nonlinear Hyperbolic PDE-Based Multi-Scale Analysis. Mathematics. 2023; 11(1):245. https://doi.org/10.3390/math11010245
Chicago/Turabian StyleBarbu, Tudor. 2023. "CNN-Based Temporal Video Segmentation Using a Nonlinear Hyperbolic PDE-Based Multi-Scale Analysis" Mathematics 11, no. 1: 245. https://doi.org/10.3390/math11010245
APA StyleBarbu, T. (2023). CNN-Based Temporal Video Segmentation Using a Nonlinear Hyperbolic PDE-Based Multi-Scale Analysis. Mathematics, 11(1), 245. https://doi.org/10.3390/math11010245