Variational Multiscale Nonparametric Regression: Algorithms and Implementation
Abstract
:1. Introduction
1.1. Variational Denoising
1.2. Statistical Multiscale Methods
1.3. Variational Multiscale Methods
1.4. Computational Challenges and Scope of the Paper
2. Theoretical Properties of Variational Multiscale Estimation Methods
2.1. Theoretical Guarantees
- -
- Sobolev spaces: The authors of [14,16] analysed the MIND with and being a set of indicator functions of rectangles at different locations and scales. They showed that, for the choice for an explicit constant , the MIND is minimax optimal up to logarithmic factors for estimating functions in the Sobolev space . This means that the MIND’s expected reconstruction error is of the same order as the error of the best possible estimator, i.e.,
- -
- Bounded variation: In [29], the MIND with bounded variation regularisation was considered. It was shown that it is optimal in a minimax sense up to logarithms for estimating functions of bounded variation if . For , the discretisation matters further and this could only be shown in a Gaussian white noise model. Such results hold for a variety of dictionaries , such as wavelet bases, mixed wavelet and curvelet dictionaries, as well as suitable systems of indicator functions of rectangles.
2.2. Practical Choice of the Threshold
3. Computational Methods
3.1. The Chambolle–Pock Method
Algorithm 1: Chambolle–Pock algorithm |
|
3.2. ADMM Method
3.3. Semismooth Newton Method
Algorithm 3: Semismooth Newton method for |
|
4. Numerical Study
4.1. Comparison of Three Algorithms
4.2. Comparison of Different Dictionaries
4.3. Unknown Noise Level
5. Conclusions and Discussion
5.1. Conclusions
5.2. Extensions
5.2.1. Bump Signals and Inverse Problems
5.2.2. Different Noise Models
Author Contributions
Funding
Acknowledgments
Conflicts of Interest
References
- Draper, N.R.; Smith, H. Applied Regression Analysis, 3rd ed.; Wiley Series in Probability and Statistics: Texts and References Section; John Wiley & Sons, Inc.: New York, NY, USA, 1998. [Google Scholar] [CrossRef]
- Bowman, A.W.; Azzalini, A. Applied Smoothing Techniques for Data Analysis: The Kernel Approach with S-Plus Illustrations; OUP Oxford: Oxford, UK, 1997; Volume 18. [Google Scholar]
- Fan, J.; Gijbels, I. Local Polynomial Modelling and Its Applications; Monographs on Statistics and Applied Probability; CRC Press: Boca Raton, FL, USA, 1996; Volume 66. [Google Scholar]
- Stone, C.J. Optimal global rates of convergence for nonparametric regression. Ann. Stat. 1982, 10, 1040–1053. [Google Scholar] [CrossRef]
- Nadaraya, E.A. On estimating regression. Theory Probab. Appl. 1964, 9, 141–142. [Google Scholar] [CrossRef]
- Watson, G.S. Smooth regression analysis. Sankhya Indian J. Stat. Ser. A 1964, 26, 359–372. [Google Scholar]
- Eggermont, P.; LaRiccia, V. Maximum likelihood estimation of smooth monotone and unimodal densities. Ann. Stat. 2000, 28, 922–947. [Google Scholar]
- Phillips, D.L. A technique for the numerical solution of certain integral equations of the first kind. J. ACM 1962, 9, 84–97. [Google Scholar] [CrossRef]
- Morozov, V.A. Regularization of incorrectly posed problems and the choice of regularization parameter. Zhurnal Vychislitel’noi Mat. I Mat. Fiz. 1966, 6, 170–175. [Google Scholar] [CrossRef]
- Rudin, L.I.; Osher, S.; Fatemi, E. Nonlinear total variation based noise removal algorithms. Phys. D Nonlinear Phenom. 1992, 60, 259–268. [Google Scholar] [CrossRef]
- Daubechies, I. Ten Lectures on Wavelets; Society for Industrial and Applied Mathematics: Philadelphia, PA, USA, 1992; Volume 61. [Google Scholar]
- Donoho, D.L. De-noising by soft-thresholding. IEEE Trans. Inf. Theory 1995, 41, 613–627. [Google Scholar] [CrossRef] [Green Version]
- Tsybakov, A.B. Introduction to Nonparametric Estimation; Springer Science & Business Media: Berlin/Heidelberg, Germany, 2008. [Google Scholar]
- Nemirovski, A. Nonparametric estimation of smooth regression functions. Izv. Akad. Nauk. SSR Teckhn. Kibernet 1985, 3, 50–60. [Google Scholar]
- Candès, E.J.; Tao, T. The Dantzig selector: Statistical estimation when p is much larger than n. Ann. Stat. 2007, 35, 2313–2351. [Google Scholar] [CrossRef] [Green Version]
- Grasmair, M.; Li, H.; Munk, A. Variational multiscale nonparametric regression: Smooth functions. In Annales de l’Institut Henri Poincaré, Probabilités et Statistiques; Institut Henri Poincaré: Paris, France, 2018; Volume 54, pp. 1058–1097. [Google Scholar]
- Scherzer, O.; Grasmair, M.; Grossauer, H.; Haltmeier, M.; Lenzen, F. Variational Methods in Imaging; Springer Science & Business Media: Berlin/Heidelberg, Germany, 2009. [Google Scholar]
- Burger, M.; Sawatzky, A.; Steidl, G. First Order Algorithms in Variational Image Processing. In Splitting Methods in Communication, Imaging, Science, and Engineering; Glowinski, R., Osher, S.J., Yin, W., Eds.; Springer International Publishing: Cham, Switzerland, 2016; pp. 345–407. [Google Scholar] [CrossRef] [Green Version]
- Hintermüller, M.; Rincon-Camacho, M. An adaptive finite element method in L2-TV-based image denoising. Inverse Probl. Imaging 2014, 8, 685–711. [Google Scholar] [CrossRef]
- Hintermüller, M.; Papafitsoros, K.; Rautenberg, C.N. Analytical aspects of spatially adapted total variation regularisation. J. Math. Anal. Appl. 2017, 454, 891–935. [Google Scholar] [CrossRef] [Green Version]
- Hintermüller, M.; Langer, A.; Rautenberg, C.N.; Wu, T. Adaptive Regularization for Image Reconstruction from Subsampled Data. In Imaging, Vision and Learning Based on Optimization and PDEs; Tai, X.C., Bae, E., Lysaker, M., Eds.; Springer International Publishing: Cham, Switzerland, 2018; pp. 3–26. [Google Scholar]
- Dong, Y.; Hintermüller, M.; Rincon-Camacho, M.M. A multi-scale vectorial Lτ-TV framework for color image restoration. Int. J. Comput. Vis. 2011, 92, 296–307. [Google Scholar] [CrossRef]
- Dong, Y.; Hintermüller, M.; Rincon-Camacho, M.M. Automated regularization parameter selection in multi-scale total variation models for image restoration. J. Math. Imaging Vis. 2011, 40, 82–104. [Google Scholar] [CrossRef] [Green Version]
- Lenzen, F.; Berger, J. Solution-Driven Adaptive Total Variation Regularization. In Scale Space and Variational Methods in Computer Vision; Aujol, J.F., Nikolova, M., Papadakis, N., Eds.; Springer International Publishing: Cham, Switzerland, 2015; pp. 203–215. [Google Scholar]
- Donoho, D.L.; Johnstone, J.M. Ideal spatial adaptation by wavelet shrinkage. Biometrika 1994, 81, 425–455. [Google Scholar] [CrossRef]
- Candès, E.J.; Donoho, D.L. Curvelets: A Surprisingly Effective Nonadaptive Representation for Objects with Edges; Technical Report; Department of Statistics Stanford University: Stanford, CA, USA, 2000. [Google Scholar]
- Labate, D.; Lim, W.Q.; Kutyniok, G.; Weiss, G. Sparse multidimensional representation using shearlets. In Wavelets XI; International Society for Optics and Photonics: Bellingham, WA, USA, 2005; Volume 5914, p. 59140U. [Google Scholar]
- Candès, E.J.; Guo, F. New multiscale transforms, minimum total variation synthesis: Applications to edge-preserving image reconstruction. Signal Process. 2002, 82, 1519–1543. [Google Scholar] [CrossRef]
- Del Alamo, M.; Li, H.; Munk, A. Frame-constrained total variation regularization for white noise regression. arXiv 2020, arXiv:1807.02038. [Google Scholar]
- Malgouyres, F. Mathematical analysis of a model which combines total variation and wavelet for image restoration. J. Inf. Process. 2002, 2, 1–10. [Google Scholar]
- Frick, K.; Marnitz, P.; Munk, A. Statistical multiresolution Dantzig estimation in imaging: Fundamental concepts and algorithmic framework. Electron. J. Stat. 2012, 6, 231–268. [Google Scholar] [CrossRef]
- Frick, K.; Marnitz, P.; Munk, A. Statistical multiresolution estimation for variational imaging: With an application in Poisson-biophotonics. J. Math. Imaging Vis. 2013, 46, 370–387. [Google Scholar] [CrossRef] [Green Version]
- Del Álamo, M.; Munk, A. Total variation multiscale estimators for linear inverse problems. Inf. Inference J. IMA 2019. [Google Scholar] [CrossRef]
- Plotz, T.; Roth, S. Benchmarking denoising algorithms with real photographs. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA, 21–26 July 2017; pp. 1586–1595. [Google Scholar]
- Munk, A.; Bissantz, N.; Wagner, T.; Freitag, G. On difference-based variance estimation in nonparametric regression when the covariate is high dimensional. J. R. Stat. Soc. Ser. B Stat. Methodol. 2005, 67, 19–41. [Google Scholar] [CrossRef]
- Frick, K.; Marnitz, P.; Munk, A. Shape-constrained regularization by statistical multiresolution for inverse problems: Asymptotic analysis. Inverse Probl. 2012, 28, 065006. [Google Scholar] [CrossRef]
- Lebert, J.; Künneke, L.; Hagemann, J.; Kramer, S.C. Parallel Statistical Multi-resolution Estimation. arXiv 2015, arXiv:1503.03492. [Google Scholar]
- Kramer, S.C.; Hagemann, J.; Künneke, L.; Lebert, J. Parallel statistical multiresolution estimation for image reconstruction. SIAM J. Sci. Comput. 2016, 38, C533–C559. [Google Scholar] [CrossRef]
- Morken, A.F. An algorithmic Framework for Multiresolution Based Non-Parametric Regression. Master’s Thesis, NTNU, Trondheim, Norway, 2017. [Google Scholar]
- Luke, D.R.; Shefi, R. A globally linearly convergent method for pointwise quadratically supportable convex-concave saddle point problems. J. Math. Anal. Appl. 2018, 457, 1568–1590. [Google Scholar] [CrossRef] [Green Version]
- Chambolle, A.; Pock, T. A first-order primal-dual algorithm for convex problems with applications to imaging. J. Math. Imaging Vis. 2011, 40, 120–145. [Google Scholar] [CrossRef] [Green Version]
- Hintermüller, M. Semismooth Newton Methods and Applications. In Lecture Notes for the Oberwolfach-Seminar on “Mathematics of PDE-Constrained Optimization”; Department of Mathematics, Humboldt-University of Berlin: Berlin, Germany, 2010. [Google Scholar]
- Clason, C.; Kruse, F.; Kunisch, K. Total variation regularization of multi-material topology optimization. ESAIM Math. Model. Numer. Anal. 2018, 52, 275–303. [Google Scholar] [CrossRef] [Green Version]
- Lepskii, O.V. On a Problem of Adaptive Estimation in Gaussian White Noise. Theory Probab. Appl. 1991, 35, 454–466. [Google Scholar] [CrossRef]
- Donoho, D.L.; Johnstone, I.M.; Kerkyacharian, G.; Picard, D. Wavelet shrinkage: Asymptopia? J. R. Stat. Soc. Ser. B 1995, 57, 301–369. With discussion and a reply by the authors. [Google Scholar] [CrossRef]
- Donoho, D.L.; Johnstone, I.M. Adapting to unknown smoothness via wavelet shrinkage. J. Am. Stat. Assoc. 1995, 90, 1200–1224. [Google Scholar] [CrossRef]
- Weyrich, N.; Warhola, G.T. Wavelet shrinkage and generalized cross validation for image denoising. IEEE Trans. Image Process. 1998, 7, 82–90. [Google Scholar] [CrossRef] [PubMed]
- Härdle, W.; Kerkyacharian, G.; Picard, D.; Tsybakov, A. Wavelets, Approximation, and Statistical Applications; Lecture Notes in Statistics; Springer: New York, NY, USA, 1998; Volume 129, p. xviii+265. [Google Scholar] [CrossRef]
- Cai, T.T. On block thresholding in wavelet regression: Adaptivity, block size, and threshold level. Stat. Sin. 2002, 12, 1241–1273. [Google Scholar]
- Zhang, C.H. General empirical Bayes wavelet methods and exactly adaptive minimax estimation. Ann. Stat. 2005, 33, 54–100. [Google Scholar] [CrossRef] [Green Version]
- Abramovich, F.; Benjamini, Y.; Donoho, D.L.; Johnstone, I.M. Adapting to unknown sparsity by controlling the false discovery rate. Ann. Stat. 2006, 34, 584–653. [Google Scholar] [CrossRef]
- Cai, T.T.; Zhou, H.H. A data-driven block thresholding approach to wavelet estimation. Ann. Stat. 2009, 37, 569–595. [Google Scholar] [CrossRef]
- Haltmeier, M.; Munk, A. Extreme value analysis of empirical frame coefficients and implications for denoising by soft-thresholding. Appl. Comput. Harmon. Anal. 2014, 36, 434–460. [Google Scholar] [CrossRef] [Green Version]
- Rockafellar, R.T. Convex Analysis; Princeton University Press: Princeton, NJ, USA, 2015. [Google Scholar]
- Ekeland, I.; Témam, R. Convex Analysis and Variational Problems, english ed.; Volume 28, Classics in Applied Mathematics; Society for Industrial and Applied Mathematics (SIAM): Philadelphia, PA, USA, 1999; p. xiv+402. [Google Scholar] [CrossRef] [Green Version]
- Nesterov, Y.; Nemirovsky, A. Interior-Point Polynomial Algorithms in Convex Programming; Society for Industrial and Applied Mathematics (SIAM): Philadelphia, PA, USA, 1994; p. ix+396. [Google Scholar] [CrossRef]
- Chambolle, A. An Algorithm for Total Variation Minimization and Applications. J. Math. Imaging Vis. 2004, 20, 89–97. [Google Scholar]
- Powell, M.J.D. A method for nonlinear constraints in minimization problems. In Optimization (Sympos., Univ. Keele, Keele, 1968); Academic Press: London, UK, 1969; pp. 283–298. [Google Scholar]
- Hestenes, M.R. Multiplier and gradient methods. J. Optim. Theory Appl. 1969, 4, 303–320. [Google Scholar] [CrossRef]
- Dykstra, R.L. An algorithm for restricted least squares regression. J. Am. Stat. Assoc. 1983, 78, 837–842. [Google Scholar] [CrossRef]
- Boyle, J.P.; Dykstra, R.L. A method for finding projections onto the intersection of convex sets in Hilbert spaces. In Advances in Order Restricted Statistical Inference; Springer: Berlin/Heidelberg, Germany, 1986; pp. 28–47. [Google Scholar]
- Deutsch, F.; Hundal, H. The rate of convergence of Dykstra’s cyclic projections algorithm: The polyhedral case. Numer. Funct. Anal. Optim. 1994, 15, 537–565. [Google Scholar] [CrossRef]
- Birgin, E.G.; Raydan, M. Robust stopping criteria for Dykstra’s algorithm. SIAM J. Sci. Comput. 2005, 26, 1405–1414. [Google Scholar] [CrossRef] [Green Version]
- Deng, W.; Yin, W. On the global and linear convergence of the generalized alternating direction method of multipliers. J. Sci. Comput. 2016, 66, 889–916. [Google Scholar] [CrossRef] [Green Version]
- Hintermüller, M.; Kunisch, K. Path-following methods for a class of constrained minimization problems in function space. SIAM J. Optim. 2006, 17, 159–187. [Google Scholar] [CrossRef] [Green Version]
- Hore, A.; Ziou, D. Image quality metrics: PSNR vs. SSIM. In Proceedings of the IEEE 2010 20th International Conference on Pattern Recognition, New York, NY, USA, 23–26 July 2010; pp. 2366–2369. [Google Scholar]
- Wang, Z.; Bovik, A.C.; Sheikh, H.R.; Simoncelli, E.P. Image quality assessment: From error visibility to structural similarity. IEEE Trans. Image Process. 2004, 13, 600–612. [Google Scholar] [CrossRef] [Green Version]
- Sheikh, H.R.; Bovik, A.C. Image information and visual quality. IEEE Trans. Image Process. 2006, 15, 430–444. [Google Scholar] [CrossRef]
- Pock, T.; Chambolle, A. Diagonal preconditioning for first order primal-dual algorithms in convex optimization. In Proceedings of the IEEE 2011 International Conference on Computer Vision, Barcelona, Spain, 6–13 November 2011; pp. 1762–1769. [Google Scholar]
- Giewekemeyer, K.; Krueger, S.P.; Kalbfleisch, S.; Bartels, M.; Salditt, T.; Beta, C. X-ray propagation microscopy of biological cells using waveguides as a quasipoint source. Phys. Rev. A 2011, 83. [Google Scholar] [CrossRef]
- Liu, X.; Tanaka, M.; Okutomi, M. Single-image noise level estimation for blind denoising. IEEE Trans. Image Process. 2013, 22, 5226–5237. [Google Scholar] [CrossRef]
- Zhang, K.; Zuo, W.; Chen, Y.; Meng, D.; Zhang, L. Beyond a Gaussian denoiser: Residual learning of deep CNN for image denoising. IEEE Trans. Image Process. 2017, 26, 3142–3155. [Google Scholar] [CrossRef] [Green Version]
- Frick, K.; Munk, A.; Sieling, H. Multiscale change point inference. J. R. Stat. Soc. Ser. B Stat. Methodol. 2014, 76, 495–580. With 32 discussions by 47 authors and a rejoinder by the authors. [Google Scholar] [CrossRef]
- Donoho, D.L. Nonlinear solution of linear inverse problems by wavelet–vaguelette decomposition. Appl. Comput. Harmon. Anal. 1995, 2, 101–126. [Google Scholar] [CrossRef] [Green Version]
- Brown, L.D.; Levine, M. Variance estimation in nonparametric regression via the difference sequence method. Ann. Stat. 2007, 35, 2219–2232. [Google Scholar] [CrossRef]
- König, C.; Munk, A.; Werner, F. Multidimensional multiscale scanning in exponential families: Limit theory and statistical consequences. Ann. Stat. 2020, 48, 655–678. [Google Scholar] [CrossRef]
Dependence on Initialisation | Theor. Convergence Speed | Practical Performance | |
---|---|---|---|
Chambolle–Pock | no | linear | Good for smooth and nonsmooth R |
ADMM | no | linear | Too slow |
Semismooth Newton | yes | superlinear | Good for smooth R Unstable otherwise |
Dyadic Cubes | Small Cubes | Wavelets | Curvelets | Shearlets | ||
---|---|---|---|---|---|---|
“brain” | MISE | 0.00176 (5.4 ) | 0.00149 (1.9 ) | 0.00147 (2.9 ) | 0.0013 (2.5 ) | 0.000871 (3 ) |
PSNR | 27.5 (0.013) | 28.3 (0.0055) | 28.3 (0.085) | 28.9 (0.086) | 30.6 (0.015) | |
SSIM | 0.871 (3.1 ) | 0.806 (0.002) | 0.872 (0.0023) | 0.852 (0.0091) | 0.715 (0.0043) | |
VIF | 0.726 (0.00058) | 0.823 (0.00036) | 0.767 (0.0021) | 0.809 (0.0022) | 0.852 (0.00086) | |
“cell” | MISE | 0.000671 (1.3 ) | 0.000438 (8.5 ) | 0.000509 (5 ) | 0.000554 (1.3 ) | 0.000412 (2.6 ) |
PSNR | 31.7 (0.0082) | 33.6 (0.0084) | 32.9 (0.0043) | 32.6 (0.11) | 33.9 (0.027) | |
SSIM | 0.912 (0.001) | 0.859 (0.008) | 0.924 (8.9 ) | 0.841 (0.018) | 0.636 (0.0091) | |
VIF | 0.86 (8.2 ) | 0.913 (0.0004) | 0.884 (0.0004) | 0.888 (0.00057) | 0.917 (0.00037) | |
“BIRN” | MISE | 0.00301 (1.6 ) | 0.00266 (1.4 ) | 0.00269 (1 ) | 0.00237 (7.1 ) | 0.00167 (8.6 ) |
PSNR | 25.2 (0.0023) | 25.7 (0.023) | 25.7 (0.016) | 26.2 (0.013) | 27.8 (0.022) | |
SSIM | 0.791 (0.00012) | 0.802 (0.00053) | 0.81 (6.5 ) | 0.82 (0.00075) | 0.858 (0.00076) | |
VIF | 0.718 (0.0018) | 0.811 (0.0018) | 0.739 (0.0024) | 0.762 (0.00096) | 0.838 (0.00058) |
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations. |
© 2020 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).
Share and Cite
del Alamo, M.; Li, H.; Munk, A.; Werner, F. Variational Multiscale Nonparametric Regression: Algorithms and Implementation. Algorithms 2020, 13, 296. https://doi.org/10.3390/a13110296
del Alamo M, Li H, Munk A, Werner F. Variational Multiscale Nonparametric Regression: Algorithms and Implementation. Algorithms. 2020; 13(11):296. https://doi.org/10.3390/a13110296
Chicago/Turabian Styledel Alamo, Miguel, Housen Li, Axel Munk, and Frank Werner. 2020. "Variational Multiscale Nonparametric Regression: Algorithms and Implementation" Algorithms 13, no. 11: 296. https://doi.org/10.3390/a13110296
APA Styledel Alamo, M., Li, H., Munk, A., & Werner, F. (2020). Variational Multiscale Nonparametric Regression: Algorithms and Implementation. Algorithms, 13(11), 296. https://doi.org/10.3390/a13110296