Sign in to use this feature.

Years

Between: -

Subjects

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Journals

Article Types

Countries / Regions

Search Results (28)

Search Parameters:
Keywords = ℓ1 regression

Order results
Result details
Results per page
Select all
Export citation of selected articles as:
32 pages, 594 KB  
Article
Design-Aware Predictive and Causal Modeling of Cardiovascular Risk in Chronic Kidney Disease Using Penalized and Double Machine Learning Approaches
by Fernando Rojas, Axa Tapia and Hilda Espinoza
Mathematics 2026, 14(9), 1554; https://doi.org/10.3390/math14091554 - 4 May 2026
Viewed by 219
Abstract
We develop a design-aware framework that combines penalized prediction and causal inference for finite populations observed through complex survey designs. The framework integrates survey-weighted pseudo-likelihoods, 1-penalized estimation, Neyman-orthogonal moment functions, and a bootstrap procedure that resamples primary sampling units within strata. [...] Read more.
We develop a design-aware framework that combines penalized prediction and causal inference for finite populations observed through complex survey designs. The framework integrates survey-weighted pseudo-likelihoods, 1-penalized estimation, Neyman-orthogonal moment functions, and a bootstrap procedure that resamples primary sampling units within strata. Methodologically, the contribution is an explicit pipeline that supports design-based inference while separating predictive associations from structurally adjusted effects in high-dimensional, clustered data. We illustrate the framework using data from the Chilean National Health Survey (ENS) 2016–2017 to study the relationship between chronic kidney disease (CKD) and high cardiovascular (CV) risk. In the ENS adult population, the survey-weighted prevalence of CKD was 3.1% (95% CI: 2.4–3.8), and the prevalence of high CV risk was 23.9% (95% CI: 21.5–26.3). High CV risk was markedly more frequent among individuals with CKD than among those without CKD (90.9% versus 21.5%). Predictive and associational analyses combined survey-weighted penalized logistic regression (LASSO) with refitted unpenalized models. In conventional survey-weighted logistic regressions, CKD showed a strong association with high CV risk (odds ratio = 5.66; 95% CI: 2.71–11.82; p<0.001), and effect sizes remained stable after LASSO-based variable selection. To assess causal relevance under confounding and potential endogeneity, we implemented two endogeneity-aware estimators: two-stage residual inclusion (2SRI) and double/debiased machine learning (DML). The DML estimator, defined as the primary causal estimand, reports an orthogonalized estimate of the average treatment effect of CKD on the probability of high CV risk. After adjustment for age and major cardiometabolic comorbidities, the DML estimate was attenuated and statistically non-significant (average treatment effect = 0.094; 95% CI: [0.409,0.220]). The 2SRI approach yielded unstable estimates with wide confidence intervals, consistent with the limited effective sample size of CKD cases (nCKD190 in a sample with n ≈ 6233) and weak identification conditions under low-prevalence settings. Simulation experiments under ENS-like complex sampling suggest that naive predictive associations may overestimate the structural contribution of CKD under confounding, whereas orthogonalized estimators yield more conservative estimates when identification holds. The causal interpretation relies on a conditional mean independence assumption given observed covariates and survey design, while control-function specifications are treated as diagnostic sensitivity analyses due to the absence of credible exclusion-based instruments. Overall, the results demonstrate a fundamental divergence between predictive relevance and causal importance in finite-population settings, underscoring the need for design-aware and endogeneity-robust methods in statistical modeling. Full article
(This article belongs to the Special Issue Applied Probability and Statistics: Theory, Methods, and Applications)
Show Figures

Figure 1

24 pages, 36350 KB  
Article
Partial Multi-Label Feature Selection via Entropy-Weighted Multi-Scale Neighborhood Granular Label Distribution Learning
by Yifan Cao, Mao Li, Cong Wang, Shuyu Fan, Ziqiao Yin and Binghui Guo
Entropy 2026, 28(4), 422; https://doi.org/10.3390/e28040422 - 9 Apr 2026
Viewed by 342
Abstract
Partial multi-label feature selection aims to identify discriminative features from data where each instance is associated with an ambiguous candidate label set. Existing methods are typically built upon single-scale modeling assumptions and may fail to fully exploit the multi-granularity structure underlying instance–label relationships. [...] Read more.
Partial multi-label feature selection aims to identify discriminative features from data where each instance is associated with an ambiguous candidate label set. Existing methods are typically built upon single-scale modeling assumptions and may fail to fully exploit the multi-granularity structure underlying instance–label relationships. To address this limitation, we propose a novel framework termed PML-FSMNG, which integrates entropy-weighted multi-scale neighborhood granules with label distribution learning. Specifically, multi-scale neighborhood systems are constructed to estimate label distinguishability at multiple structural scales, and Shannon entropy is employed to adaptively fuse scale-specific label distributions into a robust soft supervisory signal. Based on the learned label distribution, an embedded sparse regression model with 2,1-norm regularization is developed for discriminative feature selection, together with an entropy-regularized adaptive graph learning mechanism to preserve intrinsic geometric structure. Extensive experiments on benchmark datasets demonstrate that the proposed method consistently outperforms several state-of-the-art approaches, validating the effectiveness of multi-scale modeling and entropy-guided adaptive learning under label ambiguity. Full article
Show Figures

Figure 1

35 pages, 20162 KB  
Article
An Efficient and Sparse Kernelized Grey RVFL Network for Energy Forecasting
by Wenkang Gong and Gaofeng Zong
Systems 2026, 14(3), 257; https://doi.org/10.3390/systems14030257 - 28 Feb 2026
Viewed by 335
Abstract
Reliable energy forecasting is essential for the planning and dispatch of power and fuel systems; however, energy series are often short and exhibit pronounced nonlinearity. To tackle this small sample setting, we propose a gray random vector functional link (GRVFL) framework and further [...] Read more.
Reliable energy forecasting is essential for the planning and dispatch of power and fuel systems; however, energy series are often short and exhibit pronounced nonlinearity. To tackle this small sample setting, we propose a gray random vector functional link (GRVFL) framework and further derive a kernelized variant (KGRVFL). In GRVFL, an RVFL network is integrated into gray system modeling, and the parameters are learned via sparsity-regularized regression, enabling stable and reproducible training without backpropagation or evolutionary optimization. Hyperparameters are tuned using Bayesian optimization driven by a Top-k mean absolute percentage error (Top-k MAPE) criterion to improve robustness. To further promote compactness, we introduce a fractional ratio-type Fr-1 penalty and solve the resulting problem efficiently using a fractional coordinate descent (FCD) algorithm. The proposed methods are assessed on six real-world energy datasets using eight evaluation metrics. Comparisons with nine gray model baselines and six machine learning forecasters demonstrate that the sparse KGRVFL (SKGRVFL) achieves higher predictive accuracy and improved training stability under small sample conditions. Full article
(This article belongs to the Section Systems Engineering)
Show Figures

Figure 1

29 pages, 1017 KB  
Article
Bayesian Elastic Net Cox Models for Time-to-Event Prediction: Application to a Breast Cancer Cohort
by Ersin Yılmaz, Syed Ejaz Ahmed and Dursun Aydın
Entropy 2026, 28(3), 264; https://doi.org/10.3390/e28030264 - 27 Feb 2026
Viewed by 662
Abstract
High-dimensional survival analyses require calibrated risk and measurable uncertainty, but standard elastic net Cox models provide only point estimates. We develop a Bayesian elastic net Cox (BEN–Cox) model for high-dimensional proportional hazards regression that places a hierarchical global–local shrinkage prior on coefficients and [...] Read more.
High-dimensional survival analyses require calibrated risk and measurable uncertainty, but standard elastic net Cox models provide only point estimates. We develop a Bayesian elastic net Cox (BEN–Cox) model for high-dimensional proportional hazards regression that places a hierarchical global–local shrinkage prior on coefficients and performs full Bayesian inference via Hamiltonian Monte Carlo. We represent the elastic net penalty as a global–local Gaussian scale mixture with hyperpriors that learn the 1/2 trade-off, enabling adaptive sparsity that preserves correlated gene groups; using HMC with the Cox partial likelihood, we obtain full posterior distributions for hazard ratios and patient-level survival curves. Methodologically, we formalize a Bayesian analogue of the elastic net grouping effect at the posterior mode and establish posterior contraction under sparsity for the Cox partial likelihood, supporting the stability of the resulting risk scores. On the METABRIC breast cancer cohort (n=1903; p=440 gene-level features after preprocessing, derived from an Illumina HT-12 array with ≈24,000 probes at the raw feature level), BEN–Cox achieves slightly lower prediction error, higher discrimination, and better global calibration than a tuned ridge Cox, lasso Cox, and elastic net Cox baselines on a held-out test set. Posterior summaries provide credible intervals for hazard ratios and identify a compact gene panel that remains biologically plausible. BEN–Cox provides an uncertainty-aware alternative to tuned penalized Cox models with theoretical support, offering modest improvements in calibration and providing an interpretable sparse signature in highly-correlated survival data. Full article
Show Figures

Figure 1

25 pages, 16590 KB  
Article
Adaptive Bayesian System Identification for Long-Term Forecasting of Industrial Load and Renewables Generation
by Lina Sheng, Zhixian Wang, Xiaowen Wang and Linglong Zhu
Electronics 2026, 15(3), 530; https://doi.org/10.3390/electronics15030530 - 26 Jan 2026
Viewed by 378
Abstract
The expansion of renewables in modern power systems and the coordinated development of upstream and downstream industrial chains are promoting a shift on the utility side from traditional settlement by energy toward operation driven by data and models. Industrial electricity consumption data exhibit [...] Read more.
The expansion of renewables in modern power systems and the coordinated development of upstream and downstream industrial chains are promoting a shift on the utility side from traditional settlement by energy toward operation driven by data and models. Industrial electricity consumption data exhibit pronounced multi-scale temporal structures and sectoral heterogeneity, which makes unified long-term load and generation forecasting while maintaining accuracy, interpretability, and scalability a challenge. From a modern system identification perspective, this paper proposes a System Identification in Adaptive Bayesian Framework (SIABF) for medium- and long-term industrial load forecasting based on daily freeze electricity time series. By combining daily aggregation of high-frequency data, frequency domain analysis, sparse identification, and long-term extrapolation, we first construct daily freeze series from 15 min measurements, and then we apply discrete Fourier transforms and a spectral complexity index to extract dominant periodic components and build an interpretable sinusoidal basis library. A sparse regression formulation with 1 regularization is employed to select a compact set of key basis functions, yielding concise representations of sector and enterprise load profiles and naturally supporting multivariate and joint multi-sector modeling. Building on this structure, we implement a state-space-implicit physics-informed Bayesian forecasting model and evaluate it on real data from three representative sectors, namely, steel, photovoltaics, and chemical, using one year of 15 min measurements. Under a one-month-ahead evaluation using one year of 15 min measurements, the proposed framework attains a Mean Absolute Percentage Error (MAPE) of 4.5% for a representative PV-related customer case and achieves low single-digit MAPE for high-inertia sectors, often outperforming classical statistical models, sparse learning baselines, and deep learning architectures. These results should be interpreted as indicative given the limited time span and sample size, and broader multi-year, population-level validation is warranted. Full article
(This article belongs to the Section Systems & Control Engineering)
Show Figures

Figure 1

14 pages, 891 KB  
Article
A Multi-Task Ensemble Strategy for Gene Selection and Cancer Classification
by Suli Lin, Zhizhe Lin, Jin Zhang and Man-Fai Leung
Bioengineering 2025, 12(11), 1245; https://doi.org/10.3390/bioengineering12111245 - 13 Nov 2025
Viewed by 779
Abstract
Gene expression-based tumor classification aims to distinguish tumor types based on gene expression profiles. This task is difficult due to the high dimensionality of gene expression data and limited sample sizes. Most datasets contain tens of thousands of genes but only a small [...] Read more.
Gene expression-based tumor classification aims to distinguish tumor types based on gene expression profiles. This task is difficult due to the high dimensionality of gene expression data and limited sample sizes. Most datasets contain tens of thousands of genes but only a small number of samples. As a result, selecting informative genes is necessary to improve classification performance and model interpretability. Many existing gene selection methods fail to produce stable and consistent results, especially when training data are limited. To address this, we propose a multi-task ensemble strategy that combines repeated sampling with joint feature selection and classification. The method generates multiple training subsets and applies multi-task logistic regression with 2,1 group sparsity regularization to select a subset of genes that appears consistently across tasks. This promotes stability and reduces redundancy. The framework supports integration with standard classifiers such as logistic regression and support vector machines. It performs both gene selection and classification in a single process. We evaluate the method on simulated and real gene expression datasets. The results show that it outperforms several baseline methods in classification accuracy and the consistency of selected genes. Full article
(This article belongs to the Section Biosignal Processing)
Show Figures

Graphical abstract

25 pages, 492 KB  
Article
Federated Logistic Regression with Enhanced Privacy: A Dynamic Gaussian Perturbation Approach via ADMM from an Information-Theoretic Perspective
by Jie Yuan, Yue Wang, Hao Ma and Wentao Liu
Entropy 2025, 27(11), 1148; https://doi.org/10.3390/e27111148 - 12 Nov 2025
Cited by 1 | Viewed by 737
Abstract
Federated learning enables distributed model training across edge nodes without direct raw data sharing, but model parameter transmission still poses significant privacy risks. To address this vulnerability, a Distributed Logistic Regression Gaussian Perturbation (DLGP) algorithm is proposed, which integrates the Alternating Direction Method [...] Read more.
Federated learning enables distributed model training across edge nodes without direct raw data sharing, but model parameter transmission still poses significant privacy risks. To address this vulnerability, a Distributed Logistic Regression Gaussian Perturbation (DLGP) algorithm is proposed, which integrates the Alternating Direction Method of Multipliers (ADMM) with a calibrated differential privacy mechanism. The centralized logistic regression problem is decomposed into local subproblems that are solved independently on edge nodes, where only perturbed model parameters are shared with a central server. The Gaussian noise injection mechanism is designed to optimize the privacy–utility trade-off by introducing calibrated uncertainty into parameter updates, effectively obscuring sensitive information while preserving essential model characteristics. The 2-sensitivity of local updates is derived, and a rigorous (ϵ,δ)-differential privacy guarantee is provided. Evaluations are conducted on a real-world dataset, and it is demonstrated that DLGP maintains favorable performance across varying privacy budgets, numbers of nodes, and penalty parameters. Full article
(This article belongs to the Section Information Theory, Probability and Statistics)
Show Figures

Figure 1

23 pages, 21197 KB  
Article
DLPLSR: Dual Label Propagation-Driven Least Squares Regression with Feature Selection for Semi-Supervised Learning
by Shuanghao Zhang, Zhengtong Yang and Zhaoyin Shi
Mathematics 2025, 13(14), 2290; https://doi.org/10.3390/math13142290 - 16 Jul 2025
Cited by 1 | Viewed by 1070
Abstract
In the real world, most data are unlabeled, which drives the development of semi-supervised learning (SSL). Among SSL methods, least squares regression (LSR) has attracted attention for its simplicity and efficiency. However, existing semi-supervised LSR approaches suffer from challenges such as the insufficient [...] Read more.
In the real world, most data are unlabeled, which drives the development of semi-supervised learning (SSL). Among SSL methods, least squares regression (LSR) has attracted attention for its simplicity and efficiency. However, existing semi-supervised LSR approaches suffer from challenges such as the insufficient use of unlabeled data, low pseudo-label accuracy, and inefficient label propagation. To address these issues, this paper proposes dual label propagation-driven least squares regression with feature selection, named DLPLSR, which is a pseudo-label-free SSL framework. DLPLSR employs a fuzzy-graph-based clustering strategy to capture global relationships among all samples, and manifold regularization preserves local geometric consistency, so that it implements the dual label propagation mechanism for comprehensive utilization of unlabeled data. Meanwhile, a dual-feature selection mechanism is established by integrating orthogonal projection for maximizing feature information with an 2,1-norm regularization for eliminating redundancy, thereby jointly enhancing the discriminative power. Benefiting from these two designs, DLPLSR boosts learning performance without pseudo-labeling. Finally, the objective function admits an efficient closed-form solution solvable via an alternating optimization strategy. Extensive experiments on multiple benchmark datasets show the superiority of DLPLSR compared to state-of-the-art LSR-based SSL methods. Full article
(This article belongs to the Special Issue Machine Learning and Optimization for Clustering Algorithms)
Show Figures

Figure 1

40 pages, 12261 KB  
Article
Integrating Reliability, Uncertainty, and Subjectivity in Design Knowledge Flow: A CMZ-BENR Augmented Framework for Kansei Engineering
by Haoyi Lin, Pohsun Wang, Jing Liu and Chiawei Chu
Symmetry 2025, 17(5), 758; https://doi.org/10.3390/sym17050758 - 14 May 2025
Viewed by 1477
Abstract
As a knowledge-intensive activity, the Kansei engineering (KE) process encounters numerous challenges in the design knowledge flow, primarily due to issues related to information reliability, uncertainty, and subjectivity. Bridging this gap, this study introduces an advanced KE framework integrating a cloud model with [...] Read more.
As a knowledge-intensive activity, the Kansei engineering (KE) process encounters numerous challenges in the design knowledge flow, primarily due to issues related to information reliability, uncertainty, and subjectivity. Bridging this gap, this study introduces an advanced KE framework integrating a cloud model with Z-numbers (CMZ) and Bayesian elastic net regression (BENR). In stage-I of this KE, data mining techniques are employed to process online user reviews, coupled with a similarity analysis of affective word clusters to identify representative emotional descriptors. During stage-II, the CMZ algorithm refines K-means clustering outcomes for market-representative product forms, enabling precise feature characterization and experimental prototype development. Stage-III addresses linguistic uncertainties in affective modeling through CMZ-augmented semantic differential questionnaires, achieving a multi-granular representation of subjective evaluations. Subsequently, stage-IV employs BENR for automated hyperparameter optimization in design knowledge inference, eliminating manual intervention. The framework’s efficacy is empirically validated through a domestic cleaning robot case study, demonstrating superior performance in resolving multiple information processing challenges via comparative experiments. Results confirm that this KE framework significantly improves uncertainty management in design knowledge flow compared to conventional implementations. Furthermore, by leveraging the intrinsic symmetry of the normal cloud model with Z-numbers distributions and the balanced ℓ1/ℓ2 regularization of BENR, CMZ–BENR framework embodies the principle of structural harmony. Full article
(This article belongs to the Special Issue Fuzzy Set Theory and Uncertainty Theory—3rd Edition)
Show Figures

Figure 1

21 pages, 4425 KB  
Article
The Prediction Performance Analysis of the Lasso Model with Convex Non-Convex Sparse Regularization
by Wei Chen, Qiuyue Liu, Hancong Li and Jian Zou
Algorithms 2025, 18(4), 195; https://doi.org/10.3390/a18040195 - 1 Apr 2025
Viewed by 1460
Abstract
The incorporation of 1 regularization in Lasso regression plays a crucial role by inducing convexity to the objective function, thereby facilitating its minimization; when compared to non-convex regularization, the utilization of 1 regularization introduces bias through artificial coefficient shrinkage towards zero. [...] Read more.
The incorporation of 1 regularization in Lasso regression plays a crucial role by inducing convexity to the objective function, thereby facilitating its minimization; when compared to non-convex regularization, the utilization of 1 regularization introduces bias through artificial coefficient shrinkage towards zero. Recently, the convex non-convex (CNC) regularization framework has emerged as a powerful technique that enables the incorporation of non-convex regularization terms while maintaining the overall convexity of the optimization problem. Although this method has shown remarkable performance in various empirical studies, its theoretical understanding is still relatively limited. In this paper, we provide a theoretical investigation into the prediction performance of the Lasso model with CNC sparse regularization. By leveraging oracle inequalities, we establish a tighter upper bound on prediction performance compared to the traditional 1 regularizer. Additionally, we propose an alternating direction method of multipliers (ADMM) algorithm to efficiently solve the proposed model and rigorously analyze its convergence property. Our numerical results, evaluated on both synthetic data and real-world magnetic resonance imaging (MRI) reconstruction tasks, confirm the superior effectiveness of our proposed approach. Full article
(This article belongs to the Section Analysis of Algorithms and Complexity Theory)
Show Figures

Figure 1

16 pages, 470 KB  
Article
Distributed Estimation for 0-Constrained Quantile Regression Using Iterative Hard Thresholding
by Zhihe Zhao and Heng Lian
Mathematics 2025, 13(4), 669; https://doi.org/10.3390/math13040669 - 18 Feb 2025
Cited by 1 | Viewed by 1183
Abstract
Distributed frameworks for statistical estimation and inference have become a critical toolkit for analyzing massive data efficiently. In this paper, we present distributed estimation for high-dimensional quantile regression with 0 constraint using iterative hard thresholding (IHT). We propose a communication-efficient distributed estimator [...] Read more.
Distributed frameworks for statistical estimation and inference have become a critical toolkit for analyzing massive data efficiently. In this paper, we present distributed estimation for high-dimensional quantile regression with 0 constraint using iterative hard thresholding (IHT). We propose a communication-efficient distributed estimator which is linearly convergent to the true parameter up to the statistical precision of the model, despite the fact that the check loss minimization problem with an 0 constraint is neither strongly smooth nor convex. The distributed estimator we develop can achieve the same convergence rate as the estimator based on the whole data set under suitable assumptions. In our simulations, we illustrate the convergence of the estimators under different settings and also demonstrate the accuracy of nonzero parameter identification. Full article
(This article belongs to the Section D1: Probability and Statistics)
Show Figures

Figure 1

16 pages, 293 KB  
Article
Adaptive CoCoLasso for High-Dimensional Measurement Error Models
by Qin Yu
Entropy 2025, 27(2), 97; https://doi.org/10.3390/e27020097 - 21 Jan 2025
Cited by 1 | Viewed by 1421
Abstract
A significant portion of theoretical and empirical studies in high-dimensional regression have primarily concentrated on clean datasets. However, in numerous practical scenarios, data are often corrupted by missing values and measurement errors, which cannot be ignored. Despite the substantial progress in high-dimensional regression [...] Read more.
A significant portion of theoretical and empirical studies in high-dimensional regression have primarily concentrated on clean datasets. However, in numerous practical scenarios, data are often corrupted by missing values and measurement errors, which cannot be ignored. Despite the substantial progress in high-dimensional regression with contaminated covariates, methods that achieve an effective trade-off among prediction accuracy, feature selection, and computational efficiency remain significantly underexplored. We introduce adaptive convex conditioned Lasso (Adaptive CoCoLasso), offering a new approach that can handle high-dimensional linear models with error-prone measurements. This estimator combines a projection onto the nearest positive semi-definite matrix with an adaptively weighted 1 penalty. Theoretical guarantees are provided by establishing error bounds for the estimators. The results from the synthetic data analysis indicate that the Adaptive CoCoLasso performs strongly in prediction accuracy and mean squared error, particularly in scenarios involving both additive and multiplicative noise in measurements. While the Adaptive CoCoLasso estimator performs comparably or is slightly outperformed by certain methods, such as Hard, in reducing the number of incorrectly identified covariates, its strength lies in offering a more favorable trade-off between prediction accuracy and sparse modeling. Full article
(This article belongs to the Special Issue Information-Theoretic Methods in Data Analytics)
14 pages, 406 KB  
Article
On the Adaptive Penalty Parameter Selection in ADMM
by Serena Crisci, Valentina De Simone and Marco Viola
Algorithms 2023, 16(6), 264; https://doi.org/10.3390/a16060264 - 25 May 2023
Cited by 7 | Viewed by 5289
Abstract
Many data analysis problems can be modeled as a constrained optimization problem characterized by nonsmooth functionals, often because of the presence of 1-regularization terms. One of the most effective ways to solve such problems is through the Alternate Direction Method of [...] Read more.
Many data analysis problems can be modeled as a constrained optimization problem characterized by nonsmooth functionals, often because of the presence of 1-regularization terms. One of the most effective ways to solve such problems is through the Alternate Direction Method of Multipliers (ADMM), which has been proved to have good theoretical convergence properties even if the arising subproblems are solved inexactly. Nevertheless, experience shows that the choice of the parameter τ penalizing the constraint violation in the Augmented Lagrangian underlying ADMM affects the method’s performance. To this end, strategies for the adaptive selection of such parameter have been analyzed in the literature and are still of great interest. In this paper, starting from an adaptive spectral strategy recently proposed in the literature, we investigate the use of different strategies based on Barzilai–Borwein-like stepsize rules. We test the effectiveness of the proposed strategies in the solution of real-life consensus logistic regression and portfolio optimization problems. Full article
(This article belongs to the Special Issue Recent Advances in Nonsmooth Optimization and Analysis)
Show Figures

Figure 1

24 pages, 418 KB  
Article
Robust Variable Selection and Regularization in Quantile Regression Based on Adaptive-LASSO and Adaptive E-NET
by Innocent Mudhombo and Edmore Ranganai
Computation 2022, 10(11), 203; https://doi.org/10.3390/computation10110203 - 21 Nov 2022
Cited by 2 | Viewed by 2654
Abstract
Although the variable selection and regularization procedures have been extensively considered in the literature for the quantile regression (QR) scenario via penalization, many such procedures fail to deal with data aberrations in the design space, namely, high leverage points ( [...] Read more.
Although the variable selection and regularization procedures have been extensively considered in the literature for the quantile regression (QR) scenario via penalization, many such procedures fail to deal with data aberrations in the design space, namely, high leverage points (X-space outliers) and collinearity challenges simultaneously. Some high leverage points referred to as collinearity influential observations tend to adversely alter the eigenstructure of the design matrix by inducing or masking collinearity. Therefore, in the literature, it is recommended that the problems of collinearity and high leverage points should be dealt with simultaneously. In this article, we suggest adaptive LASSO and adaptive E-NET penalized QR (QR-ALASSO and QR-AE-NET) procedures where the weights are based on a QR estimator as remedies. We extend this methodology to their penalized weighted QR versions of WQR-LASSO, WQR-E-NET procedures we had suggested earlier. In the literature, adaptive weights are based on the RIDGE regression (RR) parameter estimator. Although the use of this estimator may be plausible at the 1 estimator (QR at τ=0.5) for the symmetrical distribution, it may not be so at extreme quantile levels. Therefore, we use a QR-based estimator to derive adaptive weights. We carried out a comparative study of QR-LASSO, QR-E-NET, and the ones we suggest here, viz., QR-ALASSO, QR-AE-NET, weighted QRALASSO penalized and weighted QR adaptive AE-NET penalized (WQR-ALASSO and WQR-AE-NET) procedures. The simulation study results show that QR-ALASSO, QR-AE-NET, WQR-ALASSO and WQR-AE-NET generally outperform their nonadaptive counterparts. At predictor matrices with collinearity inducing points under normality, the QR-ALASSO and QR-AE-NET, respectively, outperform the non-adaptive procedures in the unweighted scenarios, as follows: in all 16 cases (100%) with respect to correctly selected (shrunk) zero coefficients; in 88% with respect to correctly fitted models; and in 81% with respect to prediction. In the weighted penalized WQR scenarios, WQR-ALASSO and WQR-AE-NET outperform their non-adaptive versions as follows: in 75% of the time with respect to both correctly fitted models and correctly shrunk zero coefficients and in 63% with respect to prediction. At predictor matrices with collinearity masking points under normality, the QR-ALASSO and QR-AE-NET, respectively, outperform the non-adaptive procedures in the unweighted scenarios as follows: in prediction, in 100% and 88% of the time; with respect to correctly fitted models in 100% and 50% (while in 50% equally); and with respect to correctly shrunk zero coefficients in 100% of the time. In the weighted scenario, WQR-ALASSO and WQR-AE-NET outperform their respective non-adaptive versions as follows; with respect to prediction, both in 63% of the time; with respect to correctly fitted models, in 88% of the time while with respect to correctly shrunk zero coefficients in 100% of the time. At predictor matrices with collinearity inducing points under the t-distribution, the QR-ALASSO and QR-AE-NET procedures outperform their respective non-adaptive procedures in the unweighted scenarios as follows: in prediction, in 100% and 75% of the time; with respect to correctly fitted models 88% of the time each; and with respect to correctly shrunk zero 88% and in 100% of the time. Additionally, the procedures WQR-ALASSO and WQR-AE-NET and their unweighted versions result in the former outperforming the latter in all respective cases with respect to prediction whilst there is no clear "winner" with respect to the other two measures. Overall, the WQR-ALASSO generally outperforms all other models with respect to all measures. At the predictor matrix with collinearity-masking points under the t-distribution, all adaptive versions outperformed their respective non-adaptive versions with respect to all metrics. In the unweighted scenarios, the QR-ALASSO and QR-AE-NET dominate their non-adaptive versions as follows: in prediction, in 63% and 75% of the time; with respect to correctly fitted models, in 100% and 38% (while in 62% equally); in 100% of the time with respect to correctly shrunk zero coefficients. In the weighted scenarios, all adaptive versions outperformed their non-adaptive versions as follows: 62% of the time in both respective cases with respect to prediction while it is vice-versa with respect to correctly fitted models and with respect to correctly shrunk zero coefficients. In the weighted scenarios, WQR-ALASSO and WQR-AE-NET dominate their respective non-adaptive versions as follows; with respect to correctly fitted models, in 62% of the time while with respect to correctly shrunk zero coefficients in 100% of the time in both cases. At the design matrix with both collinearity and high leverage points under the heavy-tailed distributions (t-distributions with d(1;6) degrees of freedom) scenarios, the dominance of the adaptive procedures over the non-adaptive ones is again evident. In the unweighted scenarios, the procedures QR-ALASSO and QR-AE-NET outperform their non-adaptive versions as follows; in prediction, in 75% and 62% of the time; with respect to correctly fitted models, they perform better in 100% and 88% of the time, while with respect to correctly shrunk zero coefficients, they outperform their non-adaptive ones 100% of the time in both cases. In the weighted scenarios, WQR-ALASSO and WQR-AE-NET dominate their non-adaptive versions as follows; with respect to prediction, in 100% of the time in both cases; and with respect to both correctly fitted models and correctly shrunk zero coefficients, they both do 88% of the time. Results from applications of the suggested procedures to real life data sets are more or less in line with the simulation studies results. Full article
(This article belongs to the Section Computational Engineering)
Show Figures

Figure 1

18 pages, 2004 KB  
Article
A Generalized Linear Joint Trained Framework for Semi-Supervised Learning of Sparse Features
by Juan Carlos Laria, Line H. Clemmensen, Bjarne K. Ersbøll and David Delgado-Gómez
Mathematics 2022, 10(16), 3001; https://doi.org/10.3390/math10163001 - 19 Aug 2022
Cited by 3 | Viewed by 2277
Abstract
The elastic net is among the most widely used types of regularization algorithms, commonly associated with the problem of supervised generalized linear model estimation via penalized maximum likelihood. Its attractive properties, originated from a combination of 1 and 2 norms, endow [...] Read more.
The elastic net is among the most widely used types of regularization algorithms, commonly associated with the problem of supervised generalized linear model estimation via penalized maximum likelihood. Its attractive properties, originated from a combination of 1 and 2 norms, endow this method with the ability to select variables, taking into account the correlations between them. In the last few years, semi-supervised approaches that use both labeled and unlabeled data have become an important component in statistical research. Despite this interest, few researchers have investigated semi-supervised elastic net extensions. This paper introduces a novel solution for semi-supervised learning of sparse features in the context of generalized linear model estimation: the generalized semi-supervised elastic net (s2net), which extends the supervised elastic net method, with a general mathematical formulation that covers, but is not limited to, both regression and classification problems. In addition, a flexible and fast implementation for s2net is provided. Its advantages are illustrated in different experiments using real and synthetic data sets. They show how s2net improves the performance of other techniques that have been proposed for both supervised and semi-supervised learning. Full article
(This article belongs to the Section E: Applied Mathematics)
Show Figures

Figure 1

Back to TopTop