Next Article in Journal
Multi-Machine Repairable System with One Unreliable Server and Variable Repair Rate
Previous Article in Journal
Improving Reliability and Energy Efficiency of Three Parallel Pumps by Selecting Trade-Off Operating Points
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Sparse HJ Biplot: A New Methodology via Elastic Net

by
Mitzi Cubilla-Montilla
1,2,*,
Ana Belén Nieto-Librero
3,4,
M. Purificación Galindo-Villardón
3,4 and
Carlos A. Torres-Cubilla
5
1
Departamento de Estadística, Facultad de Ciencias Naturales, Exactas y Tecnología, Universidad de Panamá, Panama City 0824, Panama
2
Sistema Nacional de Investigación, Secretaría Nacional de Ciencia, Tecnología e Innovación (SENACYT), Panama City 0824, Panama
3
Department of Statistics, University of Salamanca, 37008 Salamanca, Spain
4
Institute of Biomedical Research of Salamanca, 37008 Salamanca, Spain
5
Department of Data Analytics, Banco General, Panama City 07096, Panama
*
Author to whom correspondence should be addressed.
Mathematics 2021, 9(11), 1298; https://doi.org/10.3390/math9111298
Submission received: 4 April 2021 / Revised: 2 June 2021 / Accepted: 2 June 2021 / Published: 5 June 2021

Abstract

:
The HJ biplot is a multivariate analysis technique that allows us to represent both individuals and variables in a space of reduced dimensions. To adapt this approach to massive datasets, it is necessary to implement new techniques that are capable of reducing the dimensionality of the data and improving interpretation. Because of this, we propose a modern approach to obtaining the HJ biplot called the elastic net HJ biplot, which applies the elastic net penalty to improve the interpretation of the results. It is a novel algorithm in the sense that it is the first attempt within the biplot family in which regularisation methods are used to obtain modified loadings to optimise the results. As a complement to the proposed method, and to give practical support to it, a package has been developed in the R language called SparseBiplots. This package fills a gap that exists in the context of the HJ biplot through penalized techniques since in addition to the elastic net, it also includes the ridge and lasso to obtain the HJ biplot. To complete the study, a practical comparison is made with the standard HJ biplot and the disjoint biplot, and some results common to these methods are analysed.

1. Introduction

Recently, the variety and the rapid growth of datasets have led to an increase in the amount of information in many disciplines and fields of study. Due to this increase in the volume of data, a statistical approach based on dimension reduction is an essential tool to project the original data onto a subspace of lower dimensions, in such a way that it is possible to capture most of the variability. This representation can be approximated by applying multivariate techniques including principal component analysis (PCA).
PCA has its origins in works developed by [1], but more substantial development was carried out by [1,2]. A more current reference can be found in [3]. PCA is traditionally undertaken through the singular value decomposition (SVD) technique of [4].
In PCA, data are projected on orthogonal axes of maximum variability in a space of reduced dimensions, usually a plane. Thus, each principal component is a linear combination of the initial variables and their contribution to each component is established. The coefficients of the combinations, which are called loadings and are usually different from zero, generate the main drawback of the PCA: its interpretation.
Several alternatives have been proposed to improve the interpretation of the results, ranging from rotation techniques to the imposition of restrictions on factor loadings. Initially, Hausman [5] proposed restricting the value that can be assigned to loads of the principal components (PCs) to a set of integers {−1, 0, 1} to build simplified components. Subsequently, Vines [6] welcomed the idea of [5] and suggested the use of arbitrary integers. A different method was proposed by McCabe [7] that consisted of selecting a subset of variables, identified as main variables, based on an optimisation criterion, without having to go through PCA. To solve this problem, Cadima and Jolliffe [8] presented a method called a simple threshold, which consisted of converting factorial loadings with absolute values less than a certain threshold to zero loadings.
Traditionally, rotation techniques have been used to simplify the structure of principal components (PCs) and facilitate their interpretation [9]. However, the reduction in the dimensionality of the data is not always sufficient to facilitate the interpretation of the PCs. An ad hoc approach is to use regularisation techniques, although these require a restriction parameter to induce projection vectors with modified loadings (null or near zero), the interpretation of the results improves significantly. Thus, Tibshirani [10] introduced the least absolute shrinkage and selection operator (lasso) method in which he combined a regression model with a procedure for the contraction of some parameters towards zero, imposing a penalty on the regression coefficients. A few years later, Jolliffe and Jolliffe and Uddin [11] presented a solution that modified the traditional approach to PCs using two stages, PCA and rotation. They proposed the simplified component technique (ScoT), in which the original PCs and the VARIMAX rotation are combined to shrink loadings towards zero, thus maintaining the decrease in the proportion of explained variance. Since the loadings obtained by ScoT achieve small values, but not null, Jolliffe et al. [12], proposed the SCoTLASS (simplified component technique subject to lasso) algorithm, which imposes a restriction in such a way that some loadings are completely null while sacrificing the variance. In the same sense, Zou et al. [13] proposed a penalized algorithm called sparse PCA, which applies the elastic net technique in addition to the lasso penalty [14] and which efficiently solved the problem using minimum angle regression [15]. Subject to cardinality restrictions (the number of zero loadings per component), Moghaddam et al. [16] built an algorithm for sparse components. Next, D’Aspremont et al. [17] explained the cardinality restriction based on semi-definite programming. Taking advantage of some of the previous ideas, Ref. [18] connected the PCA with the SVD of the data and obtained sparse PCs through regularisation penalties (sPCA-rSVD). Subsequently, Ref. [19] unified the low-range approach of [18] with the criterion of maximum variance of [12] and the sparse PCA method of [13], to give a general solution to the problem of sparse PCA. In the same context, [20] suggested maximising the explained variance by the cardinality of sparse PCs. Through an extension of the classical PCA, Ref. [21] built sparse components by replacing the l 2 norm in problems of traditional eigenvalues with a new norm consisting of a combination of the l 1 and l 2 norms. A modification to the PCA was presented by [22] which allowed identification of components of maximum variance, guaranteeing that each variable contributes to only to one of the factor axes obtained. Reference [23] formulated the CUR matrix decomposition, expressed in a small number of rows and/or columns in a low-range approximation of the original matrix. This is a different technique in the sense that it does not aim to obtain factorial axes as in the SVD.
A thorough review of the principal components analysis, from the traditional approach to the modern point of view through the sparse PCA, was conducted by [24]. Similarly, Ref. [25] provided an overview of sparse representation algorithms from the theory of mathematical optimisation.
Although PCA is probably the most popular technique in multivariate statistics [3,26], the structure of the observations is represented as points in the plane, but is not described jointly with the variables, which causes further inconvenience for the interpretation of the data. In this scenario, the biplot methods [27,28] have the advantage that they allow both variables and samples to be represented in a low-dimensional subspace. Hence, the present paper has been developed to take an important step in contributing to the process of analysing large-scale data, and at the same time to make a novel contribution in favour of the biplot methods.
The main objective is to propose a new biplot methodology called the Sparse HJ biplot as an alternative method to improve the interpretation of information provided by high-dimensionality data. The suggested algorithm adapts the restrictions to penalise the loadings and produce sparse components; that is, each component is a combination of only the relevant variables. In addition to the proposed method, a package was implemented in the R language to give practical support to the new algorithm.
The paper is organised into the following sections. In Section 2.1, descriptions of the approach proposed by [27] and HJ biplot [28] are presented and their properties are synthesised. Next, Section 2.2 describes the disjoint biplot (DBiplot) [29], a recent adaptation to the study of disjoint axes in the HJ biplot. Section 3 presents our main contribution, a method called Sparse HJ biplot, in which the elastic net penalty is applied to obtain zero loadings. Finally, in Section 4 this method is applied to a dataset and compared with results obtained by other methods.

2. Materials and Methods

In multivariate analysis when you have the interest of representing variables and individuals in the same coordinate system, biplots methods allow obtaining visually interpretable results. Although biplot methods and classical methods of multivariate analysis work with two-way matrices, the need to integrate techniques for manipulating large volumes of data is necessary for many scientific disciplines. The advance of science requires the design of advanced techniques to simplify the analysis of the data. In the case of Biplot methods, the literature only reports the disjoint biplot technique (described in Section 2.2), to produce axes with zero factorial loadings that facilitate the interpretation of massive data. Therefore, to analyze this type of data, in this paper an innovative multivariate technique, named “sparse HJ-Biplot”, was developed. This is a new alternative of HJ-Biplot adapted to the analysis of large data sets, that raises the interpretation of the results. The mathematical structure of the algorithm is described in Section 2.3.
To illustrate the techniques used in this paper, we take a sample from data published by The Cancer Genome Atlas [30]. The data subset used is “breast.TCGA” and it was downloaded from the mixOmics R language package [31]. It has been selected the proteomics information which includes 150 breast cancer samples and the expression or abundance of omics data in 142 proteins. These samples are classified into three groups: Basal, Her2, Luminal A.

2.1. Biplot and HJ-Biplot

The biplot methods [27] are a form of low-dimensional graphic representation of a multivariate data matrix (individuals x variables). The two most important biplot factorisations presented by [27], were the GH biplot and the JK biplot. The GH biplot achieves a high-quality representation of the variables, while the JK biplot achieves this high quality in the ranks of individuals. Consequently, the biplot representation is not simultaneous for either of the two methods.
An alternative to optimise the biplot methods described by [27,28] proposed a multivariate technique called the HJ biplot. This contribution maximises the representation quality of both rows and columns simultaneously [32] for the same coordinate system. In this way, it is possible to interpret the relationships between individuals and variables at the same time.
From the algebraic point of view, the biplot methods are based on the same principles of PCA and SVD of a data matrix. In a biplot, the data are projected onto orthogonal axes of maximum variability in a space of reduced dimension. The HJ biplot facilitates the interpretation of the positions of rows, columns and row-column relations through the axes, in the same way as correspondence analysis [33,34,35,36].
In the HJ biplot, the algorithm starts with the decomposition into eigenvalues and eigenvectors of the matrix X n x p defined previously:
X   U D V T  
where,
  • X : is the data matrix
  • U : is the matrix of data whose columns contain the eigenvectors of X X T
  • V : is the matrix of data whose columns contain the eigenvectors of X T X
  • D : is the diagonal matrix containing the eigenvalues of X
  • U and V must be orthonormal, that is, U T U = I and V T V = I , to guarantee the uniqueness of the factorisation.
The best approximation to X is, therefore:
X     U D V T k = 1 r d k u k v k T  
Considering that de X matrix is centered, the markers for the columns in the HJ biplot are matched with the marked columns of the GH biplot; in turn, the markers for the rows are made to agree with the marker’s rows of the JK biplot. Thus,
E = U D     and   G = V D  
In the same manner, the row markers in the HJ biplot correspond with the row markers of JK biplot, and the markers for the columns coincide with the markers for the columns in a GH biplot, concerning the factorial axes. Figure 1 shows the relationships between U and V .
The rules for the interpretation of the HJ biplot are:
  • The proximity between the points that represent the row markers is interpreted as the similarity between them. Consequently, nearby points allow the identification of clusters of individuals with similar profiles.
  • The standard deviation of a variable can be estimated by the module of the vector which represents it.
  • Correlations between variables can be captured from the angles between vectors. If two variables are correlated, they will have an acute angle; if the angle they form is obtuse the variables will present a negative correlation; and, if the angle is a right angle it indicates that the variables are not correlated.
  • The points orthogonally projected onto a variable approximates the position of the sample values in that variable.

2.2. Disjoint HJ Biplot

An algorithm for HJ biplot representation was proposed by [29,37]. The DBiplot allows for better interpretation of the extracted factorial axes and is a method that constructs disjoint factor axes guaranteeing that each variable of the original data matrix contributes only to one dimension. The algorithm starts with a random classification of the variables in the principal components, and the optimal classification is sought through an iterative procedure leading to maximisation of the explained variability.
The graphic representation in the subspace of reduced dimension is carried out through the HJ biplot. For this, a function called CDBiplot is hosted within the graphic interface biplotbootGUI [37]. The interface has three main functions. The CDBiplot function executes the interface to build the DBiplot. Using this interface, it is also possible to perform the representation of clusters, using the clustering biplot function (CBiplot) and a third alternative called clustering disjoint biplot, which allows the simultaneous representation of two ways, where the latter is characterised by disjoint axes.
Concerning the biplot, we have not found any other evidence for the formulation of alternative algorithms that penalise and contract the loadings of factorial axes to enhance the interpretation of the results. Because of this, we propose a new methodology for the HJ biplot that adapts the restrictions on the principal components applying the elastic net penalty. This new approach is called the Sparse HJ biplot.

2.3. Sparse HJ Biplot

The elastic net method for regression analysis, which combines the ridge and lasso regularisation techniques was presented by [14]. This method penalises the size of the regression coefficients based on the l 1 and l 2 norms, as follows:
l 1 = | | β | | 1 = j = 1 p | β j |     a n d     l 2 = ( | | β | | 2 ) 2 = j = 1 p β j 2
The Sparse HJ biplot proposes a solution to the problem of obtaining a linear combination of variables determined by a vector of sparse loadings that maximises data variability or minimises construction error. We use the concept of minimisation of the reconstruction error (E):
E = [ | | X X ^ | | 2 ] = T r a c e   ( E [ ( X X ^ ) ( X X ^ ) T ] )
Based on this approach, the ability to interpret the axes (sparse) obtained is greatly improved.
In this work, the elastic net regularisation method is implemented in the HJ biplot, combining the lasso and ridge techniques. The formulation of the HJ biplot as a regression problem imposes restrictions on factorial loadings to produce modified axes.
Since the HJ biplot does not reproduce the starting data, a factor is introduced to make this recovery possible. The following model is obtained:
X ^ = A D 1 B T + E
From the elastic net regularisation method, modified loadings for the biplot are derived as follows:
V e l a s t i c n e t = arg min | | X A D 1 B T | | 2 + λ 2 j = 1 p V j 2 + λ 1 j = 1 p | V j |
Using the first k factorial axes, the matrices are defined:
A p x k = [ α 1 ,   α 2 , ,   α k ]   and   B p x k = [ β 1 ,   β 2 , ,   β k ] .
For any λ 2 > 0 , we have:
( A ^ , B ^ ) = a r g m i n   i = 1 n | | x i A B T x i | | 2 + λ 2 j = 1 k | | β j | | 2 + λ 1 , j j = 1 k | | β j | | 1
subject to A T A = I K x K , where λ 1 , j is the lasso penalty parameter to induce sparsity (Sparsity: the condition of the penalty that refers to the automatic selection of variables, setting sufficiently small coefficients to null).
λ 2 is the regularization parameter to contract the loadings.
| | · | | 2 denotes the l 2 norm and | | · | | 1 denotes the l 1 norm.
This problem can be solved by alternating the optimisation between A and B, using the LARS-EN algorithm [14].
For fixed A, B is obtained by solving the following problem:
β ^ j = a r g m i n   | | X α j X β j | | 2 + λ 2 | | β j | | 2 + λ 1 , j | | β j | | 1 = ( α j β j ) T X T X ( α j β j ) + λ 2 | | β j | | 2 + λ 1 , j | | β j | | 1
where each β ^ j is an elastic net estimator.
For fixed B, the penalty part is ignored and minimized
a r g m i n i = 1 n | | x i A B T x i | | 2 = | | X X B A T | | 2
subject to A T A = I K x K .
This a Procrustes problem, and the solution is provided by D V S , ( X T X ) B = U D V T with A ^ = U V T .
Recently [38] proposed a solution to the optimisation problem using the variable projection method. The main characteristic of this method is to partially decrease the orthogonally constrained variables.
The steps for the implementation of the elastic net regularisation method in the HJ biplot are detailed in the following algorithm (see Algorithm 1).
Algorithm 1 Sparse HJ biplot algorithm using elastic net regularisation.
1. Consider a n x p data matrix.
2. A tolerance value is set (1 × 10−5).
3. The data is transformed (centred or standardised).
4. Decomposition of the original data matrix is performed via SVD.
5. A is taken as the loadings of the first k components V[, 1:k].
6.  β j is calculated by:
β j = ( α j β j ) T X T X ( α j β j ) + λ 2 | | β j | | 2 + λ 1 , j | | β j | | 1
7. A is updated via SVD of X T X β :
X T X β = U D V T A = U V T
8.  The difference between A and B is updated:
d i f A B = 1 p i = 1 p 1 | β i | 2 | α i | 2 j = 1 m β i j α i j
9. Steps 4, 5 and 6 are repeated until d i f A B < tolerance.
10. The columns are normalized using V ^ J E N = β j | | β j | | ,   j = 1 , , k
11. We then calculate the row markers and column markers.
12. The elastic net HJ biplot obtained by the previous steps is plotted.
The following scheme (Figure 2) presents the steps describing the application of the regularisation methods in the HJ biplot, which leads to modified axes being obtained.
Explained variance by the Sparse HJ biplot.
In the HJ biplot, orthogonal loadings and uncorrelated axes are obtained from the transformation of the original variables. Both conditions are met, since for a covariance matrix D ^ = X T X then V T V = I and V T D V is a diagonal matrix. Taking Z ^ as the estimated sparse PCs, the total explained variance is determined by tr ( Z ^ T Z ^ ) .
Although sparse PCs are capable of producing orthogonal loadings, their components are correlated [12]. Under these conditions, it is not appropriate to calculate the explained variance in the same way as in an ordinary biplot, since the true variance would be overestimated.
In this analysis, the aim is that each component should be independent of the previous ones; therefore, if linear dependence exists, it must be eliminated. From the alternatives that are available to obtain the adjusted total variance and give a solution to this problem in sparse components, [13] suggest using projection vectors to remove linear dependence. These authors denote Z ^ j · 1 , , j 1 as the residual of fitting Z ^ for Z ^ 1 , , Z ^ j 1 , as follows:
Z ^ j · 1 , , j 1 = Z ^ j H 1 , , j 1 Z ^ j
where H 1 , , j 1 is the projection matrix over { Z ^ i } 1 j 1 .
Therefore, the adjusted variance of Z ^ j is | | Z ^ j · 1 , , j 1 | | 2 and the total explained variance is determined as j = 1 k | | Z ^ j · 1 , , j 1 | | 2 . In the case of the estimated sparse components Z ^ are uncorrelated, this formula matches tr ( Z ^ T Z ^ ) .
The Q R decomposition, in which Q is an orthonormal matrix and R an upper triangular matrix, is a simpler way to estimate the adjusted variance. Taking Z ^ = Q R we have | | Z ^ j · 1 , , j 1 | | 2 = R j j 2 .
The total explained variance is then j = 1 k R j j 2 .

2.4. Software

To give practical support to the new algorithm and to provide a new tool that enables the use of the proposed method, a package was developed in the R programming language (R Core Team, 2021), called SparseBiplots [39].
This package includes a collection of functions that allow multivariate data to be represented on a low dimension subspace, using the HJ biplot methodology. The package implements the HJ biplot and three new techniques to reduce the size of the data, select variables, and identify patterns. It performs the HJ biplot adapting restrictions to decrease and/or produce zero loadings, using the methods of regularization ridge, lasso, and elastic net into the loadings matrix.
In each of the techniques implemented it is possible to obtain the eigenvalues, the explained variance, the loadings matrix, the coordinates for the individuals and the coordinates of variables. The package also allows the graphic representation into two selected dimensions to be obtained using the ggplot2 grammar [40].
This package combines an advanced methodology, the power of the R programming language and the elegance of ggplot2 to help models clearly explain the results and improve the informative capacity of the data. It is a new alternative to analysing data from different sources such as medicine, chromatography, spectrometry, psychology, or others.

3. Illustrative Example

In order to illustrate the new algorithm and to compare the results with the standard HJ biplot and the DBiplot, we consider the subset of data explained in Section 2.
The analysis of the data begins with the calculation of the loadings matrix associated with the factorial axes of each of the three methods, to compare the contributions of the proteins to the new axes (see Table 1). In this way a technical meaning is given to each axis, facilitating the interpretation of the results. No sparsity is induced in the HJ Biplot loadings matrix. Each factorial axis is obtained as a linear combination of all the proteins, making the characterisation of each axis difficult. DBiplot, on the other hand, generates disjoint factorial axes, where each protein contributes to a single factorial axis; this facilitates the grouping of proteins with similar performance making the axes mostly characterisable. However, the fact that each protein contributes its information entirely to a single axis severely limits the variability explained by the model. In this new proposal, the sparse biplot decreases the importance of proteins that contribute the smallest information to each axis, producing sparse factor axes with some loadings as zeros, by the penalty imposed. In contrast to the DBiplot, variables can contribute information to more than one axis. It does not limit the grouping of variables in different axes, conversely produces axes that can hold some similarities, but keeps the difference in their characterisation.
The three techniques were then plotted using the type of cancer as a grouping factor for the samples.
In the HJ biplot (Figure 3) a slight pattern can be observed between the three types of cancer and the proteins, but this is not entirely clear. However, due to the high number of proteins, it is difficult to identify the contribution in each axis.
The disjoint biplot was realized with the help of the biplotbootGUI package, the data were analyzed by performing 1000 iterations of the algorithm obtaining the graph shown (Figure 4). The DBiplot shows a better structure in terms of interpretation, although the cancer–protein interaction is quite poor. A common characteristic of this technique is that most of the variables contribute mainly to the first axis.
Finally, the SparseBiplot (Figure 5) shows an interaction between proteins and cancer types, that is clearer and easier to interpret. It shows that the variables that contribute negatively to the second axis have higher values for cancer type Luminal A. On the contrary, the proteins that contribute positively to the second axis have a higher value for cancer type Basal, and average values in Her2 type cancers. Axis 1 is a highly informative gradient (45.8%); and collectively with axis 2 captures approximately 75% of the variability of the data. The interpretation of the SparseBiplot makes possible the recognition of a proteomic characterisation of each of the groups, starting from a set of original proteins since the rest have obtained null loadings.

4. Conclusions and Discussion

The theoretical contribution of this research means excellent progress for Big Data analysis and multivariate statistical methods. In this paper, we proposed an innovative algorithm to describe samples and variables in a low-dimensional space keeping the most relevant variables. Additionally, the developed software is a valuable tool that enables applying the theoretical contribution to the data analysis in every scientific field, as suggested by [41,42].
In contrast to the current developmental dynamic of the biplot method, no reference was found to applied regularisation techniques to the biplot. Therefore, the main contribution of this paper is to propose a new biplot version with penalized loadings, called the elastic net HJ biplot, using the ridge (based on the standard l 2 norm) and lasso (based on the standard l 1 norm) regularisation methods [43].
The advantage of applying penalization to induce sparsity in the HJ biplot produces the loadings matrix of the resulting axis to be sparse, allowing us to omit the slightly important variables into the biplot. Consequently, the interpretability of the biplot’s results becomes clear by knowing the variables that contribute the most information to each axis in the biplot, providing efficient solutions to problems arising from the high dimension of the data [44].
A package, called SparseBiplots [39], was developed in the R programming language to perform the proposed elastic net HJ biplot algorithm. As mentioned by [45], in some study designs the sample size is smaller than the number of variables. In particular, our package has the advantage of being able to solve this type of problem in two-way tables.
The presented methodology opens a new road to future research in multivariate statistics, as it serves as the basis for it to be extended to three-way data analysis techniques, such as Statis and Statis Dual [46], partial triadic analysis [47], Tucker [48], Tucker3 [49], Parallel Factor Analysis (PARAFAC) [50], among others.

Author Contributions

Conceptualization, M.C.-M.; methodology, C.A.T.-C., M.P.G.-V.; software, M.C.-M., C.A.T.-C.; validation, M.C.-M., A.B.N.-L.; formal analysis, M.C.-M., M.P.G.-V., C.A.T.-C.; investigation, M.C.-M., A.B.N.-L.; writing—original draft preparation, writing—review and editing, M.C.-M., M.P.G.-V., A.B.N.-L., C.A.T.-C.; funding acquisition, M.C.-M. All authors have read and agreed to the published version of the manuscript.

Funding

This study was made possible thanks to the support of the Sistema Nacional de Investigación (SNI) of Secretaría Nacional de Ciencia, Tecnología e Innovación (Panamá).

Institutional Review Board Statement

Ethical review and approval were waived for this paper due to the data from The Cancer Genome Atlas (TCGA) used are completely anonymized, without violating the privacy of any patient data.

Informed Consent Statement

Patient consent was waived due to the full anonymity of the used data.

Data Availability Statement

The data analysed in this paper to compare the techniques performed can be found in https://portal.gdc.cancer.gov/ (accessed on 15 May 2021).

Conflicts of Interest

The authors declare no conflict of interest.

References

  1. Pearson, K.F.R.S. LIII. On lines and planes of closest fit to systems of points in space. Lond. Edinb. Dublin Philos. Mag. J. Sci. 1901, 2, 559–572. [Google Scholar] [CrossRef] [Green Version]
  2. Hotelling, H. Analysis of a Complex of Statistical Variables into Principal Components. J. Educ. Psychol. 1933, 24, 417. [Google Scholar] [CrossRef]
  3. Jolliffe, I. Principal Component Analysis; Wiley Online Library: Hoboken, NJ, USA, 2002. [Google Scholar]
  4. Eckart, C.; Young, G. The approximation of one matrix by another of lower rank. Psychometrika 1936, 1, 211–218. [Google Scholar] [CrossRef]
  5. Hausman, R.E. Constrained Multivariate Analysis. In Optimisation in Statistics; Zanakis, S.H., Rustagi, J.S., Eds.; North-Holland Publishing Company: Amsterdam, The Netherlands, 1982; pp. 137–151. [Google Scholar]
  6. Vines, S.K. Simple principal components. J. R. Stat. Soc. Ser. C Appl. Stat. 2000, 49, 441–451. [Google Scholar] [CrossRef]
  7. McCabe, G.P. Principal Variables. Technometrics 1984, 26, 137–144. [Google Scholar] [CrossRef]
  8. Cadima, J.; Jolliffe, I.T. Department of Mathematical Sciences Loading and correlations in the interpretation of principle compenents. J. Appl. Stat. 1995, 22, 203–214. [Google Scholar] [CrossRef]
  9. Jolliffe, I.T. Rotation of principal components: Choice of normalization constraints. J. Appl. Stat. 1995, 22, 29–35. [Google Scholar] [CrossRef]
  10. Tibshirani, R. Regression Shrinkage and Selection Via the Lasso. J. R. Stat. Soc. Ser. B Stat. Methodol. 1996, 58, 267–288. [Google Scholar] [CrossRef]
  11. Jolliffe, I.T.; Uddin, M. The Simplified Component Technique: An Alternative to Rotated Principal Components. J. Comput. Graph. Stat. 2000, 9, 689–710. [Google Scholar] [CrossRef]
  12. Jolliffe, I.T.; Trendafilov, N.; Uddin, M. A Modified Principal Component Technique Based on the LASSO. J. Comput. Graph. Stat. 2003, 12, 531–547. [Google Scholar] [CrossRef] [Green Version]
  13. Zou, H.; Hastie, T.; Tibshirani, R. Sparse Principal Component Analysis. J. Comput. Graph. Stat. 2006, 15, 265–286. [Google Scholar] [CrossRef] [Green Version]
  14. Zou, H.; Hastie, T. Regularization and variable selection via the elastic net. J. R. Stat. Soc. Ser. B Stat. Methodol. 2005, 67, 301–320. [Google Scholar] [CrossRef] [Green Version]
  15. Efron, B.; Hastie, T.; Johnstone, I.; Tibshirani, R. Least Angle Regression. Ann. Stat. 2004, 32, 407–499. [Google Scholar] [CrossRef] [Green Version]
  16. Moghaddam, B.; Weiss, Y.; Avidan, S. Spectral Bounds for Sparse PCA: Exact and Greedy Algorithms. Adv. Neural Inf. Process. Syst. 2006, 18, 915. [Google Scholar]
  17. D’Aspremont, A.; El Ghaoui, L.; Jordan, M.; Lanckriet, G.R.G. A Direct Formulation for Sparse PCA Using Semidefinite Programming. SIAM Rev. 2007, 49, 434–448. [Google Scholar] [CrossRef] [Green Version]
  18. Shen, H.; Huang, J.Z. Sparse principal component analysis via regularized low rank matrix approximation. J. Multivar. Anal. 2008, 99, 1015–1034. [Google Scholar] [CrossRef] [Green Version]
  19. Witten, D.M.; Tibshirani, R.; Hastie, T. A penalized matrix decomposition, with applications to sparse principal components and canonical correlation analysis. Biostatistics 2009, 10, 515–534. [Google Scholar] [CrossRef]
  20. Farcomeni, A. An exact approach to sparse principal component analysis. Comput. Stat. 2009, 24, 583–604. [Google Scholar] [CrossRef]
  21. Qi, X.; Luo, R.; Zhao, H. Sparse principal component analysis by choice of norm. J. Multivar. Anal. 2013, 114, 127–160. [Google Scholar] [CrossRef]
  22. Vichi, M.; Saporta, G. Clustering and disjoint principal component analysis. Comput. Stat. Data Anal. 2009, 53, 3194–3208. [Google Scholar] [CrossRef]
  23. Mahoney, M.W.; Drineas, P. CUR matrix decompositions for improved data analysis. Proc. Natl. Acad. Sci. USA 2009, 106, 697–702. [Google Scholar] [CrossRef] [Green Version]
  24. Trendafilov, N.T. From simple structure to sparse components: A review. Comput. Stat. 2014, 29, 431–454. [Google Scholar] [CrossRef]
  25. Zhang, Z.; Xuelong, L.; Yang, J.; Li, X.; Zhang, D. A Survey of Sparse Representation: Algorithms and Applications. IEEE Access 2015, 3, 490–530. [Google Scholar] [CrossRef]
  26. Abdi, H.; Williams, L.J. Principal component analysis. Wiley Interdiscip. Rev. Comput. Stat. 2010, 2, 433–459. [Google Scholar] [CrossRef]
  27. Gabriel, K.R. The Biplot Graphic Display of Matrices with Application to Principal Component Analysis. Biometrika 1971, 58, 453–467. [Google Scholar] [CrossRef]
  28. Galindo-Villardón, P. Una Alternativa de Representacion Simultanea: HJ-Biplot. Qüestiió Quad. D’Estad. I Investig. Oper. 1986, 10, 13–23. [Google Scholar]
  29. Nieto-Librero, A.B.; Sierra, C.; Vicente-Galindo, M.; Ruíz-Barzola, O.; Galindo-Villardón, M.P. Clustering Disjoint HJ-Biplot: A new tool for identifying pollution patterns in geochemical studies. Chemosphere 2017, 176, 389–396. [Google Scholar] [CrossRef] [PubMed]
  30. Cancer Genome Atlas Network. Comprehensive Molecular Portraits of Human Breast Tumours. Nature 2012, 490, 61. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  31. Rohart, F.; Gautier, B.; Singh, A.; Cao, K.-A.L. mixOmics: An R package for ‘omics feature selection and multiple data integration. PLoS Comput. Biol. 2017, 13, e1005752. [Google Scholar] [CrossRef] [Green Version]
  32. Galindo-Villardón, P.; Cuadras, C.M. Una Extensión del Método Biplot y su relación con otras técnicas. Publ. Bioestad. Biomatemática 1986, 17, 13–23. [Google Scholar]
  33. Greenacre, M.J. Correspondence analysis. Wiley Interdiscip. Rev. Comput. Stat. 2010, 2, 613–619. [Google Scholar] [CrossRef]
  34. Cubilla-Montilla, M.; Nieto-Librero, A.-B.; Galindo-Villardón, M.P.; Galindo, M.P.V.; Garcia-Sanchez, I.-M. Are cultural values sufficient to improve stakeholder engagement human and labour rights issues? Corp. Soc. Responsib. Environ. Manag. 2019, 26, 938–955. [Google Scholar] [CrossRef]
  35. Cubilla-Montilla, M.I.; Galindo-Villardón, P.; Nieto-Librero, A.B.; Galindo, M.P.V.; García-Sánchez, I.M. What companies do not disclose about their environmental policy and what institutional pressures may do to respect. Corp. Soc. Responsib. Environ. Manag. 2020, 27, 1181–1197. [Google Scholar] [CrossRef]
  36. Murillo-Avalos, C.L.; Cubilla-Montilla, M.; Sánchez, M.; Ángel, C.; Vicente-Galindo, P. What environmental social responsibility practices do large companies manage for sustainable development? Corp. Soc. Responsib. Environ. Manag. 2021, 28, 153–168. [Google Scholar] [CrossRef]
  37. Nieto-Librero, A.B.; Galindo-Villardón, P.; Freitas, A. Package biplotbootGUI: Bootstrap on Classical Biplots and Clustering Disjoint Biplot. Available online: https://CRAN.R-project.org/package=biplotbootGUI (accessed on 4 April 2021).
  38. Erichson, N.B.; Zheng, P.; Manohar, K.; Brunton, S.L.; Kutz, J.N.; Aravkin, A.Y. Sparse Principal Component Analysis via Variable Projection. SIAM J. Appl. Math. 2020, 80, 977–1002. [Google Scholar] [CrossRef]
  39. Cubilla-Montilla, M.; Torres-Cubilla, C.A.; Galindo-Villardón, P.; Nieto-Librero, A.B. Package SparseBiplots. Available online: https://CRAN.R-project.org/package=SparseBiplots (accessed on 4 April 2021).
  40. Wickham, H. Ggplot2. Wiley Interdiscip. Rev. Comput. Stat. 2011, 3, 180–185. [Google Scholar] [CrossRef]
  41. Gligorijević, V.; Malod-Dognin, N.; Pržulj, N. Integrative methods for analyzing big data in precision medicine. Proteomics 2016, 16, 741–758. [Google Scholar] [CrossRef]
  42. McCue, M.E.; McCoy, A.M. The Scope of Big Data in One Medicine: Unprecedented Opportunities and Challenges. Front. Veter Sci. 2017, 4, 194. [Google Scholar] [CrossRef] [Green Version]
  43. Montilla, M.I.C. Contribuciones al Análisis Biplot Basadas en Soluciones Factoriales Disjuntas Y en Soluciones Sparse. Ph.D. Thesis, Universidad de Salamanca, Salamanca, Spain, 2019. [Google Scholar]
  44. González García, N. Análisis Sparse de Tensores Multidimensionales. Ph.D. Thesis, Universidad de Salamanca, Salamanca, Spain, 2019. [Google Scholar]
  45. Hernández-Sánchez, J.C.; Vicente-Villardón, J.L. Logistic biplot for nominal data. Adv. Data Anal. Classif. 2016, 11, 307–326. [Google Scholar] [CrossRef] [Green Version]
  46. Lavit, C.; Escoufier, Y.; Sabatier, R.; Traissac, P. The Act (Statis Method). Comput. Stat. Data Anal. 1994, 18, 97–119. [Google Scholar] [CrossRef]
  47. Jaffrenou, P.-A. Sur l’analyse Des Familles Finies de Variables Vectorielles: Bases Algébriques et Application à La Description Statistique. Ph.D. Thesis, Thèse de Troisième Cycle, Université de Lyon, Lyon, France, 1978. [Google Scholar]
  48. Tucker, L.R. Some mathematical notes on three-mode factor analysis. Psychometrika 1966, 31, 279–311. [Google Scholar] [CrossRef] [PubMed]
  49. Kroonenberg, P.M.; De Leeuw, J. Principal component analysis of three-mode data by means of alternating least squares algorithms. Psychometrika 1980, 45, 69–97. [Google Scholar] [CrossRef]
  50. Harshman, R.A. Foundations of the PARAFAC Procedure: Models and Conditions for an “Explanatory” Multi-Modal Factor Analysis. Work. Pap. Phon. 1970, 16, 1–84. [Google Scholar]
Figure 1. Markers in the HJ biplot.
Figure 1. Markers in the HJ biplot.
Mathematics 09 01298 g001
Figure 2. Sheme of the Sparse HJ biplot.
Figure 2. Sheme of the Sparse HJ biplot.
Mathematics 09 01298 g002
Figure 3. HJ biplot representation.
Figure 3. HJ biplot representation.
Mathematics 09 01298 g003
Figure 4. Disjoint biplot representation.
Figure 4. Disjoint biplot representation.
Mathematics 09 01298 g004
Figure 5. Elastic net HJ biplot representations.
Figure 5. Elastic net HJ biplot representations.
Mathematics 09 01298 g005
Table 1. Loadings matrix for the first three principal components obtained from the HJ biplot, DBiplot and the Sparse HJ biplot algorithms. To enhance the sparsity in each method, different fonts colors are used.
Table 1. Loadings matrix for the first three principal components obtained from the HJ biplot, DBiplot and the Sparse HJ biplot algorithms. To enhance the sparsity in each method, different fonts colors are used.
ProteinsHJ BiplotDisjoint BiplotElastic Net HJ Biplot
D1D2D3D1D2D3D1D2D3
14-3-3_epsilon9.835−0.7910.6981006.33000
4E-BP1−1.1273.408−0.752001000
4E-BP1_pS65−2.0746.317−2.11600101.6330
4E-BP1_pT37−1.8622.997−5.079001000
4E-BP1_pT700.4865.227−1.75300100.8320
53BP1−6.654−3.8751.235001−3.01500
A-Raf_pS299−4.472.681−1.221010000
ACC1−4.007−3.042−0.894001000
ACC_pS79−4.094−2.386−2.147001000
AMPK_alpha−1.103−5.2871.691000−0.9170
AMPK_pT172−0.86−6.3621.4861000−1.0750
ANLN0.8776.2135.348100002.111
AR−0.729−6.684.0841000−4.2090
ARID1A−3.6430.8521.409100000
ASNS−4.0678.449−1.90800104.8190
ATM−5.234−1.222−0.396010−1.11800
Akt−5.513−4.694−0.288001−1.66400
Akt_pS473−1.0491.326−7.186001000
Akt_pT308−1.7823.054−5.25001000
Annexin_I6.102−0.652−4.7031001.91900
B-Raf−7.8290.9952.763100−4.10400
Bak9.633−1.78−1.5341006.04200
Bax4.12−1.972−2.539100000
Bcl-21.021−6.8754.6781000−3.6230
Bcl-xL4.824−0.2071.3841000.18900
Beclin−3.2834.4856.71100002.459
Bid9.8851.0761.131006.61200
Bim0.715−2.8993.656100000
C-Raf−7.355−2.215−0.384100−3.88000
C-Raf_pS3385.3126.4643.5931001.68602.060
CD31−2.1168.3987.088100004.871
CD49b3.6322.323.552001000
CDK10.4889.008−0.72910003.3510.892
Caspase-7_cleavedD1981.9066.466−1.04810002.1740
Caveolin-17.827−6.415−1.3111003.643−0.742−0.702
Chk16.7927.6051.2891003.3580.9850.563
Chk1_pS3452.4188.9665.551100004.071
Chk2−63.914−2.176100−1.8151.1900
Chk2_pT68−2.24710.0094.99110000.9144.810
Claudin-7−4.1870.8044.021100000
Collagen_VI8.506−2.722−0.0971004.62800
Cyclin_B1−4.5717.465−2.74510004.4140
Cyclin_D18.872−2.6782.4911005.17000
Cyclin_E1−1.9276.257−3.63710003.8370
DJ-13.216−5.2462.3561000−1.0530
Dvl3−7.369−0.063−0.467100−3.46200
E-Cadherin−4.7311.3373.855100−0.14200
EGFR2.3154.231−2.32500100.2440
EGFR_pY1068−0.7721.862−2.4001000
EGFR_pY11738.7021.4841.230105.32100
ER-alpha−0.686−8.9185.3290100−6.5090
ER-alpha_pS118−3.542−4.8166.1771000−2.8340
ERK2−4.911−4.53−1.404010−0.90300
FOXO3a7.6663.7831.1271004.02700
Fibronectin1.852−2.596−0.897100000
GAB2−3.4651.083−0.098100000
GATA3−1.972−8.7385.2160100−6.0580
GSK3-alpha-beta−9.2430.501−0.787100−5.92400
GSK3-alpha-beta_pS21_S9−6.718−0.029−4.263001−2.27300
HER2−3.51−0.1150.883001000
HER2_pY1248−0.9731.28−0.442001000
HER33.995−3.8590.899010000
HER3_pY12895.3810.692−1.7680011.36500
HSP708.5253.2380.1810105.12700
IGFBP21.102−3.1581.768100000
INPP4B−2.524−6.6576.5971000−5.0980
IRS14.045−2.65.3771000−0.7410
JNK2−0.584−9.0361.9841000−4.625−0.041
JNK_pT183_pT1852.476−2.463−0.011100000
K-Ras10.3040.9490.4561007.04000
Ku80−8.3030.7430.447100−4.76800
LBK1−2.521.9687.864100001.173
Lck4.5283.158−3.0520100.05500
MAPK_pT202_Y204−0.304−2.731−3.607100000
MEK12.993−2.122−2.706100000
MEK1_pS217_S221−5.209−1.263−2.992001−0.51400
MIG-64.2062.4291.513010000
Mre112.557.7297.37100004.444
N-Cadherin10.6691.6080.7991007.61600
NF-kB-p65_pS536−4.9920.915−2.066001−0.33000
NF2−4.468−1.1971.559001−0.28400
Notch14.225.1540.04901000.2210
P-Cadherin0.6925.044−4.53210002.7930
PAI-12.6680.836−1.01100000
PCNA5.3452.069−2.2781000.89300
PDCD4−7.34.392.29100−2.85100.807
PDK1_pS241−4.328−7.4681.103001−0.406−2.0680
PI3K-p110-alpha−2.045−1.893−0.058010000
PKC-alpha4.107−2.008−2.34100000
PKC-alpha_pS6572.787−0.643−0.587100000
PKC-delta_pS664−1.6223.936.058100001.686
PR−0.264−6.8585.0720010−4.4650
PRAS40_pT246−4.8166.243−2.55601000.5490
PRDX11.2442.609−0.483100000
PTEN−3.443−4.1290.798100000
Paxillin−3.448−4.577−2.127100000
Pea-153.573−5.7910.1441000−0.5810
RBM3−3.695−3.7940.379100000
Rad50−0.172−5.0012.6570100−0.9620
Rb_pS807_S811−6.5741.905−2.987100−2.00900
S6−7.7184.048−0.745100−3.9140.5430
S6_pS235_S236−2.3832.566−6.95310000.6440
S6_pS240_S244−2.972.075−7.17710000.3500
SCD1−4.546.6244.302100002.683
STAT3_pY7054.967−2.48−2.2681000.56400
STAT5-alpha−4.779−3.47−2.199001−0.60800
Shc_pY317−4.0833.862.936010000.740
Smad10.205−0.5792.104100000
Smad34.219−5.982.2971000−1.4140
Smad49.908−0.979−1.1631006.33300
Src5.4331.57−0.8831001.02300
Src_pY416−2.9978.115210000.3032.653
Src_pY5274.385−0.67−5.8831000.28200
Stathmin7.1497.294.0991003.79501.895
Syk−4.5692.55−4.518001−0.1400.6990
Transglutaminase−2.5523.844−0.091010000
Tuberin−6.672−6.243−0.289100−3.259−0.1610
VEGFR2−4.498−4.1252.757010−0.236−0.1680
XBP11.8921.074.482100000
XRCC11.111−0.6222.963001000
YAP_pS1270.955−1.745−1.806010000
YB-1−3.686−0.9541.81010000
YB-1_pS102−2.0472.807−6.06100000
alpha-Catenin−4.3424.5737.605100003.159
beta-Catenin−6.877−0.5450.895100−2.92100
c-Kit2.0823.856−2.43210000.0420
c-Met_pY12351.5798.6126.828100004.546
c-Myc1.7782.822.362100000
eEF2−5.9633.474−1.39100−1.8320.2450
eEF2K−5.193−5.4152.727010−1.232−1.2180
eIF4E1.789−0.64−1.43100000
mTOR−9.332−2.0942.38100−6.07300
mTOR_pS2448−4.8040.348−1.92100000
p273.678−0.862.963100000
p27_pT157−1.7647.2184.386100002.536
p27_pT198−2.4945.968−1.24610001.9080
p38_MAPK0.767−5.788−0.5710000−0.006
p38_pT180_Y1820.215−1.303−3.117100000
p53−2.9669.0644.31410000.8463.663
p70S6K−4.922−1.6611.338100−0.83800
p70S6K_pT3895.5751.371−1.51001.61900
p90RSK_pT359_S363−6.3191.839−1.653100−1.68700
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Share and Cite

MDPI and ACS Style

Cubilla-Montilla, M.; Nieto-Librero, A.B.; Galindo-Villardón, M.P.; Torres-Cubilla, C.A. Sparse HJ Biplot: A New Methodology via Elastic Net. Mathematics 2021, 9, 1298. https://doi.org/10.3390/math9111298

AMA Style

Cubilla-Montilla M, Nieto-Librero AB, Galindo-Villardón MP, Torres-Cubilla CA. Sparse HJ Biplot: A New Methodology via Elastic Net. Mathematics. 2021; 9(11):1298. https://doi.org/10.3390/math9111298

Chicago/Turabian Style

Cubilla-Montilla, Mitzi, Ana Belén Nieto-Librero, M. Purificación Galindo-Villardón, and Carlos A. Torres-Cubilla. 2021. "Sparse HJ Biplot: A New Methodology via Elastic Net" Mathematics 9, no. 11: 1298. https://doi.org/10.3390/math9111298

APA Style

Cubilla-Montilla, M., Nieto-Librero, A. B., Galindo-Villardón, M. P., & Torres-Cubilla, C. A. (2021). Sparse HJ Biplot: A New Methodology via Elastic Net. Mathematics, 9(11), 1298. https://doi.org/10.3390/math9111298

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop