Next Article in Journal
Simulated Annealing with Exploratory Sensing for Global Optimization
Previous Article in Journal
Online Topology Inference from Streaming Stationary Graph Signals with Partial Connectivity Information
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

A Jacobi–Davidson Method for Large Scale Canonical Correlation Analysis

College of Computer and Information Science, Fujian Agriculture and Forestry University, Fuzhou 350002, China
*
Author to whom correspondence should be addressed.
Algorithms 2020, 13(9), 229; https://doi.org/10.3390/a13090229
Submission received: 14 August 2020 / Revised: 3 September 2020 / Accepted: 9 September 2020 / Published: 12 September 2020

Abstract

:
In the large scale canonical correlation analysis arising from multi-view learning applications, one needs to compute canonical weight vectors corresponding to a few of largest canonical correlations. For such a task, we propose a Jacobi–Davidson type algorithm to calculate canonical weight vectors by transforming it into the so-called canonical correlation generalized eigenvalue problem. Convergence results are established and reveal the accuracy of the approximate canonical weight vectors. Numerical examples are presented to support the effectiveness of the proposed method.

1. Introduction

Canonical correlation analysis (CCA) is one of the most representative two-view multivariate statistical techniques and has been applied to a wide range of machine learning problems including classification, retrieval, regression and clustering [1,2,3]. It seeks a pair of linear transformations for two view high-dimensional features such that the associated low-dimensional projections are maximally correlated. Denote the data matrices S a R m × d and S b R n × d from two data sets with m and n features, respectively, where d is the number of samples. Without loss of generality, we assume S a and S b are centered, i.e., S a 1 d = 0 and S b 1 d = 0 where 1 d R d is the vector of all ones, otherwise, we can preprocess S a and S b as S a S a 1 d ( S a 1 d ) 1 d T and S b S b 1 d ( S b 1 d ) 1 d T , respectively. CCA aims to find a pair of canonical weight vectors x R m and y R n that maximize the canonical correlation
max x T C y , subject   to x T A x = 1 and y T B y = 1 ,
where
A = S a S a T R m × m , B = S b S b T R n × n and C = S a S b T R m × n ,
and then projects the high-dimensional data S a and S b onto low-dimensional subspaces spanned by x and y, respectively, to achieve the purpose of dimensionality reduction. In most cases [1,4,5], only one pair of canonical weight vectors is not enough since it means the dimension of low-dimensional subspaces is just one. When a set of canonical weight vectors are required, the single-vector CCA (1) has been extended to obtain the pair of canonical weight matrices X R m × k and Y R n × k by solving the optimization problem
max trace ( X T C Y ) , subject   to X T A X = I k and Y T B Y = I k .
Usually, both A and B are symmetric positive definite. However, there are cases, such as the under-sampled problem [6], that A and B may be semi-definite. In such a case, some regular techniques [7,8,9,10] by adding a multiple of the identity matrix to them are applied to find the optimal solution of   
max trace ( X T C Y ) , subject   to X T ( A + κ a I m ) X = I k and Y T ( B + κ b I n ) Y = I k ,
where κ a and κ b are called regularization parameters and they usually are chosen to maximize the cross-validation score [11]. In other words, A and B are replaced by A + κ a I m and B + κ b I n to keep the invertible of A and B, respectively. Hence, in this paper, by default we assume A and B are both positive definite and m n unless explicitly stated otherwise.
As shown in [4], the optimization problem (3) can be equivalently transformed into solving the following generalized eigenvalue problem
K z : = 0 C C T 0 x y = λ A 0 0 B x y = λ M z ,
where the positive definiteness of the matrices A and B implies M being positive definite. The generalized eigenvalue problem (4) is referred as the Canonical Correlation Generalized Eigenvalue Problem (CCGEP) in this paper. Define J : = R A 1 0 0 R B 1 where
A = R A T R A and B = R B T R B
are their Cholesky decomposition. It is easy to verify that
J T 0 C C T 0 J J 1 x y = λ J T A 0 0 B J J 1 x y
gives rise to
0 C ˜ C ˜ T 0 p q = λ p q ,
where
C ˜ = R A T C R B 1 , p = R A x and q = R B y ,
and it implies
C ˜ q = λ p and C ˜ T p = λ q .
It means that the eigenpairs of (4) can be obtained by computing the singular values and the associated left and right singular vectors of C ˜ . This method works well when the sample size d and feature dimension m and n are of moderate size but it will be very slow and numerically unstable for large-scale datasets which are ubiquitous in the age of “Big Data” [12]. For large-scale datasets, the equivalence between (4) and (6) makes it possible for us to simply adapt the subspace type algorithms for calculating the partial singular values decomposition, such as Lanczos type algorithms [13,14] and Davidson type algorithms [15,16], and then translate them for CCGEP (4). However, in practice, the decompositions of the sample covariance matrices A and B are usually unavailable in large scale matrix cases. The reason is that the decomposition is too expensive to compute explicitly for large scale problems, and may destroy the sparsity and some structural information. Furthermore, sometime sample covariance matrices A and B should never be explicitly formed, such as in online learning systems.
Meanwhile, in [17], it is suggested to solve CCGEP (4) by considering the large scale symmetric positive definite pencil { K , M } . Some subspace type numerical algorithms also have been generalized to computing partial eigenpairs of { K , M } , see [18,19]. However, these generic algorithms do not make use of the special structure in (4), and they usually are less efficient than custom-made algorithms. Therefore, existing algorithms either can not avoid the covariance matrices decomposition, or do not consider the structure of CCGEP.
In this paper, we will focus on the Jacobi–Davidson type subspace method for canonical correlation analysis. The idea of Jacobi–Davidson algorithm proposed in [20] is Jacobi’s approach combined with Davidson type subspace method. Its essence is to construct a correction for a given eigenvector approximation in a subspace orthogonal to the given approximation. The correction in a given subspace is extracted in a Davidson manner, and then the expansion of the subspace is done by solving its correction equation. Due to the significant improvement in convergence, the Jacobi–Davidson has been one of the most powerful algorithms in the matrix eigenvalue problem, and is almost generalized to all fields of matrix computation. For example, in [15,21], Hochstenbach presented Jacobi–Davidson methods for singular value problems and generalized singular value problems, respectively. In [22,23], Jacobi–Davidson methods are developed to solve the nonlinear and two-parameter eigenvalue problems, respectively. Some other recent work on Jacobi–Davidson methods can be found in [24,25,26,27,28,29]. Motivated by these facts, we will continue the effort by extending the Jacobi–Davidson variant to canonical correlation analysis. The main contribution is that the algorithm directly tackles CCGEP (4) without involving the large matrix decomposition, and does take advantage of the special structure of K and M, while the significance of transforming (4) into (6) lies only in our theoretical developments.
The rest of this paper is organized as follows. Section 2 collects some notations and a basic result for CCGEP that are essential to our later development. Our main algorithm is given and analyzed in detail in Section 3. We present some numerical examples in Section 4 to show the behaviors of our proposed algorithm and to support our analysis. Finally, conclusions are made in Section 5.

2. Preliminaries

Throughout this paper, R m × n is the set of all m × n real matrices, R m = R m × 1 , and R = R 1 . I n is the n × n identity matrix. The superscript “ · T ” takes transpose only, and · 1 denotes the 1 -norm of a vector or matrix. For any matrix N R m × n with m n , σ i ( N ) for i = 1 , , n is used to denote the singular values of N in descending order.
For vectors x , y R n , the usual inner product and its induced norm are conveniently defined by
x , y : = y T x , x 2 : = x , x .
With them, the usual acute angle ( x , y ) between x and y can then be defined by
( x , y ) : = arccos | x , y | x 2 y 2 .
Similarly, given any symmetric positive definite W R n × n , the W-inner product and its induced W-norm are defined by
x , y W : = y T W x , x W : = x , x W .
If x , y W = 0 , then we say x W y or y W x . The W-acute angle W ( x , y ) between x and y can then be defined by
W ( x , y ) : = arccos | x , y W | x W y W .
Let the singular value decomposition of C ˜ be C ˜ = P Λ Q T where P = [ p 1 , , p m ] R m × m and Q = [ q 1 , , q n ] R n × n are orthonormal, i.e., P T P = I m and Q T Q = I n , and Λ = diag ( λ 1 , , λ n ) R m × n with λ 1 λ 2 λ n 0 is a leading diagonal matrix. The singular value decomposition of C ˜ closely relates to the eigendecomposition of the following symmetric matrix  [30] (p. 32):
0 C ˜ C ˜ T 0 ,
whose eigenvalues are ± λ i for i = 1 , , n plus m n zeros, i.e.,
λ 1 λ n 0 0 m n λ n λ 1 ,
with associated eigenvectors are
p i ± q i , i = 1 , 2 , , n , and p i 0 , i = n + 1 , , m ,
respectively. The equivalence between (4) and (6) leads that the eigenvalues of CCGEP (4) are ± λ i for i = 1 , , n plus m n zeros, and the corresponding eigenvectors are
x i ± y i , i = 1 , 2 , , n , and x i 0 , i = n + 1 , , m ,
respectively, where
x i = R A 1 p i and y i = R B 1 q i .
Let X = [ x 1 , , x m ] and Y = [ y 1 , , y n ] . Then, the A- and B-orthonormal constraints of X and Y, respectively, i.e.,
X T A X = I m and Y T B Y = I n
are followed by P T P = I m and Q T Q = I n . Here, the first few x i and y i for i = 1 , 2 , , k with k < n are wanted canonical correlation weight vectors. Furthermore, their corresponding eigenvalues satisfy the following maximization principle which is critical to our later developments. For the proof see Appendix A.1.
Theorem 1.
The following equality holds for any U R m × k and V R n ×
i = 1 min { k , } λ i = max U T A U = I k , V T B V = I i = 1 min { k , } σ i U T C V ,
where A, B and C are defined in (2) and λ i defined in (9), and σ i ( U T C V ) for 1 i min { k , } are the singular values of U T C V .

3. The Main Algorithm

The idea of the Jacobi–Davidson method [20] is to construct iteratively approximations of certain eigenpairs. It uses solving a correction equation to expand the search subspace, and finds approximate eigenpairs as best approximations in such search subspace.

3.1. Subspace Extraction

Let X R m and Y R n with dim ( X ) = k and dim ( Y ) = , respectively. As stated in [31], we call { X , Y } a pair of defalting subspaces of CCGEP (4) if
C Y A X and C T X B Y .
Let X R m × k and Y R n × be A- and B-orthonormal basis matrices of the subspaces X and Y , respectively, i.e.,
X T A X = I k and Y T B Y = I .
The equality (13) implies that there exist D A R k × and D B R × k [32] (Equation (2.11)) such that
C Y = A X D A and C T X = B Y D B .
They imply D A = X T C Y = ( Y T C T X ) T = D B T . So (14) is equivalent to
C Y = A X D A and C T X = B Y D A T .
Now if ( λ , x ^ , y ^ ) is a singular triplet of D A , then ( λ , z ) gives an eigenpair of (4), where z = [ x T , y T ] T , x = X x ^ and y = Y y ^ . This is because
C y = C ( Y y ^ ) = ( C Y ) y ^ = A X D A y ^ = λ A X x ^ = λ A x .
and similarly C T x = λ B y . That means
0 C C T 0 x y = λ A 0 0 B x y .
Hence, we have shown that a pair of deflating subspaces { X , Y } can be used to recover those eigenpairs associated with the pair of deflating subspaces of CCGEP (4). In practice, pairs of exact deflating subspaces are usually not available, and one usually use Lanczos type methods [14] or Davidson type methods [15] to generate approximate ones, such as Krylov subspaces in Lanczos method [33]. Next, we will discuss how to extract best approximate eigenpairs from a given pair of approximate deflating subspaces.
In what follows, we consider the simple case: k = . Suppose { U , V } is an approximation of a pair of deflating subspaces { X , Y } with dim ( U ) = dim ( V ) = k . Let U R m × k and V R n × k be the A- and B-orthonormal basis matrices of the subspaces U and V , respectively. Denote θ i , i = 1 , 2 , , k the singular values of U T C V in descending order with associated left and right singular vectors u ^ i and v ^ i , respectively, i.e.,
( U T C V ) v ^ i = θ i u ^ i and ( U T C V ) T u ^ i = θ i v ^ i , for 1 i k .
Even though U and V as A- and B-orthonormal basis matrices are not unique, these θ i are. Motivated by the maximization principle in Theorem 1, we would seek the best approximations associated with the pair of approximate deflating subspaces { U , V } to the eigenpairs ( λ i , z i ) ( 1 i j k ) in the sense of
max i = 1 j σ i ( U ˜ j T C V ˜ j ) .
for any U ˜ j R m × j and V ˜ j R n × j satisfying span ( U ˜ j ) U , span ( V ˜ j ) V and U ˜ j T A U ˜ j = V ˜ j T B V ˜ j = I j . We claim that the quantity in (15) is given by i = 1 j θ i . To see this, we notice that any U ˜ j and V ˜ j in (15) can be written as
U ˜ j = U U ^ j and V ˜ j = V V ^ j
for some U ^ j R k × j and V ^ j R k × j with U ^ j T U ^ j = V ^ j T V ^ j = I j , and vice versa. Therefore the quantity in (15) is equal to
max U ^ j T U ^ j = V ^ j T V ^ j = I j i = 1 j σ i ( U ^ T U T C V V ^ ) ,
which is i = 1 j θ i by the proposition of the singular value decomposition of U T C V [30]. The maximum is attended at U ^ j = [ u ^ 1 , u ^ 2 , , u ^ j ] and V ^ j = [ v ^ 1 , v ^ 2 , , v ^ j ] . Therefore naturally, the best approximations to ( λ i , z i ) ( 1 i j ) in the sense of (15) are given by
( θ i , z ˜ i ) , where z ˜ i = x ˜ i y ˜ i , x ˜ i = U u ^ i and y ˜ i = V v ^ i .
Define the residual
r i : = K z ˜ i θ i M z ˜ i = 0 C C T 0 x ˜ i y ˜ i θ i A 0 0 B x ˜ i y ˜ i = r a ( i ) r b ( i ) ,
where K and M defined in (4), r a ( i ) = C y ˜ i θ i A x ˜ i and r b ( i ) = C T x ˜ i θ i B y ˜ i . It is noted that
U T r a ( i ) = U T ( C y ˜ i θ i A x ˜ i ) = U T C V v ^ i θ i U T A U u ^ i = 0
and similarly V T r b ( i ) = 0. We summarize what we do in this subsection in the following theorem.
Theorem 2.
Suppose { U , V } is a pair of approximate deflating subspaces with dim ( U ) = dim ( V ) = k . Let  U R m × k and V R n × k be the A- and B-orthonormal basis matrices of the subspaces U and V , respectively. Denote θ i , i = 1 , 2 , , k the singular values of U T C V in descending order. Then, for any j k ,
i = 1 j θ i = max span ( U ˜ j ) U , span ( V ˜ j ) V U ˜ j T A U ˜ j = V ˜ j T B V ˜ j = I j i = 1 j σ i ( U ˜ j T C V ˜ j ) ,
the best approximations to the eigenpairs ( λ i , z i ) ( 1 i j ) in the sense of (15) are ( θ i , z ˜ i ) ( 1 i j ) given by (16), and the associated residuals defined in (17) admit r a ( i ) U and r b ( i ) V .

3.2. Correction Equation

In this subsection, we turn to construct a correction equation for a given eigenpair approximation. Suppose ( θ , [ x ˜ T , y ˜ T ] T ) with x ˜ T A x ˜ = y ˜ T B y ˜ = 1 is an approximation of the eigenpair ( λ , [ x T , y T ] T ) of CCGEP (4), and [ r a T , r b T ] T is the associated residual. We seek A- and B-orthogonal modifications of x ˜ and y ˜ , respectively, such that
0 C C T 0 x ˜ + s y ˜ + t = λ A 0 0 B x ˜ + s y ˜ + t ,
where s A x ˜ and t B y ˜ . Then, by (18), we have
λ A C C T λ B s t = r a r b + ( λ θ ) A 0 0 B x ˜ y ˜ .
Notice that r a x ˜ and r b y ˜ by Theorem 2, which gives rise to
I m A x ˜ x ˜ T 0 0 I n B y ˜ y ˜ T r a r b = r a r b , I m A x ˜ x ˜ T 0 0 I n B y ˜ y ˜ T A x ˜ B y ˜ = 0 ,
and
I m A x ˜ x ˜ T 0 0 I n B y ˜ y ˜ T λ A C C T λ B s t = r a r b .
Because s A x ˜ and t B y ˜ , Equation (20) is rewritten as
I m A x ˜ x ˜ T 0 0 I n B y ˜ y ˜ T λ A C C T λ B I m x ˜ x ˜ T A 0 0 I n y ˜ y ˜ T B s t = r a r b .
However, we do not know λ here. It is natural that we use θ to replace λ in (21) to get the final correction equation, i.e.,
I m A x ˜ x ˜ T 0 0 I n B y ˜ y ˜ T θ A C C T θ B I m x ˜ x ˜ T A 0 0 I n y ˜ y ˜ T B s t = r a r b .
We summarize what we have so far into Algorithm 1, and make a few comments on Algorithm 1.
(1)
At step 2, A- and B-orthogonality procedures are applied to make sure U T A t ˜ = 0 and V T B s ˜ = 0 .
(2)
At step 7, in most cases, the correct equation is not necessity to solve exactly. Some steps of iterative methods for symmetric linear systems, such as linear conjugate gradient method (CG) [34] or the minimum residual method (MINRES) [35], are sufficient. Usually, more steps in solving the correction equation lead to fewer outer iterations. This will be shown in numerical examples.
(3)
For the convergence test, we use the relative residual norms
η ( θ i , z ˜ i ) : = r a ( i ) 1 + r b ( i ) 1 ( C 1 + θ i A 1 ) x ˜ i 1 + ( C 1 + θ i B 1 ) y ˜ i 1
to determine if the approximate eigenparis ( θ i , z ˜ i ) has converged to a desired accuracy. In addition, in the practical implementation, once one or several of approximate eigenpairs converge to a preset accuracy, they should be deflated so that they will not be re-computed in the following iterations. Suppose λ i for 1 i j , X j = [ x 1 , , x j ] and Y j = [ y 1 , , y j ] have been computed where j k . We can consider the generalized eigenvalue problem
K ˜ z = λ M z ,
where
K ˜ = I m A X j X j T 0 0 I n B Y j Y j T 0 C C T 0 I m X j X j T A 0 0 I n Y j Y j T B .
By (11), it is clear that the eigenvalues of (24) consist of two groups. Those eigenvalues associated with the eigenvectors [ x 1 T , y 1 T ] T , , [ x j T , y j T ] T , [ x 1 T , y 1 T ] T , , [ x j T , y j T ] T are shifted to zero and the others remain unchanged. Furthermore, for the correction equation, we find s and t subject to additional A- and B-orthogonality constrains for s and t against X j and Y j , respectively. By a similar derivation of (22), the correction equation after deflation becomes
I m A x ˜ x ˜ T 0 0 I n B y ˜ y ˜ T θ 1 A C ^ C ^ T θ 1 B I m x ˜ x ˜ T A 0 0 I n y ˜ y ˜ T B s t = r a ( 1 ) r b ( 1 ) ,
where C ^ = ( I m A X j X j T ) C ( I n Y j Y j T B ) . Notice that s A X j and t B Y j mean U A X j and V B Y j in Algorithm 1, respectively. It follows that U T C ^ V = U T C V .
(4)
At step 5, LAPACK’s routine xGESVD can be used to solve the singular value problem of U T C V because of its small size, where U T C V takes the following form:
Algorithms 13 00229 i001
This form is preserved in the algorithm during refining the basis U and V at step 8. The new basis matrices U U ^ and V V ^ are reassigned to U and V, respectively. Although a few extra costs are incurred, this refinement is necessary in order to have faster convergence for eigenvectors as stated in [36,37]. Furthermore, the restart is easily executed by keeping the first s min columns of U and V when the dimension of the subspaces span { U } and span { V } exceeds s max . The restart technique appears at step 8 to keep the size of U, V and U T C V small. There are many ways to specify s max and s min . In our numerical examples, we just simply take s max = 3 k and s min = k .
Algorithm 1 Jacobi–Davidson method for canonical correlation analysis (JDCCA)
Input: Initial vectors u 0 , v 0 , s = u 0 , t = v 0 and V = U = [ ] .
Output: Converged canonical weight vectors x ˜ i and y ˜ i for i = 1 , , k .
1:
for i t e r = 1 , 2 , ,until convergence do
2:
A- and B-orthogonal s and t against U and V, respectively, to obtain s ˜ and t ˜ .
3:
 Compute u = s ˜ / s ˜ A and v = t ˜ / t ˜ B . Let U = [ U , u ] and V = [ V , v ] .
4:
 Update the corresponding column and row of U T C V .
5:
 Compute the singular value decomposition of U T C V , i.e., U T C V = U ^ Θ V ^ T .
6:
 Compute the wanted approximate eigenpairs ( θ i , [ x ˜ i T , y ˜ i T ] T ) by (16) and the corresponding residuals r a ( i ) and r b ( i ) .
7:
 Solve
I m A x ˜ 1 x ˜ 1 T 0 0 I n B y ˜ 1 y ˜ 1 T θ 1 A C C T θ 1 B I m x ˜ 1 x ˜ 1 T A 0 0 I n y ˜ 1 y ˜ 1 T B t s = r a ( 1 ) r b ( 1 ) .
8:
 Update U = U U ^ and V = V V ^ . If the dimension of U and V exceeds s max , then replace U and V with U ( 1 : s min ) and V ( 1 : s min ) respectively.
9:
end for

3.3. Convergence

The convergence theories on the Jacobi–Davidson method for the eigenvalue and singular value problem are given in [15,38], respectively. Here we prove a similar convergence result for the Jacobi–Davidson method of CCGEP based on the following lemma. Specifically, if we solve the correction Equation (22) exactly, and then x ˜ and y ˜ are close enough to x and y, respectively, it can be hoped that the approximate eigenvectors converge cubically. For the proof see Appendix A.2 and Appendix A.3.
Lemma 1.
Let λ be a simple eigenvalue of CCGEP (4) with the corresponding eigenvector [ x T , y T ] T . Then the matrix
G : = I m A x x T 0 0 I n B y y T λ A C C T λ B I m x x T A 0 0 I n y y T B
is a bijection from span ( x ) A × span ( y ) B onto itself, where span ( x ) A and span ( y ) B are A- and B-orthogonal complementary spaces of span ( x ) and span ( y ) , respectively.
Theorem 3.
Assume the condition of Lemma 1, sin A ( x , x ˜ ) = O ( ε ) and sin B ( y , y ˜ ) = O ( ε ) . Let [ s T , t T ] T be the exact solution of the correction Equation (22). Then,
sin A ( x , x ˜ + s ) = O ( ε 3 ) a n d sin B ( y , y ˜ + t ) = O ( ε 3 ) .

4. Numerical Examples

In this section, we present some numerical examples to illustrate the effectiveness of Algorithm 1. Our goal is to compute the first few canonical weight vectors. A computed approximate eigenpair ( θ i , z ˜ i ) is considered converged when its relative residual norm
η ( θ i , z ˜ i ) = r a ( i ) 1 + r b ( i ) 1 ( C 1 + θ i A 1 ) x ˜ i 1 + ( C 1 + θ i B 1 ) y ˜ i 1 10 8 .
All the experiments in this paper are executed on a Ubuntu 12.04 (64 bit) Desktop-Intel(R) Core(TM) i7-6700 [email protected] GHz, 32 GB of RAM using MATLAB 2010a with machine epsilon 2.22 × 10 16 in double-precision floating point arithmetic.
Example 1.
We first examine Theorem 3 by using two pairs of data matrices S a and S b which come from a publicly available handwritten numerals dataset (https://archive.ics.uci.edu/ml/datasets/Multiple+Features). It consists of features handwritten numerals (‘0’–‘9’) and each digit has 200 patterns. Each pattern is represented by six different feature sets, i.e., Fou, Fac, Kar, Pix, Zer and Mor. Two pairs of feature sets Fou-Zer and Pix-Fou are chosen for S a and S b , respectively, such that S a R 76 × d and S b R 47 × d in Fou-Zer, and S a R 240 × d and S b R 76 × d in Pix-Fou with d = 2000 . To make the numerical example repeatable, the initial vectors are set to be
u 0 = x 1 + 10 3 × ones ( m , 1 ) a n d v 0 = y 1 + 10 3 × ones ( n , 1 )
where m and n are the dimension of S a and S b , respectively, ones is MATLAB built-in function, and [ x 1 T , y 1 T ] T is computed by MATLAB’s function eig on (4) and considered to be the “exact” eigenvector for testing purposes. The corrected Equation (22) in Algorithm 1 is solved by direct methods, such as Gaussian elimination, and the solution [ s T , t T ] T by such methods is regarded as “exactly” in this example. Figure 1 plots sin A ( x 1 , x ˜ 1 ) and sin B ( y 1 , y ˜ 1 ) in the first three iterations of Algorithm 1 for computing first canonical weight vector of Fou-Zer and Pix-Fou. It is clearly shown by Figure 1 that the convergence of Algorithm 1 is very fast when the initial vectors are enough close to the exact vectors, and the cubical convergence of Algorithm 1 appears in the third iteration.
Example 2.
As stated in Algorithm 1, the implementation of JDCCA involves solving the correction Equation (22) in every step. Direct solvers mentioned in Example 1 referring to O ( m + n ) 3 operations are prohibitively expensive in solving large-scale sparse linear systems. In such a case, iterative methods, such as MINRES method which is simply GMRES [39] applied to symmetric linear systems, are usually preferred. In this example, we report the effect of the number of steps in the solution of the correction equation, denoted by n g , on the total number of matrix-vector products (denoted by “#mvp”), outer iteration number (denoted by “#iter”), and CPU time in seconds for Algorithm 1 to compute the first 10 canonical weight vectors of the test problems appeared in Table 1. Table 1 presents three face datasets, i.e., ORL (https://www.cl.cam.ac.uk/research/dtg/attarchive/facedatabase.html,) FERET (http://www.nist.gov/itl/iad/ig/colorferet.cfm) and Yale (https://computervisiononline.com/dataset/1105138686) datasets. The ORL database contains 400 face images of 40 distinct persons. For each individual, there are 10 different gray scale images with 92 × 112 pixels. These images are collected by volunteers at different times, different lighting and different facial expressions (blinking or closed eyes, smiling or no-smiling, wearing glasses or no-glasses). In order to apply CCA, as numerical experiments in [40], the ORL dataset is partitioned into two groups. We select the first five images per individual as the first view to generate the data matrix S a , while the remaining for S b . Similarly, we get data matrices S a and S b for the FERET and Yale datasets. The numbers of row and column of S a and S b are detailed in Table 1.
In this example, we set the initial vectors u 0 = ones ( m , 1 ) and v 0 = ones ( n , 1 ) with s max = 30 and s min = 10 for restarting and simply take regularization parameter κ a = κ b = 10 4 . We let MINRES steps n g vary from to 5 to 40, and collect the numerical results in Figure 2. As expected, the number of total outer iterations decreases as n g increases, while the total number of matrix-vector products does not change monotonically with n g . It depends on the degree of reduction of outer iterations by the increasing of n g . In addition, it is shown by Figure 2 that the total #mvp is not a unique deciding factor on the total CPU time. When n g is larger, the significantly reduced #iter leads to smaller total CPU time. For these three test examples, the MINRES steps n g around 15 to 25 appear to be cost-effective, further increasing n g over 40 usually does not have significant effect. The least efficient case is when n g is too small.
Example 3.
In this example, we compare Algorithm 1, i.e., JDCCA, with Jacobi–Davidson QZ type method [41] (JDQZ) for the large scale symmetric positive definite pencil { K , M } defined in (4) to compute the first 10 canonical weight vectors of the test problems appeared in Table 1 with MINRES steps n g = 20 . We take u 0 = ones ( m , 1 ) and v 0 = ones ( n , 1 ) in Algorithm 1 and the initial vector ones ( m + n , 1 ) for the JDQZ algorithm, and compute the same relative residual norms η ( θ i , z ˜ i ) . The corresponding numerical results are plotted in Figure 3. For these three test problems, it is suggested by Figure 3 that Algorithm 1 always outperforms the JDQZ algorithm. Other experiments that we tested with different test problems and MINRES steps not reported here also illustrate our points.

5. Conclusions

To analyze the correlations between two data sets, several numerical algorithms have been available to find the canonical correlations and the associated canonical weight vectors; however, there is very little discussion of the large scale sparse and structured matrix cases in the literature. In this paper, a Jacobi–Davidson type method, i.e., Algorithm 1, is presented for large scale canonical correlation analysis by computing a small portion of eigenpairs of the canonical correlation generalized eigenvalue problem (4). The theoretical result is established in Theorem 3 to demonstrate that the cubic convergence of the approximate eigenvector if the correction equation is solved exactly and the approximate eigenvector of the previous step is close enough to the exact one. The corresponding numerical results are presented to confirm the effectiveness of asymptotic convergence rate provided by Theorem 3, and to demonstrate that Algorithm 1 performs far superior to the JDQZ method for the large scale symmetric positive definite pencil { K , M } .
Notice that the main computational tasks in every iteration of Algorithm 1 consist of solving the correction Equation (22). In our numerical example, we only focus on the plain version of MINRES, i.e., without considering any preconditioner. However, it is not hard to notice that incorporating a preconditioner presents no difficulty and can promote the numerical performance if the preconditioner is available. In addition, from the point of view that multi-set canonical correlation analysis (MCCA) [42] proposed to analyze linear relationships among more than two data sets can be equivalently transformed to the following generalized eigenvalue problem
0 C 12 C 1 t C 21 0 C 2 t C t 1 C t 2 0 x 1 x 2 x t = λ C 11 0 0 0 C 22 0 0 0 C t t x 1 x 2 x t ,
where C i j = S i S j T and S i is the data matrix, the development of efficient Jacobi–Davidson methods for treating such large scale MCCA will be a subject of our future study.

Author Contributions

Writing—original draft, Z.T.; Writing—review and editing, X.Z. and Z.T. All authors have read and agreed to the published version of the manuscript.

Funding

This work is supported in part by National Natural Science Foundation of China NSFC-11601081 and the research fund for distinguished young scholars of Fujian Agriculture and Forestry University No. xjq201727.

Conflicts of Interest

The authors declare no conflict of interest.

Appendix A

Appendix A.1

Proof of Theorem 1.
To prove (12), for any U R m × k and V R n × satisfying U T A U = I k and V T B V = I , respectively, we first consider the augmented matrices of C ˜ and U T C V , i.e.,
0 C ˜ C ˜ T 0 and 0 U T C V V T C T U 0 ,
whose eigenvalues ± λ i for i = 1 , , n plus m n zeros and σ i ( U T C V ) for i = 1 , , min { k , } plus k + 2 min { k , } , respectively. Notice that
0 U T C V V T C T U 0 = U T R A T 0 0 V T R B T 0 C ˜ C ˜ T 0 R A U 0 0 R B V ,
where R A and R B are defined in (5), and R A U and R B V satisfy ( R A U ) T R A U = U T A U = I k and ( R B V ) T R B V = V T B V = I , respectively. Hence, apply Cauchy’s interlacing inequalities [30] (Corollary 4.4) for the symmetric eigenvalue problem to the matrices 0 C ˜ C ˜ T 0 and 0 U T C V V T C T U 0 , to get λ i σ i ( U T C V ) for 1 i min { k , } and consequently
i = 1 min { k , } λ i max i = 1 min { k , } σ i ( U T C V )
for any U R m × k and V R n × such that U T A U = I k and V T B V = I .
On the other hand, let U = [ x 1 , x 2 , , x k ] and V = [ y 1 , y 2 , , y ] where x i and y i are defined in (10). Then, by (11), we have U T A U = I k and V T B V = I . Furthermore,
U T C V = [ p 1 , p 2 , , p k ] T C ˜ [ q 1 , q 2 , , q ] = diag ( λ 1 , , λ min { k , } ) R k ×
which to give σ i ( U T C V ) = λ i for 1 i min { k , } and thus
i = 1 min { k , } λ i max U T A U = I k , V T B V = I i = 1 min { k , } σ i ( U T C V ) .
Equation (12) is a consequence of (A1) and (A2). □

Appendix A.2

Proof of Lemma 1.
Let [ w 1 T , w 2 T ] T span ( x ) A × span ( y ) B and it satisfies G [ w 1 T , w 2 T ] T = 0 . We will prove [ w 1 T , w 2 T ] T = 0 . Since
G w 1 w 2 = I m A x x T 0 0 I n B y y T λ A C C T λ B w 1 w 2 ,
then we have
λ A C C T λ B w 1 w 2 = γ 1 A x γ 2 B y ,
where γ 1 = x T ( C w 2 λ A w 1 ) and γ 2 = y T ( C T w 1 λ B w 2 ) , which leads to
C w 2 = λ A w 1 + γ 1 A x , C T w 1 = λ B w 2 + γ 2 B y , C w 2 = λ R A T R A w 1 + γ 1 R A T R A x , C T w 1 = λ R B T R B w 2 + γ 2 R B T R B y .
Let w ˜ 1 = R A w 1 and w ˜ 2 = R B w 2 . Then the equality (A3) can be rewritten as
C ˜ w ˜ 2 = λ w ˜ 1 + γ 1 p , C ˜ T w ˜ 1 = λ w ˜ 2 + γ 2 q ,
where C ˜ , p and q are defined in (7). Multiply the first and second equations of (A4) by C ˜ T and C ˜ from left, respectively, to get
C ˜ T C ˜ w ˜ 2 = λ C ˜ T w ˜ 1 + γ 1 C ˜ T p = λ 2 w ˜ 2 + λ γ 2 q + γ 1 λ q , C ˜ C ˜ T w ˜ 1 = λ C ˜ w ˜ 2 + γ 2 C ˜ q = λ 2 w ˜ 1 + λ γ 1 p + γ 1 λ p , ( C ˜ T C ˜ λ 2 I n ) w ˜ 2 = ( λ γ 2 + λ γ 1 ) q , ( C ˜ C ˜ T λ 2 I m ) w ˜ 1 = ( λ γ 1 + λ γ 1 ) p .
Therefore, both w ˜ 1 and p belong to the kernel of ( C ˜ C ˜ T λ 2 I m ) 2 , and both w ˜ 2 and q belong to the kernel of ( C ˜ T C ˜ λ 2 I n ) 2 . The simplicity of λ implies w ˜ 1 and w ˜ 2 are multiples of p and q, respectively. Since w 1 span ( x ) A and w 2 span ( y ) B , we have w ˜ 1 span ( p ) and w ˜ 2 span ( q ) , which means w ˜ 1 = w ˜ 2 = 0 . Therefore, w 1 = w 2 = 0 . The bijectivity follow from comparing dimensions. □

Appendix A.3

Proof of Theorem 3.
Let
F : = I m A x ˜ x ˜ T 0 0 I n B y ˜ y ˜ T .
Then the correction equation is
F θ A C C T θ B F T s t = r a r b .
Since x ˜ A = x A = 1 , there exists f A x ˜ such that x = α x ˜ + f where α 2 + f A 2 = 1 . It follows that x α = x ˜ + f ˜ where f ˜ = f α and f ˜ A = tan A ( x , x ˜ ) = O ( ε ) . Similarly, there are g ˜ B y ˜ and a scalar β such that y β = y ˜ + g ˜ where g ˜ B = tan B ( y , y ˜ ) = O ( ε ) . It is noted that
0 = λ A C C T λ B x y = λ α A β C α C T λ β B x α y β = θ A C C T θ B + ( θ λ α ) A ( β 1 ) C ( α 1 ) C T ( θ λ β ) B x α y β = θ A C C T θ B x α y β ω 1 A x ω 2 B y ,
where ω 1 = λ α θ α + λ ( 1 β ) β and ω 2 = λ β θ β + λ ( 1 α ) α . Since x α = x ˜ + f ˜ and y β = y ˜ + g ˜ , the equality (A6) leads to
θ A C C T θ B f ˜ g ˜ = θ A C C T θ B x ˜ y ˜ + ω 1 A x ω 2 B y = r a r b + ω 1 α A ( x ˜ + f ˜ ) ω 2 β B ( y ˜ + g ˜ ) .
It is noted that F r a r b = r a r b , F A x ˜ B y ˜ = 0 , F A f ˜ B g ˜ = A f ˜ B g ˜ and F T f ˜ g ˜ = f ˜ g ˜ . Then, we have by (A7)
F θ A C C T θ B F T f ˜ g ˜ = F r a r b + F ω 1 α A f ˜ ω 2 β B g ˜ = r a r b + ω 1 α A f ˜ ω 2 β B g ˜ .
Together (A5) with (A8) to get
F θ A C C T θ B F T f ˜ s g ˜ t = ω 1 α A f ˜ ω 2 β B g ˜ .
In addition, since r a x ˜ and R B y ˜ , multiplying (A7) on the left by x ˜ 0 0 y ˜ T leads to
ω 1 α ω 2 β = ( x ˜ T A x ˜ ) 1 ( y ˜ T B y ˜ ) 1 θ x ˜ T A x ˜ T C y ˜ T C T θ y ˜ T B f ˜ g ˜ = θ x ˜ T A f ˜ + x ˜ T C g ˜ y ˜ T C T f ˜ θ y ˜ T B g ˜ ( by x ˜ T A x ˜ = y ˜ T B y ˜ = 1 ) = x ˜ T C g ˜ y ˜ T C T f ˜ = ( x α f ˜ ) T C g ˜ ( y β g ˜ ) T C T f ˜ = ( x α ) T C g ˜ f ˜ T C g ˜ ( y β ) T C T f ˜ g ˜ T C T f ˜ = λ β α g ˜ T B g ˜ f ˜ T C g ˜ λ α β f ˜ T A f ˜ g ˜ T C T f ˜ .
By Lemma 1, when x ˜ and y ˜ are close enough to x and y, respectively, we see that F θ A C C T θ B F T is invertible. It follows by (A9) that
f ˜ s g ˜ t = F θ A C C T θ B F T 1 ω 1 α A f ˜ ω 2 β B g ˜ f ˜ s g ˜ t M = O ε 3 .
The last equality holds because of f ˜ A = O ( ε ) , g ˜ B = O ( ε ) and (A10), which means f ˜ s A = O ε 3 and g ˜ t B = O ε 3 . Therefore,
| sin A ( x , x ˜ + s ) | = | sin A ( x , x α + s f ˜ ) | = X A ( x α + s f ˜ ) 2 x ˜ + s A = X A ( f ˜ s ) 2 x ˜ + s A f ˜ s A x ˜ + s A = O ε 3 ,
where X = [ x 2 , , x m ] . Similarly, we have | sin B ( y , y ˜ + t ) | = O ε 3 . □

References

  1. Hardoon, D.R.; Szedmak, S.; Shawe-Taylor, J. Canonical correlation analysis: An overview with application to learning methods. Neural Comput. 2004, 16, 2639–2664. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  2. Harold, H. Relations between two sets of variates. Biometrika 1936, 28, 321–377. [Google Scholar]
  3. Wang, L.; Zhang, L.H.; Bai, Z.; Li, R.C. Orthogonal canonical correlation analysis and applications. Opt. Methods Softw. 2020, 35, 787–807. [Google Scholar] [CrossRef]
  4. Uurtio, V.; Monteiro, J.M.; Kandola, J.; Shawe-Taylor, J.; Fernandez-Reyes, D.; Rousu, J. A tutorial on canonical correlation methods. ACM Comput. Surv. 2017, 50, 1–33. [Google Scholar] [CrossRef] [Green Version]
  5. Zhang, L.H.; Wang, L.; Bai, Z.; Li, R.C. A self-consistent-field iteration for orthogonal CCA. IEEE Trans. Pattern Anal. Mach. Intell. 2020, 1–15. [Google Scholar] [CrossRef]
  6. Fukunaga, K. Introduction to Statistical Pattern Recognition; Elsevier: Amsterdam, The Netherlands, 2013. [Google Scholar]
  7. González, I.; Déjean, S.; Martin, P.G.P.; Gonçalves, O.; Besse, P.; Baccini, A. Highlighting relationships between heterogeneous biological data through graphical displays based on regularized canonical correlation analysis. J. Biol. Syst. 2009, 17, 173–199. [Google Scholar] [CrossRef]
  8. Leurgans, S.E.; Moyeed, R.A.; Silverman, B.W. Canonical correlation analysis when the data are curves. J. R. Stat. Soc. B. Stat. Methodol. 1993, 55, 725–740. [Google Scholar] [CrossRef]
  9. Raul, C.C.; Lee, M.L.T. Fast regularized canonical correlation analysis. Comput. Stat. Data Anal. 2014, 70, 88–100. [Google Scholar]
  10. Vinod, H.D. Canonical ridge and econometrics of joint production. J. Econ. 1976, 4, 147–166. [Google Scholar] [CrossRef]
  11. González, I.; Déjean, S.; Martin, P.G.P.; Baccini, A. CCA: An R package to extend canonical correlation analysis. J. Stat. Softw. 2008, 23, 1–14. [Google Scholar] [CrossRef] [Green Version]
  12. Ma, Z. Canonical Correlation Analysis and Network Data Modeling: Statistical and Computational Properties. Ph.D. Thesis, University of Pennsylvania, Philadelphia, PA, USA, 2017. [Google Scholar]
  13. Golub, G.; Kahan, W. Calculating the singular values and pseudo-inverse of a matrix. SIAM J. Numer. Anal. 1965, 2, 205–224. [Google Scholar] [CrossRef]
  14. Jia, Z.; Niu, D. An implicitly restarted refined bidiagonalization Lanczos method for computing a partial singular value decomposition. SIAM J. Matrix Anal. Appl. 2003, 25, 246–265. [Google Scholar] [CrossRef]
  15. Hochstenbach, M.E. A Jacobi—Davidson type SVD method. SIAM J. Sci. Comput. 2001, 23, 606–628. [Google Scholar] [CrossRef] [Green Version]
  16. Zhou, Y.; Wang, Z.; Zhou, A. Accelerating large partial EVD/SVD calculations by filtered block Davidson methods. Sci. China Math. 2016, 59, 1635–1662. [Google Scholar] [CrossRef]
  17. Allen-Zhu, Z.; Li, Y. Doubly accelerated methods for faster CCA and generalized eigendecomposition. In Proceedings of the International Conference on Machine Learning, Sydney, NSW, Australia, 6–11 August 2017; pp. 98–106. [Google Scholar]
  18. Saad, Y. Numerical Methods for Large Eigenvalue Problems: Revised Edition; SIAM: Philadelphia, PA, USA, 2011. [Google Scholar]
  19. Stewart, G.W. Matrix Algorithms Volume II: Eigensystems; SIAM: Philadelphia, PA, USA, 2001; Volume 2. [Google Scholar]
  20. Sleijpen, G.L.G.; Van der Vorst, H.A. A Jacobi—Davidson iteration method for linear eigenvalue problems. SIAM Rev. 2000, 42, 267–293. [Google Scholar] [CrossRef]
  21. Hochstenbach, M.E. A Jacobi—Davidson type method for the generalized singular value problem. Linear Algebra Appl. 2009, 431, 471–487. [Google Scholar] [CrossRef] [Green Version]
  22. Betcke, T.; Voss, H. A Jacobi–Davidson-type projection method for nonlinear eigenvalue problems. Future Gen. Comput. Syst. 2004, 20, 363–372. [Google Scholar] [CrossRef] [Green Version]
  23. Hochstenbach, M.E.; Plestenjak, B. A Jacobi—Davidson type method for a right definite two-parameter eigenvalue problem. SIAM J. Matrix Anal. Appl. 2002, 24, 392–410. [Google Scholar] [CrossRef] [Green Version]
  24. Arbenz, P.; Hochstenbach, M.E. A Jacobi—Davidson method for solving complex symmetric eigenvalue problems. SIAM J. Sci. Comput. 2004, 25, 1655–1673. [Google Scholar] [CrossRef] [Green Version]
  25. Campos, C.; Roman, J.E. A polynomial Jacobi—Davidson solver with support for non-monomial bases and deflation. BIT Numer. Math. 2019, 60, 295–318. [Google Scholar] [CrossRef]
  26. Hochstenbach, M.E. A Jacobi—Davidson type method for the product eigenvalue problem. J. Comput. Appl. Math. 2008, 212, 46–62. [Google Scholar] [CrossRef]
  27. Hochstenbach, M.E.; Muhič, A.; Plestenjak, B. Jacobi—Davidson methods for polynomial two-parameter eigenvalue problems. J. Comput. Appl. Math. 2015, 288, 251–263. [Google Scholar] [CrossRef]
  28. Meerbergen, K.; Schröder, C.; Voss, H. A Jacobi–Davidson method for two-real-parameter nonlinear eigenvalue problems arising from delay-differential equations. Numer. Linear Algebra Appl. 2013, 20, 852–868. [Google Scholar] [CrossRef]
  29. Rakhuba, M.V.; Oseledets, I.V. Jacobi–Davidson method on low-rank matrix manifolds. SIAM J. Sci. Comput. 2018, 40, A1149–A1170. [Google Scholar] [CrossRef]
  30. Stewart, G.W.; Sun, J.G. Matrix Perturbation Theory; Academic Press: Boston, FL, USA, 1990. [Google Scholar]
  31. Teng, Z.; Wang, X. Majorization bounds for SVD. Jpn. J. Ind. Appl. Math. 2018, 35, 1163–1172. [Google Scholar] [CrossRef]
  32. Bai, Z.; Li, R.C. Minimization principles and computation for the generalized linear response eigenvalue problem. BIT Numer. Math. 2014, 54, 31–54. [Google Scholar] [CrossRef]
  33. Teng, Z.; Li, R.C. Convergence analysis of Lanczos-type methods for the linear response eigenvalue problem. J. Comput. Appl. Math. 2013, 247, 17–33. [Google Scholar] [CrossRef]
  34. Demmel, J.W. Applied Numerical Linear Algebra; SIAM: Philadelphia, PA, USA, 1997. [Google Scholar]
  35. Saad, Y. Iterative Methods for Sparse Linear Systems; SIAM: Philadelphia, PA, USA, 2003. [Google Scholar]
  36. Teng, Z.; Zhou, Y.; Li, R.C. A block Chebyshev-Davidson method for linear response eigenvalue problems. Adv. Comput. Math. 2016, 42, 1103–1128. [Google Scholar] [CrossRef]
  37. Zhou, Y.; Saad, Y. A Chebyshev–Davidson algorithm for large symmetric eigenproblems. SIAM J. Matrix Anal. Appl. 2007, 29, 954–971. [Google Scholar] [CrossRef] [Green Version]
  38. Sleijpen, G.L.G.; Van der Vorst, H.A. The Jacobi–Davidson method for eigenvalue problems as an accelerated inexact Newton scheme. In Proceedings of the IMACS Conference, Blagoevgrad, Bulgaria, 17–20 June 1995. [Google Scholar]
  39. Saad, Y.; Schultz, M.H. GMRES: A generalized minimal residual algorithm for solving nonsymmetric linear systems. SIAM J. Sci. Statist. Comput. 1986, 7, 856–869. [Google Scholar] [CrossRef] [Green Version]
  40. Lee, S.H.; Choi, S. Two-dimensional canonical correlation analysis. IEEE Signal Process. Lett. 2007, 14, 735–738. [Google Scholar] [CrossRef]
  41. Fokkema, D.R.; Sleijpen, G.L.G.; Van der Vorst, H.A. Jacobi–Davidson style QR and QZ algorithms for the reduction of matrix pencils. SIAM J. Sci. Comput. 1998, 20, 94–125. [Google Scholar] [CrossRef] [Green Version]
  42. Desai, N.; Seghouane, A.K.; Palaniswami, M. Algorithms for two dimensional multi set canonical correlation analysis. Pattern Recognit. Lett. 2018, 111, 101–108. [Google Scholar] [CrossRef]
Figure 1. Convergence behavior of Algorithm 1 for computing the first canonical weight vector of Fou-Zer and Pix-Fou.
Figure 1. Convergence behavior of Algorithm 1 for computing the first canonical weight vector of Fou-Zer and Pix-Fou.
Algorithms 13 00229 g001
Figure 2. Cost in computing the first 10 canonical weight vectors of ORL (top), FERET (middle) and Yale (bottom) datasets with MINRES steps for the correction equation varying from 5 to 40.
Figure 2. Cost in computing the first 10 canonical weight vectors of ORL (top), FERET (middle) and Yale (bottom) datasets with MINRES steps for the correction equation varying from 5 to 40.
Algorithms 13 00229 g002
Figure 3. Convergence behavior of JDCCA and JDQZ for computation of the first 10 canonical weight vectors of ORL (top), FERET (middle) and Yale (bottom) datasets with MINRES step n g = 20 .
Figure 3. Convergence behavior of JDCCA and JDQZ for computation of the first 10 canonical weight vectors of ORL (top), FERET (middle) and Yale (bottom) datasets with MINRES step n g = 20 .
Algorithms 13 00229 g003
Table 1. Test problems.
Table 1. Test problems.
ProblemsORLFERETYale
m10,304640010,000
n10,304640010,000
d20060075

Share and Cite

MDPI and ACS Style

Teng, Z.; Zhang, X. A Jacobi–Davidson Method for Large Scale Canonical Correlation Analysis. Algorithms 2020, 13, 229. https://doi.org/10.3390/a13090229

AMA Style

Teng Z, Zhang X. A Jacobi–Davidson Method for Large Scale Canonical Correlation Analysis. Algorithms. 2020; 13(9):229. https://doi.org/10.3390/a13090229

Chicago/Turabian Style

Teng, Zhongming, and Xiaowei Zhang. 2020. "A Jacobi–Davidson Method for Large Scale Canonical Correlation Analysis" Algorithms 13, no. 9: 229. https://doi.org/10.3390/a13090229

APA Style

Teng, Z., & Zhang, X. (2020). A Jacobi–Davidson Method for Large Scale Canonical Correlation Analysis. Algorithms, 13(9), 229. https://doi.org/10.3390/a13090229

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop