Next Article in Journal
Grant-Free Random Access Enhanced by Massive MIMO and Non-Orthogonal Preambles
Previous Article in Journal
Investigation of Lithium-Ion Battery Negative Pulsed Charging Strategy Using Non-Dominated Sorting Genetic Algorithm II
Previous Article in Special Issue
Prompt-Enhanced Generation for Multimodal Open Question Answering
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Robust Tensor Learning for Multi-View Spectral Clustering

1
School of Science and Information Science, Qingdao Agricultural University, Qingdao 266109, China
2
Institute of Microscale Optoelectronics, Shenzhen University, Shenzhen 518060, China
*
Authors to whom correspondence should be addressed.
Electronics 2024, 13(11), 2181; https://doi.org/10.3390/electronics13112181
Submission received: 14 April 2024 / Revised: 17 May 2024 / Accepted: 29 May 2024 / Published: 3 June 2024
(This article belongs to the Special Issue Multi-Modal Learning for Multimedia Data Analysis and Applications)

Abstract

:
Tensor-based multi-view spectral clustering methods are promising in practical clustering applications. However, most of the existing methods adopt the 2 , 1 norm to depict the sparsity of the error matrix, and they usually ignore the global structure embedded in each single view, compromising the clustering performance. Here, we design a robust tensor learning method for multi-view spectral clustering (RTL-MSC), which employs the weighted tensor nuclear norm to regularize the essential tensor for exploiting the high-order correlations underlying multiple views and adopts the nuclear norm to constrain each frontal slice of the essential tensor as the block diagonal matrix. Simultaneously, a novel column-wise sparse norm, namely, 2 , p , is defined in RTL-MSC to measure the error tensor, making it sparser than the one derived by the 2 , 1 norm. We design an effective optimization algorithm to solve the proposed model. Experiments on three widely used datasets demonstrate the superiority of our method.

1. Introduction

Spectral clustering (SC), which clusters data points using eigenvectors of matrices, is one of the most widely used methods for data clustering. SC and its variants have achieved empirical success in many applications [1,2,3]. Generally, these existing methods investigate how to improve clustering performance with a fixed affinity matrix [4,5] or how to construct a more robust affinity matrix so as to improve clustering performance [6,7]. When accessing multi-view data, many multi-view spectral clustering (MSC) methods have been developed by pursuing a robust affinity matrix [8,9,10] or designing an MSC algorithm with a fixed affinity matrix [11,12,13,14]. Here, we mainly focus on how to learn a more robust affinity matrix from multiple views.
To learn an affinity matrix from multiple views, Nie et al. [15] developed an MSC method with adaptive neighbors (MSCAN) by performing clustering and local structure learning simultaneously. Based on this, a graph learning method is proposed by introducing a rank constraint on the Laplacian matrix in two steps [16]. Although these methods achieve a promising clustering performance, they directly learn the similarity matrix from original multi-view data, which usually contain noise. To address this problem, the subspace-based methods are widely investigated. For example, Xie et al. [17] propose a method by joint latent representation and similarity learning, which learns the common similarity matrix from the adaptive consistent latent representation. By using the Frobenius norm to measure the reconstruction error, the latent representation of each view, and the consistence between the common similarity matrix and the latent representations, a multi-graph fusion method is developed for MSC [18]. Although these methods learn the class indicator matrix directly, they usually perform k-means on this obtained class indicator matrix to obtain the final predicted label. To deal with this issue, some unified clustering methods are designed. For example, Zhong et al. [19] provide an MSC method by simultaneous consensus graph learning and discretization, which learns the similarity matrix and the discrete cluster label matrix in a unified framework. Tang et al. [20] made a design to directly capture the discrete clustering indicator matrix from the unified graph. Meanwhile, a multi-view kernel spectral clustering method [21] is proposed by formulating a weighted kernel canonical correlation analysis in a primal–dual optimization setting typical of least squares support vector machines, where the clustering scores are calculated directly.
Different from these traditional shallow methods, some deep MSC methods are also widely investigated. For example, an MSC network [22] is designed by incorporating the local invariance within every single view and the consistency across different views into a unified framework, where the local invariance is defined by a deep metric learning network. Zhao et al. [23] propose a deep MSC via an ensemble model to fuse the similarity graphs from different views by ensemble clustering. They adopt a graph auto-encoder to learn the common indicator matrix directly. Wang et al. [24] use a set of encoders to obtain the latent representation of each view, and then combine the local structure, global structure, and discriminative constraint into a unified framework to learn a shared affinity matrix. Although these deep MSC methods achieved promising clustering methods by exploring deep features, they are usually limited by ignoring the high-order correlations underlying multiple views.
With the aiming of exploring high-order correlations between multiple views, many tensor-based MSC methods are investigated. These methods can be roughly categorized into two types according to the construction of a tensor. One is to construct rank-1 tensors by calculating the outer products of all view-specific feature maps, which incorporate with mth-order tensors of interconnection weights to perform calculations of all views simultaneously, leading to tensor-based high-order couplings and information fusion [25]. The other way is to construct a third-order tensor by merging a set of the representation matrix from each single view. These works mainly focus on how to approximate the tensor low-rank constraint and how to learn the common affinity matrix. One way to depict the tensor low-rank property is by using the Tucker decomposition [26]. Based on this, Chen et al. [27] recovered the latent representation of each view, and adaptively learned the affinity matrix from these representations. Meanwhile, the tensor singular value decomposition (t-SVD)-based tensor nuclear norm (TNN) is also a widely used approximation of tensor low-rank constraint. Based on this, Wu et al. [28] propose an essential tensor learning (ETL) method for MSC (ETL-MSC). Inspired by the low-rank and sparse decomposition, they recovered a latent tensor from the input tensor that was stacked by the transition probability matrix (TPM) of each view. With the obtained tensor, the affinity matrix is computed by calculating the average of all frontal slices of this tensor. By introducing a probability constraint, Zhang et al. [29] improved the clustering performance of ETL-MSC. Although these t-SVD-TNN-based methods achieved a promising clustering performance, they are still limited by the fact that the TNN is only the sum of singular values, causing a poor rank approximation. To address this problem, the nonconvex Laplace function [30], t-SVD-based weighted TNN (WTNN) [31], tailored tensor low-rank norm [32], and weighted tensor Schatten-p norm [33]-based MSC methods are developed. Unlike these methods that pursue the tensor low-rank approximate, Chen et al. [34] committed to learn a common non-negative affinity matrix directly. Unlike this, Wang et al. [35] aimed to emphasize the types of errors that can vary behavior inconsistency in each view by introducing a group 1 norm to the error tensor.
Inspired by the fact that t-SVD WTNN shrinks different singular values with different weights, leading to a satisfying clustering performance, we propose a novel robust tensor learning method for MSC (RTL-MSC). The proposed RTL-MSC aims to learn the latent tensor from the input transition probability tensor. Then, the affinity matrix, which would be input to the Markov chain SC algorithm to obtain the final clustering result, is computed by each frontal slice of the derived latent tensor. Unlike the aforementioned tensor-based methods, we impose the low-rank constraint on each frontal slice of the latent tensor to emphasize the low-rank property of each latent TPM. Simultaneously, we adopt the 2 , p norm to regularize the error tensor to depict the noise corresponding to each sample. The contributions of this work can be summarized as follows:
  • With a novel integrated strategy, the weighted tensor nuclear norm-based tensor low-rank constraint, the matrix nuclear norm-based low-rank regularization, and the 2 , p regularization are integrated into a unified framework, where the WTNN regularization depicts the information among different samples and different views, and the matrix nuclear norm regularization makes each frontal slice of the learned tensor approximately block-diagonal, capturing the geometry of each single view. Thus, the affinity matrix calculated from the latent tensor depicts the intrinsic clustering structure of the multi-view data.
  • A column-wise sparse norm, namely, 2 , p norm, is introduced to enhance the robustness of our model. Compared with the 2 , 1 norm, the  2 , p norm, which is invariant, continuous, and differentiable, can be better in restricting the sparsity property of noised samples.

2. Background and Motivation

The Markov chain method is one of the classical clustering approaches, which takes a transition probability matrix as the input to perform a Markov chain-based SC algorithm. Based on this, a robust multi-view spectral clustering (RMSC) method [36] is proposed to recover the consistent TPM by low-rank and sparse decomposition. Let { X v } v = 1 m be the multi-view data including m different features. X v = { x 1 v , x 2 v , , x n v } d v × n ( v = 1 , 2 , , m ) denotes the features of n samples in the v-th view, and  d v is the feature dimension of the v-th view. RMSC first computes the similarity matrix A v n × n by A i , j v = exp ( x i v x j v 2 2 x i v x j v 2 2 τ τ ) , and then calculates the TPM P v by P v = ( D v ) 1 A v , where D v is a diagonal matrix with the i-th element D i , i v = j = 1 n A i , j v . Considering that the features in each view might be corrupted by noise, RMSC separates the transition probability matrices P v into two parts by th low-rank and sparse decomposition: the common transition probability matrix P , which is shared by all views, and the error matrices E v , v = 1 , , m , which encode the noise of the transition probability matrices. By assuming that the common TPM P tends to be low rank, while the error matrices are sparse, RMSC is formulated as follows:
min P , E v P * + λ v = 1 m E v 1 s . t . P v = P + E v , v = 1 , 2 , , m P 0 , P 1 = 1
where λ is a non-negative parameter. RMSC explores the consistent information among multiple views while ignoring the unique information of each single view.
To emphasize this issue, a tensor-based TPM learning method, termed as ETL-MSC, is proposed by extending RMSC to a third-order tensor form. ETL-MSC stacks P v as the v-th frontal slice of the tensor P n × n × m , and takes P as the input tensor. By assuming that the essential tensor satisfies tensor low-rank constraint and that the noises in transition probability vectors of noised samples are not sparse while the noisy samples should be sparse, ETL-MSC formulates the following problem:
min C , E C ^ + λ E 2 , 1 s . t . P = C + E C ^ = φ ( C ) n × m × n
where the function φ ( C ) rotates the tensor C n × n × m to obtain a new tensor C ^ with the size of n × m × n . Correspondingly, C = φ 1 ( C ^ ) . C ^ = j = 1 n C ^ f ( j ) * = j = 1 n i = 1 m i n ( n , m ) σ i ( C ^ f ( j ) ) is the t-SVD-based TNN of C ^ , C ^ f = fft ( C ^ , [ ] , 3 ) represents the discrete Fourier transform (DFT) along the third dimension of C ^ , C ^ f ( j ) is the j-th frontal slice of C ^ f , and σ i ( C ^ f ( j ) ) denotes the i-th largest singular values of C ^ f ( j ) . The  2 , 1 -norm of E is defined as E 2 , 1 = Δ unfold ( E ) 2 , 1 , resulting in jointly consistent magnitudes for the column of each frontal slice of the error tensor, unfolding ( E ) = [ E ( 1 ) ; E ( 2 ) ; ; E ( m ) ] .
Two advantages are brought by the t-SVD-based TNN regularization and the rotation operation. First, ETL-MSC could preserve the relationship among all views. Second, the computation complexity can be reduced largely. However, the TNN minimization problem shrinks each singular value with the same parameter. That is, each singular value is considered to be equally important, which contradicts the following facts. As we know, there is a great difference between the non-zero singular values of an arbitrary image. Generally, the larger singular values are considered to be more important than the smaller ones because the larger ones are associated with the salient parts in the image. Thus, to preserve the salient parts, the larger singular values should be shrunk less. Unusually, for a noised image, the larger singular values may carry undesirable information in the existence of noise (such as illumination). In either case, different singular values are suggested to be shrunk differently. To overcome this limit, a t-SVD-based WTNN is proposed and used to formulate the following model to cluster multi-view data [37] by the following:
min C ^ w , + λ E 2 , 1 s . t . P = C + E
where C ^ w , = j = 1 n C ^ f ( j ) w , * = j = 1 n i = 1 m i n ( n , m ) w i × σ i ( C ^ f ( j ) ) is the t-SVD -based WTNN of C ^ . w is the weighted vector, with which different singular values could be shrunk differently.
After the essential tensor C is derived, the common TPM P * , which is input to the Markov Chain-based SC algorithm to obtain the final clustering result, is often computed by P * = v = 1 m C ( v ) . In other words, each frontal slice of the recovered tensor C is considered to be the intrinsic TPM of each single view, which tends to be a block-diagonal matrix. Such information is generally ignored by the aforementioned tensor methods.

3. The Proposed Method

With the aim of learning a robust TPM in a third-order tensor form, we design a novel robust tensor learning method for multi-view spectral clustering (RTL-MSC). Similar to MGL-WTNN, we also separate the input tensor P into two parts: the intrinsic tensor C and the error tensor E . For the intrinsic tensor C , we adopt the t-SVD-based WTNN to regularize its rotated tensor to capture the high-order correlations underlying multiple views. Meanwhile, we introduce the low-rank constraint to each frontal slice of C to emphasize its block-diagonal structure. Moreover, we utilize a novel 2 , p norm to emphasize the sparsity of an error tensor. Accordingly, the mathematical model is designed as follows:
min C , E C ^ w , + α v = 1 m E ( v ) 2 , p + β v = 1 m C ( v ) * s . t . P = C + E C ^ = φ ( C ) n × m × n
where α > 0 and  β > 0 are two balanced parameters. The function φ ( C ) rotates the tensor C n × n × m to a size of n × m × n . Correspondingly, C = φ 1 ( C ^ ) , where φ 1 ( ) represents the inverse operation of φ ( ) . 2 , p denotes the 2 , p norm, with the definition as follows: M 2 , p = i = 1 n M : , i 2 p = i = 1 n ( j = 1 n ( M j , i ) 2 ) p 2 . Here, 0 < p 1 . Obviously, when p = 1 , the  2 , p norm is reduced to the 2 , 1 norm. The solution obtained by the 2 , p norm is sparser than the one obtained by the 2 , 1 norm. Thus, the second term makes it more robust to noised samples. The third term in Equation (4) makes each frontal slice of the essential tensor a block-diagonal matrix, which will captures the intrinsic clustering structure of the multi-view data.
To solve Equation (4), the alternating direction method of multipliers (ADMM) is used. ADMM reformulates the constrained problem as a unconstrained one by constructing an augmented Lagrangian function. By introducing an intermediate tensor T , the Lagrangian function is written as follows:
L ( C , E , T , Y , W ) = T ^ w , + α v = 1 m E ( v ) 2 , p + β v = 1 m C ( v ) * + μ 1 2 P C E + Y μ 1 F 2 + μ 2 2 T C + W μ 2 F 2
where μ 1 > 0 and μ 2 > 0 are two penalty parameters, and Y and W are two Lagrange multipliers. Then, each variable is updated by fixing other ones.
Solving T , by fixing C , E , and W , the sub-problem of T can be reformulated as follows:
min T T ^ w , + μ 2 2 T C + W μ 2 F 2 min T T ^ w , + μ 2 2 T ˜ ϕ ( C W μ 2 ) F 2
Denoting the tensor singular value decomposition of ϕ ( C W μ 2 ) as U * S * V T , S f = fft ( S , [ ] , 3 ) , the optimal solution of Equation (6) is T * = U * ifft ( Γ w μ 1 ( S f ) ) * V T , where Γ w μ 1 ( S f ) is a tensor stacked by Γ w μ 1 ( S f ( 1 ) ) , Γ w μ 1 ( S f ( 2 ) ) , , Γ w μ 1 ( S f ( n ) ) , and  Γ w μ 1 ( S f ( k ) ) , k = 1 , 2 , , n is defined as Γ w μ 1 ( S f ( k ) ) = d i a g ( θ 1 , θ 2 , , θ min ( n , m ) ) , and  θ i = max ( S f i i ( k ) ω i μ 1 , 0 )  [37]. Following [37], the weighted vector w is adaptively computed by w = m n × ( S f ( k ) + ϵ ) 1 , where ϵ is set to be very small to avoid the singular value of 0.
Solving E , for each view, E ( v ) , v = 1 , 2 , , m is independent. By fixing T , C , and  Y , the sub-problem of E ( v ) is written as follows:
min E ( v ) α E ( v ) 2 , p + μ 1 2 P ( v ) C ( v ) E ( v ) + Y ( v ) μ 1 F 2 = Δ min E α μ 1 E 2 , p + 1 2 E Q F 2 = min E α μ 1 i = 1 n E : , i 2 p + 1 2 i = 1 n E : , i Q : , i 2 2 = min E i = 1 n ( α μ 1 E : , i 2 p + 1 2 E : , i Q : , i 2 2 )
where Q = [ Q ( 1 ) ; Q ( 2 ) ; ; Q ( m ) ] , Q ( i ) is the i-th frontal slice of the tensor P C + Y μ 1 . Then, E is solved column by column. Thus, Equation (7) is rewritten as follows:
min E : , i α μ 1 E : , i 2 p + 1 2 E : , i Q : , i 2 2
here, E : , i can be treated as a special matrix. Then, we can perform a thin SVD on it. As observed, E : , i has exactly one singular value, given by the following:
σ ( E : , i ) = E : , i T E : , i = E : , i 2
Thus, Equation (8) can be rewritten as follows:
min E : , i α μ 1 ( σ ( E : , i ) ) p + 1 2 E : , i Q : , i 2 2
According to the work in [38], the solution of Equation (10) is as follows:
E : , i = u i σ * ( E : , i ) v i T
where u i and v i are the left and right singular vectors of Q : , i , respectively. Additionally, 
σ * ( E : , i ) = arg min x 0 τ x p + 1 2 ( x σ ( Q : , i ) ) 2
where α μ 1 = Δ τ , σ ( Q : , i ) = Q : , i 2  [39], and Equation (12) can be solved by Lemma 1 in [33]. Correspondingly, E : , j is derived.
Solving C ( v ) , by fixing T , E , W , and  Y , Equation (5), w.r.t. C , can be written as as follows:
arg min C ( v ) β v = 1 m C ( v ) * + μ 1 2 C ( P E + Y μ 1 ) F 2 + μ 2 2 C ( T + W μ 2 ) F 2
For each v , v = 1 , 2 , , m , C ( v ) is independent, so Equation (13) can be rewritten as follows:
arg min C ( v ) β C ( v ) * + μ 1 2 C ( v ) M 1 ( v ) F 2 + μ 2 2 C M 2 ( v ) F 2 = arg min C ( v ) β C ( v ) * + μ 1 + μ 2 2 C ( v ) μ 1 M 1 ( v ) + μ 2 M 2 ( v ) μ 1 + μ 2 F 2
where M 1 ( v ) is the v-th frontal slice of M 1 = P E + Y μ 1 and M 2 ( v ) is the v-th frontal slice of M 2 = T + W μ 2 . According to [40,41], the solution is C ( v ) = U Γ β μ 1 + μ 2 ( Σ ) V T , where U Σ V T is the singular value decomposition of μ 1 M 1 ( v ) + μ 2 M 2 ( v ) μ 1 + μ 2 , and Γ θ ( x ) = max ( x θ , 0 ) + min ( x + θ , 0 ) is the shrinkage operator.
Summarily, Equation (4) can be solved by Algorithm 1.
Algorithm 1 RTL-MSC Algorithm
Input: Multi-view data X j , j = 1 , 2 , , m , α , β , sample number n, and cluster number c
  1:
Calculate the similarity matrix A j of the j-th each view by A i , j v = exp ( x i v x j v 2 2 x i v x j v 2 2 τ τ ) .
  2:
Compute the degree matrix D j with the ( i , i ) -th element as D i , i j = k = 1 n A i , k j , i = 1 , 2 , , n , and the other elements are all set to 0.
  3:
Compute the TPM of the j-th view by P j = ( D j ) 1 A j .
  4:
Construct the input tensor P by stacking P j as its j-th frontal slice.
  5:
Initialize T = C = E = Y = W = 0 ; μ 1 = μ 2 = 10 5 , η = 2 , μ max = 10 10 , ϵ = 10 7
  6:
repeat
  7:
   Update T by solving Equation (6).
  8:
   Update C by solving Equation (14).
  9:
   Update the Lagrange multiplier W using W = W + μ 2 ( T C ) .
10:
   Update E by solving Equation (8).
11:
   Update the Lagrange multiplier Y using Y = Y + μ 1 ( P C E ) .
12:
   Update parameters μ 1 and μ 2 by μ 1 = μ 2 = min ( η μ 1 , μ max ) ;
13:
   Check the convergence conditions:
14:
    P C E < ϵ .
15:
until converge
16:
Obtain the affinity matrix P * as P * = v = 1 m | C ( v ) | + | C ( v ) T | / ( 2 m ) ;
17:
P * and c are input to the Markov chain-based SC algorithm.
Output: The predicted label.
For Algorithm 1, the computation is mainly consumed by the optimization of T , C , and E . To solve T n × m × n , the FFT and inverse FFT of an n × m × n tensor along the third dimension and the SVD of each frontal slice of an n × m × n tensor are involved, leading to a complexity of O ( m 2 n 2 + m n 2 log ( n ) ) at each iteration. C n × n × m is solved slice by slice. For each frontal slice C ( v ) , the SVD of an n × n matrix is calculated, leading to a complexity of O ( n 3 ) . Thus, the solving of C takes a complexity of O ( m n 3 ) at each iteration. The 2 , p norm minimization problem requires a complexity of O ( m n 2 ) at each iteration. Consequently, by denoting the number of iterations as K, the proposed algorithm takes a complexity of O ( K m n 2 ( m + n + l o g ( n ) ) ) in total.

4. Experiments

To evaluate the performance of the proposed methods, we conduct experiments on three widely used datasets. All experiments are performed in Matlab R2021a.

4.1. Competitors

The proposed methods are compared with the spectral clustering method and six multi-view state-of-the-art clustering methods. (1) S C best is achieved by performing the spectral clustering on each single view and obtaining the best result among all views. (2) RMSC learns a common low-rank transition probability matrix via low-rank and sparse matrix decomposition. (3) MSCAN incorporates the common similarity matrix learning and the predicted label matrix learning into a unified framework, which allocates weight for each view automatically without any additional parameters. (4) ETL-MSC recovers an essential tensor via a tensor RPCA, which utilizes the t-SVD-based TNN to approximate the tensor low rank. (5) ATPML-MSC [34] learns a consistent TPM directly under the tensor RPCA framework. The obtained TPM is non-negative and symmetric without any postprocessing. (6) MGL-WTNN [37] recovers an low-rank tensor by using the t-SVD-based WTNN to approximate the tensor low rank.

4.2. Datasets

We take Flower (https://www.robots.ox.ac.uk/~vgg/data/flowers ), Coil20 (https://www.cs.columbia.edu/CAVE/software/softlib), and NUS-WIDE (https://lms.comp.nus.edu.sg/wp-content/uploads/2019/research/nuswide/NUS-WIDE.html) to evaluate the clustering performance. The Flower dataset consists of 1360 samples from 17 different flower categories. These images have large scale, pose, and light variations. Following [34], three different features, 1360d-color, 1360d-texture, and 1360d-shape, are extracted. Coil20 is an object dataset containing 1440 images from 20 object categories. We employ the 4096d intensity features and 3304d LBP features as the two views. NUS-WIDE is a real-world web image dataset from the National University of Singapore. We adopt six types of low-level features extracted from 2400 images of 12 concepts, including a 64-D color histogram, 144-D color correlogram, 75-D edge direction histogram, 128-D wavelet texture, 225-D block-wise color moments, and 500-D bag of words based on SIFT descriptions.

4.3. Experimental Process

To quantitatively evaluate the performance of all these clustering methods, we adopt five widely used metrics, accuracy (ACC), normalized mutual information (NMI), Purity, F-score, and average entropy (AVE). For ACC, NMI, Purity, and F-score, larger values represent better a clustering performance. For AVE, lower values indicate better a clustering performance.
For the compared methods, we download their codes and perform experiments following the relative papers. For RMSC, ETL-MSC, and MGL-WTNN, one parameter is involved and is selected from [ 0.0001 ,   1 ] . Two parameters in ATPML-MSC are changed in the range of [ 0.0001 ,   0.5 ] , respectively. For each method, we fix the parameters to the ones corresponding to the better accuracy and repeat experiments ten times to obtain the mean value and standard deviation.

4.4. Experimental Results and Analysis

The clustering results on all the three datasets are shown in Table 1, Table 2 and Table 3, where the best results are highlighted in bold and the second ones are underlined.
As can be seen from Table 1, Table 2 and Table 3, the proposed RTL-MSC method almost outperforms the other methods including the the recently proposed ATPML-MSC and MGL-WTNN. For example, RTL-WTNN improves by about 2.56 % , 1.90 % , and 2.14 % for ACC, Purity, and F-score over the most competitive method ATPML-MSC on the Coil20 dataset, respectively. This represents that the affinity matrix is more effective for a multi-view spectral clustering task. Compared with MGL-WTNN, RTL-MSC improves by 11.16 % , 5.66 % , and 8.87 % in terms of ACC, NMI, and Purity on the Coil20 dataset. This means that the global geometric structure in each single view and the 2 , p norm-based error constraint are useful for multi-view clustering. ATPML-MSC outperforms ETL-MSC on all these three datasets. This is mainly because the affinity matrix is learned in a one-step process with a probability constraint. MGL-WTNN outperforms ETL-MSC on all these three datasets for almost all metrics. This demonstrates that shrinking each singular value with an adaptive learned weight improves the clustering performance efficiently. On the NUS-WIDE dataset, the SC based on a single view, RMSC and MSCAN, obtains an unsatisfied clustering performance. The major reason may be that the intrinsic structure of each view and the noise are not well depicted by these three methods.
There are three parameters, α , β , and p, involved in Equation (4), which directly affect the performance of the proposed RTL-MSC. To investigate the influence of α and β , we set p = 1 and select α and β from [ 0.01 , 0.5 ] and [ 0.0001 , 0.01 ] , respectively. We report the ACC and NMI versus different combinations of α and β on the Flower and Coil20 datasets in Figure 1. It is shown that when α changes in [ 0.04 , 0.5 ] , both ACC and NMI fluctuate slightly via β , and RTL-WTNN achieves promising clustering results on the Flower and Coil20 datasets. This means that the proposed RTL-WTNN is insensitive to α and β when they are changed in a suitable range.
Moreover, we fix α and β and change p in the range of [ 0.9 , 1 ] to investigate the influence of the 2 , p norm. As shown in Figure 2, the highest values of ACC, NMI, and Purity is achieved at p = 0.98 , which illustrates that t h e 2 , p norm improves the clustering performance.

4.5. Ablation Study

To investigate the effect of exploiting the low-rank structure of the each frontal slice of the essential tensor and the 2 , p norm-based error constraint, we perform an ablation study. First, we remove the third term in Equation (4) and mark the derived model as RTL-WNN-1. Then, we also perform experiments by using the 2 , 1 norm to regularize the error tensor, which is termed as RTL-WTNN-2. The results are shown in Figure 3. It can be seen that RTL-WTNN outperforms RTL-WTNN-1 and RTL-WTNN-2 on all the three datasets.

5. Conclusions

We develop a novel tensor learning method for multi-view spectral clustering, which can depict well the noise embedded in each view by using the 2 , p norm to constrain the error tensor. Additionally, by adopting the tensor low-rank constraint and low-rank regularization of each frontal slice of the essential tensor, the common affinity matrix computed by the essential tensor can capture well the intrinsic clustering structure of the multi-view data. The experiments on three widely used datasets exhibit that the proposed RTL-MSC improves by about 1.02–11.16%, 1.39–12.02%, 1.02–8.87%, and 1.99–11.42% in terms of ACC, NMI, Purity, and F-score over the baseline MGL-WTNN, which illustrates the superiority of the proposed model. We plan to investigate a fast multi-view spectral clustering algorithm for clustering big data in our next work.

Author Contributions

Software, Z.L.; Writing—original draft, D.X.; Writing—review & editing, Y.S.; Supervision, W.S. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by the National Natural Science Foundation of China grant number 62175159, National Key Research and Development Program of China grant number 2023YFF0715300, Natural Science Foundation of Shandong Province grant number ZR202102180986, Natural Science Foundation of Guangdong Province grant number 2023A1515012888, Qingdao Agricultural University grant number 665/1120051, and Qingchuang Talents Induction Program of Shandong Higher Education Institution grant number 018-1622001. The APC was funded by Natural Science Foundation of Shandong Province grant number ZR202102180986, and Natural Science Foundation of Shandong Province grant number ZR202102180986.

Data Availability Statement

Data are contained within the article.

Acknowledgments

The authors would like to thank the anonymous reviewers for their constructive comments and suggestions.

Conflicts of Interest

The authors declare no conflicts of interest.

References

  1. Shi, J.; Malik, J. Normalized cuts and image segmentation. IEEE Trans. Pattern Anal. Mach. Intell. 2000, 22, 888–905. [Google Scholar]
  2. Ng, A.; Jordan, M.; Weiss, Y. On spectral clustering: Analysis and an algorithm. Adv. Neural Inf. Process. Syst. 2001, 14, 849–856. [Google Scholar]
  3. Fan, J.; Tu, Y.; Zhang, Z.; Zhao, M.; Zhang, H. A simple approach to automated spectral clustering. Adv. Neural Inf. Process. Syst. 2022, 35, 9907–9921. [Google Scholar]
  4. Yan, Y.; Shen, C.; Wang, H. Efficient semidefinite spectral clustering via Lagrange duality. IEEE Trans. Image Process. 2014, 23, 3522–3534. [Google Scholar] [CrossRef] [PubMed]
  5. Meila, M.; Shi, J. Learning segmentation by random walks. Adv. Neural Inf. Process. Syst. 2000, 13, 837–843. [Google Scholar]
  6. Roweis, S.T.; Saul, L.K. Nonlinear dimensionality reduction by locally linear embedding. Science 2000, 290, 2323–2326. [Google Scholar] [CrossRef] [PubMed]
  7. Elhamifar, E.; Vidal, R. Sparse subspace clustering: Algorithm, theory, and applications. IEEE Trans. Pattern Anal. Mach. Intell. 2013, 35, 2765–2781. [Google Scholar] [CrossRef] [PubMed]
  8. Nie, F.; Cai, G.; Li, X. Multi-view clustering and semi-supervised classification with adaptive neighbours. In Proceedings of the AAAI Conference on Artificial Intelligence, San Francisco, CA, USA, 4–9 February 2017; Volume 31. [Google Scholar]
  9. Zhou, S.; Liu, X.; Liu, J.; Guo, X.; Zhao, Y.; Zhu, E.; Zhai, Y.; Yin, J.; Gao, W. Multi-view spectral clustering with optimal neighborhood Laplacian matrix. In Proceedings of the AAAI Conference on Artificial Intelligence, New York, NY, USA, 7–12 February 2020; Volume 34, pp. 6965–6972. [Google Scholar]
  10. Alzate, C.; Suykens, J.A. Multiway spectral clustering with out-of-sample extensions through weighted kernel PCA. IEEE Trans. Pattern Anal. Mach. Intell. 2008, 32, 335–347. [Google Scholar] [CrossRef] [PubMed]
  11. Xia, T.; Tao, D.; Mei, T.; Zhang, Y. Multiview spectral embedding. IEEE Trans. Syst. Man, Cybern. Part B Cybern. 2010, 40, 1438–1446. [Google Scholar]
  12. Lu, C.; Yan, S.; Lin, Z. Convex sparse spectral clustering: Single-view to multi-view. IEEE Trans. Image Process. 2016, 25, 2833–2843. [Google Scholar] [CrossRef]
  13. El Hajjar, S.; Dornaika, F.; Abdallah, F. Multi-view spectral clustering via constrained nonnegative embedding. Inf. Fusion 2022, 78, 209–217. [Google Scholar] [CrossRef]
  14. Zong, L.; Zhang, X.; Liu, X.; Yu, H. Weighted multi-view spectral clustering based on spectral perturbation. In Proceedings of the AAAI Conference on Artificial Intelligence, New Orleans, LA, USA, 2–7 February 2018; Volume 32. [Google Scholar]
  15. Nie, F.; Li, J.; Li, X. Parameter-free auto-weighted multiple graph learning: A framework for multiview clustering and semi-supervised classification. In Proceedings of the IJCAI, New York, NY, USA, 9–15 July 2016; Volume 9, pp. 1881–1887. [Google Scholar]
  16. Zhan, K.; Zhang, C.; Guan, J.; Wang, J. Graph learning for multiview clustering. IEEE Trans. Cybern. 2017, 48, 2887–2895. [Google Scholar] [CrossRef] [PubMed]
  17. Xie, D.; Zhang, X.; Gao, Q.; Han, J.; Xiao, S.; Gao, X. Multiview clustering by joint latent representation and similarity learning. IEEE Trans. Cybern. 2019, 50, 4848–4854. [Google Scholar] [CrossRef] [PubMed]
  18. Kang, Z.; Shi, G.; Huang, S.; Chen, W.; Pu, X.; Zhou, J.T.; Xu, Z. Multi-graph fusion for multi-view spectral clustering. Knowl.-Based Syst. 2020, 189, 105102. [Google Scholar] [CrossRef]
  19. Zhong, G.; Shu, T.; Huang, G.; Yan, X. Multi-view spectral clustering by simultaneous consensus graph learning and discretization. Knowl.-Based Syst. 2022, 235, 107632. [Google Scholar] [CrossRef]
  20. Tang, C.; Li, Z.; Wang, J.; Liu, X.; Zhang, W.; Zhu, E. Unified one-step multi-view spectral clustering. IEEE Trans. Knowl. Data Eng. 2022, 35, 6449–6460. [Google Scholar] [CrossRef]
  21. Houthuys, L.; Langone, R.; Suykens, J.A. Multi-view kernel spectral clustering. Inf. Fusion 2018, 44, 46–56. [Google Scholar] [CrossRef]
  22. Huang, Z.; Zhou, J.T.; Peng, X.; Zhang, C.; Zhu, H.; Lv, J. Multi-view Spectral Clustering Network. In Proceedings of the IJCAI, Macao, 10–16 August 2019; Volume 2, p. 4. [Google Scholar]
  23. Zhao, M.; Yang, W.; Nie, F. Deep multi-view spectral clustering via ensemble. Pattern Recognit. 2023, 144, 109836. [Google Scholar] [CrossRef]
  24. Wang, Q.; Cheng, J.; Gao, Q.; Zhao, G.; Jiao, L. Deep multi-view subspace clustering with unified and discriminative learning. IEEE Trans. Multimed. 2020, 23, 3483–3493. [Google Scholar] [CrossRef]
  25. Tao, Q.; Tonin, F.; Patrinos, P.; Suykens, J.A. Tensor-based multi-view spectral clustering via shared latent space. Information Fusion 2024, 108, 102405. [Google Scholar] [CrossRef]
  26. Goldfarb, D.; Qin, Z. Robust low-rank tensor recovery: Models and algorithms. SIAM J. Matrix Anal. Appl. 2014, 35, 225–253. [Google Scholar] [CrossRef]
  27. Chen, Y.; Xiao, X.; Peng, C.; Lu, G.; Zhou, Y. Low-rank tensor graph learning for multi-view subspace clustering. IEEE Trans. Circuits Syst. Video Technol. 2021, 32, 92–104. [Google Scholar] [CrossRef]
  28. Wu, J.; Lin, Z.; Zha, H. Essential tensor learning for multi-view spectral clustering. IEEE Trans. Image Process. 2019, 28, 5910–5922. [Google Scholar] [CrossRef] [PubMed]
  29. Zhang, Y.; Yang, W.; Liu, B.; Ke, G.; Pan, Y.; Yin, J. Multi-view spectral clustering via tensor-SVD decomposition. In Proceedings of the 2017 IEEE 29th International Conference on Tools with Artificial Intelligence (ICTAI), Boston, MA, USA, 6–8 November 2017; IEEE: Piscataway, NJ, USA, 2017; pp. 493–497. [Google Scholar]
  30. Wang, S.; Chen, Y.; Cen, Y.; Zhang, L.; Wang, H.; Voronin, V. Nonconvex low-rank and sparse tensor representation for multi-view subspace clustering. Appl. Intell. 2022, 52, 14651–14664. [Google Scholar] [CrossRef]
  31. Gao, Q.; Xia, W.; Wan, Z.; Xie, D.; Zhang, P. Tensor-SVD based graph learning for multi-view subspace clustering. In Proceedings of the AAAI Conference on Artificial Intelligence, New York, NY, USA, 7–12 February 2020; Volume 34, pp. 3930–3937. [Google Scholar]
  32. Jia, Y.; Liu, H.; Hou, J.; Kwong, S.; Zhang, Q. Multi-view spectral clustering tailored tensor low-rank representation. IEEE Trans. Circuits Syst. Video Technol. 2021, 31, 4784–4797. [Google Scholar] [CrossRef]
  33. Gao, Q.; Zhang, P.; Xia, W.; Xie, D.; Gao, X.; Tao, D. Enhanced tensor RPCA and its application. IEEE Trans. Pattern Anal. Mach. Intell. 2020, 43, 2133–2140. [Google Scholar] [CrossRef] [PubMed]
  34. Chen, Y.; Xiao, X.; Hua, Z.; Zhou, Y. Adaptive transition probability matrix learning for multiview spectral clustering. IEEE Trans. Neural Netw. Learn. Syst. 2021, 33, 4712–4726. [Google Scholar] [CrossRef]
  35. Wang, S.; Chen, Y.; Jin, Y.; Cen, Y.; Li, Y.; Zhang, L. Error-robust low-rank tensor approximation for multi-view clustering. Knowl.-Based Syst. 2021, 215, 106745. [Google Scholar] [CrossRef]
  36. Xia, R.; Pan, Y.; Du, L.; Yin, J. Robust multi-view spectral clustering via low-rank and sparse decomposition. In Proceedings of the AAAI Conference on Artificial Intelligence, Québec City, QC, Canada, 27–31 July 2014; Volume 28. [Google Scholar]
  37. Xie, D.; Gao, Q.; Deng, S.; Yang, X.; Gao, X. Multiple graphs learning with a new weighted tensor nuclear norm. Neural Netw. 2021, 133, 57–68. [Google Scholar] [CrossRef]
  38. Yang, M.; Luo, Q.; Li, W.; Xiao, M. Multiview clustering of images with tensor rank minimization via nonconvex approach. SIAM J. Imaging Sci. 2020, 13, 2361–2392. [Google Scholar] [CrossRef]
  39. Horn, R.A.; Johnson, C.R. Matrix Analysis; Cambridge University Press: Cambridge, UK, 2012. [Google Scholar]
  40. Liu, G.; Lin, Z.; Yu, Y. Robust subspace segmentation by low-rank representation. In Proceedings of the 27th International Conference on Machine Learning (ICML-10), Haifa, Israel, 21–24 June 2010; pp. 663–670. [Google Scholar]
  41. Cai, J.F.; Candès, E.J.; Shen, Z. A singular value thresholding algorithm for matrix completion. SIAM J. Optim. 2010, 20, 1956–1982. [Google Scholar] [CrossRef]
Figure 1. Parameter analysis on (a,b) Flower and (c,d) Coil20 datasets.
Figure 1. Parameter analysis on (a,b) Flower and (c,d) Coil20 datasets.
Electronics 13 02181 g001
Figure 2. Clustering result change versus p on the Coil20 dataset.
Figure 2. Clustering result change versus p on the Coil20 dataset.
Electronics 13 02181 g002
Figure 3. Ablation study on all the three datasets: (a) ACC, (b) NMI, and (c) Purity.
Figure 3. Ablation study on all the three datasets: (a) ACC, (b) NMI, and (c) Purity.
Electronics 13 02181 g003
Table 1. Clustering results (mean ± standard deviation) on Flower.
Table 1. Clustering results (mean ± standard deviation) on Flower.
MethodsACCNMIPurityF-scoreAVE
S C best 0.3737 ± 0.0200.3975 ± 0.0170.3916 ± 0.0170.2637 ± 0.0082.4604 ± 0.027
RMSC0.3796 ± 0.0150.4167 ± 0.0080.3982 ± 0.0150.2685 ± 0.0082.3890 ± 0.032
MSCAN0.4585 ± 0.0010.5070 ± 0.0010.4852 ± 0.0020.3164 ± 0.0022.1405 ± 0.009
ETL-MSC0.9108 ± 0.0340.9114 ± 0.0160.9167 ± 0.0280.8626 ± 0.0330.3693 ± 0.068
ATPML-MSC0.9772 ± 0.0010.9661 ± 0.0020.9772 ± 0.0010.9554 ± 0.0040.1395 ± 0.010
MGL-WTNN0.9811 ± 0.0010.9703 ± 0.0010.9811 ± 0.0010.9628 ± 0.0010.1215 ± 0.003
RTL-WTNN0.9874 ± 0.0010.9800 ± 0.0010.9874 ± 0.0010.9753 ± 0.0010.0818 ± 0.003
Table 2. Clustering results (mean ± standard deviation) on Coil20.
Table 2. Clustering results (mean ± standard deviation) on Coil20.
MethodsACCNMIPurityF-scoreAVE
S C best 0.6886 ± 0.0260.8131 ± 0.0100.7276 ± 0.0190.6609 ± 0.0250.8388 ± 0.048
RMSC0.7026 ± 0.0200.8022 ± 0.0060.7101 ± 0.0110.6628 ± 0.0150.8668 ± 0.031
MSCAN0.8061 ± 0.0220.9299 ± 0.0050.8481 ± 0.0160.7441 ± 0.0300.4266 ± 0.044
ETL-MSC0.8730 ± 0.0210.9204 ± 0.0090.8811 ± 0.0150.8510 ± 0.0200.3518 ± 0.040
ATPML-MSC0.9513 ± 0.0380.9795 ± 0.0160.9616 ± 0.0310.9481 ± 0.0410.0990 ± 0.078
MGL-WTNN0.8853 ± 0.0230.9300 ± 0.0060.8919 ± 0.0150.8645 ± 0.0160.3098 ± 0.032
RTL-WTNN0.9769 ± 0.0280.9866 ± 0.0110.9806 ± 0.0200.9695 ± 0.0270.0627 ± 0.053
Table 3. Clustering results (mean ± standard deviation) on NUS-WIDE.
Table 3. Clustering results (mean ± standard deviation) on NUS-WIDE.
MethodsACCNMIPurityF-scoreAVE
S C best 0.2338 ± 0.0080.1095 ± 0.0040.2457 ± 0.0070.1361 ± 0.0023.1950 ± 0.016
RMSC0.2962 ± 0.0040.1556 ± 0.0040.3035 ± 0.0040.1758 ± 0.0033.029 ± 0.013
MSCAN0.2493 ± 0.0050.1972 ± 0.0040.2643 ± 0.0050.1739 ± 0.0013.0376 ± 0.012
ETL-MSC0.6906 ± 0.0010.7007 ± 0.0010.6927 ± 0.0010.6243 ± 0.0011.078 ± 0.004
ATPML-MSC0.9286 ± 0.0010.8844 ± 0.0010.9286 ± 0.0010.8658 ± 0.0020.4062 ± 0.005
MGL-WTNN0.8787 ± 0.0010.7774 ± 0.0010.8787 ± 0.0010.7791 ± 0.0010.7990 ± 0.003
RTL-WTNN0.9441 ± 0.0010.8976 ± 0.0010.9441 ± 0.0010.8933 ± 0.0010.3650 ± 0.001
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Xie, D.; Li, Z.; Sun, Y.; Song, W. Robust Tensor Learning for Multi-View Spectral Clustering. Electronics 2024, 13, 2181. https://doi.org/10.3390/electronics13112181

AMA Style

Xie D, Li Z, Sun Y, Song W. Robust Tensor Learning for Multi-View Spectral Clustering. Electronics. 2024; 13(11):2181. https://doi.org/10.3390/electronics13112181

Chicago/Turabian Style

Xie, Deyan, Zibao Li, Yingkun Sun, and Wei Song. 2024. "Robust Tensor Learning for Multi-View Spectral Clustering" Electronics 13, no. 11: 2181. https://doi.org/10.3390/electronics13112181

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop