1. Introduction
In real-world applications, samples can usually be collected by diverse data collection sources or various feature extraction methods, which are often represented with multiple views. Taking images for example, one color image can be represented by multiple types of descriptors [
1,
2,
3,
4], such as Gist [
5], histogram of oriented gradient (HoG) [
6], local binary patterns (LBP) [
7], SIFT [
8], etc. [
9,
10,
11]. Since features from different views may characterize different specific information of one sample, complementarity among different views can be explored for improving the performances of single-view learning algorithms.
In many fields, most constructed features are usually high-dimensional, while the underlying structure can be described by a small number of parameters in most situations. Direct manipulations on these features are time-consuming and computationally expensive. Dimensionality reduction (DR), therefore, has become a basic preprocessing technique to deal with such data in most problems. In the past few decades, plenty of dimensionality reduction algorithms were proposed to seek the optimal subspace for original high-dimensional data based on different principles. One class of dimensionality reduction algorithms is based on manifold embedding, which seeks to construct the low-dimensional representation that maintains the graph affinity between the samples as much as possible. The graph embedding-based methods can be divided into linear methods and nonlinear methods. Linear methods aim to seek the appropriate projection matrix to project the high-dimensional data onto an optimal low-dimensional subspace. One of the most famous linear algorithms is principal component analysis (PCA) [
12], which maximizes the global variance of data to obtain the low-dimensional subspace. Linear discriminant analysis (LDA) [
13] is a supervised learning method that seeks to construct the projection matrix that can maximize the separation between the classes. In addition, there are also many representative and classical linear dimensionality reduction methods such as locality preserving projections (LPP) [
14], neighborhood preserving embedding (NPE) [
15], marginal Fisher analysis (MFA) [
16], etc. [
15,
17,
18].
Different from linear dimensionality reduction methods, nonlinear methods agree with one famous hypothesis that the observed high-dimensional data is actually mapped from a low-dimensional submanifold. There are many representative nonlinear dimensionality reduction algorithms such as locally linear embedding (LLE) [
19], Isomap [
20], Laplacian eigenmaps (LE) [
21], etc. [
22,
23], that have been well studied. Moreover, these traditional DR methods construct a graph in advance and then find an optimal subspace that can preserve such a graph as much as possible. The graph construction and dimensionality reduction processes are separated, which leads these methods to seek the subspace utilizing a suboptimal graph. To address this issue, some algorithms such as graph-optimized locality preserving projections (GoLPP) [
24], dimensionality reduction with adaptive graph (DRAG) [
25] and joint graph optimization and projection learning (JGOPL) [
26] are proposed to introduce the graph optimization into dimensionality reduction procedure. These methods aim to simultaneously seek an optimal graph and subspace in one objective function.
Although plenty of dimensionality reduction methods have good performance in dealing with high-dimensional data, most of them fail to extend to the multi-view setting directly. As they cannot effectively explore the inherent relation among different views features. In the past decade, multi-view learning has been well-developed in various fields [
27,
28,
29,
30,
31]. Canonical correlation analysis (CCA) [
32] and its multi-view version multi-view canonical correlation analysis (MCCA) [
33] are famous algorithms that are widely adopted as a regularization term for multi-view learning. Distributed spectral embedding (DSE) [
34] aims to construct one common low-dimensionality embedding based on the smooth principle. However, since the original multi-view data are invisible to the final learning process, it cannot well explore the complementary nature of different views. To overcome this problem, Xia et al. propose a nonlinear dimensionality reduction algorithm for multi-view data termed multi-view spectral embedding (MSE) [
35], which effectively explores the complementary and compatible information from different views to construct one common low-dimensional embedding for all views. Kan et al. [
36] extend LDA to a multi-view setting and propose multi-view discriminant analysis (MvDA) to project multi-view data to a common discriminative space. Ding et al. [
37] propose a low-rank common subspace (LRCS) to seek one common linear subspace with low-rank constraint for each view based on a compatible principle that aims to reduce the semantic gap between different views. However, most multi-view dimensionality reduction algorithms construct the graph by original high-dimensional data, which is independent of subspace learning. Therefore, such DR results of these algorithms are sensitive to the graph construction. If the predefined graph is of low quality, the quality of the results of dimensionality reduction may also be low.
To deal with these issues, in this paper, we incorporate graph optimization and low-dimensional subspace learning for multi-view data into one common framework to propose graph optimization multi-view projections learning (GoMPL) for dimensionality reduction. Since features from different views are exploited to describe the same sample, they usually admit the same underlying similarity structure. Based on this hypothesis, GoMPL aims to learn one intrinsic graph structure of samples and seek the subspace by preserving such a graph for each view simultaneously. In the whole learning procedure, the common graph is allowed to be adaptively adjusted based on low-dimensional representations of each view. Moreover, the information contained in original high-dimensional data is also important [
25]. Therefore, we further regularize the target similarity graph as a centroid of the prespecified graph of each view, which introduce information on original data into the optimal graph. Specifically, the learned optimal graph is also employed to integrate the information from multi-view data, which avoids co-regularizing all the views to a common subspace [
29]. Assigning an appropriate weight to each view based on some principles is essential in multi-view learning, so our proposed GoMPL provides a self-weighted scheme to automatically learn the weights in the graph learning process, which releases from predefining hyperparameters experientially. We provide an effective updating algorithm to solve the proposed GoMPL. Plenty of experiments on the various datasets evaluate the effectiveness of our proposed GoMPL. We summarize the contributions of our work as:
We propose a novel multi-view dimensionality algorithm called GoMPL which can seek one common underlying manifold structure for samples described by multi-view features and the appropriate subspace for each view simultaneously.
We adopt one optimal graph to learn the projections for all views. This graph can be further optimized based on both the low-dimensional representation of each view and the affinity of original high-dimensional data. Therefore, GoMPL can integrate the multi-view information by the common graph rather than co-regularize the low-dimensional representation of each view.
Since the information from the original affinity of each view is also important, GoMPL regularizes the target similarity graph as a centroid of the prespecified graph of each view. Moreover, different views may take different contributions to understand the underlying manifold structure; GoMPL adaptively allocates each view an appropriate weight without predefining hyper-parameters.
5. Conclusions
By introducing graph optimization, we propose a novel dimensionality reduction algorithm for multi-view data termed GoMPL, which integrates dimensionality reduction and graph optimization into unified frameworks to construct the projection matrices for different views and optimize the graph jointly. Therefore, the proposed GoMPL performs dimensionality reduction based on the optimal high-quality graph for each view. Moreover, the learned optimal graph by the proposed GoMPL can also integrate the information from multi-view data without co-regularizing the low-dimensional representations. Furthermore, to consider the information of original high-dimensional multi-view data, GoMPL regularizes the common target graph to approximate the predefined graphs based on them. Plenty of experiments demonstrate that the proposed GoMPL can effectively explore the underlying intrinsic manifold structure of samples described by features from multiple views, and find more appropriate subspace for each view features than compared algorithms.
Our proposed GoMPL can explore the complementarity among multiple views for subspace learning. However, there are still some issues that require further clarification and possible future investigations. First, since the dimensionality reduction is constructed based on graph embedding it leads to computational cost on the matrix operations. Specifically, matrix inversion and eigendecomposition exploited in our method make our algorithm with high computational costs. However, in many practical applications, the scale of datasets is very large. Therefore, to deal with large-scale datasets, we will utilize DeepWalk [
42] technique to accelerate the graph embedding speed for the proposed GoMPL. Second, GoMPL learns one common graph for all views to perform dimensionality reduction, which is not flexible enough. In future work, we will consider learning an optimal graph for each view to explore the underlying geometric structure of samples from multi-view data more flexibly.