Harnessing Unsupervised Insights: Enhancing Black-Box Graph Injection Attacks with Graph Contrastive Learning

Liu, Xiao; Huang, Junjie; Chen, Zihan; Pan, Yi; Xiong, Maoyi; Zhao, Wentao

doi:10.3390/app14209190

Open AccessArticle

Harnessing Unsupervised Insights: Enhancing Black-Box Graph Injection Attacks with Graph Contrastive Learning

by

Xiao Liu

,

Junjie Huang

,

Zihan Chen

,

Yi Pan

,

Maoyi Xiong

and

Wentao Zhao

^*

College of Computer Science and Technology, National University of Defense Technology, Changsha 410073, China

^*

Author to whom correspondence should be addressed.

Appl. Sci. 2024, 14(20), 9190; https://doi.org/10.3390/app14209190 (registering DOI)

Submission received: 22 August 2024 / Revised: 4 October 2024 / Accepted: 8 October 2024 / Published: 10 October 2024

Download

Browse Figures

Versions Notes

Abstract

:

Adversarial attacks on Graph Neural Networks (GNNs) have emerged as a significant threat to the security of graph learning. Compared with Graph Modification Attacks (GMAs), Graph Injection Attacks (GIAs) are considered more realistic attacks, in which attackers perturb GNN models by injecting a small number of fake nodes. However, most existing black-box GIA methods either require comprehensive knowledge of the dataset and the ground-truth labels or a large number of queries to execute the attack, which is often unfeasible in many scenarios. In this paper, we propose an unsupervised method for leveraging the rich knowledge contained in the graph data themselves to enhance the success rate of graph injection attacks on the initial query. Specifically, we introduce GraphContrastive Learning-based Graph Injection Attack (GCIA), which consists of a node encoder, a reward predictor, and a fake node generator. The Graph Contrastive Learning (GCL)-based node encoder transforms nodes for low-dimensional continuous embedding, the reward predictor acts as a simplified surrogate for the target model, and the fake node generator produces fake nodes and edges based on several carefully designed loss functions, utilizing the node encoder and reward predictor. Extensive results demonstrate that the proposed GCIA method achieves a first query success rate of 91.2% on the Reddit dataset and improves the success rate to over 99.7% after 10 queries.

Keywords:

graph injection attack; query-based attack; graph contrastive learning; black-box attack

1. Introduction

Graph Neural Networks (GNNs) have been remarkably successful across a spectrum of graph-based learning tasks, including node classification [1], edge prediction [2], and graph classification [3]. Node classification, in particular, is a pivotal task in this domain. It involves GNNs in learning node representations by integrating information from adjacent nodes, which then aids in mapping these nodes to their respective categories. For example, in the context of social networks, information collectors can predict political inclinations [4,5], hobbies [6], or social bots [7]. Similarly, in citation networks, GNNs can categorize scholarly articles into various classes [8]. However, the rich and flexible structural information in graph data, as compared to Euclidean data types like images, audio, or text, makes it more susceptible to adversarial attacks. Adversarial attacks on graph data can be more detrimental due to the intricate connections and dependencies within the graph structure.

Prior research has established that Graph Modification Attacks (GMAs) [9,10,11] can significantly impair the efficacy of Graph Neural Networks (GNNs) by introducing adversarial perturbations to the graph structure or node attributes. In such scenarios, an attacker capable of executing GMA has the ability to either establish new edges or delete existing ones between nodes within the graph. Moreover, they can also manipulate the node attributes.

In the majority of scenarios, it is challenging for an attacker to possess such extensive operational privileges. Graph Injection Attack (GIA), which injects fake nodes into the graph without modifying the existing graph data, is gaining more attention. For instance, in social networks, it would be difficult for a GMA attacker to alter the attributes of a genuine user or to force a connection between the target and another real user. Conversely, for a GIA attacker, registering a fake user with specific attributes and creating a link with the victim real user is a task that can be achieved easily.

GIAs present a more complex challenge than GMAs as they require the attacker to create fake nodes with unnoticeable adversarial features. Additionally, the attacker must carefully connect the fake nodes with the existing nodes, as well as establishing edges among the fake nodes themselves, ensuring that their injection into the graph is both inconspicuous and aggressive. The majority of graph injection attacks are conducted under white-box conditions [12,13], where the attacker enjoys unrestricted access to both the parameters of the target model and the graph data. In this adversarial setting, attackers can significantly weaken the target model’s performance by introducing meticulously crafted fake nodes. However, in the context of real-world applications, it is exceedingly difficult for an attacker to gain access to the training datasets or to possess full knowledge of the target model’s parameters. Consequently, the black-box scenario is a more realistic adversarial setup for graph injection attacks, as it restricts attackers from accessing detailed information about the target model, limiting their interactions to query certain data and outputs received from the target model.

Black-box graph injection attacks can be categorized into two types, i.e., transfer-based and query-based strategies. The transfer-based approach initiates an attack by training a substitute model, in which adversarial nodes are crafted based on the surrogate model and subsequently transferred to the target black-box model. Drawing on extensive knowledge from the white-box attack domain, the transfer-based approach does not need to know the inner workings of the target model’s parameters. However, this approach demands a thorough comprehension of the graph dataset, encompassing the features and connections of the training nodes as well as the corresponding true labels. In contrast, the query-based approach starts with an initial set of fake nodes, enabling the attacker to optimize these nodes through a process of iteratively querying the target model, particularly focusing on the classification confidence scores of the target node. These methods eschew the need for ground-truth labels, requiring only the sub-graph that includes the victim node and its neighbors. Consequently, a query-based approach can be regarded as a more practical and better-suited approach to the constraints of real-world contexts.

In query-based attacks, the features of the fake nodes are usually randomly initialized, followed by a process of iterative optimization via repeated queries, which demands multiple queries to effectively perform the attack. To bridge this gap, we introduce a strategy that harnesses graph knowledge in an unsupervised manner, through Graph Contrastive Learning (GCL) [14,15,16], to guide the initial generation of the fake nodes. This approach significantly enhances the attack success rate in the initial query and reduces the overall number of queries required. Specifically, the proposed method involves training a GCL model to capture the embeddings of the target nodes. Leveraging this model, we fine-tune the features and edges of the fake nodes by maximizing the variation in the embeddings of the target nodes after injecting fake nodes. This technique employs unsupervised graph knowledge to sharpen the attack strategy. After obtaining responses from the target model, we utilize the classification confidence scores of the target nodes to train a simple classifier that stands in as a surrogate for the target model. By minimizing the confidence level in the genuine class, we are able to effectively optimize the fake nodes using the output from our classifier.

In this work, we concentrate on the query-based black-box graph injection attack, which closely resembles real-world scenarios. Within this framework, access is limited to the features of the target nodes and their neighbors, as well as the adjacency matrix that represents the connections between these nodes. Furthermore, the ability to query the target model is also available. To launch an evasion attack under these constrained conditions, we introduce a Graph Contrastive Learning-based Graph Injection Attack (GCIA) method. The GCIA methodology is comprised of three core components: a node encoder, a reward predictor, and a fake node generator. The node encoder in GCIA is a Graph Contrastive Learning (GCL) model equipped with two graph convolutional layers. It employs a strategy of randomly removing edges and masking node features to generate diverse graph views, thereby producing node embeddings in an unsupervised fashion. The reward predictor functions as a surrogate for the target model, learning the correlation between the node embeddings and the query results through a straightforward Multi-Layer Perceptron (MLP) architecture. The fake node generator is designed to optimize the fake nodes and their edges, with the dual objectives of maximizing the shift in the embedding space of the target node and minimizing the categorical confidence scores provided by the reward predictor. The proposed GCIA method initiates by training a feature encoder on the accessible graph data, which is followed by an iterative process of generating fake nodes and conducting queries. With each query result, the reward predictor is meticulously fine-tuned using the queried classification confidence scores to better grasp the decision boundaries of the target model. Thereafter, the fake node generator utilizes the insights from previous unsuccessful forays to produce fake nodes and edges that are more likely to lead to a successful attack. The key contributions of this work are as follows:

To the best of our knowledge, this is the first work that utilizes unsupervised learning for graph injection attacks. By maximizing the changes in the target node in the embedding space, we have significantly enhanced the success rate of the initial attack query.
The proposed reward predictor and fake node generator can fully leverage querying results. The reward predictor fine-tunes the model parameters using the query results, ensuring that predictions are closer to the actual query results. The fake node generator enhances the success rate of the attack by generating fake nodes that are dissimilar to those from previous unsuccessful cases.
Through extensive experiments conducted on various recognized benchmark datasets, the effectiveness of the proposed GCIA method has been demonstrated in comparison to state-of-the-art attack models across varying attack budgets. In the single fake node–single fake edge injection attack scenario, when the target black-box model is a GCN, after conducting 10 queries, the GCIA method achieved attack success rates of 43.5%, 60.3%, 100%, and 91.1% on the Cora, Citeseer, Reddit, and PubMed datasets, respectively.

This work is an extension of our previous study, presented at the ICASSP 2024 conference [17]. The current manuscript expands upon the preliminary findings reported in the conference paper by including multi-fake node attacks and multi-fake edge attacks. Extensive experiments have been conducted to test the impact of the fake node number and fake edge number on the attack performance.

The remainder of the paper is organized as follows: In Section 2, we review the relevant literature on graph injection attacks and graph contrastive learning. In Section 3, we introduce the preliminaries of graph contrastive learning and graph injection attacks. Then, in Section 4, we introduce the proposed Graph Contrastive Learning-based Graph Injection Attack (GCIA) method. The node encoder, the reward predictor, and the fake node generator are detailed in Section 4.2, Section 4.3, and Section 4.4, respectively. Section 5 describes our experimental results, after we evaluated the performance of the proposed GCIA method across four adversarial settings. Finally, Section 6 concludes with a summary and an outline of promising directions for future works.

2. Related Work

In the following, we briefly introduce graph injection attack methods and graph contrastive learning models.

2.1. Graph Injection Attack

In the majority of research, adversarial attacks on Graph Neural Networks (GNNs) are executed by modifying node features or edges in the graph, which is known as Graph Modification Attack (GMA). While GMA has shown remarkable efficacy in a multitude of attack scenarios, it is based on the premise that attackers can easily alter the original graph, an assumption that is unrealistic in most situations. To address this concern, researchers have proposed Graph Injection Attack (GIA), which injects adversarial fake nodes into the graph instead of modifying the origin graph.

In white-box graph injection attacks, the injected fake nodes and fake edges are typically generated based on the gradient values. Sun et al. [18] propose the Node Injection Poisoning Attack (NIPA) method, which makes use of the reinforcement learning method to generate fake nodes and fake edges for poisoning attacks. Fang et al. [13] introduced the Global Attacks via Node Injections (GANI) method, which generates features and selects neighbors for the fake nodes based on the statistical information of features and evolutionary perturbations obtained from a genetic algorithm. Tao et al. [19] propose a Generalizable Node Injection Attack (G-NIA) method that injects only a single node during the test phase to dampen the performance of the target GNN model for evasion attack. Chen et al. [20] found that the flexibility of GIA will lead to great damage to the homophily distribution of the original graph, and suggest using the Harmonious Adversarial Objective (HAO) method to generate undetectable fake nodes.

In black-box attacks, the information available to the attackers differentiates two types of adversarial strategies: transfer-based attack approaches and query-based attack approaches. In transfer-based attack approaches, the training data and the ground-truth training labels are accessible. Attackers generate fake nodes on a pre-trained surrogate model and enhance the transferability to an attack target model. Zou et al. [21] propose Topological Defective Graph Injection Attack (TDGIA), which is a black box graph injection attack method which leverages the topological vulnerabilities to generate fake edges and optimizes the features of adversarial nodes.

In query-based attack approaches, the adversary can submit the perturbed graph to the target model. By leveraging the classification confidence provided by the target model, the attacker then optimize the fake nodes. Ju et al. [22] model the graph injection attack as a Markov decision process and propose a Gradient-free Graph Advantage Actor Critic (G²A2C) method to generate fake nodes and fake edges in a black box setting. Wang et al. [23] formulate the GIA problem as a graph clustering problem and propose a black-box cluster attack method that measures the similarity between victim nodes using a metric measuring adversarial vulnerability and then optimizes fake nodes through queries.

This paper primarily focuses on query-based black-box attack. In this attack scenario, the attacker can access a subset of the dataset without ground-truth labels, and is able to obtain the target model’s classification confidence through queries.

2.2. Adversarial Attacks on Graph Contrastive Learning

Due to the remarkable achievements of contrastive learning in computer vision, researchers have begun to explore its application in graphs, achieving competitive results. Contrastive learning models are typically divided into three main components: data augmentation, contrastive modes, and contrastive optimization objectives. In node-level tasks, common data augmentation strategies include feature-based augmentation, structure-based augmentation, and sampling-based augmentation. After data augmentation, the model treats views obtained from the same subgraph as positive pairs, and views from different subgraphs as negative pairs. The optimization of the model is then guided by maximizing the mutual information between positive pairs and minimizing the mutual information between negative pairs. Zeng et al. [24] define augmentation operations such as edge deletion, node deletion, edge insertion, and node insertion to generate different views of the graph. You et al. [25] enhance the consistency between graph representations by introducing uniform perturbations such as node deleting, edge perturbation, and feature masking. Zhang et al. [16] generate different views by removing edges and the masking node features. Based on GRACE, Zhu et al. [26] introduced innovative augmentation strategies designed to preserve important edges and features, while introducing perturbations to less significant edges and features.

Researchers have already focused on adversarial security in graph contrastive learning. Zhang et al. [27] employed discrete optimization to reformulate the attack problem as an optimization problem. This approach effectively directed the victim model to map the target node to the specific embedding. Zhang et al. [28] worked on the adversarial security of graph contrastive models. By calculating the gradients of the adjacency matrices of two views, they flipped the edges to maximize the contrastive loss. Li et al. [29] performed poisoning attacks on graph contrastive models through node injection. They computed the gradients of the contrastive loss on the adjacency matrix in an unsupervised manner, and flipped the edges with the greatest gradients, thereby significantly degrading the accuracy of downstream tasks.

These works concentrated on the adversarial attack targets of graph contrastive models. In contrast, the proposed GCIA method aims to perform evasion attacks on black-box GNN models. Within the GCIA method, the graph contrastive model is used for extracting node embeddings from unlabeled data. By maximizing the change in target node embedding, GCIA can achieve a high success rate in the first query.

3. Preliminaries

In this section, we give some preliminaries on Graph Contrastive Learning (GCL) Models and Graph Injection Attacks (GIA) on graph data. Prior to delving into the specifics, we provide a summary of the frequently used notations in Table 1.

3.1. Graph Contrastive Model

Let

G = {A, X}

represent a graph, with

A \in {0, 1}^{N \times N}

and

X \in R^{N \times F}

denoting the adjacency matrix and feature matrix, respectively. We denote

V = {v_{1}, v_{2}, \dots, v_{N}}

as the set with N nodes, and

E \subseteq V \times V

represents the edge set. Specifically,

x_{i} \in R^{F}

is the F-dimensional feature of node

v_{i}

, and

A_{i j} = 1

if

(v_{i}, v_{j}) \in E

; otherwise, 0. The objective of the graph contrastive model is to learn a feature encoder

E (A, X)

in an unsupervised manner. The GCL model processes the input feature matrix and adjacency matrix to produce node embeddings. By aggregating the features of nodes with the features of their neighbors, the GCL model generates embeddings that are well-equipped for diverse downstream tasks, including node classification and link prediction.

A typical graph contrastive framework encompasses three principal stages. Initially, two augmentation processes are applied to the original graph to yield two views. Subsequently, these views are fed into the encoder to generate node embeddings. Ultimately, a contrastive loss is computed and the encoder is optimized to enhance the similarity among positive sample pairs while diminishing the similarity among negative sample pairs. In node-level tasks, the most prevalent data augmentation techniques are feature masking and edge dropping (or insertion). During the feature masking procedure, the model generates a masking vector

M_{F} \in {0, 1}^{F}

based on a predefined masking probability

p_{F}

. Each element denotes whether the feature is masked or not. The masked feature matrix can be expressed as

\begin{matrix} \tilde{X} = (1 - M_{F}) ⊙ X, \\ s . t . & \sum M_{F} = p_{F} \cdot F . \end{matrix}

(1)

Similarly, the edge dropping (or insertion) is performed through a masking matrix

M_{A} \in {0, 1}^{F \times F}

. The masked adjacency matrix can be expressed as

\begin{matrix} \tilde{A} = A + (J - I - 2 A) ⊙ M_{A}, \\ s . t . & \sum M_{A} = p_{A} \cdot N, \end{matrix}

(2)

where

p_{A}

denotes the masking probability.

By employing different masking rates and masking matrices, we can obtain two graph views,

{\tilde{G}}^{1}

and

{\tilde{G}}^{2}

, with their respective node embeddings of

E ({\tilde{A}}^{1}, {\tilde{X}}^{1})

and

E ({\tilde{A}}^{2}, {\tilde{X}}^{2})

. We denote the embedding of node

v_{i}

in

{\tilde{G}}^{1}

and

{\tilde{G}}^{2}

as

e_{i}^{1}

and

e_{i}^{2}

, respectively.

For a node

v_{i}

, its embedding

e_{i}^{1}

in graph view

{\tilde{G}}^{1}

is considered as the anchor, while its embedding

e_{i}^{2}

in the other view forms a positive sample. The embeddings of other nodes in both views are regarded as negative samples. The GCL model is trained to minimize the classical InfoNCE [30] objective:

\begin{matrix} L (e_{i}^{1}, e_{i}^{2}) = - log \frac{e^{ψ (e_{i}^{1}, e_{i}^{2}) / τ}}{e^{ψ (e_{i}^{1}, e_{i}^{2}) / τ} + \sum_{j \neq i} (e^{ψ (e_{i}^{1}, e_{j}^{1}) / τ} + e^{ψ (e_{i}^{1}, e_{j}^{2}) / τ})}, \end{matrix}

(3)

where

ψ

is the similarity function, such as cosine similarity, and

τ

is a temperature parameter.

The overall objective to be maximized is defined as the average over all positive pairs:

\begin{matrix} L = \sum_{i = 1}^{N} (L (e_{i}^{1}, e_{i}^{2}) + L (e_{i}^{2}, e_{i}^{1})) . \end{matrix}

(4)

3.2. Graph Injection Attack

In the node classification task, a GNN model

f (G) \to Y

classifies each node

v_{i}

into a category

y_{i} \in Y

. A Graph Injection Attack (GIA) attacker modifies graph G to

G^{'} = (A^{'}, X^{'})

with

\begin{matrix} A^{'} & = [\begin{matrix} A & A_{I}^{T} \\ A_{I} & B \end{matrix}], \\ X^{'} & = [\begin{matrix} X \\ X_{I} \end{matrix}], \end{matrix}

(5)

where

A_{I} \in {0, 1}^{N \times N_{I}}

is the adjacency matrix that represents edges between fake nodes and the original graph G, B is the adjacency matrix of fake nodes,

X_{I}

is the feature matrix of the fake nodes, and

N_{I}

is the number of fake nodes.

By injecting fake nodes, the attacker aims for

\begin{matrix} min_{G^{'}} I (f (T, G^{'}) = Y_{T}), \end{matrix}

(6)

where

T

denotes the set of target nodes,

Y_{T}

is the label of target nodes, and

I (\cdot)

represents an indicator function that returns the number of true conditions.

4. Methodology

4.1. Overview

In this paper, we propose a Graph Contrastive Learning-based Graph Injection Attack (GCIA) method. The overview of the GCIA is shown in Figure 1. We can see from Figure 1 that the proposed GCIA attack contains three phases: generation of fake nodes, querying, and the training reward predictor. In query-based black-box attacks, the attacker is permitted multiple queries to the target model. Thus, these three phases are executed iteratively. The GCIA method consists of three components: a node encoder, a reward predictor, and a fake node generator. The node encoder, which can be trained by any local–local contrasting learning method, maps the features of a target node

v_{t}

to a low-dimensional embedding

z_{t}

by aggregating the features of the target node and its neighboring nodes. After querying, the attacker retrieves the reward value of the target node

r_{t}

, which refers to the likelihood of the node being the intended target. The reward predictor aims to learn the mapping from the embedding

z_{t}

to the reward

r_{t}

, serving as a surrogate for the target model. The fake node generator employs the gradient optimization method to generate fake nodes and fake edges based on information from the node encoder and the reward predictor.

4.2. Node Encoder

The node encoder

E (A, X)

uses two graph convolutional layers to aggregate the node features and produces node embeddings:

\begin{matrix} Z = E (A, X) ≜ σ (\hat{A} σ (\hat{A} X W^{0}) W^{1}), \end{matrix}

(7)

where

W^{0}

and

W^{1}

are the parameters of two graph convolution layers,

\hat{A} = {\tilde{D}}^{- \frac{1}{2}} \tilde{A} {\tilde{D}}^{- \frac{1}{2}}

,

\tilde{A} = A + I_{N}

, with

I_{N}

being the identity matrix;

\tilde{D} \in R^{N \times N}

is a diagonal matrix with

{\tilde{D}}_{i, j} = \sum_{j} {\tilde{A}}_{i, j}

; and

σ (\cdot)

is the activation function.

In this paper, we propose analyzing the node encoder using contrastive learning in an unsupervised way since we have limited access to the labeled data in the black-box setting and the model trained on other data has insufficient generalization ability. As in GRACE [16], we use both edge removal and node feature masking to generate contrastive graph views. We consider the embeddings of a node in different graph views as positive pairs, and those of different nodes as negative pairs. The model parameters are optimized by maximizing the similarity between positive pairs and minimizing the similarity between negative pairs.

4.3. Reward Predictor

By leveraging graph contrastive learning, the reward predictor can take node embeddings as inputs and predict a reward value through Multi-Layer Perception (MLP). The reward predictor can be expressed as

\begin{matrix} F (z_{t}, θ) ≜ a^{1} σ (a^{0} z_{t} + b^{0}) + b^{1}, \end{matrix}

(8)

where

z_{t}

is the embedding of the target node,

θ = {a^{0}, b^{0}, a^{1}, b^{1}}

donates the weight and bias, and

σ (\cdot)

is the activation function.

We use a

l_{1}

norm as the loss function to ensure that the output of the reward predictor matches the query results. It can be expressed as

\begin{matrix} L_{r} (z_{t}, r_{t}) = | | F (z_{t}) - r_{t} {| |}_{1}, \end{matrix}

(9)

where

r_{t}

represents the query result of target node

v_{t}

.

Since the attacker has no knowledge of the label information before the initial query, the reward predictor is integrated into the GCIA after the first query. Before the q-th query, the reward predictor is trained using all the results from the previous queries:

\begin{matrix} θ^{q} = \underset{θ}{arg min} \sum_{p \in (1, q - 1)} \sum_{v_{t} \in T} L_{r} (z_{t}^{p}, r_{t}^{p}) \end{matrix}

(10)

4.4. Fake Node Generator

Since the target model is not accessible in the black-box attack, optimizing fake nodes through Equation (6) directly is impossible. In the proposed GCIA method, we propose optimizing fake nodes and fake edges using two accessible surrogate tasks, i.e., first maximizing the change in the target node in the embedding space, then minimizing the prediction of reward predictor. We formulate the adversarial loss function predicated on the surrogate tasks, and subsequently introduce the generation steps for adversarial nodes and edges. Note that in our attack setting, there are no edges between adversarial nodes, i.e., matrix B in Equation (5) is the identity matrix.

4.4.1. Adversarial Loss

During the q-th query, the embedding of the target node is modified from

z_{t}

to

z_{t}^{q}

by injecting fake nodes with feature matrix

X_{I}

and adjacency matrix

A_{I}

. Intuitively, for a perturbed target node, its embedding

z_{t}^{'}

is excepted to be different from the initial embedding

z_{t}

, as well as the embeddings

z_{t}^{p}

(p \in (1, q - 1))

, which failed in the previous queries. We define

L_{z} (z_{t}^{'})

as the difference loss of

z_{t}^{'}

:

\begin{matrix} L_{z} (z_{t}^{'}, q) = s (z_{t}^{'}, z_{t}) + \sum_{p \in (1, q - 1)} s (z_{t}^{'}, z_{t}^{p}) . \end{matrix}

(11)

where

s (\cdot, \cdot)

denotes the cosine similarity.

The reward predictor can be considered a simpler surrogate for the target model, as it learns to mimic query results. Thus, the impact of fake nodes on the target model can be approximated by the loss value of the reward predictor, which is called reward loss:

\begin{matrix} L_{f} (z_{t}^{'}, q) = CW (F (z_{t}^{'}, θ^{q}), y_{t}), \end{matrix}

(12)

where CW

(\cdot, \cdot)

represents the Carlini and Wagner loss [31], which minimizes the distance between the logit values of original class and the second-largest class:

\begin{matrix} CW (F (z_{t}^{'}, θ^{q}), y_{t}) = max (max_{i \neq y} \{F {(z_{t}^{'}, θ^{q})}_{i}\} - F {(z_{t}^{'}, θ^{q})}_{y_{t}}, - κ), \end{matrix}

(13)

where

F {(z_{t}^{'}, θ^{q})}_{i}

and

F {(z_{t}^{'}, θ^{q})}_{y_{t}}

denote the i-th component and the component corresponding to the prediction label, respectively.

y_{t}

is the prediction label of the target model, and

κ

is the minimum expected confidence.

We expect the fake nodes to reduce the similarity between

z_{t}^{'}

with both the original embedding and the embedding from failed queries, while also increasing the likelihood of being misclassified in the reward predictor. By integrating the above losses, given the feature matrix

X_{I}

and the adjacency matrix

A_{I}

of the fake nodes, the adversarial loss is formulated as follows:

\begin{matrix} L_{a d v} (A_{I}, X_{I}, v_{t}, q) & = L_{f} (z_{t}^{'}, q) - L_{z} (z_{t}^{'}, q), \\ z_{t}^{'} & = E {(A^{'}, X^{'})}_{t}, \\ A^{'} & = [\begin{matrix} A & A_{I}^{T} \\ A_{I} & B \end{matrix}], \\ X^{'} & = [\begin{matrix} X \\ X_{I} \end{matrix}] . \end{matrix}

(14)

4.4.2. Fake Feature Generation

The fake node generator optimizes the features of fake nodes using gradient optimization. We first calculate the statistical information of the known node features and sample vectors from the feature distribution as the initial features. Subsequently, we iteratively calculate the gradient and optimize the features of fake nodes. In the k-th iteration, the gradient can be calculate as follows:

\begin{matrix} g_{i}^{k} & = \frac{\partial L_{a d v} (A_{I}, X_{I}^{k}, v_{t}, q)}{\partial x_{i}^{k}}, \\ = \frac{\partial L_{f} (z_{t}^{' k}, q)}{\partial x_{i}^{k}} - \frac{\partial L_{z} (z_{t}^{' k}, q)}{\partial x_{i}^{k}}, \\ z_{t}^{' k} & = E {(A^{'}, X^{' k})}_{t}, \\ X^{' k} & = [\begin{matrix} X \\ X_{I}^{k} \end{matrix}], \end{matrix}

(15)

where

x_{i}^{k} \in X_{I}^{k}

is the feature of a fake node in the k-th iteration.

We perform different update strategies for fake nodes with discrete and continuous features, respectively:

x_{i}^{k + 1} = \{\begin{matrix} \begin{matrix} x_{i}^{k} + λ g_{i}^{k}, & if continuous, \\ Flip (x_{i}^{k}, Δ_{i}^{k}), & if discrete, \end{matrix} \end{matrix}

(16)

where

λ

is the optimization step, and Flip

(x_{i}^{k}, Δ_{i}^{k})

donates flipping of the

Δ_{i}^{k}

-th element of the vector

x_{i}^{k}

. We define

Δ_{i}^{k}

as

\begin{matrix} Δ_{i}^{k} = argmax ((1 - 2 x_{i}^{k}) ⊙ g_{i}^{k}), \end{matrix}

(17)

where ⊙ is the Hadamard product. For each fake node, we perform K iterations of gradient optimization, and finally obtain the feature matrix of the fake nodes

X_{I}^{q}

.

4.4.3. Fake Edge Generation

The fake node generator is capable of producing multiple fake nodes, each equipped with several fake edges. When the fake node has only one fake edge, we can connect the fake node to the target node directly to maximize the attack effectiveness. In instances where a fake node has multiple fake edges, we implement an optimization–selection–optimization strategy for the generation of fake edges.

Algorithm 1 outlines the process of generating a single fake node with multiple fake edges during the q-th query. We first connect the fake node with random sampled features to the target node, and optimize the features with gradient optimization shown in Section 4.4.2. Subsequently, we connect the fake node to the neighbors of the target node and compute the adversarial loss in sequence.

Δ_{e} - 1

neighbors with the highest adversarial loss are chosen to form the fake edges with the fake nodes. The adversarial adjacency matrix

A_{I}

is then constructed to include these fake edges, as well as the edges between the fake node and the target node. Finally, we optimize the features of fake nodes once more to maximize the adversarial loss. By injecting the fake node and its edges into the graph and iteratively executing Algorithm 1, the fake node generator can generate multiple fake nodes with multiple fake edges.

Algorithm 1 Fake edge generation.

Require: graph

G = {A, X}

, fake edge number

Δ_{e}

, fake node

v_{i}

, target node

v_{t}

Ensure: feature matrix of fake nodes

X_{I}

, adjacency matrix of fake nodes

A_{I}

,

1:: Initialize feature of fake node $v_{i}$ to $x_{i}^{0}$
2:: $X_{I}^{0} \leftarrow x_{i}^{0}$
3:: $A_{I} \leftarrow (v_{i}, v_{t})$
4:: $X_{1}^{1} \leftarrow$ Optimize the feature matrix according to Section 4.4.2
5:: for $v_{n j} \in N (v_{t})$ do
6:: $A_{I}^{n j} \leftarrow Add (v_{n j}, v_{i}) to A_{I}$
7:: Compute $L_{a d v} (A_{I}^{n j}, X_{1}^{1}, v_{t}, q)$
8:: end for
9:: $V_{n e i} \leftarrow$ Choose $Δ_{e} - 1$ neighbors with highest $L_{a d v} (A_{I}^{n j}, X_{1}^{1}, v_{t}, q)$
10:: for $v_{n j} \in V_{n e i}$ do
11:: $A_{I} \leftarrow Add (v_{n j}, v_{i}) to A_{I}$
12:: end for
13:: $X_{I} \leftarrow$ Optimize the feature matrix according to Section 4.4.2
14:: return $A_{I}$ , $X_{I}$

5. Experiment

5.1. Settings

Datasets: We conducted our experiments on four acknowledged datasets. Cora [32] and Citeseer [32] are citation datasets with discrete features, Pubmed [32] is a citation dataset with continuous features, and Reddit [33] is a social network dataset with continuous features. For Reddit, we explored the subgraphs and splits shared by G-NIA [19].

Comparison Methods: We compared our work with four black-box graph injection attack methods. The G²A2C [22] method is a query-based GIA method that models the node injection attack as a Markov decision process. TDGIA [21] and G-NIA [19] methods are transfer-based black-box graph injection attacks which can access all data. They first train the surrogate model based on the whole dataset, and then optimize the fake nodes with the surrogate model as the target, and finally transfer the attack to the target model.

We also compared our method with a random attacker which randomly generates features for the fake node and connects it to the target node. In Random-1 attack, the attacker first calculates the mean and variance of the features across all nodes, and subsequently samples a fake node randomly and establishes a direct link to the target node. In Random-10 attack, the attacker attacks the target node 10 times, and once the target node has been misclassified, the attack is considered successful.

Detailed Settings: In our experiments, the attacker is restricted to acquiring the subgraph containing the target node and its two-hop neighbors, and to querying for the class probabilities of the target nodes. To make the attack unnoticeable, for the dataset with discrete features, we forced the mean of the injected features to be consistent with the original graph, and for the dataset with continuous features, we ensured that the injected features did not exceed the upper and lower bounds of the original graph. We relaxed the knowledge limitations of baseline methods due to their different experimental settings. Specifically, for TDGIA and G-NIA, which are transfer-based approaches, we allowed the attackers access to all data, as well as labels of the training set and validation set. In the G²A2C method, which is a generative-based approach, we allowed feature metrics of fake nodes to exceed the dataset’s average. In all attack methods, the proposed GCIA adhered to the most rigorous knowledge constraints.

Number of Fake Nodes: Considering that different attack methods have varying configurations for fake nodes and edges, we conducted comparative experiments under setups with different numbers of fake nodes and edges. The experimental setup is depicted in Table 2. For each target node,

Δ_{v}

represents the number of fake nodes employed to attack that target node, and

Δ_{e}

denotes the number of edges each fake node establishes with the nodes in the graph. The description of each setup is shown in the following:

Setup 1 uses a single fake node directly connected to the target node, and thus this experimental setup primarily assesses the capability of optimizing the features of the fake node.
Setup 2 indicates the employment of multiple fake nodes directly connected to the target node. This experimental setup evaluates the capability in concurrently optimizing the features of multiple fake nodes.
Setup 3 uses a single fake node to attack the target node through multiple fake edges. This setup evaluates the capability of generating fake edges.
Setup 4 attacks the target node using multiple fake nodes, each containing multiple fake edges. This experimental setup evaluates the attack effectiveness that can be achieved under an ample attack budget.

5.2. Performance Comparison with Setup 1

We compared the proposed GCIA with baseline methods in Setup 1 and present the results in Table 3 and Table 4. We can see that graph learning models are vulnerable to graph invasion attacks even if the fake nodes are generated by random sampling. Despite having higher node degrees, Reddit and PubMed with continuous features are more susceptible to fake nodes than datasets with discrete features. This is due to the fact that the continuous features in a fake node can be more easily manipulated to embedded malicious information. Among all the attack methods, our proposed GCIA method achieves the highest misclassification rate in almost all cases. Compared with the state-of-the art G²A2C method, GCIA improves the misclassification rate by 7.9%, 10.3%, 11.4%, and 22.0% on Cora, Citeseer, Reddit, and PubMed, respectively. It should be noted that the proposed GCIA method has a significantly higher initial query success rate, which demonstrates that the graph contrastive learning model can extract rich graph information. By increasing the changes in the embedding of the target node, the initial fake nodes generated by GCIA sometimes have even stronger attacking capabilities than those produced by the comparison methods.

5.3. Performance Comparison with Setup 2

We have selected the most representative GCN and GAT models as the targets for the adversarial attacks depicted in Figure 2. Additionally, we have chosen the Cora dataset characterized by discrete features and the PubMed dataset with continuous features for experimental comparisons. Since TDGIA only supports data with continuous features, the experiments on it were conducted solely on the PubMed dataset. In Setup 3, multiple fake nodes are directly connected to the target node. Consequently, during the classification process, the target model aggregates the features from the fake nodes into the representation of the target node, which makes target nodes with a higher number of fake nodes more likely to be misclassified. The experimental results indicate that all attack methods benefit from an increase in the number of fake nodes. With the escalation in fake node number, the proposed GCIA methods achieve higher misclassification rates, and consistently maintain a leading position. This experiment demonstrates that the proposed adversarial loss serves as an effective proxy for misclassification rates. When multiple fake nodes are injected into the graph, GCIA-1 can further enhance the attack performance, indicating that the GCIA method can fully leverage the unsupervised information extracted from the graph contrastive learning model to generate adversarial fake nodes.

5.4. Performance Comparison with Setup 3

Figure 3 shows the experimental results when selecting the GCN and GAT models as target models with Setup 3. The experiments are validated when conducted on both the Cora and PubMed datasets. The results indicate that fake nodes with additional fake edges generally enhance the attack capability, particularly for the G-NIA method. When attacking the GCN model utilizing the Cora dataset, the G-NIA method with

Δ_{e} = 3

achieved the optimal outcome. In contrast, an increase in

Δ_{e}

has a minimal impact on the attack performance of the proposed GCIA method. Nevertheless, the GCIA method consistently secured the best attack performance in most scenarios. Notably, when attacking the GAT model in the PubMed dataset, the GCIA-1 method performed worse as the number of fake edges increased. This may be attributed to the GCIA-1 method’s reliance solely on unsupervised data information, thus rendering it incapable of selecting threatening fake edges when faced with the attention mechanism in the GAT model.

5.5. Performance Comparison with Setup 4

Figure 4 presents an extensive evaluation of the attack performance of our proposed GCIA method in response to variations in

Δ_{v}

and

Δ_{e}

. Within the comparative methodologies, only the G²A2C method supports attacks with multiple fake nodes and multiple fake edges. Therefore, our comparative analysis is exclusively focused solely on the results obtained from the GCIA and G²A2C methods when attacking the GAT model on the Cora dataset. It is important to notice that within the Figure 4, the data have been proportionately scaled without altering the inherent relative proportions, thereby optimizing the visual representation for clearer elucidation.

From the experimental results, it is evident that the fake node number

Δ_{v}

has a more significant impact on our proposed GCIA method than the fake edge number

Δ_{e}

, with the method achieving peak performance at

Δ_{v} = 3

,

Δ_{e} = 3

. The G²A2C method benefits from both the fake node number and the fake edge number, and surpasses the attack capability of GCIA at

Δ_{v} = 3

,

Δ_{e} = 3

. Overall, the proposed GCIA method is adept at realizing heightened attack efficacy within scenarios that impose limitations on the attacker’s capabilities.

5.6. Ablation Study

In the proposed GCIA method, the node encoder is a Graph Contrastive Learning (GCL) model with two graph convolutional layers. The reward predictor, which serves as a proxy for the target model, is a two-layer Multilayer Perceptron (MLP) model. To verify the effectiveness of different components in GCIA, we implement them using various architectural configurations and conduct ablation studies.

To assess the impact of the complexity of the reward predictor on the effectiveness of the attack, we have implemented three reward predictors with different model complexities:

MLP: The two-layer Multilayer Perceptron (MLP), as demonstrated in the paper, which is considered the most simple model. Given that the node encoder can extract unsupervised knowledge from the graph to generate high-quality node embeddings, a simple reward predictor may suffice for GCIA.
MLP-head: This model utilizes a linear layer to extract features and then constructs a separate head for each target class to predict the classification confidence for that class. This model has moderate complexity. Since the CW loss requires the reward predictor to have a higher imitation ability for the classification confidence of the target model, then using the MLP-head may improve the performance of the GCIA method.
MLP-attention: This model employs a multi-head attention mechanism to aggregate the features of the target node and its neighboring nodes, ultimately outputting the classification confidence for the node. This model has the highest complexity among the three models. Since the node encoder utilizes two graph convolutional layers to extract the embedding of the target node, which may differ from the feature aggregation process of the target model, then incorporating an attention mechanism into the reward predictor may improve the performance of the attack.

In the 10-th query, benefiting from the well-trained reward predictor, the fake node generator can generate effective fake nodes. To evaluate the effect of the complexity of the reward predictor on the attack effectiveness, we performed an adversarial attack on the GCN, SAGE, GAT, and APPNP models in the Cora dataset. The misclassification rate was considered as the metric to assess the performance of the reward predictor. The experimental results are depicted in Figure 5 and Table 5, from which it is evident that the complexity of different reward predictors has a small impact on the effectiveness of the attack, with the simpler MLP reward predictor achieving the best results. Since the node encoder is capable of mining unsupervised information from the graph to obtain discriminative node embeddings, a simple reward predictor can serve as a proxy for the target model. In contrast, more complex reward predictors are prone to overfitting, which may lead to a decline in attack performance.

We evaluated the efficacy of the node encoder by employing various view generation strategies to test its ability to extract unsupervised knowledge from the graph data. In the initial query of a black-box attack using GCIA, there is no access to any information about the target model, and the attack performance depends on the feature extraction capabilities of the node encoder. Consequently, we used the misclassification rate of GCIA-1 as the performance metric for the node encoder and tested on GCN, SAGE, GAT, and APPNP. Table 6 shows the misclassification rate of the proposed GCIA method during the initial query. It is evident that the combination of feature masking and edge removing achieves the most effective attack outcome.

5.7. Case Study

To further investigate how GCIA optimizes fake nodes using the embeddings from the node encoder, we visualized an attack on the GCN model of the Cora dataset, which failed on the first query and succeeded on the second query. Figure 5 shows the T-SNE [34] visualization results of the node embeddings extracted from the node encoder and the hidden embeddings from the GCN model, respectively. In the figure, circles of different colors represent the embeddings of nodes belonging to different categories. The squares denote the original embedding of the target node, the triangles indicate the embedding of the target node during the first attack, and the stars indicates the embedding of the target node in the second attack. The squares and triangles are in the same color, indicating that the predicted category of the target node remains unchanged during the first attack. The stars have a different color, indicating that the target node is misclassified in the second attack.

We can see that target nodes of different categories are effectively separated in the embedding space produced by our node encoder. By optimizing the fake nodes, the embedding of the target node in the node encoder undergoes a significant change, consequently altering the embedding in the GCN model. It can be observed that when the embedding of the target node changes in the node encoder, its embedding in the target model also changes, ensuring the effectiveness of GCIA. Comparing attack 1 and attack 2 in Figure 5, although the embedding of the target node changed more significantly in attack 1, this change does not lead to misclassification of the target node. Therefore, in the second attack, the GCIA method learned from the failure, making the embedding of the target node different from the embeddings in both the original and attack 1, thus successfully executing the attack.

6. Conclusions

In this work, we investigate graph injection attacks in a black-box setting. We propose the Graph Contrastive Learning-based Graph Injection Attack (GCIA) method to fully exploit the unsupervised knowledge contained in graph data. This approach comprises three components: a node encoder, a reward predictor, and a fake node generator. Specifically, the node encoder is a graph contrastive learning model that extracts embedded representations of nodes from existing data. The reward predictor can be regarded as a simplified surrogate for the target model, with inputs being the embedded nodes and outputs being the classification confidence of nodes. The fake node generator effectively creates fake nodes by maximizing the embedding changes of target nodes in the node encoder while minimizing the confidence of the original class of target nodes in the reward predictor. When attackers construct multiple fake edges for a fake node, the GCIA method iteratively establishes appropriate fake edges for fake nodes through an optimization–selection–optimization strategy. Compared to other query-based attack methods, GCIA is capable of leveraging graph contrastive learning models to extract unsupervised knowledge from graphs, achieving a high initial misclassification rate. On the GCN model, the GCIA method achieved misclassification rates of 26.4%, 39.0%, 99.1%, and 68.8% in the first query on the Cora, Citeseer, Reddit, and PubMed datasets, respectively. After several queries, the GCIA method outperformed the state-of-the-art methods in most cases.

Compared to our preliminary work [17], this work introduces a fake edge generation algorithm, enabling attacks with multiple fake nodes and multiple fake edges. Extensive comparative and ablation experiments have demonstrated the effectiveness of the proposed GCIA method.

The GCIA method demonstrates effectiveness for graph injection attack, and at the same time it has promising directions for future works. From the perspective of an attacker, early queries could focus on obtaining the classification boundaries of the target model, thereby generating adversarial samples with higher attack capabilities in later queries. From the defender’s perspective, detecting adversarial samples during the attacker’s queries is of paramount importance in a black-box setting, especially when the distribution of fake nodes is similar to that of normal nodes.

Author Contributions

Conceptualization, X.L. and W.Z.; methodology, J.H.; software, X.L.; validation, Z.C., Y.P. and M.X.; formal analysis, J.H.; investigation, W.Z.; resources, Z.C.; data curation, Y.P.; writing—original draft preparation, X.L.; writing—review and editing, J.H. and W.Z.; visualization, M.X.; supervision, W.Z.; project administration, J.H.; funding acquisition, W.Z. and J.H. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by the NUDT Innovation Science Foundation, number 23-ZZCX-JDZ-08, the NUDT Research Project, number ZK22-56, and the National Natural Science Foundation of China, number 62201600.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

The “Cora”, “CiteSeer” and “PubMed” datasets are openly available at https://github.com/kimiyoung/planetoid/raw/master/data (accessed on 1 August 2024), reference number [8]. The “Reddit” dataset is openly available in https://github.com/TaoShuchang/G-NIA (accessed on 1 August 2024), reference number [19].

Conflicts of Interest

The authors declare no conflict of interest.

Abbreviations

The following abbreviations are used in this manuscript:

GNN	Graph neural network
GCL	Graph contrastive learning
GIA	Graph injection attack
GMA	Graph modification attack
GCN	Graph convolutional network
SAGE	Graph sample and aggregate
GAT	Graph attention network
APPNP	Approximate personalized propagation of neural predictions
MLP	Multi-layer perception
GCIA	Graph contrastive learning based graph injection attack
MSR	Misclassification rate

References

Izadi, M.R.; Fang, Y.; Stevenson, R.; Lin, L. Optimization of graph neural networks with natural gradient descent. In Proceedings of the 2020 IEEE International Conference on Big Data (Big Data), Atlanta, GA, USA, 10–13 December 2020; IEEE: Piscataway, NJ, USA, 2020; pp. 171–179. [Google Scholar]
Wang, X.; He, X.; Wang, M.; Feng, F.; Chua, T.S. Neural graph collaborative filtering. In Proceedings of the 42nd International ACM SIGIR Conference on Research and Development in Information Retrieval, Paris, France, 21–25 July 2019; pp. 165–174. [Google Scholar]
Zhang, M.; Cui, Z.; Neumann, M.; Chen, Y. An end-to-end deep learning architecture for graph classification. In Proceedings of the Thirty-Second AAAI Conference on Artificial Intelligence and Thirtieth Innovative Applications of Artificial Intelligence Conference and Eighth AAAI Symposium on Educational Advances in Artificial Intelligence, New Orleans, LA, USA, 2–7 February 2018; AAAI Press: Washington, DC, USA, 2018. AAAI’18/IAAI’18/EAAI’18. [Google Scholar]
Peng, X.; Zhou, Z.; Zhang, C.; Xu, K. Online Social Behavior Enhanced Detection of Political Stances in Tweets. In Proceedings of the International AAAI Conference on Web and Social Media, New York, NY, USA, 3–6 June 2024; Volume 18, pp. 1207–1219. [Google Scholar]
Benslimane, S.; Azé, J.; Bringay, S.; Servajean, M.; Mollevi, C. A text and GNN based controversy detection method on social media. World Wide Web 2023, 26, 799–825. [Google Scholar] [CrossRef]
Duraisamy, P.; Parvathy, K.; Niranjani, V.; Natarajan, Y. Improved Recommender System for Kid’s Hobby Prediction using different Machine Learning Techniques. In Proceedings of the 2023 Fifth International Conference on Electrical, Computer and Communication Technologies (ICECCT), Erode, India, 22–24 February 2023; pp. 1–6. [Google Scholar] [CrossRef]
Yang, Y.; Yang, R.; Li, Y.; Cui, K.; Yang, Z.; Wang, Y.; Xu, J.; Xie, H. RoSGAS: Adaptive Social Bot Detection with Reinforced Self-supervised GNN Architecture Search. ACM Trans. Web 2023, 17, 1–31. [Google Scholar] [CrossRef]
Yang, Z.; Cohen, W.; Salakhudinov, R. Revisiting Semi-Supervised Learning with Graph Embeddings. In Proceedings of Machine Learning Research; Proceedings of the 33rd International Conference on Machine Learning, New York, NY, USA, 20–22 June 2016; Balcan, M.F., Weinberger, K.Q., Eds.; PMLR: Cambridge, MA, USA, 2016; Volume 48, pp. 40–48. [Google Scholar]
Zhou, F.; Cao, C.; Zhang, K.; Trajcevski, G.; Zhong, T.; Geng, J. Meta-GNN: On few-shot node classification in graph meta-learning. In Proceedings of the 28th ACM International Conference on Information and Knowledge Management, Beijing, China, 3–7 November 2019; pp. 2357–2360. [Google Scholar]
Xu, K.; Chen, H.; Liu, S.; Chen, P.Y.; Weng, T.W.; Hong, M.; Lin, X. Topology attack and defense for graph neural networks: An optimization perspective. In Proceedings of the 28th International Joint Conference on Artificial Intelligence, Macao, China, 10–16 August 2019; pp. 3961–3967. [Google Scholar]
Lin, X.; Zhou, C.; Wu, J.; Yang, H.; Wang, H.; Cao, Y.; Wang, B. Exploratory Adversarial Attacks on Graph Neural Networks for Semi-Supervised Node Classification. Pattern Recognit. 2023, 133, 109042. [Google Scholar] [CrossRef]
Wang, J.; Luo, M.; Suya, F.; Li, J.; Yang, Z.; Zheng, Q. Scalable attack on graph data by injecting vicious nodes. Data Min. Knowl. Discov. 2020, 34, 1363–1389. [Google Scholar] [CrossRef]
Fang, J.; Wen, H.; Wu, J.; Xuan, Q.; Zheng, Z.; Tse, C.K. GANI: Global Attacks on Graph Neural Networks via Imperceptible Node Injections. arXiv 2022, arXiv:2210.12598. [Google Scholar] [CrossRef]
Wu, X.G.; Wu, H.J.; Zhou, X.; Zhao, X.; Lu, K. Towards Defense Against Adversarial Attacks on Graph Neural Networks via Calibrated Co-Training. J. Comput. Sci. Technol. 2022, 37, 1161–1175. [Google Scholar] [CrossRef]
Liu, Y.; Yang, X.; Zhou, S.; Liu, X.; Wang, S.; Liang, K.; Tu, W.; Li, L. Simple contrastive graph clustering. IEEE Trans. Neural Netw. Learn. Syst. 2023, 35, 13789–13800. [Google Scholar] [CrossRef] [PubMed]
Zhu, Y.; Xu, Y.; Yu, F.; Liu, Q.; Wu, S.; Wang, L. Deep Graph Contrastive Representation Learning. arXiv 2020, arXiv:2006.04131. [Google Scholar]
Liu, X.; Huang, J.J.; Zhao, W. GCIA: A Black-Box Graph Injection Attack Method Via Graph Contrastive Learning. In Proceedings of the ICASSP 2024—2024 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Seoul, Republic of Korea, 14–19 April 2024; IEEE: Piscataway, NJ, USA, 2024; pp. 6570–6574. [Google Scholar]
Sun, Y.; Wang, S.; Tang, X.; Hsieh, T.Y.; Honavar, V. Non-target-specific node injection attacks on graph neural networks: A hierarchical reinforcement learning approach. In Proceedings of the Proc. WWW, Taipei, Taiwan, 20–24 April 2020; Volume 3. [Google Scholar]
Tao, S.; Cao, Q.; Shen, H.; Huang, J.; Wu, Y.; Cheng, X. Single node injection attack against graph neural networks. In Proceedings of the 30th ACM International Conference on Information & Knowledge Management, Queensland, Australia, 1–5 November 2021; pp. 1794–1803. [Google Scholar]
Chen, Y.; Yang, H.; Zhang, Y.; Ma, K.; Liu, T.; Han, B.; Cheng, J. Understanding and improving graph injection attack by promoting unnoticeability. arXiv 2022, arXiv:2202.08057. [Google Scholar]
Zou, X.; Zheng, Q.; Dong, Y.; Guan, X.; Kharlamov, E.; Lu, J.; Tang, J. Tdgia: Effective injection attacks on graph neural networks. In Proceedings of the 27th ACM SIGKDD Conference on Knowledge Discovery & Data Mining, Singapore, 14–18 August 2021; pp. 2461–2471. [Google Scholar]
Ju, M.; Fan, Y.; Zhang, C.; Ye, Y. Let Graph be the Go Board: Gradient-free Node Injection Attack for Graph Neural Networks via Reinforcement Learning. In Proceedings of the Thirty-Seventh AAAI Conference on Artificial Intelligence, Washington, DC, USA, 7–14 February 2023. [Google Scholar]
Wang, Z.; Hao, Z.; Wang, Z.; Su, H.; Zhu, J. Cluster Attack: Query-based Adversarial Attacks on Graph with Graph-Dependent Priors. In Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, Vienna, Austria, 23–29 July 2022; Volume 7, pp. 768–775, IJCAI-22. [Google Scholar] [CrossRef]
Zeng, J.; Xie, P. Contrastive self-supervised learning for graph classification. In Proceedings of the AAAI Conference on Artificial Intelligence, Vancouver, BC, Canada, 2–9 February 2021; Volume 35, pp. 10824–10832. [Google Scholar]
You, Y.; Chen, T.; Sui, Y.; Chen, T.; Wang, Z.; Shen, Y. Graph contrastive learning with augmentations. Adv. Neural Inf. Process. Syst. 2020, 33, 5812–5823. [Google Scholar]
Zhu, Y.; Xu, Y.; Yu, F.; Liu, Q.; Wu, S.; Wang, L. Graph contrastive learning with adaptive augmentation. In Proceedings of the Web Conference 2021, Ljubljana, Slovenia, 19–23 April 2021; pp. 2069–2080. [Google Scholar]
Zhang, H.; Chen, J.; Lin, L.; Jia, J.; Wu, D. Graph contrastive backdoor attacks. In Proceedings of the International Conference on Machine Learning, PMLR, Honolulu, HI, USA, 23–29 July 2023; pp. 40888–40910. [Google Scholar]
Zhang, S.; Chen, H.; Sun, X.; Li, Y.; Xu, G. Unsupervised graph poisoning attack via contrastive loss back-propagation. In Proceedings of the ACM Web Conference 2022, Lyon, France, 25–29 April 2022; pp. 1322–1330. [Google Scholar]
Li, Q.; Wang, Z.; Li, Z. PAGCL: An unsupervised graph poisoned attack for graph contrastive learning model. Future Gener. Comput. Syst. 2023, 149, 240–249. [Google Scholar] [CrossRef]
Lai, C.I. Contrastive Predictive Coding Based Feature for Automatic Speaker Verification. arXiv 2019, arXiv:1904.01575. [Google Scholar]
Carlini, N.; Wagner, D. Towards evaluating the robustness of neural networks. In Proceedings of the 2017 IEEE Symposium on Security and Privacy (sp), San Jose, CA, USA, 22–26 May 2017; IEEE: Piscataway, NJ, USA, 2017; pp. 39–57. [Google Scholar]
Sen, P.; Namata, G.; Bilgic, M.; Getoor, L.; Galligher, B.; Eliassi-Rad, T. Collective classification in network data. AI Mag. 2008, 29, 93. [Google Scholar] [CrossRef]
Hamilton, W.; Ying, Z.; Leskovec, J. Inductive representation learning on large graphs. Adv. Neural Inf. Process. Syst. 2017, 30, 1025–1035. [Google Scholar]
Van der Maaten, L.; Hinton, G. Visualizing data using t-SNE. J. Mach. Learn. Res. 2008, 9, 2579–2605. [Google Scholar]

Figure 1. The flow chart of a single query in the Graph Contrastive Learning-based Graph Injection Attack (GCIA) method. The node encoder is trained from graph data in an unsupervised way using graph contrastive learning. Before querying, the GCIA method optimizes the fake nodes by (a) maximizing the changes in the embedding representation of the target node, and (b) minimizing the classification execution of the reward predictor for its initial category. After querying, the GCIA method retrains the reward predictor using the query results to ensure the model’s output is aligned well with the results from the target model. In the next iteration, the GCIA method generates new fake nodes based on the updated reward predictor.

Figure 2. Misclassification rates of various attack methods in Setup 2. The horizontal coordinate corresponds to the different attack methods with different fake node numbers, while the vertical coordinate illustrates the associated misclassification rates. The blue, orange, and green bars denote the conditions in which the fake node number

Δ_{v}

is 1, 2, or 3, respectively, with each fake node directly connected to the target node. (a) Result in GCN model with Cora dataset, (b) result in GCN model with PubMed dataset, (c) result in GAT model with Cora dataset, (d) result in GAT model with PubMed dataset.

Figure 2. Misclassification rates of various attack methods in Setup 2. The horizontal coordinate corresponds to the different attack methods with different fake node numbers, while the vertical coordinate illustrates the associated misclassification rates. The blue, orange, and green bars denote the conditions in which the fake node number

Δ_{v}

is 1, 2, or 3, respectively, with each fake node directly connected to the target node. (a) Result in GCN model with Cora dataset, (b) result in GCN model with PubMed dataset, (c) result in GAT model with Cora dataset, (d) result in GAT model with PubMed dataset.

Figure 3. Misclassification rates of various attack methods in Setup 3. The horizontal coordinate corresponds to the different attack methods with different fake edge numbers, while the vertical coordinate illustrates the associated misclassification rates. The blue, orange, and green bars denote the conditions in which the fake edge number

Δ_{e}

is 1, 2, and 3, respectively. (a) Result in GCN model with Cora dataset, (b) result in GCN model with PubMed dataset, (c) result in GAT model with Cora dataset, (d) result in GAT model with PubMed dataset.

Figure 3. Misclassification rates of various attack methods in Setup 3. The horizontal coordinate corresponds to the different attack methods with different fake edge numbers, while the vertical coordinate illustrates the associated misclassification rates. The blue, orange, and green bars denote the conditions in which the fake edge number

Δ_{e}

is 1, 2, and 3, respectively. (a) Result in GCN model with Cora dataset, (b) result in GCN model with PubMed dataset, (c) result in GAT model with Cora dataset, (d) result in GAT model with PubMed dataset.

Figure 4. Misclassification rates of various attack methods in Setup 4. The experiments are conducted on the GAT model using the Cora dataset. The horizontal coordinate corresponds to the different fake node numbers, while the vertical coordinate illustrates the different fake edge numbers. The orange circles denote the GCIA method, while blue circles signify the G²A2C method. The radius of each circle corresponds to the misclassification rate induced by the attack, which has been proportionately scaled for clearer visual representation.

Figure 5. Visualization of the node embeddings from the node encoder and GCN model. Circles denote the nodes on the graph, squares denote the target node, triangle denotes the target node in the 1st query, stars denote the target node in the 2nd query, and their color indicates the labels of the nodes. (a) Node embeddings from node encoder, (b) node embeddings from GCN model.

Table 1. The definitions or descriptions of notations.

Notation	Description
$G = {A, X}$	Graph data
A	Adjacency matrix
X	Node attribute matrix
Z	Node embedding matrix
N	Number of nodes
F	Number of features
$V$	Set of nodes
$E$	Set of edges
$Y$	Labels of nodes
v	A node
y	Class label of a node
$x_{i}$	Feature vector of node $v_{i}$
$z_{i}$	Embedding of node $v_{i}$
$r_{i}$	Query result of node $v_{i}$
g	Gradient value
$λ$	Optimization step
$θ$	Model parameters
$Δ_{n}$	Number of fake nodes
$Δ_{e}$	Number of fake edges for each fake node
$A_{I}$	Adjacency matrix that represents edges between fake nodes and original nodes
$X_{I}$	Feature matrix of fake nodes
$A^{'}$	Adversarial adjacency matrix
$X^{'}$	Adversarial feature matrix
$G^{'} = {A^{'}, X^{'}}$	Adversarial graph
$L_{r}$	Loss function in reward predictor
$L_{z}$	Difference loss
$L_{f}$	Reward loss
$L_{a d v}$	Adversarial loss

Table 2. The number of fake nodes and fake edges used in the experiments, as well as the comparative methods under each experimental setup.

Δ_{v}

denotes the number of fake nodes, and

Δ_{e}

represents the number of fake edges for each fake node.

Table 2. The number of fake nodes and fake edges used in the experiments, as well as the comparative methods under each experimental setup.

Δ_{v}

denotes the number of fake nodes, and

Δ_{e}

represents the number of fake edges for each fake node.

Setup	$Δ_{v}$	$Δ_{e}$	Compared Methods
1	1	1	G-NIA, TDGIA, G²A2C
2	2, 3	1	TDGIA, G²A2C
3	1	2, 3	G-NIA, G²A2C
4	2, 3	2, 3	G²A2C

Table 3. Misclassification rate on Cora and Citeseer datasets launched by the different attacker methods with Setup 1. The number following the method indicates the query times (the best scores and the second-best scores are in bold and with an underline, respectively).

Attack Methods	Cora				Citeseer
Attack Methods	GCN	SAGE	GAT	APPNP	GCN	SAGE	GAT	APPNP
Clean	0.177	0.21	0.187	0.154	0.298	0.329	0.306	0.291
Random-1	0.197	0.207	0.188	0.165	0.301	0.35	0.315	0.304
Random-10	0.251	0.267	0.241	0.207	0.363	0.44	0.374	0.345
G-NIA [19]	0.423	0.37	0.396	0.309	0.648	0.537	0.580	0.483
TDGIA [21]	—	—	—	—	—	—	—	—
G²A2C [22]	0.261	0.355	0.412	0.293	0.394	0.559	0.552	0.454
GCIA-1 (ours)	0.264	0.35	0.322	0.193	0.408	0.517	0.394	0.357
GCIA-10 (ours)	0.490	0.534	0.439	0.299	0.636	0.703	0.581	0.484

Table 4. Misclassification rate on Reddit and PubMed datasets launched by the different attacker methods with Setup 1. The number following the method indicates the query times (the best scores and the second-best scores are in bold and with an underline, respectively).

Attack Methods	Reddit				PubMed
Attack Methods	GCN	SAGE	GAT	APPNP	GCN	SAGE	GAT	APPNP
Clean	0.162	0.154	0.157	0.199	0.208	0.272	0.258	0.194
Random-1	0.195	0.181	0.169	0.228	0.221	0.253	0.266	0.197
Random-10	0.306	0.249	0.229	0.296	0.381	0.316	0.33	0.265
G-NIA [19]	0.848	0.648	0.631	0.675	0.92	0.704	0.646	0.71
TDGIA [21]	0.939	0.883	0.714	0.768	0.883	0.871	0.662	0.857
G²A2C [22]	0.971	0.941	0.921	0.699	0.652	0.569	0.589	0.409
GCIA-1 (ours)	0.991	0.97	0.706	0.983	0.688	0.581	0.652	0.534
GCIA-10 (ours)	1	1	0.992	0.999	0.911	0.844	0.938	0.808

Table 5. Misclassification rate on Cora of GCIA-10 method with different reward predictors.

	GCN	SAGE	GAT	APPNP	Avg. Rank
MLP	0.490	0.561	0.448	0.315	1.75
MLP-head	0.469	0.533	0.454	0.317	2.00
MLP-attention	0.474	0.53	0.461	0.309	2.25

Table 6. Misclassfication rate of the GCIA-1 method with different augmentation methodologies on Cora, Citeseer, Reddit, and PubMed.

Data Sets	Models	Feature Masking	Edge Removing	Both
Cora	GCN	0.271	0.265	0.262
	SAGE	0.321	0.296	0.329
	GAT	0.288	0.275	0.299
	APPNP	0.198	0.199	0.208
Citeseer	GCN	0.38	0.395	0.407
	SAGE	0.513	0.504	0.513
	GAT	0.444	0.444	0.454
	APPNP	0.359	0.355	0.355
Reddit	GCN	0.966	0.984	0.971
	SAGE	0.933	0.967	0.932
	GAT	0.866	0.715	0.801
	APPNP	0.905	0.937	0.933
PubMed	GCN	0.642	0.547	0.701
	SAGE	0.571	0.439	0.586
	GAT	0.694	0.453	0.453
	APPNP	0.459	0.382	0.501
Avg. Rank		2.000	2.344	1.656

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Liu, X.; Huang, J.; Chen, Z.; Pan, Y.; Xiong, M.; Zhao, W. Harnessing Unsupervised Insights: Enhancing Black-Box Graph Injection Attacks with Graph Contrastive Learning. Appl. Sci. 2024, 14, 9190. https://doi.org/10.3390/app14209190

AMA Style

Liu X, Huang J, Chen Z, Pan Y, Xiong M, Zhao W. Harnessing Unsupervised Insights: Enhancing Black-Box Graph Injection Attacks with Graph Contrastive Learning. Applied Sciences. 2024; 14(20):9190. https://doi.org/10.3390/app14209190

Chicago/Turabian Style

Liu, Xiao, Junjie Huang, Zihan Chen, Yi Pan, Maoyi Xiong, and Wentao Zhao. 2024. "Harnessing Unsupervised Insights: Enhancing Black-Box Graph Injection Attacks with Graph Contrastive Learning" Applied Sciences 14, no. 20: 9190. https://doi.org/10.3390/app14209190

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Article metric data becomes available approximately 24 hours after publication online.

Article Menu

Harnessing Unsupervised Insights: Enhancing Black-Box Graph Injection Attacks with Graph Contrastive Learning

Abstract

1. Introduction

2. Related Work

2.1. Graph Injection Attack

2.2. Adversarial Attacks on Graph Contrastive Learning

3. Preliminaries

3.1. Graph Contrastive Model

3.2. Graph Injection Attack

4. Methodology

4.1. Overview

4.2. Node Encoder

4.3. Reward Predictor

4.4. Fake Node Generator

4.4.1. Adversarial Loss

4.4.2. Fake Feature Generation

4.4.3. Fake Edge Generation

5. Experiment

5.1. Settings

5.2. Performance Comparison with Setup 1

5.3. Performance Comparison with Setup 2

5.4. Performance Comparison with Setup 3

5.5. Performance Comparison with Setup 4

5.6. Ablation Study

5.7. Case Study

6. Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Conflicts of Interest

Abbreviations

References

Share and Cite

Article Metrics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI