This experiment was performed on both complete and random graphs. For random graphs, we followed the method in reference [
13] and randomly extracted 500 to 2000 nodes and their edges from the larger Ca-CondMat dataset to construct four random graph datasets for further experimental comparison with the GPPS scheme [
16]. These datasets are denoted as CA-CM
1, CA-CM
2, CA-CM
3, and CA-CM
4, respectively, and contain 500, 1000, 1500, and 2000 nodes and their corresponding edges.
Complete graphs
Figure 7a shows the effect of different schemes on the IL in the soc-wiki-Vote dataset. It is evident that as
decreases, the IL of TC
-PIA increases, indicating fewer modifications to the original graph and better preservation of the original data. In addition, for the same
, IL decreases as
k increases, suggesting that higher privacy levels result in lower utility of the graph. The lower IL at
compared to
is due to dataset inhomogeneities causing experimental fluctuations. Compared to GPPS, TC
-PIA(
), TC
-PIA(
), and TC
-PIA(
) perform better, as the proposed scheme effectively limits the addition of fake nodes and edges, reducing their impact on the original graph. Unlike TC
-PIA, TC
-PIA
second considers the different
requirements of all users. Since not all users require weak privacy, TC
-PIA
second results in slightly lower IL compared to TC
-PIA(
). However, it outperforms TC
-PIA(
), TC
-PIA(
), and GPPS by considering different privacy requirements, rather than applying a uniform
. This reduces unnecessary modifications and makes anonymization more efficient, preserving more of the original graph structure.
Figure 7b,c show the results for the Email-Eu-core and Facebook datasets, respectively. In
Figure 7b, TC
-PIA(
) significantly outperforms GPPS. For
, the IL of TC
-PIA(
) is comparable to that of GPPS, while TC
-PIA(
) performs slightly worse. The analysis for TC
-PIA
second is consistent with the findings in
Figure 7a.
Overall, TC-PIA performs better on the soc-wiki-Vote dataset, and all four TC-PIAs outperform GPPS with different parameters because they are better suited for uniformly distributed and moderately sized graph networks.
Figure 8a shows the effect of different schemes on ACC in the soc-wiki-Vote dataset. TC
-PIA consistently has a lower rate of change in ACC compared to GPPS, indicating better performance. This result is attributed to the fact that TC
-PIA considers the triangle structure and number during structural anonymization, which helps to preserve more original edges and structures. As
k increases, the impact of each scheme on the ACC gradually increases, i.e., the difference between the anonymized graph and the original graph increases. This is because higher anonymity levels lead to more changes in the graph. In the TC
-PIA scheme, a smaller
results in less disruption to the original graph structure, as indicated by a lower effect on the ACC. The performance of TC
-PIA
second follows a similar trend but outperforms TC
-PIA (
), TC
-PIA (
), and GPPS due to its more personalized approach to user privacy requirements.
Figure 8b shows the experimental results for the Email-Eu-core dataset. Under the TC
-PIA scheme, the change in ACC decreases as
decreases, mirroring the trend observed for the soc-wiki-Vote dataset in
Figure 8a. Therefore, no further details are needed. Compared to GPPS, TC
-PIA(
) consistently shows a lower change in ACC for all
k values. At
, TC
-PIA(
) exhibits slightly higher fluctuations than GPPS, but remains lower for other
k values. This fluctuation is due to the non-uniformity of the dataset, but overall TC
-PIA(
) outperforms GPPS. For
, TC
-PIA(
) shows a higher change in ACC compared to GPPS, due to the formation of additional triangles for a stable structure. However, for
, TC
-PIA(
) gradually outperforms GPPS.
Figure 8c compares the TC
-PIA and GPPS schemes on the Facebook dataset. The analysis is similar to that of
Figure 8a,b.
Figure 9a shows the APL changes for the soc-wiki-Vote dataset. For all
k values, the APL changes for TC
-PIA(
), TC
-PIA(
), TC
-PIA(
),and TC
-PIA
second are consistently lower than those for GPPS. This is because TC
-PIA considers edges that participate more frequently in the shortest paths during anonymization, thus preserving more connection paths between nodes in the original graph.
Figure 9b shows the APL changes for the Email-Eu-core dataset. TC
-PIA(
) and TC
-PIA(
) exhibit significantly lower APL changes compared to GPPS. For
, the results for TC
-PIA(
) are similar to those of GPPS. For
, the APL change for TC
-PIA(
) is slightly higher than GPPS. Additionally, as
decreases, the APL impact of TC
-PIA decreases accordingly.
Figure 9c shows the APL changes for the Facebook dataset. The results are similar to those in
Figure 9a,b, so no further details are provided.
Figure 10a shows the error rate of EC on the soc-wiki-Vote dataset. All three TC
-PIA schemes outperform GPPS on this metric, as TC
-PIA better preserves graph structural features. Additionally, as
decreases, the EC error rate for TC
-PIA decreases significantly.
Figure 10b shows the error rate of EC on the Email-Eu-core dataset. TC
-PIA(
) has a lower error rate than GPPS. For
, TC
-PIA(
) shows a slightly higher error rate than GPPS, but for
, it performs better. This is because TC
-PIA uses triangle-structure-based similarity to minimize the impact on original nodes and controls the addition of fake nodes and edges, preserving more of the original graph’s structure. TC
-PIA(
) performs slightly worse than GPPS due to additional matrix elements that increase graph modifications.
Figure 10c shows the error rate of EC on the Facebook dataset. The experimental results are similar to those shown in
Figure 10a,b, and no further elaboration is needed.
To clarify the impact of controlling the addition of fake nodes on the graph, we compared two approaches based on this metric. The first approach is the original TC
-PIA
second, which merges and removes nodes after adding fake nodes. The second approach, referred to as TC
-PIA
third, does not apply any post-processing after adding fake nodes and does not control their number.
Figure 11 shows the experimental comparison results of the two approaches on three datasets, showing that TC
-PIA
second consistently outperforms TC
-PIA
third. This suggests that controlling for the addition of fake nodes effectively reduces the impact on the original graph.