Model accuracy reflects the model’s classification ability on the test dataset, i.e., the proportion of correctly classified samples. Higher accuracy indicates good classification performance on the test set, directly representing the overall performance of the model. Analyzing model accuracy under different client numbers can effectively assess the robustness of the algorithm. Since the total dataset size remains constant, changes in the number of clients lead to different allocations of the training dataset. Therefore, the ability of the model to maintain high accuracy under varying training samples is a key challenge in federated learning. To evaluate the impact of different client numbers in federated learning on model accuracy, communication overhead, and strategies for handling attacks, the experimental design in this paper is divided into two parts, analyzed from different perspectives, with each section interconnected.
Part 1: We compare the SPM mechanism with the basic federated learning algorithm without differential privacy (NoDP-FL) [
30] and three advanced differential privacy mechanisms: (1) the PM mechanism [
27], (2) the mechanism by Sun et al. [
28], and (3) the PNPM mechanism [
29]. In
Section 5.2.1 and
Section 5.2.2, we verify the model accuracy of different mechanisms in scenarios with varying client numbers to comprehensively analyze the robustness of each mechanism. In the first part of
Section 5.2.3, we conducted a usability analysis, comparing the privacy budget consumption of each mechanism in achieving the same model accuracy. In the second part of
Section 5.2.3, we selected representative client numbers of 30 and 300 to analyze the usability of the SPM mechanism under different privacy budgets and explore its maximum model accuracy under the same configuration.
These analyses help us gain a comprehensive understanding of the mechanism’s performance and applicability in real-world applications.
5.2.1. Model Accuracy Analysis in Scenarios with Few Clients
In this section of the experiment, we ensure that the parameter configurations and privacy budgets for each mechanism remain consistent, selecting a client count of 5 to 50 for the limited client scenario, in which two network models are used with the CIFAR-10 dataset. The specific settings are as follows: a sampling rate of 0.6 (with a sampling rate of 1 for the two models in the CIFAR-10 dataset), a local iteration count of 3 (with the two models in the CIFAR-10 dataset set to 5 and 7 iterations, respectively), a batch size of 64 (with a batch size of 32 for CIFAR-10 Model 2), and a global communication round count of 50. The selection of these parameters is based on a comprehensive consideration of communication and time overhead. In
Section 5.2.4, we will discuss in detail how to slightly enhance model performance by increasing local computation time while maintaining the same privacy budget, and simultaneously reducing global communication costs appropriately. Next, we will analyze the model accuracy performance of different mechanisms under each dataset.
The MNIST dataset uses a privacy budget of 0.3. As shown in
Table 2, there are significant differences in the performance of various algorithms under different client counts. First, in terms of overall accuracy, the NoDP algorithm exhibits the best model accuracy across all client counts, consistently maintaining over 90% accuracy with minimal fluctuation as the number of clients increases. This is because it does not introduce noise perturbations, resulting in weaker privacy protection; it will serve as our baseline comparison mechanism for this section. In contrast, the SPM mechanism closely follows, demonstrating model accuracies near that of NoDP, especially with client counts of 10 and 20, where SPM achieves accuracies of 89.48% and 89.66%, nearly matching NoDP. This indicates that SPM can still provide privacy protection without significantly reducing model accuracy, which is its advantage over other privacy-protection mechanisms. The accuracies of mechanisms like PNPM and those by Sun et al. [
26] are relatively low, particularly when the number of clients is small. While the mechanism by Sun et al. [
26] performs poorly with a small number of clients, it shows improvement when the client count reaches 30. The PM mechanism clearly performs the worst, especially with a client count of 5, achieving an accuracy of only 52.73%, which is significantly lower than other algorithms, indicating that this mechanism is not suitable for scenarios with a privacy budget.
Figure 3 illustrates the trend of model accuracy under different mechanisms on the MNIST dataset. From the figure, it can be observed that the NoDP mechanism maintains a relatively stable model accuracy as the number of clients increases, while other mechanisms exhibit varying degrees of decline, with SPM showing the slowest decrease, indicating that this mechanism is more suitable for limited client scenarios. In contrast, other mechanisms perform poorly when the number of clients reaches 50. This is due to the inverse relationship between the number of clients and the amount of training data allocated to each client; the reduction in the training dataset leads to a decline in the model accuracy of these mechanisms.
The Fashion-MNIST dataset uses a privacy budget of 0.6. As shown in
Table 3, the NoDP mechanism maintains stable model accuracy across different client counts, while SPM, although slightly lower, is close to NoDP, particularly with 5 to 10 clients, where SPM’s accuracies are 83.78% and 83.99%, with a difference of less than 1%. In contrast, the mechanisms by PNPM and Sun et al. [
26] show a significant decline in accuracy as the number of clients increases, especially with PNPM dropping to 80.71% when the client count reaches 50. The PM mechanism performs poorly with a limited number of clients, achieving an accuracy of only 56.05%. Although it improves subsequently, the performance remains unstable.
Figure 4 illustrates the trend of model accuracy under different mechanisms on the Fashion-MNIST dataset. It can be observed that the model accuracy of all mechanisms shows a slight decline as the number of clients increases, but remains relatively stable at client counts of 5 and 10. Notably, the SPM mechanism, when the client count reaches 50, has an accuracy that is 1.96% lower than NoDP, yet still surpasses the accuracy of other mechanisms, demonstrating the advantages of the SPM mechanism.
In the CIFAR-10 dataset (Model 1), a privacy budget of 1.9 is used (NaN indicates gradient explosion under this privacy budget, preventing convergence). As shown in
Table 4, the NoDP mechanism performs steadily with client counts of 5 to 10, consistently maintaining an accuracy of over 81%. The SPM mechanism experiences a slight decline in accuracy as the number of clients increases, yet still maintains an accuracy of 75.14% with 50 clients, which is higher than that of other mechanisms. In contrast, the PNPM and Sun et al. [
26] mechanisms exhibit lower accuracies and demonstrate a noticeable decline in performance as the number of clients increases. The PM mechanism performs poorly with a limited number of clients. This indicates that under a lower privacy budget, the SPM mechanism can maintain high performance.
Figure 5 illustrates the trend of model accuracy under different mechanisms on the CIFAR-10 dataset (Model 1). We observe that with 5 to 30 clients, SPM maintains a model accuracy similar to that of NoDP. However, with 40 to 50 clients, all mechanisms show a significant decline in model accuracy. This may be related to the CIFAR-10 dataset consisting of color three-channel images, which increases training complexity; however, this trend differs from the performance observed in other datasets. To further investigate this situation, we employed a second network architecture for this dataset, which will be analyzed comprehensively in Model 2.
As shown in
Table 5, in the CIFAR-10 dataset (Model 2), we set the privacy budget to 3.9. We found that in a more complex network (ResNet18), the lower privacy budget of 1.9 could not yield optimal performance for the mechanisms, so we made a moderate increase in the privacy budget. With a client count of 5, all mechanisms achieved high model accuracy, with the lowest being 78.33%. This phenomenon can be attributed to the limited number of clients, allowing each client to utilize a more sufficient training set, along with the ability of deep networks to extract more useful information. However, as the client count increases to 40 to 50, the reduction in the local training set leads to a decline in the model accuracy of all mechanisms, while the SPM mechanism still maintains a high accuracy of 84.53%, with a difference of 2.93% compared to NoDP.
We further analyzed the reasons for the significant decline in Model 1 when the client count was between 40 and 50. Compared to Model 1, Model 2 has a greater depth, enabling it to extract more information across a larger number of clients (e.g., 50 clients). Additionally, the lower privacy budget used in Model 1 is also a contributing factor to the decline. Since deep networks learn more information when learning sample features, a very low privacy budget (such as 1.9 in Model 1) may lead to overfitting in the early stages of training, thus preventing convergence. Meanwhile, a smaller privacy budget can also alter the features learned by the model in the early stages during aggregation. In
Section 5.2.4, we will explore how to enhance model accuracy through other means while maintaining the privacy budget unchanged.
Figure 6 illustrates the trend of model accuracy under different mechanisms on the CIFAR-10 dataset (Model 2).
5.2.2. Model Accuracy Analysis in Scenarios with Many Clients
In the previous experiment, we explored the model accuracy of various mechanisms in a limited client scenario; however, in real-world applications, the number of clients often varies dynamically due to communication costs and resource constraints. Therefore, the robustness and adaptability of mechanisms in multi-client scenarios are particularly important to ensure that the model maintains good performance across different environments. Hence, our mechanisms need to be adapted to multi-client scenarios.
In this section, we selected a client count ranging from 100 to 500 as the multi-client scenario. The CIFAR-10 dataset also employs two network models. The specific settings are as follows: a sampling rate of 0.6 (the sampling rate cannot be 1 in multi-client scenarios to prevent excessive local noise from interfering with global model aggregation), local iterations set to 3 (with 15 and 10 iterations for the two models in the CIFAR-10 dataset, and 10 iterations for the Fashion-MNIST dataset), a batch size of 64 (with a batch size of 32 for CIFAR-10 Model 2), and 50 global communication rounds (with 20 for the Fashion-MNIST dataset). Next, we will analyze the performance of different mechanisms in terms of model accuracy for each dataset.
The privacy budget for the MNIST dataset is set to 0.3. As shown in
Table 6, in the multi-client scenario, we find that the model accuracy of NoDP on the MNIST dataset does not decline. This is because the dataset is relatively simple and easy to train. With a client count of 100, the accuracy of SPM is 88.37%, which is 1.63% lower than NoDP’s 90; with a client count of 500, SPM’s accuracy is 84.31%, differing by 5.85% from NoDP. Nevertheless, SPM’s accuracy remains higher than that of other mechanisms, further validating the effectiveness of our algorithm.
Figure 7 illustrates the trend of model accuracy under different mechanisms on the MNIST dataset.
The privacy budget for the Fashion-MNIST dataset is set to 0.9. Compared to the limited client scenario, we slightly increased the privacy budget in the multi-client scenario because the increase in the number of clients leads to a reduction in the training set allocated to each client, consequently reducing the available feature information. A moderate increase in the privacy budget can effectively reduce noise, preventing the model from experiencing gradient explosion due to noise interference. As shown in
Table 7, among all clients, the model accuracy of the SPM mechanism differs from that of the NoDP mechanism by at least 0.06% and at most 1.87%. Except for the PM mechanism, the model accuracies of the other mechanisms are relatively high, indicating that in the multi-client scenario, the SPM, PNPM, and Sun et al. [
26] mechanisms can all demonstrate good performance.
Figure 8 illustrates the trend of model accuracy under different mechanisms on the Fashion-MNIST dataset.
The privacy budget for the CIFAR-10 dataset (Model 1) is set to 2.9. For this relatively complex dataset, our privacy budget is slightly increased compared to the limited client scenario. As shown in
Table 8, with a client count of 100, the SPM mechanism differs from NoDP by 1.74%, outperforming other mechanisms. However, as the client count increases, the reduction in the training set leads to an expanding gap between SPM and NoDP, with similar performance trends observed for other mechanisms. With a client count of 500, the accuracy of NoDP is 67.27%, while the accuracies of SPM, PNPM, Sun et al. [
26], and PM are 57.83%, 54.01%, 50.71%, and 45.16%, respectively. Nevertheless, SPM still maintains the highest accuracy among all mechanisms.
This indicates that in multi-client scenarios, shallow network models struggle to handle more useful information for complex datasets. If we aim to improve model accuracy, a moderate increase in the privacy budget may be necessary, though this could also reduce the privacy-protection capability of the mechanism. In this regard, we analyze the usability of our algorithm under different privacy budgets in
Section 5.2.2.
Figure 9 illustrates the trend of model accuracy under different mechanisms on the CIFAR-10 dataset (Model 1).
The privacy budget for the CIFAR-10 dataset (Model II) is set to 3.9, similar to the low client scenario. As shown in
Table 9, we find that the performance of the SPM and PNPM mechanisms is nearly consistent across clients, with only minor differences. For example, with 100 clients, the accuracy of SPM is 83.58%, while that of PNPM is 82.47%, indicating that SPM slightly outperforms PNPM. However, with 500 clients, the accuracy of PNPM (77.56%) surpasses that of SPM (76.64%). The accuracy of other mechanisms is relatively low. Based on the above experimental analysis, we conclude that in deeper network models, the accuracy of all mechanisms generally improves, demonstrating superior performance compared to shallow models. Even with a larger number of clients (500), the accuracy of SPM and PNPM is similar to that of NoDP. This indicates that deep models can learn more useful information from relatively smaller training sets, thereby enhancing model accuracy, although their training time cost is significantly higher than that of shallow models.
Figure 10 illustrates the variation trend of model accuracy for different mechanisms under the CIFAR-10 dataset (Model 2).
5.2.3. Usability Analysis
In this section, we will analyze the usability of the mechanisms, with the experimental scenario specifically designed in two parts, which we will elaborate on below.
Part One: Usability Analysis of Mechanisms.
We will compare the performance of the SPM mechanism with other federated learning privacy-protection mechanisms across three datasets, primarily examining the privacy budget used to achieve maximum model accuracy. This analysis aims to compare privacy-protection capabilities while ensuring model performance. A smaller privacy budget indicates stronger privacy-protection capabilities. For the low and high client scenarios, we selected median quantities as representatives, specifically 30 and 300, to validate performance under different scenarios. The parameters used for each mechanism remain consistent with those in the previous two sections, with specific values detailed in
Table 10 and
Table 11.
In the low client scenario, there are significant differences in the privacy budgets required by each mechanism to achieve similar model accuracies. The SPM mechanism requires the lowest privacy budget, followed closely by PNPM, indicating that both mechanisms are suitable for scenarios with smaller privacy budgets. The Sun et al. [
26] mechanism also performs well on the MNIST and Fashion-MNIST datasets, while the PM mechanism fails to achieve the model accuracy of other mechanisms on the CIFAR-10 dataset, and is therefore replaced with NaN. Furthermore, on other datasets, the privacy budget of the PM mechanism is excessively high, indicating that it is not suitable for low privacy budget scenarios. Under small privacy budgets, although the performance differences among the mechanisms are minor, they still reveal significant gaps, further demonstrating the advantages of the SPM mechanism in terms of privacy protection.
In the multi-client scenario, we find that the privacy budgets required by each mechanism are generally larger, especially on the CIFAR-10 dataset. This is primarily due to the complexity of this dataset compared to others, necessitating smaller noise perturbations to maintain model performance. For the CIFAR-10 dataset, in Model I, the privacy budget required by the SPM mechanism is 2.9, while the minimum privacy budget for other mechanisms must reach 5.5. In Model II, the privacy budgets of the mechanisms are close, with SPM at 3.9, PNPM at 4.5, and Sun et al. at 8.5. This indicates that, compared to privacy budgets below 3, the differences among budgets exceeding 3 are not significant, and the privacy-protection capabilities provided are relatively weak. This further validates the effectiveness of our algorithm.
Part Two: Usability Analysis of the SPM Mechanism.
In this section, we will explore the usability analysis of the SPM mechanism. As described in
Section 5.2.1 and
Section 5.2.2, the parameters used in these sections are based on a comprehensive selection of privacy-protection capabilities. This section will examine the differences between the SPM mechanism and the NoDP mechanism without noise processing when the privacy budget is appropriately increased, to demonstrate the usability of the SPM mechanism. We will use the NoDP values from the previous section and maintain the client numbers consistent with those in Part One of this section, selecting 30 and 300 as representatives. By comparing the performances of SPM and NoDP under different privacy budgets, we aim to reveal the balance between privacy protection and model accuracy in the SPM mechanism. We will analyze each dataset separately below.
For the MNIST dataset, we selected a privacy budget of 0.3 in
Section 5.2.1 and
Section 5.2.2. In this section, our privacy budgets are set as follows: 0.1 to 0.4 for the low client scenario and 0.3 to 2.4 for the multi-client scenario. This is because in the multi-client scenario, small increases in the privacy budget do not lead to significant performance improvements, so the maximum privacy budget we selected is the minimum value that allows the model accuracy to approach that of NoDP.
As shown in
Figure 11, in the low client scenario, selecting a privacy budget of 0.4 allows the performance of the SPM mechanism to approach that of NoDP. However, in the multi-client scenario, the privacy budget needs to be set to 2.4. We infer that in the multi-client case, selecting a privacy budget of 2.4 is a reasonable choice, but it also implies a decrease in privacy-protection capability. We will further evaluate the privacy-protection capability of the privacy budget values used in this section in
Section 5.2.4.
In the Fashion-MNIST dataset, as shown in
Figure 12, we selected a privacy budget range of 0.6 to 2.1 in the low client scenario. However, with a privacy budget of 2.1, the model accuracy of the SPM mechanism still shows a 1% gap compared to NoDP, but is significantly higher than the accuracy at 0.6, which is acceptable. Although we did not continue to increase the privacy budget, a moderate increase does improve model accuracy. Therefore, for this scenario, selecting a privacy budget of 2.1 can provide better performance, but it also reduces privacy-protection capability.
In the multi-client scenario, we selected a privacy budget range of 0.6 to 1.8. We found that when the privacy budget is 1.8, the performance of the SPM mechanism approaches that of NoDP, and the overall accuracy shows an upward trend with increasing privacy budgets. This indicates that selecting a privacy budget of 1.8 is a good choice in the multi-client scenario. This is because, compared to the low client scenario, there are fewer locally allocated datasets in this scenario, so the chosen number of local iterations (set to 10) is adjusted to accommodate this change.
In CIFAR-10 (Model 1), as shown in
Figure 13, due to the model being a shallow network, we set the privacy budget range as follows: 1.3 to 2.1 for the low client scenario and 2.9 to 4.9 for the multi-client scenario. In the low client scenario, when the privacy budget is 1.9 and 2.1, the model accuracy approaches that of NoDP, indicating that a privacy budget of 2.0 is a good choice, ensuring high performance while providing good privacy protection.
In the multi-client scenario, even with a higher privacy budget of 4.9, the accuracy of the SPM mechanism still shows an approximate 3% gap compared to NoDP. We believe this is primarily due to the model being shallow, along with an increase in the number of clients leading to a reduction in the local training set, which consequently decreases the content that can be learned. Even with an increased privacy budget, the model struggles to learn more useful information without being disrupted. Therefore, we decided not to continue increasing the privacy budget for further investigation.
In CIFAR-10 (Model II), as shown in
Figure 14, we conducted additional experiments addressing the issue of being unable to learn more information in shallow networks in the multi-client scenario. By employing the deeper ResNet18 architecture, we found that under the same privacy budget (4.4), this model can provide better performance, with a gap of less than 0.2% compared to NoDP. Therefore, we set the privacy budgets as follows: 2.9 to 4.9 for the low client scenario and 2.4 to 4.4 for the multi-client scenario.
In the low client scenario, when the privacy budget is 4.4 and 4.9, the gap between SPM and NoDP can also be maintained within 1%. This further indicates that adopting a deeper network architecture is necessary for more complex datasets. However, this also implies that the time and privacy overhead incurred may increase.
Finally, based on the above observations, we conclude that appropriately increasing the privacy budget helps improve model accuracy to meet practical needs, but it may also lead to a decrease in privacy-protection capability. To this end, we will introduce DLG attacks in
Section 5.2.4 to test the privacy-protection capability of the privacy budgets used in this section and conduct a comprehensive analysis of the optimal privacy budget for the SPM mechanism.
5.2.4. Privacy-Protection Capability Evaluation
In this section, we selected three privacy budget values for the SPM mechanism in different client scenarios to assess its usability and privacy-protection capability. The first two privacy budget values are selected from the privacy budgets in
Section 5.2.1 and
Section 5.2.2 to evaluate the level of privacy protection when balancing privacy-protection capability and performance, while the third privacy budget is chosen from the maximum privacy budgets for each dataset in
Section 5.2.3 to measure the privacy-protection capability at maximum accuracy. We employed classic DLG attacks on the datasets corresponding to each privacy budget, recording the convergence count of the attacks as Attack T (AT), and selecting three phases: (1/3 AT, 2/3 AT, 3/3 AT) to observe the completion of the disguised images during the attacks. We use the following two metrics to evaluate the experimental results: (1) Structural Similarity Index (SSIM), which is used to assess the difference between the attack-reconstructed image and the original image; (2) DLG attack loss value, which measures the difference between the gradients of the original sample and the disguised sample.
The SSIM value ranges from 0 to 1, with values closer to 1 indicating higher image similarity, and 1 representing identical images. SSIM is calculated based on three aspects: luminance, contrast, and structure, which respectively measure the differences in mean luminance, contrast, and local structural similarity. These metrics help comprehensively evaluate the effectiveness of the attack and the performance of privacy protection. Next, we will independently analyze the experimental results of each dataset.
In the MNIST dataset, the privacy budgets we adopted are 0.3, 1.5, and 2.4. From
Table 12 and
Figure 15, it can be seen that as the number of attack rounds increases, the loss value decreases while the SSIM value continues to increase. With a privacy budget of 0.3, the provided privacy protection is the strongest, with a very high loss value of 119.6575 and a relatively low SSIM value of 0.4582, resulting in the final disguised image showing almost no features of the original image. This indicates that using a privacy budget of 0.3 can effectively protect privacy in the low client scenario. With a privacy budget of 1.5, the disguised image displays only a few features, making it difficult to distinguish obvious characteristics, indicating that the multi-client scenario also possesses strong privacy-protection capability. However, when the privacy budget is increased to 2.4, we find that the attacked disguised images exhibit numerous features, which can be used to distinguish the original training images after further processing, especially considering that this dataset is small, containing only ten classes. As mentioned in the conclusion of the previous section, increasing the privacy budget reduces privacy-protection capability, which also indirectly reflects the privacy-protection capability of the SPM mechanism.
Figure 15 shows the variation trend of disguised images under different privacy budgets in the MNIST dataset.
In the Fashion-MNIST dataset, as shown in
Figure 16, the protection capability of the SPM mechanism is relatively unstable. In the final iteration, the disguised images under all privacy budgets displayed some features, and the prominence of these features increased with the privacy budget. As shown in
Table 13, it can be seen that, except for the case where the privacy budget is 0.6, where the loss value does not converge, the loss values for other privacy budgets converge well, and the SSIM values can reach around 80%. This indicates that in this dataset, if we wish to enhance the privacy-protection capability of the SPM mechanism, it is necessary to consider reducing the privacy budget or employing deeper neural networks.
Figure 16 shows the variation trend of disguised images under different privacy budgets in the Fashion-MNIST dataset.
In the CIFAR-10 dataset (Model 1), the privacy budgets were increased to 1.9, 2.9, and 4.9 due to the dataset consisting of color images. As shown in
Table 14, the loss values and SSIM values under each privacy budget did not exhibit non-convergence. However, with a privacy budget of 1.9, the disguised images displayed almost no features. As the privacy budget increased, the images gradually revealed some features, but there was still considerable noise interference, resulting in blurry images. This indicates that training the SPM mechanism on this dataset using Model 1 is advisable, demonstrating good privacy-protection capability. Furthermore, the optimal choice of privacy budget aligns with those selected in
Section 5.2.1 and
Section 5.2.2.
Figure 17 shows the variation trend of disguised images under different privacy budgets in the CIFAR-10 dataset (Model 1).
In the CIFAR-10 dataset (Model 1), as shown in
Table 15, we used a deeper network structure and relatively low privacy budgets of 3.9, 4.4, and 4.9. With a privacy budget of 3.9, we observed a trend consistent with the SPM mechanism, where the colors of the disguised images were completely opposite to those of the original images, indicating a reversal of the color channels in the attacked images. This phenomenon reflects the disturbance method of our mechanism, where applying directional perturbations after desensitizing the gradient weights can effectively disrupt the direction of DLG attacks, thereby protecting the original data while maintaining high accuracy. For the settings of other privacy budgets, we found it difficult to obtain useful information. As seen in
Table 1, the SSIM values for each privacy budget are relatively low, with the highest being only around 70%. This indicates that, despite the higher privacy budgets, the metrics have limitations across multiple dimensions, with actual pixel similarity being only about one-third of the current value.
We believe that even with higher privacy budgets, the combination of the SPM mechanism and deep neural networks maintains privacy-protection capability. Therefore, for complex datasets, although shallower networks can provide smaller privacy budget settings, deep neural networks can still maintain similar privacy-protection capabilities even with larger privacy budgets, albeit with increased time costs.
Figure 18 shows the variation trend of disguised images under different privacy budgets in the CIFAR-10 dataset (Model 2).
Figure 18 shows the variation trend of disguised images under different privacy budgets in the CIFAR-10 dataset (Model 2). In conclusion, we summarize as follows: The SPM mechanism implements positive and negative disturbances by taking the absolute value of the model gradient parameters and multiplying by a disturbance coefficient, achieving a desensitization effect that meets strict differential privacy requirements. Combined with the characteristics of DLG attacks, this mechanism can effectively reduce the likelihood of successful attacks. First, the positive and negative disturbances increase the randomness of the model gradients, making it difficult for attackers to extract useful information. Second, the dynamic adjustment of the disturbance coefficient can enhance resistance to specific attack patterns, adapting to different attack environments. Furthermore, the design of the SPM mechanism ensures that the leakage of sensitive information during gradient updates is significantly reduced, thereby suppressing the effectiveness of DLG attacks. Overall, the SPM mechanism not only protects data privacy but also effectively mitigates the risk of DLG attacks by enhancing the fuzziness and randomness of the gradients.
Furthermore, we analyzed how the SPM mechanism effectively reduces the risk of membership inference attacks. Membership inference attacks attempt to infer whether specific data points were used in training by analyzing the model outputs, while the SPM mechanism applies noise and processes gradients in absolute value, making the model outputs more random and making it difficult for attackers to extract clear membership information. Notably, the SPM mechanism is particularly suitable for multi-client scenarios. In multi-client environments, the datasets used by each client are usually smaller, and the diversity of data is relatively low, which makes the features learned by the attack models trained on shadow datasets used in membership inference attacks more difficult to discern against the features of our model, thus reducing the effectiveness of membership inference attacks.
The design of the SPM mechanism not only protects user data privacy but also significantly reduces the risk of the model facing various types of attacks by enhancing the fuzziness of the model outputs.
5.2.5. Impact of Local Iteration Count on Communication Overhead
In this section, we conducted an internal exploration of the SPM mechanism to investigate whether “increasing local iterations can reduce global communication costs at the expense of local time overhead”. We selected the parameter settings from
Section 5.2.1 and
Section 5.2.2 as a control, appropriately reducing the global communication rounds T while increasing the local iterations E, ensuring that the privacy budget remained unchanged. This was done to compare the model accuracy against the original parameter settings and to analyze the changes in other parameters while maintaining a similar model accuracy and ensuring the effectiveness of privacy protection. The client scenarios were set to 30 and 300 clients, respectively. Next, we will analyze the experimental results one by one.
In the MNIST dataset, as shown in
Table 16, we increased the local iterations E from 3 to 5, 7, 9, and 11 for the small client scenario, and to 11, 13, 15, and 17 for the multi-client scenario, while reducing the global communication rounds T to 20. As seen in
Table 1, appropriately increasing the local iterations not only maintains the original model accuracy but also leads to some improvements, while significantly reducing global communication costs. Therefore, for the SPM mechanism, this paper recommends adopting the parameter settings shown in the table for this dataset to achieve higher communication efficiency and model performance.
In the Fashion-MNIST dataset, as shown in
Table 17, we found that reducing the local iterations, for example setting E = 2 and simultaneously decreasing the global communication rounds T, led to a decline in model accuracy from the original 83.28 to 82.80. However, when E = 7 or higher, the model accuracy remains stable, while communication costs are significantly reduced. This conclusion applies to both client scenarios in this dataset. However, as the number of local iterations increases, the computational time overhead also rises rapidly, which is another important issue that requires careful consideration.
In the CIFAR-10 dataset (Model 2), as shown in
Table 18, we found that reducing the number of iterations did not yield significant results for complex network models. For example, increasing the number of iterations from E = 5 to E = 8 resulted in negligible improvement in model accuracy, while reducing the global communication rounds T led to a decrease in model accuracy, which contradicts the purpose of this experiment. We believe this is due to the deeper internal connections of the network, which are already capable of fully learning feature information under the current iteration settings. Therefore, this approach cannot effectively reduce communication overhead for this dataset. We will explore more strategies to optimize communication overhead in future research.