Robust Federated Learning Mitigates Client-side Training Data Distribution Inference Attacks

Recent studies have revealed that federated learning (FL), once considered secure due to clients not sharing their private data with the server, is vulnerable to attacks such as client-side training data distribution inference, where a malicious client can recreate the victim's data. While various countermeasures exist, they are not practical, often assuming server access to some training data or knowledge of label distribution before the attack. In this work, we bridge the gap by proposing InferGuard, a novel Byzantine-robust aggregation rule aimed at defending against client-side training data distribution inference attacks. In our proposed InferGuard, the server first calculates the coordinate-wise median of all the model updates it receives. A client's model update is considered malicious if it significantly deviates from the computed median update. We conduct a thorough evaluation of our proposed InferGuard on five benchmark datasets and perform a comparison with ten baseline methods. The results of our experiments indicate that our defense mechanism is highly effective in protecting against client-side training data distribution inference attacks, even against strong adaptive attacks. Furthermore, our method substantially outperforms the baseline methods in various practical FL scenarios.


INTRODUCTION
Federated learning (FL) [14] is an innovative distributed machine learning paradigm that has gained significant attention recently, allowing individuals to train a global machine learning model collaboratively without sharing their private training data with others.FL usually consists of one server and multiple clients.During the FL training process, each client performs local training using the current global model and its local training data, then sends its local model update to the server.Upon receiving model updates from all clients, the server leverages a specific aggregation rule to combine the received model updates and further update the global model.The updated global model is then distributed to all clients for the next round of training.For instance, the FedAvg [14] aggregation rule calculates the average of model updates to obtain the global model and is commonly employed in non-adversarial scenarios.
A benefit of FL compared with centralized learning is that clients no longer need to send their private training data to the server.Some cryptographic approaches have been proposed to protect the FL system from information leakage in network security level [5].However, methods reliant on cryptography often incur large computation and communication overhead [27].Recent studies [2,8] have shown that FL is vulnerable to poisoning attacks, where malicious clients could send carefully crafted model updates to the server to manipulate the final learned global model.Moreover, some works studied the privacy of FL and found that FL is not as sound as expected in privacy-even awful.These works explored the possibility of privacy both on the server and client sides.On the server side, they found that the server can reconstruct images or properties of data from a specific client [26].The most influential for client-side inference attacks is the GAN attack proposed by Hitaj et al. [12].Taking advantage of the dynamic nature of the learning process, this form of attack enables adversaries to leverage a Generative Adversarial Network (GAN) [11] to produce similar samples from the targeted training set, initially intended to remain undisclosed.The models generated by the GAN aim to align closely with the same data distribution as the training data.
Client-side inference attacks are considerably more feasible than server-side ones, as the attacker involved in a client-side attack is a participating client in FL, and does not need to compromise the server to carry out its malicious actions.Certain defense mechanisms have been specifically designed to defend against clientside inference attacks.For instance, Netzer et al. [23] developed a method that utilizes differential privacy mechanisms to mitigate such attacks.Nevertheless, our later experiments revealed that such approaches are ineffective in defending against inference attacks.
Our work: We first observe that existing Byzantine-robust aggregation rules [4,6,9,15,25] could only mitigate client-side inference attacks to some extent.However, they are still vulnerable to inference attacks in certain circumstances, as demonstrated in our experiments.Based on this motivation, we propose a new Byzantinerobust aggregation rule called InferGuard to defend against inference attacks.In our proposed InferGuard, after receiving model updates from all clients, the server calculates the Median [25] of these updates.If a received model update deviates substantially from the computed Median, it is identified as malicious.
Our contributions can be summarized as follows: • We find that existing Byzantine-robust aggregation rules could mitigate the client-side inference attacks on FL to some extent.However, their effect is not optimal because we can still recognize the content in generated images.• We introduce InferGuard, an innovative defense framework designed to protect against inference attacks on the client side of FL.InferGuard effectively mitigates malicious clients' influence while preserving the FL system's utility.• We thoroughly evaluate our defense framework using five benchmark datasets.Our results demonstrate that Infer-Guard effectively safeguards against client-side inference attacks, and outperforms baseline methods.

THREAT MODEL AND DEFENSE GOALS
Attacker's Goal: Following [12], we consider a client-side data reconstruction attack.We assume that an attacker controls some malicious clients.The attacker's goal is to reconstruct images of a specific label that it initially did not possess.
Attacker   model's accuracy is as close to FedAvg's as possible without attacks, and be computationally efficient without additional costs.
Subsequently, the server determines whether    should be considered a benign local model update based on the following: is the smallest.
We experiment on the MNIST dataset to verify our idea (refer to Section 4.1 for experimental settings).We aim to check whether the model update from the malicious client is chosen at each global round.Multi-Krum is adopted as our baseline.The results are shown in Figure 1.In Figure 1, if the "Indicator" equals one, it represents that the server mistakenly selects the malicious model update in a specific training round.In contrast, zero represents that it does not.From Figure 1, we observe that for Multi-Krum, the server chooses the malicious model update almost every training round.However, in our method, after the attacker starts to attack at the 50th training round (following prior work [12], we assume that the attacker starts to attack from a specific training round), the server does not select the malicious model update anymore.
4.1.4Parameter Settings.We train 300 rounds on MNIST, Fashion-MNIST, AT&T, GTSRB datasets, and 150 rounds on SVHN, with each client training locally for one epoch per round.In a FL setup with 10 clients, where one is malicious following the approach in [12], we simulate non-i.i.d.data by uniformly distributing each label to 5 clients, leading to distinct class distributions per client.In all datasets, we suppose that the last client is the malicious client.In MNIST, Fashion-MNIST, GTSRB, and SVHN datasets, the malicious client steals images of label 3, which he does not own.In AT&T, the malicious client steals images of label 11.The malicious client initiates data distribution inference attacks targeting specific labels not owned by them, starting from round 50 for MNIST, Fashion-MNIST, AT&T, and GTSRB, and round 20 for SVHN.For the  parameter, we set  = 2.0 for MNIST,  = 1.8 for Fashion-MNIST,  = 2.8 for AT&T,  = 3.0 for SVHN and  = 1.2 for GTSRB dataset.

Experimental Results
Our InferGuard is effective: From Table 1, we observe that our method almost wins on all three evaluation metrics compared with baselines on five datasets.For instance, the SSIM value on MNIST after our defense declined to 0.22, while SSIM values of baselines are at least 0.4.Our visualization result, Figure 2 also supports the same conclusion.Impact of different : We experiment with the parameter  taking on values of 0.5, 2, 2.5, 3, 5, 7, 10.As shown in Figure 3, we can conclude that the defensive efficacy starts to decline when  surpasses 2. So in our experiment, we choose  = 2 as the default setting.Non-iid settings: Table 3 shows the situations where each client owns three labels.Under such non-iid settings, our method is still superior over baselines.
Results on adaptive attacks: Suppose the local model weight is    , and the loss function when training the local model is (   ; ).The training data distribution for the malicious client is D  .The optimization problem for the malicious client is formalized as where  is a constant.Table 2 shows the evaluation result of our adaptive attack on the MNIST dataset with  = 0.0016.In MNIST, our defense effect is  weakened, but from the SSIM values, we can conclude that it is still robust in some way because the SSIM score under our method is still lower than SSIM scores under baselines.
Results on membership inference attack: We consider the membership inference attack proposed in [29].The attacker has 300 samples and needs to determine whether each sample is in the training set of the other 9 clients.Other settings are the same as our default setting.Table 4 displays the results on Location30 [20] dataset, as recommended in [29].In Table 4, "Model acc" denotes the global model's testing accuracy, and "Attack acc" represents the attacker's success rate.We can see that InferGuard achieves the best defense effect among all defenses.Moreover, it continues to sustain the model's high-level performance.

CONCLUSION
Our research demonstrated that current strategies for defending against client-side inference attacks fall short in practice, highlighting the need for a more robust defense mechanism in FL.Interestingly, we discovered that existing Byzantine-robust aggregation rules, although not originally designed to combat inference attacks, provide a degree of protective effect.Building on these insights, we developed a novel Byzantine-robust aggregation rule named Infer-Guard, which could effectively counter client-side inference attacks.Comparative analysis across five datasets revealed that this new defense strategy markedly outperforms existing mechanisms.Future exploration could focus on providing a formal theoretical guarantee to demonstrate the robustness of our proposed InferGuard against client-side training data distribution inference attacks.

Figure 1 :
Figure 1: MNIST dataset, whether a malicious model update is chosen in each round.
During the global training round , the set of clients whose model updates satisfy Eq. (2) is denoted as H .The final aggregated local model update is then calculated as the average of the local model updates of all the clients in H as 1 | H |  ∈ H    .If |H | = 0, then we choose the model update whose    −   med 2
's Capabilities: The attacker achieves his goal via sending carefully crafted model updates to the server.Malicious clients can exchange local model updates amongst themselves, and each malicious client knows other malicious clients' local training data.Malicious clients can initiate the attack at any global round.Note that the server is trustworthy, and there is no collusion between the attacker and the server.
Defense Goals: Our goal is to create an aggregation rule for FL systems that is secure against client-side inference attacks and maintains robustness, fidelity, and efficiency.This rule should prevent attackers from accessing clients' local data, ensure the global Suppose we have  clients, in global training round , each client  submits its local model update    to the server, where 1 ≤  ≤ .Upon receiving local model updates from all clients, the server first computes the coordinate-wise median of  local model updates, denoted as   med , as the following:

Table 1 :
Results of different FL methods under attack.

Table 2 :
Results of adaptive attack on MNIST dataset.

Table 3 :
Impact of data distribution.Each client owns three labels of data.

Table 4 :
Results of different FL methods under membership inference attack on Location30 dataset.