Blink: Link Local Differential Privacy in Graph Neural Networks via Bayesian Estimation

Graph neural networks (GNNs) have gained an increasing amount of popularity due to their superior capability in learning node embeddings for various graph inference tasks, but training them can raise privacy concerns. To address this, we propose using link local differential privacy over decentralized nodes, enabling collaboration with an untrusted server to train GNNs without revealing the existence of any link. Our approach spends the privacy budget separately on links and degrees of the graph for the server to better denoise the graph topology using Bayesian estimation, alleviating the negative impact of LDP on the accuracy of the trained GNNs. We bound the mean absolute error of the inferred link probabilities against the ground truth graph topology. We then propose two variants of our LDP mechanism complementing each other in different privacy settings, one of which estimates fewer links under lower privacy budgets to avoid false positive link estimates when the uncertainty is high, while the other utilizes more information and performs better given relatively higher privacy budgets. Furthermore, we propose a hybrid variant that combines both strategies and is able to perform better across different privacy budgets. Extensive experiments show that our approach outperforms existing methods in terms of accuracy under varying privacy budgets.


INTRODUCTION
Graph neural networks (GNNs) achieve state-of-the-art performance in many domains, such as graph mining [26], recommender systems [52] and bioinformatics [18].However, training GNNs can raise privacy concerns as the graph data used for training, such as social networks, may contain sensitive data that must be kept confidential as required by laws [41].Thus, there have recently garnered significant attention on the security and privacy of GNNs from the research community [20,25,36,37,44].Research has shown that neural networks can unintentionally leak information about training data [38], and there have been recent demonstrations of The problem of link local differential privacy over decentralized nodes.Each node first perturbs its adjacency list before sending to the server for privacy protection.link inference attacks in GNNs [20,44].Hence, it is of particular significance to design privacy-preserving GNN frameworks.
Local differential privacy (LDP) [11,14,42] is a rigorous privacy notion for collecting and analyzing sensitive data from decentralized data owners.Specifically, LDP ensures privacy by having each data owner perturb their data locally before sending it to the server, often through noise injection [16].The focus of our work is to design LDP mechanisms to protect graph topology (i.e., links) over decentralized nodes.In this setting, the server has access to the features and labels of all nodes, but not to any links among them.The server must infer the graph topology from the noisy adjacency lists transmitted by the nodes, as shown in Figure 1.
To illustrate the importance of link LDP in graph topology protection, consider a contact-tracing application installed on end devices for infectious disease control.The on-device application records interactions between other devices via Bluetooth, and the server trains a GNN to identify individuals at higher risk of virus exposure using the collected data.While local features, such as age and pre-existing conditions, can be voluntarily submitted by users and directly used by the server, this is not the case for contact history (i.e., links) due to the risk of revealing sensitive information such as users' whereabouts and interactions with others.Hence, it is crucial for end devices to perturb their links to achieve LDP before transmitting the information to the server for privacy protection.
Our focus on link local privacy is driven by the following considerations.To start with, links represent the relationships between  [24] and MLP (GCN after removing links) on various graph datasets.Significant performance degeneration caused by removing links indicates the importance of graph topology in GNN training.nodes, which data owners are often unwilling to disclose.Moreover, the issue of link LDP in GNNs over decentralized nodes as clients has yet to be sufficiently addressed in the literature, and there is currently a lack of effective mechanisms to balance privacy and utility.[36] first propose locally differentially private GNNs, but only providing protection for node features and labels, while assuming the server has full access to the graph topology.Current differential privacy techniques for protecting graph topology while training GNNs, such as those described in [21,27,44], are limited by poor performance and are often outperformed by MLPs that are trained without link information at all (which naturally provides full link privacy).This issue with [44] has been investigated in [25], and we also demonstrate similar behaviors of other baselines in this paper.On a separate line of research, there have been recent works on privacy-preserving graph synthesis and analysis with link local privacy guarantees [23,32,51].However, although some of these works do provide valid mechanisms to train GNNs with link LDP protections, these mechanisms are usually designed to estimate aggregate statistics of the graph, such as subgraph counts [23], graph modularities and clustering coefficients [32,51], which are not useful for training GNNs.Hence, these works are not directly applicable to our setting, and we will later show in this paper that they perform poorly in terms of GNN test accuracy.As such, there is a clear need for novel approaches to alleviate the performance loss of GNNs caused by enforcing privacy guarantees and to achieve link privacy with acceptable utility.
Challenges.First, local DP is a stronger notion than central DP (CDP) where the magnitude of noise increases with the number of nodes.This creates an issue in real-world graph datasets where the number of vertices is typically large.Moreover, as shown in Figure 2, removing links leads to a significant drop in GNN performance, indicating that graph topology is crucial in training effective graph neural networks.This is because GNN training is very sensitive to link alterations as every single wrong link will lead to the aggregation of information of neighboring nodes which should have been irrelevant.When the server adopts local differential privacy, it only has access to graph topology that is perturbed to be noisy for privacy protection, thus making it very challenging to train any effective GNNs on it.Additionally, conventional LDP mechanisms such as randomized response [43] flip too many bits in the adjacency matrix and renders the noisy graph too dense, thus making it difficult to train any useful GNNs.To conclude, it is challenging to alleviate the negative effects of local differential privacy on GNN performance.
Contribution.In this paper, we propose Blink (Bayesian estimation for link local privacy), a principled mechanism for link local differential privacy in GNNs.Our approach involves separately and independently injecting noise into each node's adjacency list and degree, which guarantees LDP due to the basic composition theorem of differential privacy [17].On the server side, our proposed mechanism uses Bayesian estimation to denoise the received noisy information in order to alleviate the negative effects on GNN performance of local differential privacy.
Receiving the noisy adjacency lists and degrees, the server first uses maximum likelihood estimation (MLE) in the -model [9] to estimate the existence probability of each link based solely on the collected noisy degree sequence.Then, the server uses the estimated link probability as a prior and the noisy adjacency lists as evidence to evaluate posterior link probabilities where both pieces of information are taken into consideration.We theoretically explain the rationale behind our mechanism and provide an upper bound of the expected absolute error of the estimated link probabilites against the ground truth adjacency matrix.Finally, the posterior link probabilities are used to construct the denoised graph, and we propose three variants of such a construction-hard thresholding, soft thresholding, and a hybrid approach.Hard thresholding ignores links with a small posterior probability; it performs better when privacy budget is low and uncertainty is high because the lost noisy information would not significantly help with GNN training.The soft variant keeps all the inferred information and uses the posterior link probabilities as edge weights in the GNN model, and performs better than the hard variant when privacy budget is relatively higher thanks to the extra information.The hybrid approach combines both hard and soft variants and performs well for a wide range of privacy budgets.Extensive experiments demonstrate that all three variants of Blink outperform existing baseline mechanisms in terms of the test accuracy of trained GNNs.The hard and soft variants complement each other at different privacy budgets and the hybrid variant is able to consistently perform well across varying privacy budgets.
Paper organization.The rest of this paper is organized as follows.Section 2 introduces preliminaries for GNNs and LDP and Section 3 formally formulates our problem statement.We describe our proposed solution, Blink, in Section 4 and explain its rationale and properties theoretically.We report and discuss extensive experimental results with all Blink variants and other existing methods in Section 5.In Section 6, we conduct a literature review on related topics and give brief introduction to relevant prior work.At last, Section 7 concludes our work and discusses possible future research directions.The appendix includes complete proofs and experimental details.The adjacency matrix  ∈ {0, 1} × represents all the links in the graph, where  , = 1 if and only if a link exists between   and   .The feature matrix of the graph is  ∈ R × , where  is the number of features on each node and for each , row vector   is the feature of node   .Finally,  ∈ {0, 1} × is the label matrix where  is the number of classes.In the semi-supervised setting, if vertex   ∈   , then its label vector   is a one-hot vector, i.e.   • ì 1, where ì 1 is an all-ones vector of compatible dimension.Otherwise, when the vertex is unlabeled, i.e.,   ∈   , its label vector   is the zero vector ì 0. A GNN learns high-dimensional representations of all nodes in the graph by aggregating node embeddings of neighbor nodes and mapping the aggregated embedding through parameterized non-linear transformation.More formally, let where N (  ) is the set of neighboring nodes of   , Aggregate(•) is a differentiable, permutation invariant aggregation function such as sum or mean, and  (•) is a differentiable transformation such as multi-layer perceptrons (MLPs).Note that the neighbor set N (  ) may contain the node   itself, depending on the GNN architecture.
When initialized, all node embeddings are the node feature, i.e.,

𝑥
(0)  =   for each   ∈  .At the last layer, the model outputs vectors of  dimension followed by a softmax layer to be compared against the ground truth so that the parameters in  can be updated via back-propagation to minimize a pre-defined loss function such as cross-entropy loss.

Local differential privacy
Differential privacy (DP) is the state-of-the-art mathematical framework to quantify and reduce information disclosure about individuals [15,17,49].DP bounds the influence of any individual tuple in the database to guarantee that one cannot infer the membership of any tuple from the released data, in a probabilistic sense.Usually, this is achieved by injecting noise to the data samples or the algorithm itself [16,17,46].Mathematically, the most commonly used DP notion, -differential privacy, is defined as follows.
Definition 2.1 (-DP).Let D be the space of all possible databases and O be the output space.A randomized algorithm A : D → O is said to be -differentially private if for any two databases ,  ′ ∈ D that only differ in exactly one record, and for any possible output In a central DP (CDP) setting, a data curator (server) applies a randomized algorithm A on a given database  known to the curator and -central DP is achieved if this central algorithm A satisfies Definition 2.1.In a local model of DP [42], on the other hand, the data curator is untrusted and can only collect individual data from each data owner without being given the complete central database.Therefore, to preserve privacy, each data owner must implement a randomized algorithm to privatize its own data before transmitting to the server, and -local DP (LDP) is achieved if each of such local randomizers satisfies Definition 2.1.It is trivial to see that LDP is a stronger privacy notion where the server is no longer trusted, and more noise needs to be injected in LDP to achieve the same privacy budget as CDP.

PROBLEM STATEMENT
We aim to protect the graph topology over decentralized nodes from an untrusted server.In our setting, each node stores information about itself and nothing of other nodes other than the existence of links, i.e.,   locally stores its feature vector   , its adjacency list   and its label vector   , and nothing else.Additionally, we assume that a server  has access to  ,  and  , but not the adjacency matrix , which is kept private by the nodes.Collaborating with the nodes, the server tries to train a GNN model on  to correctly classify the unlabeled nodes in   , without revealing the existence of any links in .More specifically, we aim to design a local randomizer R to privatize the adjacency lists   such that R achieves -link LDP as defined below.Definition 3.1 (-link LDP).Randomized algorithm R : {0, 1}  → O is said to be -link differentially private if for any two adjacency lists ,  ′ ∈ {0, 1}  that only differ by one bit, i.e. ∥ −  ′ ∥ 1,1 = 1, and for any possible outcome  ∈ O, we have Remark 3.2.Note that two adjacency lists are said to be neighbors if they differ by exactly one bit.This is the same as adding or removing exactly one edge in the graph.Therefore, if a mechanism satisfies -link LDP as defined in Definition 3.1, the influence of any single link on the released output is bounded and thus the link privacy is preserved.
After sending the privatized adjacency lists R ( 1 ), . . ., R (  ) to the server, we also aim to design a server-side algorithm to denoise the received data which yields an estimated adjacency matrix Â.Finally, the server uses ( , , , Â) to train a GNN to perform node classification as described in Equation (1).Additionally, note that although we assume that the server has access to  , ,  , but it can be seen in Section 4 that our proposed method does not involve the server utilizing node features or labels to denoise the graph topology.Hence, our method is compatible with existing LDP mechanisms that provide protections for node features and labels, such as LPGNN [36], and can serve as a convenient add-on to provide full local differential privacy on , , .

OUR APPROACH
To train GNNs with link local differential privacy over decentralized nodes, we propose Blink (Bayesian estimation for link local privacy), a new framework to inject noise to the graph topology on the client side to preserve privacy and to denoise the server side to train better GNN models.The key idea is to independently inject noise to the adjacency matrix and degree sequence such that the degree of each node can be utilized by the server to better denoise the graph structure.More specifically, as shown in Figure 3, the server uses the received noisy degree sequence as the prior and the noisy adjacency matrix as evidence to calculate posterior probabilities for each potential link.We now describe our method in more detail in the following subsections.

Client-side noise injection
As suggested by previous studies [25,27,44], simply randomly flipping the bits in adjacency lists will render the noisy graph too dense.Therefore, node degrees and graph density must be encoded in the private messages as well.Our main idea is to independently inject noise to the adjacency list and the degree of a node, and let the server estimate the ground truth adjacency matrix based on the gathered information.Based on this idea, we let the nodes send privatized adjacency lists and their degrees separately to the server, such that degree information can be preserved and utilized by the server to better denoise the graph topology.Specifically, for each node   , we spend the total privacy budget  separately on adjacency list   and its degree   , controlled by degree privacy parameter  ∈ [0, 1], such that we spend a privacy budget   =  on degree and the remaining   = (1−) on adjacency list.This is possible because of the basic composition theorem of differential privacy [17].For real-valued degree   , we use the widely-adopted Laplace mechanism [16] to inject unbiased noise drawn from Laplace(0, 1/  ).And for bit sequence   , we use randomized response [43] to randomly flip each bit with probability 1/(1 + exp(  )).This procedure is described in Algorithm 1.According to basic composition and the privacy guarantee of Laplace mechanism and randomized response, we have the following theorem, stating that Algorithm 1 achieves -link LDP.The detailed proof, together with the proofs of subsequent results, will be included in Appendix A.

Server-side denoising
After receiving the noisy adjacency lists Ã1 , Ã2 , . . ., Ã and degrees d1 , d2 , . . ., d from the nodes, the server first assembles them into a noisy adjacency matrix Ã ∈ {0, 1} × and noisy degree sequence d ∈ R  .The server then uses d to estimate link probability to be used as prior, and uses Ã as the evidence to calculate the posterior probability for each potential link to exist in the ground truth graph.At last, the server constructs graph estimations based on the posterior link probabilities and use the estimated graph to train GNNs.These steps are described in greater details as follows.

Estimation of link probability given degree sequence
Given noisy degree sequence d, the server aims to estimate the probability of each link to exist, which is then used as prior probability in the next step.To estimate link probability, we adopt -model, which is widely adopted in social network analysis [5,9,33] and closely related to the well-known Bradley-Terry-Luce (BTL) model for ranking [7,22,28,39].Given a vector  = ( 1 ,  2 , . . .,   ) ∈ R  , the model assumes that a random undirected simple graph of  vertices is drawn as follows: for each 1 ≤  <  ≤ , an edge between node   and   exists with probability independently of all other edges.Hence, the probability of observing the (true) degree sequence  = ( 1 , . . .,   ) from a random graph drawn according to the -model is As a result, one can estimate the value of  by maximizing the likelihood   () of observing .The maximum likelihood estimate (MLE) β of  must satisfy the system of equations Chatterjee et al. [9] show that with high probability, there exists a unique MLE solution β as long as the ground truth sequence (  )

Algorithm 2 MLE of link probability given degree sequence
Input:  ∈ R  -degree sequence Output:  ∈ [0, 1] × -link probability matrix, where    is the estimated probability that an edge exists between   and   1: function MLELinkProbability(): initialize  ∈ R  as a zero vector 3: while not converging do 4: ←   ( ) ⊲ MLE solution is a fixed point of function   [9] 5: end while 6: for (,  ) ∈ {1, 2, . . ., } 2 ∧  ≠  do 7: end for 9: .SetDiagonal(0) ⊲ -model does not consider self loops 10: return  11: end function is bounded from above and below, and the authors also provide an efficient algorithm for computing the MLE when it exists.Consider the following function   : R  → R  where Chatterjee et al. [9] prove that the MLE solution is a fixed point of the function  and hence can be found iteratively using Algorithm 2. Therefore, if the degree sequence  were to be released to and observed by the server, the server could then model the graph using the -model and estimate link probabilities via MLE.However, actual degree sequence  must be kept private to the server for the privacy guarantee.As per Algorithm 1,  is privatized through Laplace mechanism and only the noisy d can be observed by the server.Although the server cannot directly maximize the likelihood   () of observing , the following theorem shows that the observable log-likelihood ℓ d () = log( d ()) is a lower bound of the unobservable ℓ  () = log(  ()) (up to a gap).Theorem 4.2.For any  ∈ R  that is bounded from above and below, let  = max 1≤ ≤ |  |.For any given constant , with probability at least 1 − 1/ and it will be a fixed point of function  d .However, Eq. ( 8) has a solution only if d ∈ (0,  − 1) for all  = 1, 2, . . ., .Therefore, the server first clips the values of d to d+ ∈ [1,  − 2]  and then calls the function MLELinkProbability( d+ ) from Algorithm 2 to find the link probabilities that maximize ℓ d+ ().This step is described in Lines 2 and 3 of Algorithm 3.

Estimation of posterior link probabilities
The noisy degree sequence d enables the server to estimate the link probabilities to be used as a prior, such that the server can use the received noisy adjacency matrix as evidence to evaluate posterior probability.For each potential link between   ,   ∈  , the server receives two bits Ã  and Ã related to its existence.Because the privacy budget   in RR (Algorithm 1) is known to the server, the server can use the flip probability  flip = 1/(1 + exp(  )) to calculate the likelihood of observing the received bits ( Ã  , Ã ) conditioned on whether a link exists between   and   in the actual graph.More specifically, we have ; .
Here,    is the likelihood of observing ( Ã  , Ã ) given the existence of the link (  ,   ), and  ′   is the likelihood of observing ( Ã  , Ã ) given the non-existence of the link (  ,   ).Hence, together with the link probability (without taking evidence into consideration)    estimated solely from noisy degree sequence, one can apply Bayes rule to evaluate the posterior probability, i.e. for each 1 ≤  ≠  ≤ , .
For each 1 ≤  ≠  ≤ ,    is the posterior probability that a link exists between   and   conditioned on the evidence ( Ã  , Ã ).
We will show the accuracy of this estimation of graph topology in Section 4.3 by bounding the mean absolute error between  and ground truth .

Graph estimation given posterior link probabilities
After obtaining , we propose three different variants of Blink for the server to construct graph estimations used for GNN training.
Blink-Hard.The simplest and most straightforward approach is to only keep links whose posterior probability of its existence triumphs that of its absence, i.e. keep a link between   and   in the estimated graph Â if and only if    > 0.5.
It is clear that hard-thresholding loses a lot of information contained in  by simply rounding all entries to 0 or 1.However, when privacy budget is low and uncertainty is high, the information provided by the nodes are usually too noisy to be useful for GNN training, and may even corrupt the GNN model [25].Therefore, Blink-Hard is expected to perform better when privacy budget is low, while when privacy budget grows, it is likely to be outperformed by other variants of Blink.
Blink-Soft.Instead of hard-thresholding, the server can keep all the information in  instead by using them as edge weights.In this way, the GNN formulation in ( 1) is modified as follows to adopt weighted aggregation: where Aggregate(•) is a permutation invariant aggregation function such as sum or mean.Detailed modifications of specific GNN architectures will be included in Appendix B.
The soft variant utilizes extra information of  compared to Blink-Hard, and hence, is expected to achieve better performance as long as the information is not too noisy to be useful.Therefore, we form a hypothesis that Blink-Soft and Blink-Hard complement each other and the former is preferred when privacy budget is relatively higher while the latter is preferred at lower privacy budgets.
Blink-Hybrid.At last, we combine both the hard and soft variants such that the server can eliminate unhelpful noisy information while utilizing more confident information in ∥ ∥ 1,1 via weighted aggregation.The server first takes the highest ∥ ∥ 1,1 entries of  and filters out the remaining by setting them as zeros.This is to only keep the top ∥ ∥ 1,1 possible links in the graph as ∥ ∥ 1,1 is an estimation of the graph density ∥∥ 1,1 (suggested by Theorem 4.4 and Corollary 4.6).This step is inspired by the idea of only keeping the top || links from DpGCN [44].Then, the server utilizes the remaining entries in  by using them as edge weights and trains the GNN as in Equation (9).Blink-Hybrid is expected to incorporate the advantages of both Blink-Hard and Blink-Soft and perform well for all privacy levels.

Theoretical analysis for utility
We have described our proposed approach, namely, Blink, in detail in previous sections.While its privacy guarantee has been shown in Theorem 4.1, we now theoretically demonstrate its utility guarantees.
Choice of utility metric.To quantify the utility of Blink, we first need to identify a metric to be bounded that is able to reflect the quality of the estimated graph, .In the related literature, many metrics have been used to demonstrate the utility of differentially private mechanisms for graph analysis.For example, Hidano and Murakami [21] show that their estimated graph topology preserves the graph density; Imola et al. [23] bound the error in triangle count and -star count; Ye et al. [51] evaluate the error in any arbitrary aggregate graph statistic, as long as the aggregate statistic can be estimated without bias from both degree and neighbor lists, such as clustering coefficient and modularity.However, none of these metrics can reflect the quality of the estimated graph topology directly and represent the performance of the GNN trained on the estimated graph because they only involve aggregate and high-level graph statistics.In contrast, the performance of GNNs for node classification is very sensitive to link information from a microscopic or local perspective, as node information propagates along links and any perturbation of links will lead to aggregation of other nodes' information that should have been irrelevant, or missing the information of neighboring nodes.This is one of the reasons that many prior works involving privacy-preserving GNNs for node classification [25,27,44] only provide empirical evidence of the utility of their approaches.Although there is no metric that can directly reflect the performance of the trained GNNs, the closer the estimated adjacency matrix  is to the ground truth , the better the GNNs trained on  are expected to perform, and the closer the trained GNNs would perform compared to those trained with accurate graph topology.Therefore, we evaluate the utility of Blink as a statistical estimator of the ground truth adjacency matrix  by bounding the expectation of the ℓ 1 -distance between  and , i.e.E[∥ − ∥ 1,1 ].If  is a binary matrix similar to , this metric measures the number of edges in the ground truth graph that are missing or falsely added in the estimated graph; if  is a matrix of link probabilities, this metric measures to what extent the links in  are perturbed.It can be seen that this metric, just like GNN performance, is sensitive to link perturbations from a local perspective, and it is able to reflect the overall quality of the estimated graph topology.This metric is also closely related to the mean absolute error (MAE) between  and , defined as 1 2 , |   −    |, which is a commonly used metric in empirical evaluation.Therefore, we use the expectation of ℓ 1 -distance between  and  as the utility metric to quantify the utility of Blink, and we present an upper bound of it in the following Theorem 4.4.
Theorem 4.4.Assume that  found by MLE in Algorithm 3 is the optimal solution that maximizes ℓ d+ ().Then we have where the expectation is taken from the randomness of RR (i.e.Ã) and Laplace mechanism (i.e.d). 1   Remark 4.5 (Implications of Theorem 4.4).Theorem 4.4 is significant since it shows that  is a reasonable estimate of  in the sense that its ℓ 1 -distance from  is of the same order of magnitude as  itself and  is usually sparse.For example, for a random guess  whose entries are all 1/2, ∥ − ∥ 1,1 assumes the value when  is sparse.This is reflected in Corollary 4.6 below.Since our approach, Blink, is developed based on randomized response, here we compare the given bound with the estimation error of randomized response.Since the flip probability of randomized response is given as 1/(1 + exp ()), the expected estimation error for RR, in terms of the ℓ 1 distance from , is  2 /(1 + exp ()), which is much larger than the bound given in Theorem 4.4 for sparse graphs.This shows that our approach successfully utilizes Bayesian estimation to denoise the noisy adjacency lists perturbed by randomized response and achieves a significant improvement over naïve approaches.Empirical bound tightness.To empirically evaluate the estimation accuracy of the posterior link probabilities  against the ground truth graph topology , and to inspect the tightness of our upper bound on the estimation error given in Theorem 4.4, we report the average mean absolute error (MAE) and its theoretical upper bound (as given in Theorem 4.4) between  and  on four well-known graph datasets in Figure 4. Figure 4 shows that in all datasets, the MAE between  and the ground truth  is very small, and decrease to almost zero (on the order of 10 −6 when  = 8) as the privacy budget  increases.This demonstrates that the inferred link probability matrix  is a close estimation of the unseen private adjacency matrix , and thus can be used for GNN training.Furthermore, Figure 4 reports that the upper bound of the expected MAE given by Theorem 4.4 is very close to the empirical average MAE when the total privacy budget  is small.However, empirical results also suggest that the bound given in Theorem 4.4 is not tight when  is large, as our upper bound converges to 2∥∥ 1,1 instead of 0 when  → ∞, while the empirical estimation error converges to zero when  grows larger.This has inspired us to prove Theorem 4.7 below, which states that the estimated graph from noisy messages will be identical to the actual one when  → ∞.Theorem 4.7.As  is a random function of the total link privacy parameter , we write  =   .Then, we have lim →∞   = , i.e., when  → ∞, the estimated graph from noisy messages converges to the ground truth.Remark 4.8 (Implications of Theorem 4.7).Theorem 4.7 demonstrates that when  goes to infinity, the estimated graph from noisy messages will converge to the ground truth graph and hence the trained GNN will also have the same performance as its theoretical upper bound -the performance of a GNN trained with the accurate graph topology.This is a desirable property of any differentially private mechanisms, yet not enjoyed by all existing ones.For example, LDPGen [32] clusters structurally similar nodes together and generates a synthetic graph based on noisy degree vectors via the Chung-Lu model [2].Even when no noise is injected, the generated graph is not guaranteed to be identical to the ground truth graph since only accurate degree vectors are used to construct the graph.Theorem 4.7 shows that Blink is able to achieve this desirable property, and together with Theorem 4.4, which has been shown to be quite tight when  is small, we show that the estimation error of Blink is well controlled for all , as demonstrated empirically in Figure 4.
Remark 4.9.Note that Theorem 4.4, Corollary 4.6 and Theorem 4.7 are not violations to privacy.They indicate how good the server can estimate the ground truth, conditioned on the fact that the input information is theoretically guaranteed to satisfy -link LDP (as shown in Theorem 4.1).These are known as privacy-utility bounds, and it is standard practice in local differential privacy literature for the server to denoise the received noisy information to aggregate useful information.
Remark 4.10 (Limitations).Note that Theorems 4.4 and 4.7 only capture the estimation errors of , and are not direct indicators of the performance of the GNNs trained on .As discussed in the beginning of this section, there is no metric that can directly reflect the performance of the trained GNNs.However, in general, the closer the estimated adjacency matrix  is to the ground truth , the better the GNNs trained on  are expected to perform.Still, Theorem 4.4 and 4.7 alone are not sufficient to demonstrate the superior performance of the proposed approach, Blink, over existing approaches.For example, for L-DpGCN (to be introduced in Section 5.1), the mechanism only retains around the same number of links in the estimated graph as the ground truth graph, and hence the estimation error is approximately bounded by 2∥∥ 1,1 , similar to what we have proved in Theorem 4.4.This is also the case for degree-preserving randomized response proposed in [21].Therefore, we provide extensive empirical evaluations of the performance of Blink in Section 5 and show that Blink outperforms existing approaches in terms of utility at the same level of privacy.

Technical novelty
The split of privacy budget to be separately used for degree information and adjacency lists has appeared in the literature [21,51].However, in both works, the noisy degrees and adjacency lists are denoised or calibrated such that a target aggregate statistic can be estimated more accurately.Hidano and Murakami [21] uses the noisy degree to sample from the noisy adjacency lists such that the overall graph density can be preserved.[51] combines two estimators of the target aggregate statistic, one from noisy degrees and the other from noisy adjacency lists, and calibrates for a better estimation for the target aggregate statistic, such as clustering coefficient and modularity.As discussed previously, guarantees on the estimation error of these aggregate statistics are not sufficient to train useful GNNs due to their sensitivity to link perturbations.In contrast, our approach, Blink, utilizes the noisy degree information to estimate the posterior link probabilities conditioned on the evidence of noisy RR outputs for all possible links, via Bayesian estimation.To the best of our knowledge, this is a novel approach that has not been explored in the literature.

EXPERIMENTS 5.1 Experimental settings
Environment.To demonstrate the privacy-utility trade-off of our proposed mechanism, we ran extensive experiments on real-world graph datasets with state-of-the-art GNN models. 2 The experiments are conducted on a machine running Ubuntu 20.04 LTS, equipped with two Intel ® Xeon ® Gold 6326 CPUs, 256GB of RAM and an NVIDIA ® A100 80GB GPU.We implement our mechanism and other baseline mechanisms using the PyTorch3 and PyTorch Geometric4 frameworks.To speed up execution, we use NVIDIA's TF32 tensor cores [10] in hyperparameter search at the slight cost of precision.All experiments other than hyperparameter grid search are done using the more precise FP32 format to maintain precision.
Datasets.We evaluate Blink and other mechanisms on realworld graph datasets.The description of the datasets is as followed: • Cora and CiteSeer [50] are two well-known citation networks commonly used for benchmarking, where each node represents a document and links represent citation relationships.Each node has a feature vector of bag-of-words and a label for category.
• LastFM [35] is a social network collected from music streaming service LastFM, where each node represents a user and links between them represent friendships.Each node also has a feature vector indicating the artists liked by the corresponding user and a label indicating the home country of the user.
• Facebook [34] is a social network collected from Facebook, where each node represents a verified Facebook page and links indicate mutual likes.Each node is associated with a feature vector extracted from site description and a label indicating the cite category.This graph is significantly larger and more dense than the previous datasets, and hence represents the scalability and performance on larger graphs of our proposed method.
Table 1 summarizes the statistics of datasets used in experiments.
Baselines.To better present the performance of Blink, we implement the following baseline mechanisms for comparison.
(1) Randomized response (RR) [43] is included to demonstrate the effectiveness of our server-side denoising algorithm, where the server directly uses the RR result of adjacency matrix as the estimated graph without calibration.(2) Wu et al. [44] propose DpGCN as a mechanism to achieve central DP to protect graph links.It adds Laplacian noise to all entries of  and keeps the top ∥∥ 1,1 entries to be estimated links.However, in LDP setting, ∥∥ 1,1 is kept private to the server and cannot be directly utilized.Following the same idea, we propose a LDP variant of it, namely L-DpGCN, where each node adds Laplacian noise to entries of its own adjacency list and sends the noisy adjacency matrix Ã to the server.The server first estimates the number of links by ∥ Ã∥ 1,1 and keeps the top ∥ Ã∥ 1,1 entries as estimated links.
(3) Solitude is proposed in [27] as a LDP mechanism to protect features, labels and links of the training graphs.The link LDP setting of theirs is identical to ours, and we only use the link privacy component of their mechanism.In Solitude, each node perturbs its adjacency list via randomized response, and the server collects noisy matrix Ã.However, RR result is usually too dense to be useful for GNN training.Hence, Solitude learns a more sparse adjacency matrix by replacing the original GNN learning objective with min Â, where  is the GNN trainable parameters and L ( Â| ) is the original GNN training loss under parameters  and graph topology Â.To optimize for Equation (10), Solitude uses alternating optimization to optimize for both variables.(4) Hidano and Murakami [21] propose DPRR (degree-preserving randomized response) to achieve -link local differential privacy to train GNNs for graph classification tasks.The algorithm denoises the randomized response noisy output by sampling from links reported by RR such that the density of the sampled graph is an unbiased estimation to the ground truth density.We implement DPRR as a baseline to compare with Blink variants.(5) We also implement and include baselines designed for privacypreserving graph synthesis and analysis.Qin et al. [32] propose LDPGen, a mechanism to generate synthetic graphs by collecting link information from decentralized nodes with link LDP guarantees, similar to ours.The key idea of the mechanism is to cluster structually similar nodes together (via K-means [4]) and use noisy degree vectors reported by nodes to generate a synthetic graph via the Chung-Lu model of random graphs [2].(6) Imola et al. [23] propose locally differentially private mechanisms for graph analysis tasks, namely, triangle counting and -star counting.Their main idea is to use the randomized response mechanism to collect noisy adjacency lists from nodes, and then derive an estimator to estimate the target graph statistics from the noisy adjacency lists.We adopt the first part of their mechanisms, i.e., random response, to derive noisy graph topology to be used for GNNs.The RR mechanism used in [23] only involves injecting noise to the lower triangular part of the adjacency matrix, i.e., node   only perturbs  ,1 , . . .,  , −1 and sends these bits, to force the noisy adjacency matrix to be symmetric.Hence, we denote this baseline as SymRR.More discussions of these baseline methods and other related works can be found in Section 6.  Experimental setup.For all models and datasets, we randomly split the nodes into train/validation/test nodes with the ratio of 2:1:1.To better demonstrate the performance of Blink and other baseline methods, we apply them with multiple state-of-the-art GNN architectures including graph convolutional networks (GCN) [24], GraphSAGE [19] and graph attention networks (GAT) [40] (details of the model configurations can be found in Appendix B.1).Note that we do not conduct experiment on Blink-Soft or Solitude with the GAT architecture because it is not reasonable to let all nodes attend over all others [40] (even in a weighted manner).To compare the performance of DP mechanisms, we also experiment on all datasets with non-private GNNs, whose performance will serve as a theoretical upper bound of all DP mechanisms.Moreover, following [25], we also include the performance of multi-layer perceptrons (MLPs) for each dataset, which is trained after removing all links from the graph and is considered fully link private.We experiment all mechanisms under all architectures and datasets with  ∈ {1, 2, . . ., 8}.To showcase the full potential of our proposed method, for each combination of dataset, GNN architecture, privacy budget and mechanism, we run grid search and select the hyperparameters with the best average performance on validation data over 5 trials, and report the mean and standard deviation of model accuracy (or equivalently, micro F1-score, since each node belongs to exactly one class) on test data over 30 trials for statistical significance.Similar to previous works [27,36], we do not consider the potential privacy loss during hyperparameter search.The hyperparameter spaces for all mechanisms used in grid search are described in detail in Appendix B.2.

Privacy-utility of the proposed Blink mechanisms
We report the average test accuracy of all three variants of Blink and other baseline methods over all datasets and GNN architectures in Figure 5.For all methods, the test accuracy increases as the total privacy budget increases, indicating the privacy-utility trade-off commonly found in differential privacy mechanisms.At all privacy budgets, L-DpGCN outperforms RR because the former one takes Figure 7: GCN test accuracy on LastFM with all three variants at  ∈ {1, 8}.This is a closer look on the results in Figure 5 on LastFM with GCN. the graph density into consideration and preserves the number of links in the estimated graph, while the latter mechanism produces too many edges after randomly flipping the bits in the adjacency matrix which renders the graph too dense.This is consistent with Remark 4.5 where we show the huge improvement in estimation error for Blink over RR.We notice that the performance of SymRR and our implementation of Solitude is on par with RR, which makes sense because SymRR is essentially RR only applied to the lower triangular part of the adjacency matrix while Solitude denoises graph topology based on RR outputs.Note that Lin et al. [27] did not made the implementation of Solitude publicly available by the time this paper was written, and the authors only performed experiments at large privacy budgets where  ≥ 7, while our results under similar privacy budgets agree with or outperform theirs presented in the paper.Additionally, the performance of LDPGen is worse than other mechanisms, which is expected because it is designed for graph synthesis and not for GNN training.The performance of LDPGen on GNNs also does not improve as the privacy budget increases, which is also expected because at all privacy budgets, the synthetic graph generated by LDPGen is always generated by a random graph model given noisy degree vectors.This has been discussed in Remark 4.8.
It is evident from Figure 5 that at all levels of , the Blink variants generally outperform all baseline methods, because they also take individual degrees into consideration when estimating the prior probabilities, and utilize its confidence in links via hard thresholding, soft weighted aggregation or both.Noticeably, only Blink variants (especially Blink-Hard and Blink-Hybrid) can consistently perform on par with the fully link private MLP baselines, which is due to the fact that these variants can eliminate noisy and non-confident link predictions at low privacy budget and high uncertainty.Additionally, among the three GNN architectures, the baseline methods can perform better on GraphSAGE with accuracy closer to MLP.This is because GraphSAGE convolutional layers have a separate weight matrix to transform the embedding of the root node and hence can learn not to be distracted by the embeddings of false positive neighbors.See Appendix B.1 for details on GNN architectures.At last, for  ∈ [4,8] which is widely adopted in LDP settings in real-world industry practice [3,13], Blink variants on different GNN architectures outperform the MLP and baselines significantly in most cases, indicating their utility under real-world scenarios.Also, when  ≥ 6, Blink variants achieve test accuracy on par with the theoretical upper bound on all datasets and architectures.In the following paragraphs, we describe with greater detail the performance and trade-off among the Blink variants.
Performance of Blink-Hard.As demonstrated in Figure 5, one main advantage of Blink-Hard is that it is almost never outperformed by MLP trained only on node features, which is not the case for the baseline methods.Existing approaches [21,44] of achieving (central or local) link privacy on graphs aim to preserve the graph density in the estimated graph, i.e. try to make ∥ Â∥ 1,1 ≈ ∥∥ 1,1 , however, when  is small, it is against the promise of differential privacy to identify the same number of links from the estimated graph from the actual graph.As Kolluri et al. [25] point out, 100% of the selected top || links estimated by DpGCN [44] at  = 1 are false positive, corrupting the GNN results when aggregating neighbor embeddings.By only keeping links whose posterior link probability of existence exceeds 0.5, Blink-Hard takes an alternative approach of understanding graph density at tight privacy budgets and high uncertainty.As shown in Figure 6, ∥ Â∥ 1,1 estimated by Blink-Hard is much lower than the ground truth density ∥∥ 1,1 when  is small, and gradually increases to a similar level with ∥∥ 1,1 as  increases.In this way, Blink-Hard eliminates information that is too noisy to be useful at low privacy budgets, thus reducing false positive link estimations and avoiding them from corrupting the GNN model.As shown in Figure 6, among the much fewer link estimations given by Blink-Hard, the true positive rates are much higher than DpGCN as reported in [25].Therefore, Blink-Hard consistently outperforms the fully link-private MLP and other baselines.Performance of Blink-Soft.Although Blink-Hard outperforms MLP and baseline mechanisms, the elimination and rounding of link probabilities lead to a significant amount of information loss.Blink-Soft aims to improve over the hard variant at moderate privacy budgets by utilizing the extra information while it is not too noisy.As described in Section 4.2.3,Blink-Soft utilizes the inferred link probabilities by using them as weights in the GNN aggregation step (see Appendix B.1 for more details), which has enabled the GNN to be fed with more information and perform better as long as the extra information is useful.As reflected in Figure 5 and Figure 7, Blink-Soft is able to outperform Blink-Hard at moderate privacy budgets (i.e.,  ∈ [4,6]) under almost all dataset and GNN architecture combinations.For higher privacy budgets, as both variants perform very well and are on par with the non-private upper bound, the performance gap is not significant.However, at lower privacy budgets where  ∈ [1, 3], Blink-Soft sometimes performs much worse than Blink-Hard and the fully private MLP baseline, for example, on LastFM with GCN model which we take a closer look in Figure 7.This is caused by the low information-to-noise ratio of the inferred link probabilities at low privacy budgets.Here, we confirm the hypothesis proposed in Section 4.2.3 that Blink-Hard and Blink-Soft complement each other where the hard variant performs better at low privacy budget while the soft variant performs better at higher privacy budgets.
Performance of Blink-Hybrid.The hybrid variant is proposed to combine the previous two aiming to enjoy the benefits of both variants across all privacy settings.As shown in Figure 5, at low privacy budgets, Blink-Hybrid successfully outperforms Blink-Soft by a significant margin and achieves test accuracy on par with Blink-Hard (thus, not outperformed by MLP), due to its elimination of noisy and useless information, avoiding false positive links from poisoning the model.At higher privacy budgets, Blink-Hybrid is often able to perform better than the hard variant, thanks to keeping the link probabilities as aggregation weights.For example, for the configuration of LastFM with GCN in Figure 5 which we take a closer look in Figure 7, Blink-Hard achieves accuracy close to Blink-Hard at  ∈ [1,3] while performs on par with Blink-Soft at  ∈ [4,8], achieving the best of both worlds.Although Blink-Hybrid is seldom able to outperform both the hard and the soft variants, it can enjoy both the benefits of Blink-Hard at low privacy budgets and the benefits of Blink-Soft at higher privacy budgets.

On the effects of 𝛿
The degree privacy budget parameter, , is an important hyperparameter that makes a difference on the performance of the trained GNNs.In previous experiments, we choose the value of  by grid search and use the one that is associated with the best validation accuracy.To better understand the effects of different choices of  values on the GNN performance, we report the test accuracy of graph convolutional network on CiteSeer with Blink-Soft over varying  values at  ∈ {1, 8} in Figure 8.It can be seen that at different privacy budgets, there are different implications for the values of .At small privacy budget, i.e.,  = 1, we notice that the GNN performance increases as  value increases, while at larger privacy budget, e.g.,  = 8, it is clear that lower  values result in better performance.This is because at very tight privacy budgets, the noisy adjacency matrix given by randomized response will be too noisy to be useful, hence, the prior probabilities estimated from the noisy degree sequence will be more important.Therefore, it is optimal to allocate more privacy budget towards degrees.At much higher privacy budgets, the flip probability in randomized response becomes so small that the noisy adjacency matrix itself is sufficient to provide the necessary information to effectively train the GNN, hence, it is preferred to allocate more privacy budget to the adjacency lists.If we denote  * () as the optimal  at total privacy budget  that achieves the best performing GNNs (for instance, we have  * (1) = 0.9 and  * (8) = 0.1 in Figure 8), we propose a conjecture that  * decreases as  increases, i.e.,  * ′ () < 0, and our experiments resonate with this conjecture.

Ablation studies
Naturally, one would be curious about which component, the prior or the evidence, of the proposed Blink mechanisms, contributes more to the final link estimations .To answer this question, we conduct ablation studies on the proposed methods.Figure 9 reports the MAE of the estimated link probabilities  against the ground truth  under Blink, its prior component and its evidence components.Blink with prior component only is equivalent to taking  = 1 where the flip probability of RR becomes 1/2 and hence the noisy adjacency matrix does not provide any information as evidence.Blink with evidence only is the case where the prior probabilities are set to be all 1/2 to provide no extra information.
First, as shown in Figure 9, all mechanisms, the complete and partial ones, have their MAE decreasing as privacy budgets increase.More importantly, it can be seen that at tighter privacy budgets, the prioronly mechanism produces better estimations than its evidence-only counterpart, indicating that the prior contributes more to the final estimation at tighter privacy budgets.When  grows (i.e.,  ≥ 7 in Figure 9), the noisy adjacency matrix becomes less noisy so the evidence-only method starts to produce better estimations, playing a more important role than the prior.This agrees with the findings in Section 5.2.2 that it is optimal to allocate more privacy budget to degrees (i.e., the prior component) at smaller  and vise versa.It is important to note that at all privacy budgets, the full Blink method significantly outperforms both single-component methods, indicating that our proposed method effectively utilizes both components to make better estimations and both components are irreplaceable in Blink.

RELATED WORK
Graph neural networks.Recent years have witnessed an emerging amount of work on graph neural networks for many tasks on graph, such as node classification, link classification and graph classification.Many novel GNN models have been proposed, including GCN [24], GraphSAGE [19], GAT [40] and Graph Isomorphism Networks [47].As our proposed mechanism estimates the graph topology and then feed it into GNN models without interfering the model architecture, we will not survey recent advances in GNNs here in great detail, but refer the audience to available surveys [8,45,54] for detailed discussions of GNN models, performance and applications.
Differentially private GNNs.There have been recent attempts in the literature of incorporating the notion of differential privacy to GNNs.Wu et al. [44] study the adversarial link inference attacks on GNNs and proposes DpGCN, a central DP mechanism to protect edge-level privacy, which can be easily modified to adopt stronger LDP guarantee.Daigavane et al. [12] attempt to extend the well-celebrated DP-SGD [1] algorithm on neural networks to GNNs and achieve stronger node-level central differential privacy.More recently, Kolluri et al. [25] propose new GNN architecture to achieve edge-level central DP, where they separate the edge structure and only use MLPs to model both node features and graph structure information.Following a similar intuition, Sajadmanesh et al. [37] propose a new mechanism where the aggregation step is decoupled from the GNN and executed as a pre-processing step to save privacy budget.When combined with DP-SGD, Sajadmanesh et al. [37] achieve stronger node-level central DP on the altered GNN architecture.For local differential privacy, Sajadmanesh and Gatica-Perez [36] propose a LDP mechanism to protect node features but not the graph topology.Lin et al. [27] extend on [36] and propose Solitude to also protect edge information in a LDP setting.The link LDP notion of [27] is identical to that of ours.However, their link DP mechanism is not principled and their estimated graph structure is learned by minimizing a loss function ∥ Â − Ã∥ 1,1 + ∥ Â∥  to encourage the model to choose less dense graphs.Hidano and Murakami [21] propose link LDP mechanism for graph classification tasks, and takes a similar approach to ours, by separately injecting noise to adjacency matrix and degrees.However, Hidano and Murakami [21] aim to preserve node degrees in the estimated graph like DpGCN, which is not suitable to node classification tasks and performs worse than our method as shown in Section 5.
Privacy-preserving graph synthesis.Privacy-prserving graph publication is also closely related to what we have studied, where one aims to publish a sanitized and privacy-preserving graph given an input graph.Blocki et al. [6] utilize the Johnson-Lindenstrauss Transform to achieve graph publication with edge differential privacy.Qin et al. [32] consider local edge differential privacy where an untrusted data curator collects information from each individual user about their adjacency lists and construct a representative synthetic graph of the underlying ground truth graph with edge LDP guarantee.This is achieved by incrementally clustering structurally similar users together.More recently, Yang et al. [48] achieve differentially private graph generation by noise injection to a graph generative adversarial network (GAN) such that the output of the GAN model is privacy-preserving.It is worth noting that in the settings of [6,48], a privacy-preserving synthetic graph is generated in a centralized way, i.e., the curator has access to the ground truth graph and perturbs it for a privacy-preserving publication, which is a weaker threat model than ours.Qin et al. [32] consider threat models similar to ours with local differential privacy where the curator does not need to have access to the actual graph, but there's no theoretical upper bound on the distance from the synthetic graph to the ground truth graph, which we provide in Theorem 4.4.

Privacy-preserving graph analysis.
There exist prior works in the literature proposed for graph analysis tasks with local differential privacy.Imola et al. [23] propose mechanisms to derive estimators for triangle count and -star count in graphs with link LDP.Ye et al. [51] propose a general framework for graph analysis with local differential privacy, to estimate an arbitrary aggregate graph statistic, such as clustering coefficient or modularity.This approach combines two estimators of the target aggregate statistic, one from noisy neighbor lists and one from noisy degrees, and derives a better estimator for the target statistic.However, this approach does not produce an estimated graph topology that can be used for GNN training.Hence, we do not include this as a baseline in our experiments in Section 5.
Link inference attacks in GNNs.As the popularization of GNNs in research and practice in recent years, there has garnered an increasing amount of attention on their privacy and security in the research community, and several privacy attacks on GNNs have been proposed for an attacker to infer links in the training graph.He et al. [20] propose multiple link stealing attacks for an adversary to infer links in the training graph given black-box access to the trained GNN model, guided by the heuristic that two nodes are more likely to be linked if they share more similar attributes or embeddings.Wu et al. [44] consider a scenario where a server with full access to graph topology trains a GNN by querying node features and labels from node clients (who do not host graph topology), and demonstrate that the nodes can infer the links from the server by designing adversarial queries via influence analysis.Zhang et al. [53] propose graph reconstruction attacks where an adversary examines the trained graph embeddings and aims to reconstruct a graph similar to the ground truth graph used in GNN training.All these attacks share the same threat model where the GNN is trained with complete and accurate information and an adversary aims to infer links by examining the trained model.Our proposed solution, Blink, naturally defends this kind of attacks at its source as a local differential privacy mechanism with a more severe threat model, where even the server who trains the GNN does not have non-private access to any links in the training graph.
Estimate of link probability given degree sequence.The model and estimation of random graphs given degree sequence is a common topic in network science and probability theory.Chatterjee et al. [9] discuss the maximum likelihood estimate of parameters in -model, which is closely related to BTL model for ranking [7,28].Parameters in BTL model can be estimated via MM algorithms [22] or MLE [39].Alternatively, configuration model [29] can also be used to model random graphs given degree sequence, which generates multi-graphs that allows multiple edges between two vertices.In configuration model, the expected number of edges between two nodes   and   conditioned on the degree sequence  is given by     /(   − 1).When this value ≪ 1, it can be considered a probability that there's (at least one) edge between   and   .We also attempt Blink with configuration model instead of -model, but the link probabilities fail to be consistently below 1.

CONCLUSION
Overall, the presented framework, Blink, is a step towards making GNNs locally privacy-preserving while retaining their accuracy.It separately injects noise to adjacency lists and node degrees, and uses the latter as prior and the former as evidence to evaluate posterior link probabilities to estimate the ground truth graph.We propose three variants of Blink based on different approaches of constructing the graph estimation from the inferred link probabilities.Theoretical and empirical evidence support the state-of-the-art performance of Blink against existing link LDP approaches.
The area of differentially private GNNs is still novel with many open challenges and potential directions.There are a few future research directions and improvements for the presented paper.First, one may want to improve the bound in Theorem 4.4, which would require careful inspection of β found by MLE.Also, an interesting future direction of research is to design algorithms such that each node can optimally decide its own privacy parameter , to avoid the use of hyperparameter search of  which may potentially lead to information leakage [30].We also leave the investigation of such potential risk to future work.Additionally, one could consider exploring different models for graph generation from the posterior link probabilities, or extend the proposed framework to other types of graphs such as directed or weighted graphs.Furthermore, exploring the scalability of Blink for large-scale graph data is an important future direction.At last, one may also want to incorporate other LDP mechanisms that protect features and labels (such as [36]) into Blink to provide complete local privacy protection over decentralized nodes.

A COMPLETE PROOFS A.1 Proof of Theorem 4.1
Proof.Let  be the privacy budget and  be the degree privacy parameter.Assume ,  ′ ∈ {0, 1}  are adjacency lists that differ only at the -th bit, and ( 1 ,  2 ) ∈ {0, 1}  × R is an arbitrary output of function LinkLDP described in Algorithm 1.For ease of presentation, we denote the function LinkLDP as mechanism M, the process of randomized response in Lines 4-6 in Algorithm 1 as mechanism L, and the process of adding Laplacian noise to  in Lines 7-9 in Algorithm 1 as mechanism R.
where ( 11) and ( 12) are due to independence, ( 13) is because   =  ′  when  ≠  and ( 14) is the result of RR and the triangle inequality.

𝑑
Hence, due to Chebyshev's inequality, we have With probability at least 1 − 1/ 2 , we have Note that the expectation is taken over the randomness of both Ã and d.For now, we only consider the expectation over the randomness of Ã and assume that d is fixed (as a result,    are fixed), until further specified.

B EXPERIMENTAL DETAILS B.1 GNN architectures
For all GNN architectures we experiment on, we have the same model structure of a convolutional layer with 16 units, followed by a ReLU operator and a dropout layer with dropout rate set to  dropout , and finally followed by another convolutional layer whose number of units equal to the number of classes of the input graph.We experiment with different convolutional layers, including GCN, GraphSAGE and GAT, which are described in more detail below.
GCN.In a graph convolutional network [24], the node embedding is updated through where  is a learnable weight matrix and  is a learnable additive bias.Note that we include self-loops and symmetric normalization coefficients such that the new node embedding will depend on the previous embedding of itself.
In a weighted graph setting with edge weight matrix , we use where P =  +  for adding self loops to the GNN.

GraphSAGE.
In GraphSAGE [19], the node embedding is updated through where  1 , 2 are learnable weight matrices and  is a learnable additive bias.The key difference to our configuration of GCN in Eq. ( 19) is that the transformation of the root node embedding   is now learned separately compared to its neighbors.This enables GraphSAGE to prefer the root node embedding more than that of neighbors to achieve better performance when the links are noisy and neighbors may not have a positive effect on model performance.
In a weighted graph setting with edge weight matrix , we use

Figure 1 :
Figure1: The problem of link local differential privacy over decentralized nodes.Each node first perturbs its adjacency list before sending to the server for privacy protection.link inference attacks in GNNs[20,44].Hence, it is of particular significance to design privacy-preserving GNN frameworks.Local differential privacy (LDP)[11,14,42] is a rigorous privacy notion for collecting and analyzing sensitive data from decentralized data owners.Specifically, LDP ensures privacy by having each data owner perturb their data locally before sending it to the server, often through noise injection[16].The focus of our work is to design LDP mechanisms to protect graph topology (i.e., links) over decentralized nodes.In this setting, the server has access to the features and labels of all nodes, but not to any links among them.The server must infer the graph topology from the noisy adjacency lists transmitted by the nodes, as shown in Figure1.To illustrate the importance of link LDP in graph topology protection, consider a contact-tracing application installed on end devices for infectious disease control.The on-device application records interactions between other devices via Bluetooth, and the server trains a GNN to identify individuals at higher risk of virus exposure using the collected data.While local features, such as age and pre-existing conditions, can be voluntarily submitted by users and directly used by the server, this is not the case for contact history (i.e., links) due to the risk of revealing sensitive information such as users' whereabouts and interactions with others.Hence, it is crucial for end devices to perturb their links to achieve LDP before transmitting the information to the server for privacy protection.Our focus on link local privacy is driven by the following considerations.To start with, links represent the relationships between

Figure 2 :
Figure2: Test accuracy of GCN[24] and MLP (GCN after removing links) on various graph datasets.Significant performance degeneration caused by removing links indicates the importance of graph topology in GNN training.nodes, which data owners are often unwilling to disclose.Moreover, the issue of link LDP in GNNs over decentralized nodes as clients has yet to be sufficiently addressed in the literature, and there is currently a lack of effective mechanisms to balance privacy and utility.[36]first propose locally differentially private GNNs, but only providing protection for node features and labels, while assuming the server has full access to the graph topology.Current differential privacy techniques for protecting graph topology while training GNNs, such as those described in[21,27,44], are limited by poor performance and are often outperformed by MLPs that are trained without link information at all (which naturally provides full link privacy).This issue with[44] has been investigated in[25], and we also demonstrate similar behaviors of other baselines in this paper.On a separate line of research, there have been recent works on privacy-preserving graph synthesis and analysis with link local privacy guarantees[23,32,51].However, although some of these works do provide valid mechanisms to train GNNs with link LDP protections, these mechanisms are usually designed to estimate aggregate statistics of the graph, such as subgraph counts[23], graph modularities and clustering coefficients[32,51], which are not useful for training GNNs.Hence, these works are not directly applicable to our setting, and we will later show in this paper that they perform poorly in terms of GNN test accuracy.As such, there is a clear need for novel approaches to alleviate the performance loss of GNNs caused by enforcing privacy guarantees and to achieve link privacy with acceptable utility.

Figure 3 :
Figure 3: Structure of the proposed Blink framework.set  = {  :  ∈ {1, 2, . . ., }} is the set of all  nodes, consisting of labelled and unlabeled nodes.Let   ,   be the sets of labelled and unlabeled nodes, respectively, then   ∩   = ∅ and   ∪   =  .The adjacency matrix  ∈ {0, 1} × represents all the links in the graph, where  , = 1 if and only if a link exists between   and

Figure 5 :
Figure 5: Performance of Blink and other mechanisms.X-axis represents  and y-axis represents test accuracy (%).

Figure 6 :
Figure 6: Density of estimated Â against ground truth  in Blink-Hard.

Figure 9 :
Figure 9: The MAE of the estimated link probabilities  against ground truth  for full Blink, Blink with prior component only and Blink with evidence component only on CiteSeer.The latter figure is a closer look at the prior component whose trend is unclear in the former figure.