Reducing Access Disparities in Networks using Edge Augmentation

In social networks, a node's position is a form of \it{social capital}. Better-positioned members not only benefit from (faster) access to diverse information, but innately have more potential influence on information spread. Structural biases often arise from network formation, and can lead to significant disparities in information access based on position. Further, processes such as link recommendation can exacerbate this inequality by relying on network structure to augment connectivity. We argue that one can understand and quantify this social capital through the lens of information flow in the network. We consider the setting where all nodes may be sources of distinct information, and a node's (dis)advantage deems its ability to access all information available on the network. We introduce three new measures of advantage (broadcast, influence, and control), which are quantified in terms of position in the network using \it{access signatures} -- vectors that represent a node's ability to share information. We then consider the problem of improving equity by making interventions to increase the access of the least-advantaged nodes. We argue that edge augmentation is most appropriate for mitigating bias in the network structure, and frame a budgeted intervention problem for maximizing minimum pairwise access. Finally, we propose heuristic strategies for selecting edge augmentations and empirically evaluate their performance on a corpus of real-world social networks. We demonstrate that a small number of interventions significantly increase the broadcast measure of access for the least-advantaged nodes (over 5 times more than random), and also improve the minimum influence. Additional analysis shows that these interventions can also dramatically shrink the gap in advantage between nodes (over \%82) and reduce disparities between their access signatures.


Introduction
One of the promises of a highly-connected world is an impartial spread of opinions driven by free and unbiased sources of information, leading to an equitable exposure of opinion to the wide public.On the contrary, the social network platforms currently governing news diffusion, while offering many seemingly-desired features like search, personalization, and recommendation, are reinforcing the centralization of information spread and the creation of so-called echo chambers and filter bubbles [3].A person's position within these networks often determines their access to information and opportunities such as jobs, education, and health information [10,20] and can confer advantage via influence on others [29].Network position can therefore be viewed as a form of social capital [9,12] -a function of social structure that produces advantage [19].
The dynamics of how social networks are formed (including organic growth and recommendations) can lead to skews in network position based on demographics, gender, or other attributes.Experiments show that introducing even slight demographic bias to network formation processes can exacerbate differences in network structure between groups [52].This becomes even more problematic when seen in light of boyd, Levy, and Marwick's argument [7] that position in the network is itself a feature that can lead to discrimination separately from individual demographic attributes, and modern social networks might be vehicles for a more direct propagation of (dis)advantage.Social networks' topology can cause better-positioned users to benefit more from the privileges of their position, leading to even better connections.On the other hand, less well-connected individuals -because of demographics, class, wealth, or other factors that drive network position -will find it much harder to improve their network status.As a result, the gap in power between the most and least advantaged users can lead to a cascading cycle where those with more capital have better opportunities for additional improvement, creating increased inequality.
In order to mitigate the differential accumulation of social capital, one could consider intervening in the network to change the spread of information.However, in order to do this in an automated fashion, we need ways to measure social capital based on network position.Fish et al. [25] first introduced the notion of information access as a resource and used it to propose a formal description for an individual's access to information.Beilinson et al. [5] expanded on this concept and defined an access signature to encode the "view" from a node of its access to information sent from other nodes in the network.We build on these approaches to model structural access advantage and formulate appropriate metrics for its evaluation.We design intervention strategies that use these metrics to achieve our main goal of ensuring equitable information access.
Our setup differs from prior work in a significant way.In influence maximization, a single piece of information is being spread in the network, and one can improve access for disadvantaged nodes by augmenting the set of initial sources.In contrast, we consider a setting such as those which occur on LinkedIn, where each node is the source of a unique piece of information, and access to all pieces is equally important.Given this key difference, we argue that instead of trying to select additional seeds for some or all of the pieces to improve dispersal, the natural intervention is adding edges to the network, representing the idea of purposefully strengthening weak ties [29] to mitigate bias in the structure and increase connectivity.
In this work, we have three primary contributions: (1) Using a normative framework and drawing on prior work, we formulate three measures -broadcast, influence, and controlto model structural advantage with respect to access.
(2) We focus on intervening in the network using budgeted edge augmentation to improve the structural position of leastadvantaged nodes, reduce the advantage gap, and ensure that nodes have similar "views" of the network (as measured via their access signature).At the core of our approach is the idea that to mitigate inequality, we should maximize the minimum access of the least-advantaged node -which in turn reduces to maximizing the minimum access between all pairs of nodes in the network.
(3) We introduce heuristic algorithms for selecting edge augmentations and empirically evaluate them on a corpus of social network data.We further show experimentally that while this process directly maximizes the broadcast measure of access advantage, it also simultaneously improves influence and control disparities among nodes, as well as making node access signatures more uniform.

Related Work and Preliminaries
Motivated by the design of viral marketing strategies, Domingos and Richardson [24] introduced an algorithmic problem for social networks in which one wished to convince an initial subset of individuals to adopt a new product or innovation in order to maximize the cascade of further adoptions.This model can be generalized to many types of information spread beyond adoption and was formalized as the discrete optimization problem of influence maximization by Kempe et al. [34], leading to an extensive literature on the subject (see the survey [40]), including many applications in public health awareness [57,[61][62][63].
Structural Advantage Information propagation in networks has been studied for decades in social and computing sciences [10,20], and network position is known to dramatically impact a node's access to other network members [29].It has been repeatedly argued that one's position in a network is itself a form of wealth or social capital [9,12,19,30], enabling better and faster access to circulating information and important individuals.This translates into better access to opportunities (such as jobs and scholarships) and enables well-positioned people to be more effective brokers, make better decisions, and innovate more efficiently [9].Further, in public health scenarios, people rarely act on mass-media information unless it is also transmitted through personal ties [33,48], leading to well-connected nodes having improved outcomes in crises.

Bias in Network Structure
The network itself can act as a transmitter for bias when the structural advantages described above interact with network formation mechanisms that encourage homophily and clustering of demographic groups.Schelling demonstrated how local neighborhood-based decisions could lead to segregation [50], and recent work has explored how bias in localized decisions about new connections can result in networks that have significant skew [32,37].Sociologists have extensively studied the role of social status in shaping network structure, showing in small-scale experiments that it significantly influences whether individuals end up in central vs. peripheral network positions [15,41].
More recently, studies in network science have extended these ideas to large-scale networks by developing computational methods for characterizing the structural influence of social status at scale [2,38].For example, Clauset et al. quantify the ways in which institutional reputation (and the auxiliary features of demographics and productivity) shapes the structure of faculty hiring networks among academic departments [17,60] and subsequently the differential spread of ideas [44].

Algorithmic Fairness in Information Propagation
In the setting of information access, natural questions of fairness arise in the problem of ensuring similar allocation among demographic groups, which are often represented as disjoint subsets of nodes.Inspired by the literature on social position initiated by Granovetter's strength of weak ties [29] and framed in the context of online social networks by boyd, Levy, and Marwick [7], there has been a rash of recent work on computational questions around fairness in access on social networks [1,4,25,31,45,51,55,59].The key underlying idea is that information access is a resource, and Fish et al. [25] argued that access based on network position is a form of privilege, which they used to define a notion of individual fairness.
Much of the work on defining and applying fairness has been undertaken in the influence maximization framework.One important thrust has been improving equity among demographic groups within a network, typically defined based on protected classes (e.g., race, gender) [1,31,45,51,55].They develop metrics and algorithms to ensure that roughly equal amounts of information reach each demographic group while optimizing influence maximization.In all cases, a single piece of information is being spread in the network, and they intervene by augmenting the seed set.The one exception is Jalali et al. [31] who adds edges instead of seeds.We note that while a few papers have considered edge augmentation to maximize the influence of a given group [4,22], they inherently define advantage to be access to the seed set.
Several other recent papers in the space consider variants of the basic access problem.Becker et al. [3] consider  sources of diverse information in a network and maximize the expected number of nodes receiving at least  types of information.Ramachandran et al. [46] use a diffusion model of mobility dynamics and try to achieve equity in group-level access in the facility location problem.

Preliminaries
As in the discrete optimization setting of [34], we use a stochastic information flow model describing how information might transmit from one node to another along the edges of  (for example, Independent Cascade, Linear Threshold, or an infection flow model from epidemiology [34]).These models all work by assuming that at time zero, an initial seed set of nodes that possess the information to be spread.For each seed   in the seed set, there is then a (potentially hard to compute) probability    -which we call access distancethat node   ∈  possesses   's information once the spread process has terminated.Inversely,   is called the reach of   with respect to   .Since we restrict our attention to the undirected setting (as social network links require mutual consent and typically create a giant connected component -Facebook's has 99.9% of users [56]),    =   and we use them interchangeably.

Independent Cascade Model
In this work, we utilize the standard probabilistic model of influence propagation, Independent Cascade (IC) [34] with a uniform transmission probability .In this model, a node exists in one of the three states: ready to receive, ready to transmit, or dormant.Initially (at time zero), all nodes are ready to receive information, while the seed nodes also possess the information and are ready to transmit.At each time step, a node that is ready to transmit sends its information to neighbors by transmitting along each incident edge independently with probability .All such transmissions are imagined to happen simultaneously, after which the transmitting node goes dormant.Computing the access probabilities for Independent Cascade is #-hard [16], so we use standard Monte Carlo simulations to estimate them when needed.
Access signatures Since we view a piece of information as being uniquely identified by its originator, describing the access of a node requires a vector of  − 1 probabilities, which is standardized to length  to facilitate easy indexing and comparison across nodes, and   := 1.These vectors are called information access signatures, and were introduced by Beilinson et al. [5], who argued that nodes that have similar "status" based on network position receive similar information.The signature encodes the "view" from a node of its access to information sent from the other nodes in ; people who are likely to receive information from the same part(s) of the network will have similar signatures.
Definition 1 (Access Signature [5]).The access signature    :  → R  of a node   ∈  in graph  on  nodes is:

Structural Advantage
How does network position impact access and influence?In social networks, structural advantage can manifest in many ways.
Inspired by prior work, we formalize three distinct notions of advantage arising from network position and propose measures for quantifying each.

Access-based Definitions
We begin by defining analogues of graph-theoretic distance, diameter, and betweenness centrality, highlighting when the access-based variants diverge from their traditional counterparts.
Access Distance In graph theory, the distance between nodes   and   is the number of edges in a shortest     -path.To adapt this to an information flow setting, we let the access distance be dist * (  ,   ) =    , the probability that   receives   's information after the completion of Independent Cascade.We observe that these measures can diverge in even simple networks.Consider two nodes connected by an edge; they have distance 1 and access distance .If instead, these nodes were connected by  disjoint paths of length 2 they woud have distance 2, but access distance . While the nodes are graph-theoretically closer in the first scenario, in the information access setting they are closer in the second.
Access Diameter For large networks, we often rely on summary statistics as indicators of network structure.One such metric is the diameter, defined to be the maximum distance between any two nodes (equivalently, the length of a longest shortest path).The analogous notion in the information access setting is then then the smallest access distance between two nodes (equivalently, the lowest probability of pairwise information transmission).We call this the access diameter: Access Centrality Finally, since we are interested in assessing influence or control with respect to information flow, we consider the betweenness centrality, which measures how often a node appears on the shortest paths between others.Specifically, if we let   be the number of shortest     -paths, and   () the number of shortest     -paths passing through vertex   , we can define the betweenness centrality of   as One can think of this as measuring the brokerage ability of a node in a world where information flows along the shortest paths.To adapt to the Independent Cascade model, we want to measure the fraction of other nodes' pairwise access that depends on   .In other words, the access centrality where   () =   − ′  can be computed using the access distance  ′ in  ′ =  \   .We note this is computationally expensive, as you must re-estimate access distances in  \  for each vertex .
To see where these two notions diverge, consider nodes ,  connected with a path of length two through node .The betweenness and access centrality of  are both 1.Now augment this graph by adding  disjoint -paths of length 3; the betweenness centrality of  remains 1, but the access centrality tends to 0 as  increases, as the fraction of information passing through  becomes insignificant.

Measures of Advantage
We now formalize three different notions of structural advantage, arising from various perspectives on fairness and information flow.

Broadcast Advantage:
From a fairness point of view, Fish et al. [25] argued that the performance of a source should be measured by how effectively it reaches least-advantaged nodes.In this vein, we propose our first advantage function, broadcast, to measure how difficult it is for a node to disseminate its information to all others in the network.
Definition 2 (Broadcast Advantage).The broadcast advantage of a node is the worst-case probability that its information is received -equivalently, the minimum entry in its access signature: broadcast(  ) = min In some sense, this represents how "loud" the node is -a larger broadcast means a better probability that everyone else in the network will receive your information.Consider the case of recruiters using a network like LinkedIn, wanting to spread information about a job opportunity.In order to ensure a diverse candidate pool and broad reach, the employer wants a high probability the ad will reach all suitable nodes in the network.Since well-connected users receive many such ads, the measure of recruiting effectiveness will depend on how well they can disseminate the information to the least-advantaged members of the network.Better-positioned recruiters will have higher broadcast.
Further, social media is often used in public health epidemiologic monitoring and surveillance for early detection of disease outbreaks.Staff responsible for dispelling misinformation and identifying highrisk or affected groups need access not only to the majority of people, but especially to those who are poorly-connected (and thus at risk of being neglected in treatment [27,49]), motivating us to improve their broadcast.
From another perspective, the broadcast is a lower bound on the probability that   will get information from   , regardless of which   is selected!Increasing broadcast(  ) necessarily improves information flow to/from the parts of the network that are currently least accessible from   , increasing the novelty and diversity of its information.Novel information often represents a resource or opportunity due to local scarcity, and users with access to it enjoy social and economic advantages, including more success in wages, promotion, job placement, and creativity [9,29].

Influence Advantage:
Network prominence has been studied as a type of advantage [8,35].A central or well-connected node is more likely to have high visibility, which Jackson's friendship paradox argues can lead to over-representation and increased influence [30].This type of advantage does not require the ability to reach all nodes in the network, just many of them.
Being able to disseminate information to a large set of other members enables a user to build their social reputation, express and diffuse their opinion, and discover novel content and information [22], which can be viewed as media power or celebrity capital.This may also lead to opportunities for revenue from advertisement [14].Consider the example of collaborations in a scientific community.If someone can reach more people to share her research, she gets more recognition, and feedback which enables improvement, collaboration opportunities, and directions or ideas for future work [21,53].We propose influence advantage as a measure of this form of structural advantage, drawing on influence maximization [34] in choosing a quantification.Definition 3 (Influence Advantage).The influence advantage of a node is the average probability that its information is receivedequivalently, the mean of the entries in its access signature: 3 Control Advantage: Burt [9] introduced the idea of brokerage advantage.Individuals in networks with many "structural holes" may derive information and control benefits from the lack of external connectivity among people they can reach.Burt introduced this form of social capital as an information benefit or vision advantage that improves performance by providing early access to diverse and novel perspectives, ideas, and information.Hence, a person's reach is a form of power as it enables her to broker favors and consolidate strength by being uniquely positioned to coordinate the actions of others.We call this type of structural advantage control.
While Burt proposed several ways to measure structural holes, including bridge count [13], and network constraint/redundancy [9], in more recent work Jackson [30] used betweenness centrality [26] to measure brokerage advantage.This generic measure of importance in a network captures a node's ability to act as an intermediary to coordinate others, where nodes rely on it in order to reach other users along shortest paths.Higher centrality corresponds to more control over information flow in the network.In turn, we use access centrality to measure the control advantage.
Definition 4 (Control Advantage).The control advantage of a node is given by its access centrality: We observe that control can be rewritten as a nested sum over nodes, revealing a useful finer-grained notion of advantage.For example, suppose the node   has one neighbor   , which is a leaf, and another neighbor which is a member of a large clique.Clearly,   has a large degree of control over   , as it is an intermediary to all access to the clique, yet control(  ) might remain small, as   plays little role in access between clique nodes.We use control   to denote the brokerage   has over information reaching node   , where control .
Our measure can then be written as control(  ) =  control   .When trying to mitigate inequity in access, we would like to see the control values decrease for better-positioned nodes.Additionally, we argue that in an ideal network, no node has a monopoly over others' access to information, and we would like to prevent situations where control   is close to 1 for any pair (, ).

Edge Intervention & Welfare
In contrast to the standard framework of influence maximization, we argue that when considering information flow in a network, it is important to have access to information from all individuals, not just a seed set.Further, given this shift in objective, adjustments to the model of intervention are warranted, and we propose edge augmentation as the natural candidate.We support our argument from three perspectives: variety, structure, and voice.
Variety Since ideas travel a variety of paths from many sources [28,42,58], access to more diverse information and a greater number of individuals is important [29] and can provide a vision advantage that translates into social capital [9].Key functionalities of social networks like LinkedIn rely on the fact that important information is frequently being disseminated from a multitude of constantlychanging sources.Traditional influence maximization is insufficient for assessing access and proposing equity-improving interventions in this setting, as we no longer know the seed set, nor can we afford to try and augment sources for each new announcement.
Structure Granovetter introduced the idea of network manipulation to achieve specific goals [29].Since network position is a critical form of social capital in information access, and positional disparities arise from biases in the network structure, we argue that interventions which change the underlying connectivity of the network are necessary.The natural candidate is to increase access through edge augmentation.This approach is further supported when one thinks of these edges as representing the addition of weak ties to the transmission network, as research shows that information can traverse greater social distance and reach more people when diffused along weak ties instead of strong ones [29].
Voice While it is easy to focus on improving access for poorlypositioned nodes, it is also important to consider the effect of interventions on already-advantaged users.Specifically, node interventions increase the reach (and thus influence) of selected individuals [30], essentially amplifying their information within the network.To give voice to all participants, we argue that edge augmentation improves fairness by increasing the reach of all nodes.Now that we have argued for using edge augmentation to intervene in the network, we turn to the question of which structural measure of advantage to optimize.We use a normative framework to select one of broadcast, influence, and control, and draw on the Rawlsian Maximin argument [47] in proposing that we should maximize the advantage of the least advantaged node(s).
To choose a notion of advantage, we begin by observing that optimizing influence encourages the formation of edges to wellpositioned nodes.Therefore, nodes with better connections become more attractive to connect to [30], leading to a rich-get-richer phenomenon and potentially increasing the advantage gap instead of equalizing access [11].These peripheral-central connections also increase the control of central nodes over others, especially the disadvantaged.On the other hand, using broadcast as the objective prioritizes connectivity for the most disadvantaged nodes.As John Stuarts Mills noted, "it is hardly possible to overrate the value . . . of placing human beings in contact with persons dissimilar to themselves and with modes of thought and action unlike those with which are familiar . . .Such communication has always been and is peculiarly in the present age, one of the primary sources of progress" [43].Optimizing for control, on the other hand, prioritizes the brokerage ability of nodes over their access to diverse information, which could lead to polarization and centralized information distribution.We argue that increasing broadcast, which tends to also reduce the control of other nodes, is preferable since depending on powerful information-brokers reduces one's chance of unbiased access to diverse opinions.
Several other normative reasons underlie our preference for broadcast to measure structural advantage, when one considers outcomes in a network containing several (mostly-disjoint) minority groups.First, while these groups may have common interests, they will not individually have enough influence to accomplish them.Connecting disadvantaged nodes directly (instead of through a central node) will enable them to support one another and access important information, while countering the ever-increasing power of the majority.In support of this argument, we note that Kogan et al. show that geographically vulnerable (disadvantaged) users propagate more information during disasters, and are more likely to propagate tweets from other geographically vulnerable users [36].A final argument arises from work on mitigating polarization in social networks by increasing the similarity of users' exposure to a broad diversity of news and ideas.Since minimizing diameter can speed up communication [23] and increase the uniformity of exposure times, we argue that optimizing broadcast is the natural analogue in the information access setting.
To formalize a discrete optimization problem, we must now transform our advantage measure into an objective function.Following the Rawlsian Maximin Principle that one should maximize the welfare of the worst-off person [47], we seek to maximize broadcast for the least-advantaged nodes, and formalize this as the welfare.
Definition 5.The welfare of a graph  = ( , ) is Our central problem is to find a budgeted intervention optimizing welfare.

Heuristics
In this section, we introduce several heuristics for MaxWelfare-Augmentation which greedily select new edges using advantagebased criteria.We employ two basic strategies -connecting disadvantaged nodes to a central one, and adding a chord between two peripheral nodes.We will compare these with a baseline (rand) which chooses both ends of each new edge uniformly at random.We begin by defining the center of the network to be the node with maximum broadcast.In our greedy algorithms, we select this node in the un-augmented network and fix it for the duration of the edge selection process.As we iteratively make interventions, it is possible that a new central node emerges (one with higher broadcast than the selected center).While we could update at every step, this incurs a high computational cost.In order to evaluate the likelihood and impact of a shifting center, we re-ran the experiments on the three smallest networks and recorded how often the maximum broadcast increased, along with the  1 norm of the access signature difference between initial and new centers.The initial center node remained central more than 99% of the time, and the signature difference was less than 0.01 in the other 1% of cases.Based on this and the significant computational cost, we choose to fix a center node based on the initial network.
Before proceeding to the heuristics, we need two additional observations.First, computing the access distances is known to be #−hard [16]; as such, whenever our strategies use    , we rely on simulation to estimate the access distances using Reverse Influence Sampling (RIS) [6,54].Second, greedy heuristics may select a pair of vertices to connect which already have an edge in the graph.When this happens, we select an alternative augmentation in one of two ways: (1) the heuristic was trying to connect a node  to the center, we instead connect  to the node with second-highest broadcast, continuing down the broadcast order as needed until we find a non-neighbor of ; (2) if the heuristic was adding a chord or random edge, we "randomly replace an endpoint." We can now define our strategies for reducing the access diameter of a network.

Broadcast-based Strategies
To reduce the access diameter of the network we must affect at least one node with minimum broadcast.If   ,   is a pair of nodes so that    is minimum, we call them diameter-defining.Our first heuristic bc-chord finds a diameterdefining pair and adds the edge between them.A natural alternative strategy is to connect one or both of the pair to the center; we do this in bc-both and bc-one, respectively.Note that bc-both adds pairs of edges, and runs for only  2 steps; we constrain  to even values in experiments to ensure fair comparisons.
Influence-based Strategies Another reasonable approach to improving access in the network is to equalize influence.Similar to broadcast, we connect the node with minimum influence to the center, and call this heuristic infl.
Diameter-based Strategies Finally, we consider a measure that can be computed without simulation, the diameter of the underlying network.While the shortest-path distances and access distances may diverge, they are not independent, and creating short paths between nodes will improve their pairwise access.Similar to bc-chord, diam-chord adds an edge between a pair of nodes with maximum  (, );

Experiments
We implemented the heuristics from Section 5 in C++ and compiled with gcc 8.1.0;all experiments were run on identical hardware equipped with 40 CPUs (Intel Xeon Gold 6230 @ 2.10GHz) and 190 GB of memory, running CentOS Linux release 7.9.2009.To evaluate the effectiveness of our intervention strategies, we used a corpus of real-world networks sourced from the SNAP [39] and ICON [18] repositories, as described in Table 1.We treated all data as undirected, and used the largest connected component for each.
As briefly mentioned in Section 5, we use Reverse Influence Sampling (RIS) [6] to estimate access distances; we generate  = 10, 000 instances per simulation.To evaluate the accuracy, we ran each estimation 10 times and measured the fluctuations in access distances.In all cases, pairwise accesses varied by less than 0.03 (3% of the range), and the average difference was at most 0.004 (0.4% of the range).The heuristics bc-chord, bc-both, bc-one, and infl use RIS, requiring ( + ) time and ( 2 + ) space.
In each experiment, we used even values of  from 0 to 200, aiming for a practical intervention size relative to the network (less than a tenth of a percent of ||).In the Independent Cascade model, the spread of information depends on the input parameter  (the probability of transmission along an edge in a time step).For each network in our corpus, we computed the distribution of access distances for varied  and selected four (network-specific) values: one each to represent poorly-spreading and well-spreading scenarios, and two in the critical region of moderate spread.

Summary of Experimental Results
The primary objective of this work is to intervene in a network to improve access for the most disadvantaged nodes and reduce disparities in advantage by making access signatures more similar.
To assess whether our strategies achieve these goals, we employ several methods for evaluating the outcome of interventions.First, we directly measure the improvement in the minimum values of broadcast and influence realized in the network.Next, we shift our attention to the access signatures, where we evaluate whether our interventions have increased the similarity among nodes' views of the network using Manhattan distance.Finally, we consider whether our approaches improve disparity by reducing the advantage gap between the most-and least-privileged nodes.
In Figure 1, we present a comprehensive view of all three evaluations for a single network across its four transmission probabilities.From the first row, we observe that the heuristics bc-chord, bc-both and infl are most effective at improving broadcast and influence, with the latter two performing almost identically.Further, bc-chord surpasses the other approaches as information spreads  more easily.These results are qualitatively replicated by the other networks in our corpus (see Section 6.2).Given this, we restrict our attention to the bc-chord and infl approaches in subsequent figures, with infl favored over bc-bothto increase the diversity among our strategies.Further, we note that the behavior with respect to  remained consistent across all networks, and is wellrepresented by considering only the low-moderate-spread and wellspreading values of  (2nd and 4th columns).Due to space constraints, plots for the entire corpus (Figures 2 and 3), only show these two transmission probabilities.In the second row of Figure 1, we use violin plots to show the distribution of access distances for all pairs before (init) and after (rand, infl, bc-chord) intervention.We observe that while randomized augmentation has little effect, both heuristics significantly reduce the maximum pairwise access distance, with bc-chord again out-performing infl as  increases.While the distributions for other networks vary in initial shape, the pattern of improvement was consistent (see Section 6.3).Finally, the third row of Figure 1 illustrates our success in increasing the uniformity among each node's view of the network as measured by reducing the maximum distance between access signatures.Results for other networks are summarized in Section 6.3.
To round out our evaluation, we also computed how our interventions affected the advantage gaps for broadcast, influence, and control, as discussed in Section 6.4.For the network featured in Figure 1, these results are in the second row of Tables 2 and 3.
One surprising result was that while the absolute broadcast gap increased, the relative one decreased.We believe this is caused by interventions increasing access by a larger additive amount for central nodes than peripheral ones.Over the entire corpus, bc-chord shrank the broadcast/influence gaps by over 85%/82%, respectively.
Overall, we observe that our interventions are most effective when the network is better-connected -whether because  is higher, or the underlying graph is denser (e.g. in EU and Fb).Additionally, our analysis showed that bc-both and infl perform almost identically (Figure 2), suggesting that the nodes with minimum broadcast and influence may have similar access signatures.To further investigate this phenomenon, we measured the signature difference between the nodes selected by each of these heuristics at each intervention step and found them to be consistently in the bottom 10% of all pairs, with the average falling in the bottom 1%.This leads us to hypothesize that the set of least-advantaged nodes with respect to broadcast and influence are almost identical.

Improving Minimum Broadcast / Influence
The broadcast and influence measures quantify a node's structural advantage as a function of its signature.Here we evaluate whether edge interventions can improve these measures for the most disadvantaged nodes in the network.Figure 2 plots the trajectory of the minimum broadcast and influence as the number of interventions  increases with low-moderate-and well-spreading  for each network in the corpus.We observe that infl and bc-both consistently show the most improvement for both advantage measures.

Making Distances & Signatures Closer
One goal of intervention is to increase access for nodes that have the lowest probability of receiving some types of information.In Figure 3, we plot the distribution of pairwise access distances before and after intervention; we again consider two transmission probabilities (low-moderate-spread and well-spreading) for each of the 6 networks in the corpus.We observe that while the median value does not move significantly, the lower tail of the distribution gets much shorter and thinner.The amount of improvement increases with , and is more pronounced in the denser networks (EU, Irvine, and Fb).In some cases, with only 200 interventions, we are able to increase the minimum pairwise access distance by 0.7, more doubling the probability of information transmission!Another of our objectives is increasing similarity among access signatures so that all nodes have a similar "view" of the network.We use the Manhattan distance ( 1 metric) to measure the distance between two signatures 1 .The third row of Figure 1 shows violin plots of the distribution of these distances for Email-Arenas; those for other networks are omitted in the interest of space.The maximum signature difference was consistently reduced (at least 43% for well-spreading ), and while the median was relatively stable, the tail of the distributions shifted noticeably downward.

Measuring the Gap
The final central premise of this work is that improving equity requires reducing access disparities between nodes.To evaluate this, we measure the advantage gap for broadcast and influence, as well as the maximum amount of control achieved in the network (which can be viewed as a gap, since there are always nodes on the periphery with control value essentially zero).
Broadcast/Influence Gaps We begin by calculating both the absolute (max − min) and relative ( max − min min ) advantage gaps for broadcast and influence on network in the corpus; Table 2 shows these when  is well-spreading.As mentioned in Section 6.1, the absolute broadcast gap often increases with intervention, while the 1 using Euclidean distance ( 2 ) results in similar trends and no qualitative differences influence gap is typically reduced.However, the relative advantage gap behaves quite differently, consistently decreasing significantly with bc-chord, yet increases in most cases for infl.This supports our argument that infl may contribute to a rich-get-richer phenomenon by increasing advantage for central nodes, and is an important distinction between two otherwise well-performing heuristics.
Reducing Control Finally, we consider how our interventions affect control.In Table 3, we report the maximum values of not only the primary control measure of cent * but also the finer-grained pairwise control (control   ).Here, we must restrict our analysis to the three smallest networks in our corpus due to the exceptionally high cost of computing control for all nodes (which requires removing each node from the network and re-estimating access distances); we use the same well-spreading  values as in our gap analysis.The results are encouraging, as they show that intervention can increase the independence of nodes in the network when accessing information and prevent better-positioned nodes from having a monopoly over others.It is noteworthy that bc-chord not only uniformly achieves more than 53% reduction in pairwise control, it never increases the control (whereas infl can cause a 10-fold jump).

Conclusion
In this work, we propose a novel method for quantifying social capital through the lens of information flow in a network when all nodes have unique, equally-important information to disseminate.We introduce three new measures of structural advantage quantified in terms of network position, argue for intervening through edge augmentation to reduce bias in network structure, and formalize the budgeted intervention problem of MaxWelfare-Augmentation for mitigating structural inequity in information access.Finally, we propose heuristic strategies that improve access for the leastadvantaged nodes, reduce advantage disparities, and increase the similarity in access signatures.We perform a case study on a corpus of social networks and demonstrate that our bc-chord heuristic improves the minimum broadcast and influence, dramatically shrink advantage gaps, and reduces variance among access signatures.
Our work is inherently limited by our use of a uniform transmission probability in the Independent Cascade model, and by ignoring the time at which information is received (as we know that early access plays an important role in social capital).Further, the quantification of control is computationally infeasible for large networks, limiting our empirical evaluation.
We leave open many directions for future work, including the adaptation of these ideas to directed networks where access and reach may differ (   ≠   ) and optimizing for one may lead to trade-offs for the other.It would also be interesting to adapt this problem to the group fairness setting by defining and optimizing advantage measures on groups.Finally, we note that our measures and strategies can be applied to any probabilistic models of information flow, and may improve many existing diameter-based approaches.

Figure 1 :
Figure 1: Results for Email-Arenas with  = {0.2,0.3, 0.4, 0.5} (L to R).At top, we plot improvement in minimum broadcast and influence; the violin plots show the distribution of pairwise access distances (middle) and  1 signature distances (bottom).

Figure 2 :
Figure 2: For each network, we plot the improvement in min.broadcast and influence for low-moderate-and well-spreading .

Figure 3 :
Figure 3: For each network, we plot the distribution of pairwise access distances for low-moderate-and well-spreading .

Table 1 :
Summary of Datasets

Table 3 :
Maximum Control Values