Measuring and Mitigating Group Inequalities In Resource Allocation

Resource allocation, an integral part of socio-economic governance, profoundly influences individual prosperity and has the potential to mitigate or exacerbate socioeconomic disparities. This paper addresses the challenge of equitably allocating finite resources among individuals by answering two fundamental questions: (1) how to accurately measure and test group disparities and (2) how to optimally distribute resources while ensuring group fairness. We propose the Group Beneficiary Disparity (GBD) metric – an evaluation tool engineered to systematically gauge inequalities in a binary beneficiary/non-beneficiary context. The GBD provides decision-makers and planners with a powerful tool to audit social programs and optimize policies from a lens of group equality. We argue that utilitarian decision-makers cannot fully eliminate group disparities even when operating under social welfare constraints. To address this issue, we propose a new resource allocation optimization model, called A-FARM (Asymptotically Fair Allocation of Resources Model), with asymptotic group fairness guarantees. A-FARM partitions individuals into distinct, non-overlapping units and distributes resources among these units based on a utility-based allocation mechanism. Finally, we evaluate the performance of our proposed algorithm using both simulated and real-world data. Our results demonstrate that, A-FARM enables decision-makers to (1) achieve maximum efficiency under group fairness constraints and (2) perform a fairness-efficiency trade-off.


INTRODUCTION
Over the past decades, closing wage, investment, education, and homeownership racial gaps alone could have added $16 trillion to GDP [48].Furthermore, Peterson and Mann [48] showed that eliminating the racial gap in access to higher education could have added up to $113 billion in income 6:2 A. Farahi et al. for saving, investing, and consumption.Economists have repeatedly shown that narrowing group disparities in access to such core services not only helps individuals from underserved communities to flourish but also generates a net benefit to the society as a whole by stimulating economic growth [8].Additionally, group parity is a desired outcome or even required by law in many social settings.However, traditional measures of social welfare functions used for policy optimization and planning have limitations in capturing and addressing these disparities, as they often do not explicitly account for protected attributes such as race, gender, and age, among others [5,60].
Recent breakthroughs in Artificial intelligence (AI) have made it possible to use computational methods for policy optimization and resource allocation that account for multiple objectives and complex constraints.As a result, these models are increasingly being utilized for critical social and infrastructure planning and resource allocation, such as urban planning [41], infrastructure management [2], and critical care management [20].The main objectives of these models are often the optimization of the utility (the greatest good for the greatest number) or efficiency (maximizing the overall benefit).However, without equity constraints, these models are likely to have profound ethical implications [52], as evidenced by real-world cases [7,9].Therefore, there is an urgent need for computational methods that enable hypothesis testing, quantifying, and mitigating group biases [17].
These observations raise a fundamental question that motivates this work: how can we use computational methods to hypothesis test and quantify group inequalities?In response, we draw on the statistics and hypothesis testing literature to introduce the Group Beneficiary Disparity (GBD).GBD is a measure of group inequality with applications in evaluation and hypothesis testing the outcome of resource allocation systems (e.g., health care), urban planning (e.g., homeownership), and priority setting (e.g., college admission) tasks.It is worth noting that making a social argument to change the utilitarian status quo decision-maker requires solid empirical evidence to justify the allocation of public resources in interventions that may redress the perceived disparities.We, therefore, take the stand that the burden of proof is on system designers to demonstrate that a situation is unfair.To this end, the null hypothesis is that groups with similar characteristics have equal chances of access.GBD equips the end-users to perform such hypothesis testing.
Fairness in resource allocation has typically been addressed using social welfare constraints.Social welfare functions, including proportional fairness, envy-free fairness, and max-min fairness [50,54,58], have their roots in the economics literature and aim to distribute resources in equitably among different groups [60].In resource allocation mechanisms, social welfare functions often attempt to account for fairness by constraining the space of possible solutions for a utilitarian decision-maker [5].However, since they often ignore group attributes, they are not guaranteed to equalize access among different segments of society; therefore, group inequalities may be perpetuated or exacerbated rather than alleviated.
Our Contributions.The contributions of this work are threefold.(i) We introduce a measure of group inequality, dubbed as Group Beneficiary Disparity, and a hypothesis testing procedure (Theorem 3.3).We demonstrate the superiority of GBD over potential alternatives, making it a robust and reliable tool for evaluating and testing the outcome of resource allocation systems, urban planning, and priority setting tasks.(ii) We provide a comprehensive study of the conditions under which a utilitarian decision-maker can effectively address group disparities (Theorem 5.6).Our findings highlight the limitations of a utilitarian decision-maker and illustrate the need for the explicit treatment of group inequalities in social welfare formulations in most real-world applications.(iii) Last, to address the challenge of balancing system-level utility with group inequality, we introduce A-FARM, Asymptotically Fair Allocation of Resources Model.This model provides an optimization framework that maximizes efficiency while simultaneously reducing group inequality, enabling decision-makers to seamlessly control the trade-off between 6:3 fairness and efficiency.Our contributions pave the way for the development of more equitable and efficient resource allocation mechanisms that consider the needs of all groups in society.

RELATED WORK
The main objectives of utilitarian decision-makers are to maximize utility and efficiency.Utilitarian decision-makers are mainstream in real-world applications, including revenue allocation for public transport, medication allocation in healthcare, and bandwidth allocation in wireless networks [38,42,56].However, a key limitation of utilitarian decision-makers is the fact that they neglect any notions of fairness or group inequality that exist in many social and urban settings [4].As suggested by real-world cases [7,9], without fairness considerations, these models are likely to compromise social values and unfairly prioritize a subset of stakeholders at the cost of others [52].The need to mitigate these negative consequences motivated a proliferation of work in fair allocation literature.
The literature on algorithmic fairness is broadly split into two branches: "outcome fairness" and "allocation fairness." The AI literature primarily has focused on measuring and guaranteeing outcome fairness [11,46], while allocation fairness mainly originated from the economic literature [60].This work contributes to the latter.In fair resource allocation, there is no unique, broadly accepted definition of fairness that can be reasonably applied to all settings [12].Fairness is often defined with respect to the distribution of benefits (social welfare functions) or with respect to disparities in access across socio-demographic groups (e.g., structural inequality).
In conjunction with efficiency, social welfare functions have been widely used to evaluate the fairness of a social program.One common social welfare function is proportional fairness.A proportionally fair allocation is one in which each unit or agent is allocated a proportionally fair share according to his or her own utility function [54].A similar but stronger allocation is envy-free allocation.An allocation is envy-free if every individual likes his or her bundle of goods at least as much as the bundle of any other agent [57,58].Max-min fairness is another well-studied notion of fairness as a social welfare function.Max-min fairness maximizes the minimum allocation received by a unit in the system [21,50].These social welfare constraints, in their original form, assume all individuals are identical except in their benefit and do not necessarily distinguish between groups as defined by their group attributes.In the context of resource allocation, a decision-maker most often optimizes for efficiency and utility subject to social welfare constraints and ignores the joint distribution of utility and the group attributes [e.g., 1,30,39,49].As a result, social welfare functions fail to account for group inequalities.
There are extensions of social welfare functions that account for group benefit as opposed to individual benefit [see 43, for a review of group-based equity measures].These measures require an estimation of the benefit, or effect, of a decision on each group.Additionally, generalization to continuous variable attributes or when the number of groups is large is challenging [3,26].Attempts at fair resource allocation have been made by grouping various individuals together, however, not using group attributes [43,53].Unanimous fairness, and its more flexible version of democratic fairness, incorporates a notion of a group into an allocation problem [53], where all or a majority of agents in each group must agree that their group share is fair.While this concept is suited to small groups, attempts have also been made using the more flexible democratic fairness, which aims to satisfy a certain fraction of the agents in each group that is more suited to large groups [53].In the above settings, voting within a group is assumed to be possible.These notions of fairness are not generalizable to group inequalities considered in this work.
Kleinberg et al. [35] is the closest work to ours that employs group attributes to group individuals and then performs decision-making.Kleinberg et al. [35] have employed grouping into resource allocation settings.By considering a setup in which the population is split into minority and majority subsets, a utility-based algorithm is able to rank each subset independently while satisfying equity constraints.However, this work only considers group attributes on a binary scale.In realistic settings, such attributes as age or income level are often continuous.Thus, group disparity should be able to be measured on a continuous scale as well.

MEASURING AND HYPOTHESIS TESTING GROUP FAIRNESS
We consider a dynamical system in which the status of both beneficiary and non-beneficiary stakeholders can vary over time due to resource allocation and random factors beyond the control of the decision-maker.This binary setting finds its applications in a variety of fields, such as college admission, urban planning, and healthcare.For instance, a student may be admitted or not, a neighborhood may fall in a hospital desert or not depending on city planning policies [32], and an individual may have health insurance or not depending on the politics of the place they live in [51].In this section, we explore ways to measure group disparity and conduct hypothesis testing.
Notation and Problem Setup.Let us define a system as a collection of n agents.An agent is represented by a tuple of three elements, {x, u, S }: (1) x ∈ X is a d-dimensional vector that describes the group attributes, (2) u ∈ R >0 is a utility value, and (3) S is a beneficiary state that can be either 0 or 1.The state of an agent specifies whether or not they have benefited from the system.A system is entirely characterized by the joint distribution P (x, u, S ).The variable x specifies the group to which an individual belongs; hence, group disparity is defined with respect to x.The utility u is a context-dependent measure that quantifies the benefit an agent receives when S = 1.A beneficiary agent is an agent in the desired state with state one.It is important to distinguish between an individual's utility and their beneficiary state; while utility specifies the potential benefit an agent may receive if they are a beneficiary of a program, the beneficiary state indicates whether they are a beneficiary (S = 1) or not (S = 0).For example, an uninsured individual may experience an improvement in life expectancy if they become insured, but this benefit is unrealized if they are non-beneficiaries.

Definitions
We introduce the concept of "group fair state, " which serves as the foundation for our work.Group fairness is achieved when an agent's group attributes are statistically independent of their beneficiary state.We propose the term "group fair state" to operationalize the evaluation of group disparity.

Definition 3.1 (Group Fair State
). Suppose P (x |S = 1) and P (x |S = 0) are the conditional distributions of group attributes x.A system is in a group fair state with respect to vector x, if and only When a system is in a group fair state with respect to x, the agent's group attributes and beneficiary state are independent, i.e., P (x, S ) = P (x )P (S ).This statistical state applies system-wide and ensures that the vector x is independent of an agent's current state.For example, in the health insurance example, a group fair state for the vector x = {age, income} implies that both the insured and uninsured have similar joint distributions over age and income.However, as we will show in Section 5.3, these fairness criteria cannot always be met without violating one or more social welfare counterparts.
While this definition seems mathematically similar to demographic parity in the outcome prediction literature -where a binary decision outcome is independent of the group attributes [28,62] -our application is allocation fairness where the agent status is known and is not predicted.Group Beneficiary Disparity extends demographic parity to resource allocation settings and broadens its Measuring and Mitigating Group Inequalities In Resource Allocation 6:5 applications.Here we assume the outcome is known, as opposed to outcome prediction setting, where our goal is to have an estimation of the outcome.

Definition 3.2 (Biased System).
A biased system is a system that is not in a group fair state.Definition 3.1 is based on the principle of equal access, which examines the relationship between group attributes and beneficiary state independent of the utility variable u.Demographic parity, in the outcome prediction literature, is closely related to this definition [28,62].However, applications of measures of demographic parity are narrower for the following reasons: (i) while not impossible [31], generalization to continuous variables is challenging [44].(ii) Its existing estimators are biased.(iii) The hypothesis testing procedures proposed for outcome prediction [17] do not apply here.GBD extends demographic parity to resource allocation settings and broadens its applications.

Group Beneficiary Disparity
The concept of group fairness described in Definition 3.1 provides an intuitive framework under which we can construct a measure of Group Beneficiary Disparity (GBD).Specifically, the degree to which a social system is unfair may be quantified with the distance between conditional distributions P (x |S = 0) and P (x |S = 1).A zero-distance between the two distributions implies independence between x and the agent's state, indicating a group fair state.On the other hand, a non-zero distance indicates a disparity between the beneficiary and non-beneficiary groups.Hence, we define GBD as the distance between P (x |S = 0) and P (x |S = 1).
For convenience, we denote the non-overlapping sets of non-beneficiary and beneficiary agents with Y = {∀x i ∈ X | S i = 0} and Z = {∀x i ∈ X | S i = 1}, respectively.y and z denote elements from sets Y and Z , respectively.Deza and Deza [16] compiled a comprehensive list of distance measures, including the distances between probability measures.We want a measure that can handle multivariate data vectors, is computationally efficient, and is equipped with hypothesis testing theoretical guarantees.We employ a class of statistics known as the Maximum Mean Discrepancy (MMD), a class of integral probability metrics [47].MMD is defined as MMD(F , P y , We want to estimate MMD(F , P y , P z ) where P y and P z are unknown, and we have only samples from them.As a result, we employ a kernel-based method to compute MMD [27].Let k (., .)be a kernel function in a universal reproducing kernel Hilbert space (RKHS) H .The universality condition requires that k (., .)be continuous and H be dense in the space of bounded continuous functions on X, with respect to the L ∞ norm.MMD squared in this embedding space is MMD 2 (k, P y , P z ) = E y,y ∼P y k (y, y ) − 2E y,z∼P y , P z k (y, z) + E z,z ∼P z k (z, z ) where y and z are independent copies of y and z, respectively [23,24].MMD 2 (k, P y , P z ) measures the distances between Hilbert space embeddings of P y and P z [see Lemma 4 of 24].Zawadzki and Lahaie [61] noticed that squared MMD is the divergence associated with the kernel score where ω ∈ X. Subject to condition E x,x ∼P k (x, x ) < ∞, the above scoring rule is strictly proper [22].Since we only have samples from each distribution, we need to estimate MMD 2 (.) in a finite sample setting.Utilizing u-statistics, an unbiased estimator of MMD 2 (.) is where m is the number of agents with S = 1 and o = n − m is the number of agents with S = 0.This estimator has a computational complexity of O(m 2 ), assuming m > o.

Hypothesis Testing
Making a social argument to change the status quo requires solid empirical evidence.It is natural for a decision-maker to presume that a system is fair (see Theorem 5.3) unless data suggests otherwise.Hence, we take the stand that the burden of proof is on the decision-maker to demonstrate that a situation is unfair.We hypothesize that a system of interest has benefited the underlying population irrespective of group membership; hence, the null hypothesis is P y = P z .Given observed samples from P y and P z , a two-sample test determines whether to reject the null hypothesis.MMD is a convenient statistic with strong theoretical guarantees to perform such hypothesis testing [23].Theorem 3.3.Let F be a class of unit ball functions in a universal RKHS associated with a continuous kernel k (., .).Then MMD 2 (k, P y , P z ) = 0 if and only if the system is in a group fair state.
Proof of this theorem mirrors that of Theorem 5 in Gretton et al. [24].Following Definition 3.1, a system is in a group fair state when P y = P z , and Theorem 5 in Gretton et al. [24] proves that MMD 2 (k, P y , P z ) = 0 if and only if P y = P z .Following Theorem 3.3, the null hypothesis is that MMD 2 (k, P y , P z ) = 0.This theorem lends theoretical support to perform the above hypothesis testing.The reader may want to consult with [24] for convergence bounds and other theoretical guarantees associated with the kernel embedding representation of MMD.
Why MMD?In principle, we could use two sample test statistics other than MMD (e.g., Hellinger distance [37], Wasserstein distance [6], Kolmogorov-Smirnov statistic [K-S, 45], Kullback-Leibler divergence [K-L, 59]).The key advantages of MMD are that it is generalizable to multi-dimensional settings, its evaluation is computationally fast as opposed to the Wasserstein distance and the K-L divergence, and it is distribution-free as opposed to a t-test.We additionally demonstrate that MMD's statistical power in discriminating between two non-identical distributions is superior to alternatives.
Hyper-parameter setup.In the present study, we utilize Gaussian Radial Basis Function (RBF) kernels, a category of universal kernels [55].These kernels come with a single free hyperparameter that controls the Type II error rate.Various algorithms have been proposed to minimize Type II error [25].
Experiment.Conditional mean and variance of two groups are used as measures of vertical equity [36].However, these summary statistics have limitations in capturing the nuanced differences between two distributions.Suppose a social setting in which the means and variances of the group attribute x, e.g., age, for the beneficiary and non-beneficiary groups are matched.However, the beneficiary group has an extended tail that follows a log-normal distribution, while the distribution for the non-beneficiary group is more concentrated and follows a normal distribution.We ask which test statistic can better discriminate between the two distributions.Tests that are focused on mean and variance, such as t-test, are unable to distinguish between two groups.Following Farahi and Chen [19], we compare the performance of K-S, K-L, and MMD tests in Figure 1.We draw m samples from each distribution and compute the average p-value of 1,000 data realizations.To estimate K-L divergence, we employ the kNN estimator of Wang et al. [59].MMD (green line) test achieves the highest rejection rate, and the K-L test (blue dotted-dashed line) has the poorest performance.This experiment illustrates the fact that MMD can achieve good performance even when the mean and variance of the two non-identical distributions are the same.

CASE STUDIES.
We illustrate two applications of our proposed hypothesis testing method by applying it to two realworld scenarios: examining age beneficiary disparity in access to health care insurance and income beneficiary disparity in homeownership.By employing Equation (1) to perform hypothesis testing and quantify disparity (Theorem 3.3), we investigate how disparity varies across different regions.

6:7
Fig. 1.The statistical power of the MMD, K-S, and K-L tests in distinguishing between a log-Normal and a Normal distribution with the same mean and variance.The mean p-value (left panel) and rejection rate (left panel) vs. sample size.MMD rejects the null with a rate of 95% with a sample size of 1,000 data points, while K-S requires a sample size of ∼2,000 points to achieve the same performance.
Another application is to evaluate the performance of a decision-making algorithm, presented in the next section.
Data Sources.We sourced our healthcare coverage data from the 2017 and 2018 Annual Social and Economic Supplement (ASEC) Split-Panel Sample of the Current Population Survey (CPS), which was collected by the U.S. Census Bureau in March of 2017 and 2018 [13,14].For our homeownership data, we utilized the 2019 American Housing Survey [AHS,15].Both data sets contain household and family characteristics with no personally identifiable information.A detailed description of the data is provided in Supplementary Material.
Findings.The top panel in Figure 2 shows the age disparity between those covered with health insurance and those not during the years 2017 and 2018.The U.S. states are ranked in ascending order by their median group beneficiary disparity in 2018.Our results reveal that age disparity in health care coverage is prevalent in the United States, as we reject the null hypothesis for 17 out of 23 states with high statistical significance (p−value < 0.05) in both 2017 and 2018.Interestingly, age disparity improved in Idaho, New York, and Alaska from 2017 to 2018, while it worsened in Washington and North Carolina.Future research could investigate the potential causal relationship between changes in age disparity and state-level policies.
In our analysis of homeownership data, we found that the income disparity between homeowners and renters is present in all cities examined, as we reject the null hypothesis for all of them.Boston and New York exhibit the greatest disparity, whereas Phoenix and Riverside exhibit the least.By ranking metropolitan and rural areas in ascending order of their median group beneficiary disparity, we can gain insight into how income disparity varies across different regions.
Overall, our findings highlight the importance of utilizing rigorous statistical methods to investigate disparities in real-world scenarios.The flexibility and robustness of our proposed hypothesis testing method allow for its application to a wide range of scenarios, making it a valuable tool for decision-makers seeking to quantify disparities and develop evidence-based policies.

A UTILITARIAN SYSTEM AND GROUP FAIRNESS
When equity considerations become important, the decision-makers have to juggle between who gets how much from the resources, a problem that often hinges on the efficiency-fairness trade-off Fig. 2. Evaluation of disparity across age (top) and income (bottom) groups in health insurance coverage and homeownership, respectively.The states/cities are ranked in ascending order on their median age and income beneficiary disparity.[5,29,33].Is there any system that does not require a trade-off?If a system asymptotically, and without extra constraints or trade-offs, converges to a group fair solution, then explicit treatment of group fairness in the computational algorithms is unnecessary.But if such a guarantee does not exist, then the decision-maker may want to consider group fairness as an extra cost or constraint in the optimization model [18,34].In Section 5.1, we establish conditions under which a system is guaranteed to converge to a group fair solution asymptotically.We then, in Section 5.3, ask if these conditions are not satisfied, can welfare constraints alleviate group disparity?Since the answer is no, in Section 5.4, we propose a new optimization algorithm that maximizes efficiency while guaranteeing group fairness constraints.
Notations and Problem Setup.We consider a dynamical system where S t , the state of an agent, changes over time while x and u are time-independent.To model the time evolution of the system, we consider time as a discrete quantity, and subscript t specifies that the variable changes with time.Any instance of data at state zero with a probability c (t, u) ≡ P (S t +1 = 0 | S t = 1, u) degrades, its state decaying to S = 0 (e.g., a road's driving condition to degrade to unsafe, or one loses insurance coverage and becomes uninsured).Simultaneously, the state of a fraction of non-beneficiary agents gets upgraded with a probability r (t, x, u) ≡ P (S t +1 = 1 | S t = 0, x, u) (e.g., the unsafe road's surface is paved, or an uninsured person receives a public health insurance option).We refer to c (t, u) and r (t, x, u) as decay and decision functions, respectively.r (t ) denotes the fraction of agents with S t = 0 that upgrade to S t +1 = 1, and c (t ) the fraction of agents with S t = 1 that degrade to S t +1 = 0. Collectively, r (t ) and c (t ) quantify the rate of changes at time t.They can be computed as the following: r (t ) = u,x r (t, x, u)P (x, u |S t = 0) and c (t ) = u c (t, u)P (u |S t = 1).Often, in practice, the value of r (t ) is influenced by exogenous constraints (e.g., budget), and then a decisionmaker prioritizes the non-beneficiary agents who will transition from undesired to desired state.The decay rate often is not in the decision-maker's control and occurs at random (e.g., a city's 6:9 roads gradually degrade due to weather conditions).Let r (t ) < 1 and c (t ) < 1, implying that not all states can be upgraded or decay in a one-time step.
We assume that the agents are distributed over M units.The i th unit consists of n i agents.Let j be an index over agents in a unit such that j ∈ {1, . . ., n i }.Then, for the j th agent in unit i, there exists a state s i j ∈ {1, 0} and a spot utility u i j .Denote the maximum utility of the i th unit as u m,i = j u i j and its realized utility as u i = j s i j u i j .The decision-maker only picks agents in an undesired state to upgrade.For notational convenience, we denote the optimization variable s i j = 0 with r i j ∈ {0, 1}.r i j = 1 when agent j from unit i is selected for an upgrade, zero otherwise. 1If an upgrade occurs, the next time step s i j updates to one.We use these notations when we introduce the A-FARM algorithm (Section 5.4).

Group Fairness and Equilibrium State
Next, we introduce several useful notions of equilibrium.

Definition 5.1 (Equilibrium State).
Let the equilibrium state be a statistical state in which the conditional distribution of x on S does not vary with time, P (x If the decision and the decay functions are independent of x, then Definition 3.1 and Definition 5.1 are equivalent.This is implied from Lemma B.2 in Appendices.This observation is useful when we are exploring the properties of different baseline decision-makers in Section 5.2.Theorem 5.2.Suppose a system is in an arbitrary state at time t 0 .This system asymptotically converges to a group fair state at a limit of t → ∞, if the decision and the decay functions are independent of x at all times.
This theorem indicates that a decision function, such as one that makes random decisions irrespective of the value of x, will invariably steer the system toward a group fair state.

Theorem 5.3 (Maximal Eqilibrium State).
A maximal equilibrium state is a state where the joint distribution of {x, S t } does not change over time, i.e., P (x, S t ) = P (x, S t +1 ).Suppose the decay and decision functions are independent of x, then the maximal equilibrium state has a unique solution.The solution is a group fair state where P (S t =1) P (S t =0) = r (t ) c (t ) .Proposition 5.4.If r (t ) and c (t ) do not vary over time and r + c ≤ 1, then the maximal equilibrium state would be stationary and is an attractor.Theorem 5.2 and Theorem 5.3 suggest that when the decision and decay functions are independent of the group attributes, then the maximal equilibrium state is the natural state of the system.Proposition 5.4 implies that the maximal equilibrium state is an attracting fixed point.A system with a random initial condition asymptotically converges to the maximal equilibrium state unless there are some systemic biases that repel the current state away from the maximal equilibrium state.In many settings, legally, the decision function cannot be a function of group attributes; for instance, Title VII of the Civil Rights Act of 1964 prohibits employment discrimination based on race, color, religion, sex, and national origin.Thus, detecting variation in the unemployment rate across different socio-economic groups with identical education levels can indicate a violation of equal employment opportunity.This observation has another profound implication -the fact that modern systems are not in the maximal equilibrium state suggests that there are group inequalities that have historically favored or disfavored specific groups.

Decision-Maker Baselines
Next, we study the characteristics of three baseline decision functions -random, equalizer, and utilitarian.
Random Algorithm.A random decision function picks a random subset of non-beneficiary agents and upgrades them.Theorem 5.2 guarantees that under this decision algorithm, the system converges to a group fair state.However, one might justifiably ask whether there exists a more efficient algorithm that accelerates the convergence to a group fair state.While we do not intend to search for an optimal decision-maker, we propose a greedy-based algorithm that arguably boosts the rate of changes in favor of a group fair state.
Group Equalizer Algorithm.Group equalizer, or simply equalizer, is an algorithm that randomly selects but prioritizes the underserved groups by downsampling the overserved groups in favor of underserved groups (see Appendix D for the details).We prove that for a system that is not in a group fair state, an unserved group exists; hence, this algorithm outperforms the random algorithm.
Theorem 5.5.Let a system be biased at time t.There must exist an identifiable group with S t = 0 that is over-(under-)represented with respect to the similar groups with Utilitarian Decision Algorithm.So far, we have not considered the role of utility in decisionmaking.Most often, in real-world applications, not all decisions are considered equal in their utility.A random or equalizer decision function, while eliminating group inequalities, can lead to significant suppression of efficiency.In a utilitarian system, one seeks to maximize the sum of utilities [5].The traditional planning, machine learning, and decision theory literature typically seek to maximize the system-level expected utility, potentially subject to social welfare functions.The following Theorem identifies conditions under which it is guaranteed a utilitarian system converges to a group fair state.Theorem 5.6.Suppose a decision function that only uses an agent's utility u to pick upgrade candidates.If u is independent of protected attributes x, this system converges to a group fair state.

Limitations of Welfare Constraints
In many practical applications, u and x are correlated, or the decay and upgrade functions are not independent of x.These strong conditions in Theorem 5.6 suggest that a utilitarian system cannot guarantee convergence to a group fair solution in those applications.We, then, ask whether imposing social welfare constraints can alleviate group disparities (i.e., whether a utilitarian system under social welfare constraints asymptotically converges to a group fair solution).First of all, the existence of a solution under social welfare constraints is not guaranteed in many applications [57].Second, we show even if a solution exists, the solution could differ from a group fair solution.
Example 1 (Proportional Fairness).Suppose a system with two groups, A and B, each with n agents.For group A, the utilities of the agents are 0.5.For group B, the utility of half of the agents is 0.9, and the rest are 0.1.Initially, all agents are in the non-beneficiary state, and only 10% of the population can be upgraded.Under group fairness constraint, the decision-maker needs to pick 10% of group A and 10% of group B. After the upgrade, the sum of the realized utility of each group would be 0.05n and 0.09n, respectively.This violates proportional fairness.
Example 2 (Rawlsian Min-Max Fairness).Suppose a system with two groups, A and B, each with n agents.For groups A and B, the utilities of the agents are 0.1 and 0.9, respectively.We assume initially, all agents are in the non-beneficiary state, and the decision-maker can upgrade only 10% of the population.To satisfy group fairness, the decision-maker needs to pick 10% from Group A and 10% from Group B. Then the sum of the realized utility of the system after the upgrade would be 0.05n.A utilitarian decision-maker under Rawlsian min-max fairness constraint picks 2% of group B and 18% of group A, where the sum of the utility of each group would be 0.018n; and the sum of the realized utility of the system after the upgrade becomes 0.036n.Hence, the solution under Rawlsian min-max fairness would differ from a solution under group fairness constraint.

A-FARM, an Optimization Model
Since a utilitarian algorithm under welfare constraints cannot satisfy group fairness, we propose A-FARM, an optimization model for resource allocation with asymptotic group fairness guarantees.We first set up the objective loss function using sigmoidal efficiency functions.The rationales behind this optimization model and asymptotic guarantees are provided in Appendix A.
Optimization Setup.We consider a multiplicative loss function for resource allocation, max r i j M i=1 Q (u i ), where i = {1, 2, . . ., M } and M is the number of units.The goal of this objective function is to allocate resources across units in order to maximize the system-level realized utility while also reaching group fairness.We formulate our optimization problem as follows, arg max where r i j ∈ {0, 1}, j ∈ {1, . . ., n i }, and i ∈ {1, . . ., M }.R, a constraint to the optimization problem, specifies the number of agents in an undesired state who will get upgrades to the desired state.Instead of solving for arg max r i j M i=1 Q (u i ), we solve for the equivalent optimization problem arg max r i j . α is the transition rate from low efficiency to high efficiency per unit, f specifies where this transition happens, u i /u m,i is the unit-level efficiency, and A is the normalization constant that forces Q (u i = u m,i ) = 1.This cost function increases with unit efficiency and is zero when the efficiency is zero.In addition, this approach allocates more resources to under-served units where u i /u m,i < f , and fewer resources to over-served units where u i /u m,i > f by penalizing solutions for which the unit efficiency is less than f .Therefore, this model attempts to achieve a balanced efficiency, u i /u m,i , between all units.This optimization model is a relaxed version of the proportional efficiency (see Appendix A for definitions and proofs), which is equivalent to a utilitarian solution under group fairness constraint.
Specifically, under a limit of α → ∞ the sigmoid function becomes a step function, and if f = e, then the solution to this problem satisfies group fairness criteria (see Appendix A). Figure 3 shows the sigmoidal efficiency function with varying α and f .In a finite sample setting, where this solution might not exist, we use a sigmoid function to relax the group fairness assumption.Therefore, the solution is close to the ideal group fair solution.Such an approximation is reflected in the transition rate α and fractional utility f of the sigmoid function, where we allow a certain amount of deviation from the system-level efficiency.Later in Section 6.2, we evaluate the performance of this model and show that the solution converges to a group fair solution and can achieve maximum expected efficiency under group fair constraint.
Theorem 5.7.The optimization problem in Equation ( 2) is a convex optimization problem, and there exists a unique tractable solution.We employ the method of Lagrange multipliers to solve the above optimization problem.The Lagrangian function is In this dual formulation, λ is the Lagrange multiplier.The original optimization problem becomes equivalent to j=1 r i j .It is understood that u i and R i are a function of r i j .In Equation (4), we employ the fact that the first term is separable in R i to get to the second line.This is a min-max optimization problem, where the goal of the optimization is to minimize for λ while maximizing for R i .Since this is a convex optimization problem, we can solve it iteratively, as presented in Algorithm 1.The full algorithm can be found in Appendix J.
Summary.A-FARM approximately solves the problem of maximizing utility, max r i j M i=1 n i j=1 r i j u i j , subject to group fairness, MMD 2 (k, P y , P z ) = 0, and budgetary constraints, M i=1 n i j=1 r i j = R via a dual optimization problem in Equation ( 4).

EXPERIMENTS 6.1 Benchmarking Utilitarian Decision-Maker
Simulation Setup.We simulate four distinct systems, each comprising 2,000 agents, with an even split between initial beneficiaries and non-beneficiaries, thereby achieving class balance.We consider two types of initial conditions and two utility models.For comprehensive details, we direct readers to Appendix E for the model description, Table 1 for a summary of the systems, and Figure 8 for the specific initial conditions used.In each system, a decision-making algorithmeither Random, Equalizer, or Utilitarian -is employed to select non-beneficiary states for updates.Our budget constraints allow for upgrading only 5% of the non-beneficiary agents annually, while beneficiary agents experience a random decay rate of 5% per year.We let x be a two-dimensional continuous variable and u a one-dimensional continuous variable.Each simulation setup explores two utility models: (1) a random utility model, where u is independent of x, and (2) a biased utility model, where u and x are correlated.For each utility model, two sets of initial conditions are considered.Altogether, this results in four distinct systems generated by combining two initial condition setups with two utility models.We run 1, 000 independent realizations for each system and evolve them using each of the three decision-making algorithms over eight time steps.Additional details, including the choice of hyperparameters, can be found in the Supplementary Material.
Results.For each system, the top and bottom panels of Figure 4 shows the time evolution of GBD and system-level efficiency.Systems 1 and 2 support Theorem 5.2 and Theorem 5.6; and Systems 3 and 4 show deviation from the assumptions made in those theorems.As expected from Theorem 5.2, GBD decays towards zero independent of the simulation setup.Under the Equalizer decision algorithm for all simulation setups, the system converges to a group fair state at a faster rate.Random and equalizer decision algorithms have similar efficiency performances.By definition, the utilitarian algorithm outperforms random and equalizer algorithms in terms of efficiency, regardless of setup.In Appendix F, we present experiments with class imbalanced systems.
As implied from Theorem 5.6, and confirmed by our experiments, when the utility function is independent of x, the random and utilitarian decision algorithms have similar GBD performance (System 1 and System 2).This finding suggests that a utilitarian decision algorithm might be preferred over its equity-based alternatives under independence criteria.The system converges to a group fair state and maximizes the efficiency of the system.When the utility function is correlated with protected attributes x, (Systems 3 and 4), a utilitarian decision algorithm is not guaranteed to eradicate group inequalities.In fact, such an algorithm either perpetuates group inequalities or exacerbates them.Although equity-friendly algorithms, such as A-FARM, alleviate group inequalities, they can suppress the efficiency of the system.This observation suggests that there must be a trade-off between efficiency and fairness; i.e., group disparity cannot be achieved without sacrificing efficiency.By definition, social welfare constraints restrict the space of acceptable solutions that are considered by a utilitarian decision-maker but do not guarantee group parity [3].Thus, they cannot address group inequalities that are demonstrated in Systems 3 and 4. Next, we illustrate how well A-FARM can eliminate group inequalities.

Benchmarking A-FARM
In this section, we conduct a set of simulations to show the performance of our resource allocation optimization model, A-FARM.We make comparisons with utilitarian and random solvers, described above.Additionally, we illustrate the trade-off between fairness and efficiency by varying the number of units.
Simulation Setup.We conduct simulations involving 2, 10, and 20 intrinsic groups, each with a total of 2, 000 agents.These intrinsic groups serve as natural clusters within the entire population.In each scenario, we examine two types of utility models: unbiased and biased.In the unbiased utility model with two intrinsic groups, both the beneficiary and non-beneficiary populations have their utility values randomly generated from a common log-normal distribution, characterized by a shared mean μ and standard deviation σ .Importantly, in this model, the utility function u is conditionally independent of the variable x.Conversely, in the biased utility model with two intrinsic groups, the utility values for the beneficiary and non-beneficiary agents come from lognormal distributions with different means and standard deviations, making u and x correlated in this case.For a more detailed explanation of the generative model used in these simulations, readers are directed to Appendix K.
In the allocation algorithm, the upgrade and downgrade rates are taken to be 30% and 10% per year, respectively.The system has evolved for ten years with each decision algorithm.The hyperparameters in the sigmoidal utility function are taken as α = 20 and f = ( M i u i )/( M i u m,i ); but the results are insensitive to 50% variation in these hyper-parameters.
Results. Figure 5 shows the performance of A-FARM, the utilitarian, and random solvers.The dotted red line shows the efficiency bound in Corollary A.8 that is the maximum achievable efficiency under group fairness constraint.A utilitarian solver prioritizes items with the highest utilities for upgrades, leading to maximum efficiency compared to other solvers.However, it is exclusively focused on utility, which comes at the expense of group fairness, causing it to exhibit the most significant group inequality among all solvers.In contrast, the random solver upgrades items arbitrarily without considering utility maximization.This approach naturally leads to reduced disparity but also compromises efficiency.A-FARM strikes a balance.It aims to maximize efficiency while maintaining group fairness.In doing so, it may give up some efficiency, yet it still attains the maximum achievable efficiency under group fairness constraints (red dashed line).
Figure 6 illustrates the trade-off between fairness and efficiency.Detailed initial conditions for all models are provided in the Supplementary Material.It is evident that achieving reduced disparity often entails a compromise in efficiency.In a limit of a single unit, A-FARM performs identically to the utilitarian solver.However, as the number of units grows, both the group disparity and efficiency decline until the units match the count of intrinsic groups.This trend is observed because once the unit count surpasses the number of intrinsic groups, separating the utility from group attributes within agents becomes unfeasible.By this point, the system has effectively reached a group fair solution.Thus, introducing more units beyond the number of intrinsic groups is not beneficial, resulting in efficiency losses.These results emphasize that by incrementally increasing units up to the threshold of intrinsic groups, A-FARM navigates the trade-off, transitioning from a utilitarian approach to one that champions group fairness.
A distinguishing characteristic of A-FARM is that it does not require protected attributes as an input.Instead, it relies on data that has been pre-partitioned into specific groups.The efficiency of this optimization strategy is contingent upon the accuracy of these groups.Consequently, the complexity or dimensionality of the protected attributes do not have an impact on the algorithm's performance.In Figure 6, A-FARM demonstrates superior performance over the utilitarian solver Fig. 5. Time evolution of a biased system as a result of A-FARM (blue), a utilitarian (black) decision-maker, and a random (orange) decision-maker.The shaded 68% confidence intervals are shown using 100 simulations.The red dotted line is the maximum achievable efficiency bound.Top panels.Group beneficiary disparity.Bottom panels.Efficiency.
when the utility model is biased.The utilitarian solution (when M = 1) and other A-FARM solutions displayed in the leftmost column, which pertains to the unbiased utility model, all exhibit comparable performance and efficiency outcome.This is due to the fact that the utility associated with non-beneficiary agents is independent of their protected attributes.The system naturally gravitates towards a group-fair equilibrium, as implied from Theorem 5.6.
Efficiency-Fairness Trade-off.Group fairness constraint puts an upper bound on the achievable efficiency (Corollary A.8); i.e., under group fairness constraints, a system may not achieve its utilitarian efficiency, as seen in Figure 6.Hence, being able to control for the trade-off between efficiency and group fairness is a required feature in many practical applications [5].A-FARM not only achieves the steady-state maximum achievable efficiency but also, if desired, can control for fairness-efficiency.
Limitations.Here, we assume the intrinsic number of groups and group memberships are known.However, in practice, we might not know that in advance.The optimal strategy to define groups is beyond the scope of this work and will be explored in future work.

Case Study: Infrastructure Upgrade
The application of A-FARM to the selection of roads for repair in Detroit is a prime example of how this algorithm can be used to solve real-world problems.The goal of this case study was to distribute funds for road repair in a way that would maximize the efficiency of the system while Fig. 6.With 2, 10, and 20 intrinsic groups, the trade-off between fairness and efficiency can be seen by increasing the number of units M up to the number of intrinsic groups.The shaded regions are 68% confidence intervals computed using 100 simulations.Right Panel.Application of A-FARM to road maintenance in the city of Detroit.Utility is the census tract population, and group disparity is computed with respect to the median household income and poverty rate.The A-FARM solution can reduce the group disparities by a factor of three while sacrificing only 2% in efficiency.
narrowing the group inequality by at least a factor of two.In this case study, we consider the roads in the city of Detroit, Michigan -specifically, the subset of roads known as "higher function roads" that have associated pavement condition data.Suppose the city government has a budget for road repair, and the funds need to be distributed over roads in poor condition.The roads can be considered as agents where a beneficiary state is a good-conditioned road, and a non-beneficiary state is a poor-conditioned road.With each road having an associated utility, a decision-maker can upgrade the status of at most 40% of these poor-conditioned roads via repaving.Natural conditions over time will randomly downgrade the status of good-conditioned roads to poor at a 10% rate.The utility of each road is the number of people it is serving, i.e., the population of the census block group it is located in.Therefore, efficiency is equivalent to the fraction of the city population with access to good-quality roads.
Detroit Data Set.Each road is mapped to the closest census tract.Group disparity is defined with respect to median household income and poverty rate of the closest census tract, x = {median household income, poverty rate}.We use the 2018 PASER rating data provided by the city managers to determine whether a road is in good condition or in poor condition.The optimization units are constructed based on neighborhood boundaries. 2 Current neighborhood boundaries are compiled by Detroit's Department of Neighborhoods staff in concert with community groups.
Results.Our primary objective is to reduce group inequality by at least half while optimizing system efficiency.Group inequality is defined with respect to median household income and the poverty rate of the closest census block group.We utilize neighborhood boundaries to define the A-FARM units, ensuring each road is exclusive to a single unit.It is essential to point out that 6:17 Fig. 7. Application of A-FARM to road maintenance in the city of Detroit.Utility is the population of the census tract and the group disparity is calculated with respect to the median household income and poverty rate.The A-FARM solution can reduce the group disparities by a factor of three while sacrificing only 2% in efficiency.
the size of a neighborhood unit falls between that of tracts and zip codes.This intermediary size facilitates a degree of mixing among distinguishable groups.
Figure 7 presents the system's progression under random, utilitarian, and A-FARM problemsolving methods.Our primary focus is on the equilibrium state rather than the transition from its initial condition.The red-shaded area represents the 68% distribution range of the maximum efficiency attainable by A-FARM.Considering the intermingling of identifiable groups within neighborhoods, it's unsurprising that A-FARM does not settle into a perfectly group-fair solution.However, it still successfully aligns with our preliminary goal of mitigating group inequality.Specifically, A-FARM reduces group inequality by over two-thirds, with a mere 2% efficiency trade-off.

CONCLUSION
The potential of AI-powered algorithms to revolutionize decision-making is immense, but a commitment to equity and fairness must accompany it.This work takes a crucial step towards achieving that goal by introducing Group Beneficiary Disparity, a measure that allows us to identify and quantify group inequalities.By using this measure in our case study on road repair in Detroit and implementing the A-FARM algorithm, we have demonstrated that it is possible to make decisions that optimize not only efficiency but also narrow group disparities.
This work is a significant contribution to the ongoing efforts toward creating a more just and equitable society.By incorporating Group Beneficiary Disparity into algorithmic decision-making, we can ensure equal access for all society members, especially those most under-served and marginalized.Moving forward, this work can serve as a foundation for developing more equitable and fair algorithms that prioritize social welfare and minimize group disparities.By continuing to scrutinize and refine these computational models, we can create optimism about the potential for AI-powered algorithms to play a pivotal role in achieving our collective goal of a just and equitable future.It is worth emphasizing that while incorporating equity constraints may seem to come with short-term efficiency costs, the long-term benefits of economic growth and societal progress far outweigh these costs.[8].

APPENDICES A PROPORTIONAL EFFICIENCY AND GROUP FAIRNESS
We first develop a theoretical framework that provides the rationale behind the proposed optimization model.This framework borrows from the proportional fairness literature [1,54,57], and proves under what conditions the A-FARM algorithm asymptotically converges to a group fair solution.Each unit competes over resources to achieve its maximum utility by upgrading its agents to a desired state.Since resources are scarce, not all agents in an undesired state can be upgraded.The decision-maker allocates resources across units to maximize the utility of the entire system, not just one unit.Simultaneously, the decision-maker is concerned about the group fairness of a proposed allocation.As we will show, there will be a trade-off between unit-level and systemlevel efficiency, which can be mapped to the trade-off between system-level efficiency and group fairness.
Following Definition A.1, a proportionally efficient state is one in which the efficiency of each unit is at least as large as the efficiency of the entire system.First, we make three assumptions.
Scaling assumption.Under this assumption, the utility probability density function of unit i, after applying appropriate scaling to u, is the same as the utility probability density function over the entire system for all units i ∈ {1, . . ., M }, i.e., P i (γ i u)du = P b (u)du, where γ i ∈ R >0 and P b (u) is a baseline distribution.
Unit utilitarian assumption.Under this assumption, agents of unit i are ranked in descending order based on their utility and only the top f i = P i (S = 1) percent are in a desired state.Lemma A.2.Under the scaling and unit utilitarian assumptions, when n i → ∞, the efficiency of Proof.Suppose for each unit i, agents are ranked based on their utility in descending order, and u ( f i ) denotes the utility of the 1 − f th i percentile of the utility distribution.
and β is independent of the unit.This implies that X and S are independent.Hence, P (S = 1|X = x i ) = P (S = 1) = P (S = 1|X = x j ) for all x i , x j ∈ X and thus P c 1 (S = 1) = P c 2 (S = 1).This process is reversible in the other direction.
Lemma A.4.Under the scaling and unit utilitarian assumptions, if P i (S = 1) = P j (S = 1) then β i = β j and if P i (S = 1) P j (S = 1) then β i β j .
Proof.This is a straightforward consequence of the definition of β i .If P i (S = 1) = P j (S = 1), then f i = f j ; hence, β i = β j .If P i (S = 1) P j (S = 1), then f i f j ; hence, β i β j .
No mixing assumption.Let C ⊆ 2 X be a collection of all identifiable groups as defined by X.Under this assumption, there are M = card(C) units such that the members of each unit i belong to the same identifiable group as defined by x, i.e., x = x for x, x ∈ c for all units c ∈ C.
The no mixing assumption says that the agents with different attributes x are not mixed together in the same unit.For notational convenience, we denote Lemma A.5.Under the scaling, no mixing, and unit utilitarian assumptions, P (x | S = 1) = P (x | S = 0) if and only if e i = e j ∀i, j ∈ {1, . . ., M } where M = card(C).
Proof.By definition of proportional efficiency, e i ≥ e for all i ∈ {1, . . ., M }.Therefore, it must be the case that either (1) e i = e for all i ∈ {1, . . ., M } or (2) e i e j for at least one i j.Case (1) trivially implies that e i = e j for all i, j ∈ {1, . . ., M }.In Case (2), we can re-order the e i 's such that Therefore, e 1 < e is a contradiction, hence it must be the case that e i = e j for all i, j ∈ {1, . . ., M }.
In the other direction, if e i = e j for all i, j ∈ {1, . . ., M }, then e i = e for all i and proportional efficiency is satisfied.
The above lemmas provide the basis for the main theorem of this work, where we show under what conditions proportional efficiency and group fairness are equivalent.
Theorem A.7.Given the assumptions above, when n i → ∞ ∀i ∈ {1, . . ., M }, a system is proportionally efficient if and only if it is in a group fair state.
Proof.If a system is proportionally efficient, then e i = e j for all i, j ∈ {1, . . ., M } by Lemma A.6 and hence P (x |S = 1) = P (x |S = 0) by Lemma A.5.In the other direction, if a system is group fair, then e i = e j for all i, j ∈ {1, . . ., M } by Lemma A.5 and hence it is proportionally efficient by Lemma A.6.
Given that proportional efficiency and group fairness are asymptotically equivalent here, instead of solving for group fairness, we propose an optimization model that solves for proportional efficiency.Kleinberg et al. [35] show a similar result to Theorem A.7 where each subgroup seeks to maximize utility and satisfy equity constraints.
Corollary A.8.Under group fairness constraints, the system-level efficiency is bounded, Proof.Suppose m i = P i (S = 1)n i = n i j s i j .It is straightforward to show n i j s i j u i j ≤ m i j u i j for all i ∈ {1, . . ., M }.The utility sum of the first m i agents is larger than the utility sum of random m i agents.This implies M i n i i j s i j u i j ≤ M i m i i j u i j , and therefore e ≤ β.This Corollary is a result of relaxing the unit utility assumption.First, we note that, under group fairness constraint, β i = β k for all i, j ∈ {1, . . ., M }.This implies relaxing unit utility assumption results in e ≤ β when the system is in a group fair state.This corollary suggests that under group fairness constraints, the maximum efficiency of the system is bounded, and no decision-maker that satisfies group fairness can achieve greater efficiency.
In a finite sample setting, where the number of agents in each unit is finite, it is not guaranteed that a proportionally efficient solution exists.The existence of such a solution is not important in our application as we are not strictly seeking proportional efficiency.While we use the proportional efficiency framework as motivation for our optimization model, in our actual optimization setup, we relax the proportional efficiency requirement and find an approximate solution.

A.2 Fairness and Efficiency Trade-off
The goal of a decision-maker is to gradually converge to a group fair steady-state solution if the system is not initially in such a state.If the utility u and the group attribute x are dependent, there must be a trade-off between group fairness requirements and utility maximization (see Corollary A.8).Under this assumption, a utilitarian decision-maker, whose goal is to maximize the system-level utility, perpetuates group disparities.Meanwhile, only focusing on minimizing group unfairness leads to a sub-optimal solution in terms of efficiency.Hence, there will be a fairness-efficiency trade-off.A desirable decision-making model is a model that allows the user to control for this trade-off [5,10,40].
The proportional efficiency formulation not only allows the incorporation of group fairness constraints into an optimization model but also can effortlessly handle the fairness-efficiency trade-off.Suppose the following two extreme scenarios: (1) there is only one unit and (2) there is at least card(X) units, which can be infinitely large, such that there is no mixing between the units.
Under the first scenario, where there is only one unit, a decision-maker that solves for proportional efficiency becomes equivalent to a utilitarian decision-maker.Under the second scenario, a decision-maker who solves for proportional efficiency ignores utility optimization between identifiable groups, with its goal being to find a group fair solution.Therefore, by decreasing the number of units, a decision-maker transitions from one that finds a group fair solution to one that finds a 6:21 utilitarian solution.This adjustment in the number of units controls the trade-off between fairness and efficiency.

A.3 Advantages and Limitations
The key advantages of this framework for resource allocation are twofold.First, it is straightforward to control the trade-off between fairness and efficiency by adjusting the coarseness of the units.Second, in order to consider group fairness, there is no need to have access to protected variables as long as the units can be used as proxies for group membership (no mixing assumption).This property has the potential to alleviate any concerns of privacy in some settings.For example, in urban planning, geographical units such as census blocks, census tracks, zip codes, and counties can be used as a proxy for socio-economic protected variables such as household income, poverty rate, race, and ethnicity.Adjusting the coarseness of the units from small geographical units to larger ones allows a decision-maker to control the fairness and efficiency trade-off.
It is important to note that if groups are constructed improperly, this model can lead to inefficiencies and perpetuate structural disparities simultaneously.Future work should study the optimal way of constructing groups.

B PROOF OF EQUILIBRIUM THEOREMS
We start with defining the equilibrium state and proving a few useful lemmas that shed light on the dynamical properties of the system U when the decision and decay functions are independent of the protected variable(s) x.

Lemma B.1. If the decision and decay functions are independent of x then
Proof.We have The second equality follows the fact that the decision function, when s i = 0, and the decay function, when s i = 1, are independent of x.
Lemma B.2.If the decision and decay functions are independent of x and a system is in a group fair state then system is in equilibrium state.
Proof.When the decay and decision functions are independent of x, employing Lemma B.1 yields Since the system is in a group fair state P (x | S t = 1) = P (x | S t = 0), this expression simplifies to The same argument applies to P (x | S t +1 = 1) = P (x | S t = 1); thus the system is in equilibrium.
This lemma implies that when the decay and the decision functions are independent of x, then group fair state (see Definition 3.1) and equilibrium state (see Definition 5.1) are equivalent.Theorem B.3.Suppose a system is in an arbitrary state at time t 0 .This system asymptotically converges to a group fair state at a limit of t → ∞, if the decision and the decay functions are independent of x at all times.
Proof.P (x | S t 0 = 1) may be decomposed into two terms, P (x ) and q(x ).Let P (x ) be the probability density function of system U and q(x ) be function such that P (x | S t 0 = 1) = P (x )+q(x ).Rearrangement of P (x ) = P (x | S t 0 = 1)P (S t 0 = 1) + P (x | S t 0 = 0)P (S t 0 = 0) yields P (x | S t 0 = 0) = P (x ) − P (x )P (S t 0 = 1) − q(x )P (S t 0 = 1) P (S t 0 = 0) In the following, we show that the evolution of this system is only controlled with a scaling of q(x ), i.e., P (x ) and q(x ) are stationary and do not vary with time.Thus, at time t up to a normalization factor If we show lim t →∞ α (t ) = 0, then we have proved the theorem.
If the system is in a group fair state, then q(x ) = 0. Due to Lemma B.2 the system stays in a group fair state; thus, the theorem is proved for this case.Next, we assume that the system is not in a group fair state.
Let's evolve the system for one time step: To get the second line, we use Lemma B.1.Utilizing Bayes theorem, we get P (S t 0 = 1) P (S t 0 +1 = 1) We then substituted these into Equation ( 5) to get In the last line, P (S t 0 = 1) is absorbed into β (t 0 ) and the normalization factor.The normalization factor affects both terms, P (x ) and q(x ), similarly, thus can be factored out.This shows that the evolution of this system, up to a normalization, is only controlled via the coefficient α (t ), and P (x ) and q(x ) are time independent.At each time step the α (t ) decays with a factor of (1 − r (t ) − c (t )).After T time steps, up to a normalization, we have This implies that α (t 0 +T ) = T −1 i=0 (1 − r (t 0 +i) − c (t 0 +i)).r (t 0 +i) + c (t 0 +i) < 2 at all times implies that −1 < 1 − r (t 0 + i) − c (t 0 + i) < 1.Hence, we have lim T →∞  The same argument can be made for lim T →∞ P (x | S t 0 +T = 0) = P (x ).Hence, asymptotically, the system converges to an equilibrium.Hence, we have lim T →∞ P (x | S t 0 +T = 0) = lim T →∞ P (x | S t 0 +T = 1) = P (x ).
This theorem indicates that a random decision function, assuming the decay function is also independent of x, asymptotically converges to a group fair state at a limit of T → ∞. r (t ) + c (t ) = 2 occurs when the degrade and upgrade rates are 100%, r (t ) = c (t ) = 1.In such a setting, only item labels switch, and the system does not converge to a group fair state.In practice, we do not expect the rate of upgrade and downgrade to be 100%.Theorem B.4 (Maximal Eqilibrium State).The maximal equilibrium state is a state at which the joint distribution of x, S t does not change over time.Suppose the decision and decay functions are independent of x.Then, the maximal equilibrium state has a unique solution.This state is a group fair state with P (S t = 1)/P (S t = 0) = r (t ) c (t ) .
Proof.Let P (x, S t ) be the joint distribution of {x, S t }.If the system in a maximal equilibrium state, then we have P (x, S t ) = P (x, S t +1 ).If a joint distribution does not vary over time, its conditional distribution does not vary over time either, i.e., if P (x, S t ) = P (x, S t +1 ) then P (x | S t = 0) = P (x | S t +1 = 0) and P (x | S t = 1) = P (x | S t +1 = 1).A maximal equilibrium state needs to satisfy the equilibrium state requirement (see Definition 5.1).According to Lemma B.2, the maximal equilibrium state is a group fair state, but the inverse is not true.
Since we are in the maximal equilibrium state P (S t +1 = 0) = P (S t = 0), we can substitute it in the left-hand side of the above equation and solve for β to get By setting S t +1 = 1 we reach to the same conclusion.The inverse is straightforward.Hence, at time t, the solution for the maximal equilibrium state is unique.
Remark B.5.Not all group fair states are a maximal equilibrium state.
Lemma B.6.Let a system be in a group fair state at time t 0 .If c and r are constant and r + c ≤ 1, then this system always asymptotically converges to the maximal equilibrium state at a limit of t → ∞.
Proof.Let this system, that consists of n agents, be in a group fair state at time t 0 , i.e., P (x | S t 0 = 0) = P (x | S t 0 = 1), but it is not (necessarily) in the maximal equilibrium state.Hence, From the first line to the second line, we used the fact that the decision function is independent of x.To get the last line, we use the fact that u is independent of x.The last line is not a function of x, implying that r (t, x ) = r (t ) is independent of x.Now, we need to show that the above two conditions have not changed after a one-time step, i.e., the decision function is only a function of u, and the utility of the non-beneficiary agents is independent of vector x.The former is obvious.For the latter we need to show P (u | S +1 = 1, x ) = P (u | S t +1 = 1).The non-beneficiary agents at time t + 1 are a weighted average of two populations: those who did not get upgrades and those whose status decayed to zero.Since the utility of both populations is independent of x, a weighted average of them is also going to be independent of x.The decision function r (t + 1, x ) is independent of x, and according to Theorem 5.3, the system converges to a group fair state at time t → ∞.
C LIMITATIONS EXAMPLE Definition 3.1 utilizes a concept of equal access; it asks if there is a relation between the protected variables of the population and whether they are beneficiary or non-beneficiary.For some realworld applications, this definition of fairness might not be applicable or its application inappropriate.Often, this fairness criterion cannot be satisfied without violating one or more social welfare counterparts.In the following, we compare the group fairness of this work with proportional fairness.Group proportional fairness implies that each group should satisfy realized i (u) ≤ max i (u)/n, where max(u) is the maximum utility of group i if this group receives all the resources and n is the number of groups.Finally, realized i (u) is the realized utility of group i under a proportionally fair allocation.Group proportional fairness suggests that the realized utility of each group should be equal to or larger than its maximum utility divided by the number of groups.
Example 1 (Proportional fairness).Let us assume a utilitarian decision-maker.We have two groups, A and B, each with n agents.For group A, suppose that the utilities of the agents are 0.5.For group B, the utility of half of the agents is 0.9, and the utilities of the rest are 0.1.We assume initially, all agents are in the non-beneficiary state, and the decision-maker can only upgrade only 10% of the population.To satisfy group fairness as defined in Definition 3.1, the decision-maker needs to pick 10% of group A and 10% of group B. Unrealized average utility of both groups is 0.5.Then, the sum of the realized utility of each group after the upgrade would be 0.05n and 0.09n, respectively.This violates proportional fairness.To achieve proportional fairness, the sum of the realized utility of both groups should be the same.
Example 2 (Rawlsian min-max fairness).Let us assume a utilitarian decision-maker.We have two groups, A and B, each with n agents.For groups A and B, suppose that the utilities of the agents are 0.1 and 0.9, respectively.We assume initially, all agents are in the non-beneficiary state, and the decision-maker can only upgrade 10% of the population.To satisfy group fairness as defined in Definition 3.1, the decision-maker needs to pick 10% of group A and 10% of group B.Then, the sum of the realized utility of the system after the upgrade would be 0.05n.A utilitarian decision-maker, under Rawlsian min-max fairness constraint, picks 2% of group B and 18% of group A, where the sum of utility of each group would be 0.018n; and the sum of realized utility of the system after the upgrade becomes 0.036n.Hence, the solution under Rawlsian min-max fairness would be different from a solution under group parity constraint.

D EQUALIZER DECISION MAKER ALGORITHM
To reach a group fair solution, equalizer over-(under-)samples the over-(under-)represented nonbeneficiary groups.We employ an importance sampling strategy to operationalize this idea.The importance sampling weights are w (x, t ) = P (x |S t =1) P (x |S t =0) .Employing these weights, we can construct a more efficient decision function r (x i , t ) = r (t )w (x i ,t ) x ∈Z w (x,t ) .Equalizer samples agents with S = 0 with a rate that is proportional to w (x, t ).Since P (x | S t = 0) and P (x | S t = 1) are unknown, we employ a kernel density estimator (KDE) to compute the sampling weights, w (y i , This estimator can become unstable when the denominator is very close to zero.Therefore, we add a small value ϵ.

E DATA PREPROCESSING AND MODEL PARAMETERS
Simulation Parameters.For each simulation setup, we consider two utility models: a random utility model and a biased utility model.In the random utility model, the utility of the beneficiary and non-beneficiary populations are randomly drawn from a log-normal distribution with μ = 1 and σ = 0.5, where μ is the mean, and σ is the standard deviation of the logarithm of the utility function.In the biased utility model, the utility value of the beneficiary agents is drawn from a log-normal distribution with μ = 0.5 and σ = 0.5, and the utility value of non-beneficiary agents is drawn from a log-normal distribution with μ = 2.0 and σ = 0.5.The utility function of the random utility model will be independent of x, while the biased utility model is designed such that u and x are correlated.Initial Condition.Figure 8 shows a sample realization of the initial conditions for Systems 1 through 4. Here, the orange points represent the beneficiary agents, while the blue points signify the non-beneficiary agents.The figure is generated based on the following model for initial conditions.We designate half of the sample size to beneficiaries, and the other half are non-beneficiaries.Their two-dimensional protected variables are generated as outlined below.
-Two Gaussian Mixtures.The protected variables for the beneficiary group are shifted along both These initial conditions are displayed in the top panels of Figure 8. Subsequently, we allocate a utility score to each group based on two scenarios.
-Random Utility.Here, the utility scores for both the beneficiary and non-beneficiary groups are generated from a log-normal distribution defined as u ∼ logNormal(μ = 2.1, σ 2 = 0.4).-Biased Utility.In this case, the utility scores for the beneficiary and non-beneficiary groups are distinctly distributed.Specifically, the utility for the beneficiary group is drawn from u ∼ logNormal(μ = 1.7, σ 2 = 0.2), while for the non-beneficiary group, it's drawn from u ∼ logNormal(μ = 2.5, σ 2 = 0.2).
The specific combinations of utility scenarios and systems are detailed in Table 1.For instance, in System 1, we use two Gaussian mixtures and random utility for modeling the protected attributes and utility, respectively.
Prepossessing.In all of our experiments, we apply the uniform quantile transformer implementation of Scikit-Learn to renormalize our data.A quantile transformer is a non-linear transformation such that the probability density function of each covariate will be mapped to a uniform distribution.All the data will be mapped to a uniform distribution with the range [0, 1] d where d is the dimension of the covariate space.Sampling weights and MMD are also computed in the transformed space.Gaussian RBF kernel has one free parameter, which is its bandwidth.To find a suitable bandwidth, Gretton et al. [25] proposed an algorithm for minimizing Type II error for a given test level, an upper bound on the probability of making a Type I error.This hyper-parameter is important in ensuring the optimal test power [see Figure 6 in 19].

Experiment parameters.
In our experiments, we use a Gaussian RBF kernel and set the bandwidth as specified in Table 2 that is close to the optimal of Gretton et al. [25].Our results are insensitive to a few percent variation in the kernel bandwidth.All other simulation and decision function parameters are summarized in Table 2.The simulation setup (initial condition and how the utility is defined) is the same as the previous experiment.The only difference is that we allow the initial ratio of P (S t =0 = 1)/P (S t =0 = 0) varies.We assume that there are 2,000 agents with S t =0 = 0 and then compute the number of agents with S t =0 = 1 according to the prespecified ratio.Results. Figure 9 shows the evolution of systems with imbalanced classes.Since r (t )/ c (t ) = 1, according to Theorem 5.3 these systems should converge to the maximal equilibrium state with lim t →∞ P (S t = 1)/P (S t = 0) = 1, which implies that the efficiency of system under random and equalizer decision functions asymptotically converge to 50%.Here, we notice a similar trade-off between efficiency and fairness as we saw before when the utility and protected variables are not independent.When the number of non-beneficiary agents is larger than the number of beneficiary agents, the efficiency increases at a faster rate.This is an artifact of the difference in the initial condition.When the number of non-beneficiary agents is larger than the number of beneficiary agents, the decision-maker has more options to maximize the utility, and the system is further away from the equilibrium; thus, the rate of changes would be larger than when it is close to the equilibrium.

F.2 Hyper-parameters
An analysis of the hyper-parameters reveals that our results are insensitive to small variations in moderately sized α and f .In general, 10 < α < 100 gives well-behaved results.Too large of an α induces a sigmoidal efficiency function that is very close to a step function, leading to slow convergence.On the other hand, too small a α results in a sigmoidal efficiency function that is approximately linear with an ill-behaved gradient.Also, since we want to achieve an approximate proportional efficiency, ], which is the average efficiency at the system level.

G CASE STUDIES: DATA SOURCE AND DATA PROCESSING G.1 Hypothesis Testing
Health Care coverage.The health care coverage data are from the 2017 and 2018 Annual Social and Economic Supplement (ASEC) Split-Panel Sample of the Current Population Survey (CPS). 3The U.S. Census Bureau collected the above data in March of 2017 and 2018 [see 13, 14, for detailed data documentation] and covers the following topics: household and family characteristics, marital status, geographic mobility, foreign-born population, income from the previous calendar year, poverty, work status/occupation, health insurance coverage, program participation, and educational attainment.These questions are asked of the civilian non-institutional population and military personnel who live in households with at least one other civilian adult.
In this case study, the state of each participant is determined by their health insurance coverage.Each survey participant is either in a "have health insurance" or a "do not have health insurance" category.There are no missing observations in these data sets.Our goal is to study the age distribution disparity between those who have access to health insurance and those who do not; hence, x = {age}.
We first drop all observations with ages below 18 or above 80.We then keep the states for which there are at least 25 observations per health insurance category for the survey years 2017 and 2018.Otherwise, the estimation uncertainty hinders drawing a meaningful conclusion.There are 25 states which satisfy these conditions.See Figure 2 for a list of states.
Homeownership.Our homeownership data are retrieved from the 2019 American Housing Survey [AHS,15].AHS, the most comprehensive national housing survey in the U.S., is sponsored by the Department of Housing and Urban Development and conducted by the Census Bureau.
In this case study, the state of each participant is determined by their homeownership status.Each survey participant is either a homeowner or a renter.This study is focused on households who make more than $10k and less than $500k.After discarding non-respondents and households with income less than $10k and more than $500k, we ended up with 48,660 households.Data are stratified on metropolitan areas.Since we have enough samples from each metropolitan area, we do not discard any city.We, then, take log-income and compute disparity with respect to x = {log-income} for each city in our data set.General Information.The curated data from both data sets contain no personally identifiable information.
The uncertainties are estimated by a bootstrap procedure.For each state and survey year, the data is split into beneficiary and non-beneficiary subsets.Then, for each bootstrap iteration, we generate a random sample with a replacement for each subset and compute age or income beneficiary disparity.

H A-FARM CASE STUDY
Detroit Data Set.Each road is mapped to the closest census tract.A census tract is a geographic region defined for the purpose of taking a census.Group disparity is defined with respect to the median household income and the poverty rate of the closest census tract, x = {median household income, poverty rate}.We use the 2018 PASER rating data provided by city managers to assign whether a road is in good condition or in poor condition.
The units are constructed based on neighborhood boundaries. 4Current neighborhood boundaries are compiled by the Detroit's Department of Neighborhoods staff in concert with community groups.

I PROOF OF CONVEX OPTIMIZATION THEOREM
Let u i = n i j=1 r i j × u i j be the realized utility of i th unit, i be an index over units, j be an index over items in i th unit, and R i = n i j=1 r i j .r i j is the optimization variable that can take the value 0 (does not change) or 1 (upgrade).The items within a unit are rank-ordered based on their utility u i j from highest to lowest, hence u i (j+1) ≤ u i j .
Lemma I.1.u i (R i ) is a concave function.
Proof.Suppose that the top k − 1 items have r i j = 1 and the rest are zero.We can approximate ∂u i ∂R i by switching r ik from zero to one Similarly, we can approximate The last equality is due to the fact that the utility function is ranked ordered, thus u i (k+1) ≤ u ik .Hence, the claim is proved.
Therefore, since log Q (u i ) is strictly concave for all i, then the optimization problem in Equation 4 is a convex optimization problem.For a convex optimization problem, there exists a unique tractable global optimal solution.

ALGORITHM 1: A-FARM Optimization Algorithm
Require: {u i j , R i , α, R, f , δ }.  4.This is a max-min algorithm that aims to minimize for λ and maximize for R i in an iterative manner while sharing those parameter values.The output is the set of recommended items to upgrade to a beneficiary state.

K INITIAL CONDITION SIMULATION SETUP
The following describes the generative model for the initial condition simulation setup from Section 6.Given that there are n д intrinsic groups in the population, then we can initially assign half of the n д groups to a desired beneficiary state with 2-dimensional protected variables drawn as And draw the biased utility model as u ∼ logNormal(μ u + 0.2, σ 2 u ) μ u ∼ Uniform(1, 2.5) u ∼ Uniform(0.1,0.2).Similarly, assign the remaining half of the n д groups to an undesired beneficiary state with protected variables drawn for each group as follows.
[x 1 , x 2 ] ∼ Normal(μ, V ) u ∼ Uniform(0.1,0.2).An unbiased utility model can be simulated by drawing the utility u in a way that is conditionally independent of the protected variables [x 1 , x 2 ].This can be done, for example, by fixing the mean and variance of the log-normal distribution from which u is drawn rather than randomly drawing them from uniform distributions.
Figure 10 shows a set of four different initial conditions based on the simulation setup described above.This assumes a 2-dimensional protected attribute vector with a varying number of intrinsic groups in conjunction with a biased or unbiased utility model.
Measuring and Mitigating Group Inequalities In Resource Allocation 6:33 Fig. 10.Using the initial condition simulation setup in Appendix K, initial conditions with 2, 10, and 20 intrinsic groups, as well as biased and unbiased utility models, are shown.

Fig. 3 .
Fig. 3. Resulting sigmoidal efficiency functions with respect to changes in α and f .

Fig. 4 .
Fig. 4. Time evolution of four biased systems under random (yellow dotted line), equalizer (blue dashed line), and utilitarian decision algorithms (black line).The shaded 68% confidence intervals are shown using 100 simulations.The red dotted line is the maximum achievable efficiency bound.Top panels.Group beneficiary disparity.Bottom panels.Efficiency.The equilibrium efficiency level of both the Random and Equalizer decision-makers are indistinguishable for System 1 and System 2, causing their performance metrics to overlap precisely in this figure.

Fig. 8 .
Fig.8.Initial conditions (top panels) and utility models (bottom panels).The orange points are the beneficiary and the blue points are the non-beneficiary agents.

Fig. 9 .
Fig. 9. Same as Figure 4. Here, we vary the initial ratio of agents with S = 1 to agents with S = 0.

Theorem I. 2 . 2 i
The optimization problem in Equation (2) is a convex optimization problem, and there exists a unique tractable global optimal solution.Proof.By LemmaI.1, u i (R i ) is a concave function implying ∂u i ∂R i > 0 and ∂ 2 u i ∂R ≤ 0. Also, we note that exp{α f } − exp{−α u i /u m,i − f } = exp{α f } 1 − exp{−αu i /u m,i } > 0

Table 1 .
Summary of Systems Considered in this Work

Table 2 .
Simulation and Decision Function Parameters F.1 Experiments with a Class ImbalanceSimulation Setup.

The recommended items to upgrade in each unit.
Return {R i } J A-FARM OPTIMIZATION ALGORITHM Algorithm 1 shows the optimization algorithm used to solve the convex optimization problem of Equation Set optimization hyper-parameters.whileconvergence do % Max optimization: fori ∈ {1, . . ., M } do Solve R i,t = arg max R i, t −1 Q (u i,t −1 ) − λ t −1 R i,t −1 w i,t ← λ t −1 × R i,t