Runtime Monitoring of Dynamic Fairness Properties

A machine-learned system that is fair in static decision-making tasks may have biased societal impacts in the long-run. This may happen when the system interacts with humans and feedback patterns emerge, reinforcing old biases in the system and creating new biases. While existing works try to identify and mitigate long-run biases through smart system design, we introduce techniques for monitoring fairness in real time. Our goal is to build and deploy a monitor that will continuously observe a long sequence of events generated by the system in the wild, and will output, with each event, a verdict on how fair the system is at the current point in time. The advantages of monitoring are two-fold. Firstly, fairness is evaluated at run-time, which is important because unfair behaviors may not be eliminated a priori, at design-time, due to partial knowledge about the system and the environment, as well as uncertainties and dynamic changes in the system and the environment, such as the unpredictability of human behavior. Secondly, monitors are by design oblivious to how the monitored system is constructed, which makes them suitable to be used as trusted third-party fairness watchdogs. They function as computationally lightweight statistical estimators, and their correctness proofs rely on the rigorous analysis of the stochastic process that models the assumptions about the underlying dynamics of the system. We show, both in theory and experiments, how monitors can warn us (1) if a bank's credit policy over time has created an unfair distribution of credit scores among the population, and (2) if a resource allocator's allocation policy over time has made unfair allocations. Our experiments demonstrate that the monitors introduce very low overhead. We believe that runtime monitoring is an important and mathematically rigorous new addition to the fairness toolbox.


INTRODUCTION
A majority of works in the fairness literature have considered fairness in static decision making problems, such as classification, regression, etc [10,13,17].Recent results suggest that fairness itself is not static, but rather dynamic: a system that is fair in its static decision-making tasks may become biased in its overall societal impacts over time [9,20,24,25,29,38,39].This happens when the system makes sequential decisions about humans, and every decision of the system is met with some human reaction in return, possibly changing the parameters and the future decisions of the system.Such feedback patterns often reinforce historical biases in the dataset and introduce new biases in the society in the long-run as well.While there are many works that have proposed analysis and mitigation techniques for long-run biases, to our best knowledge, there does not exist any technique that could detect such biases in real-time.
We propose runtime monitoring, as a new addition to the fairness toolbox, for the real-time detection of dynamic social biases in deployed machine-learned decision makers, whose models are unknown and may change over time (e.g., due to retraining, changes in parameters, etc.).
The goal of runtime monitoring is to design a monitor which will observe the sequential interactions between the decision-maker and its environment, and, after each observation, will output a quantitative, statistically rigorous estimate of how fair or biased the system is at that point in time.Unlike most existing approaches [25,29,38], our monitors do not require any assumption or explicit knowledge of the system model.
Monitoring can help us in two ways.Firstly, by detecting biases in real-time, it can trigger corrective measures or retraining, whenever necessary.Statically designed fairness interventions are based on an assumed dynamic model of the system.In practice, models are rarely perfect due to imperfect knowledge of the systems and the involved uncertainties, making it often impossible to predict if a long-run fairness intervention is going to work in practice.
Moreover, the underlying environment conditions may change over time, making static interventions even harder.
Monitoring offers an additional, complementary tool that enables us to close this gap by warning us of the presence of biases in real-time, so that we can adapt our intervention techniques whenever necessary.There is an analogy to control theory, where it is well-known that closed-loop (feedback) controllers fare much better against modeling uncertainties than open-loop (feed-forward) controllers [30,Sec. 1.3].
The other area where monitoring can help us is in the creation of trusted third-party watchdogs for overseeing the fairness of decision-makers.They can work neutrally in public interest, since they are by-design independent of the implementation of the system.
Consider the following situation where fairness is dynamic, and we show that monitoring will be useful.Consider a bank that gives loans to individuals based on their credit scores.The population is divided into two groups, with one group having higher average credit score than the other.A policy of the bank that gives loans to the eligible individuals from each group with equal probabilities (equalized opportunity [17]) may seem fair and noble.However, in doing so, the bank may end up giving more loans to less eligible individuals from the disadvantaged group.If the credit score distribution of the disadvantaged group is heavily skewed towards a higher default rate, then there will be many loan defaults, causing a further drop in average credit score of the disadvantaged group [39].For this example, we present a monitor which observes a single long sequence of lending events, consisting of sampling of an applicant, the decision made by the bank on this applicant, and if the loan was granted then whether it was repaid or not.After each observation, the monitor computes a quantitative statistical estimate of the difference between the average credit scores of the two groups.It does so by being completely oblivious to the bank's policy and by not assuming any prior knowledge about the humans' behaviors (whether they repay or not).Now consider the following situation.It has been shown that voice assistants, such as Amazon Alexa and Google Home, are biased towards the English accents of native speakers, where the native speakers experience significantly higher quality service than the non-native speakers [18].This happens when there is an imbalance between group representations in the dataset, with more data available for one demographic group than the other.If over time, more and more non-native speakers stop using the service out of dissatisfaction, then the dataset gets more skewed towards the native speakers, intensifying the biases further [19].Similar representation-driven biases were reported in other areas as well, such as recommendation systems [7], credit market [15], and crime prediction [12].While, in theory, there are remedies that work if the reactions of the humans can be perfectly predicted, in practice, they may worsen the situation whenever the modeling assumptions do not align well with the true intentions of the humans [38].This demonstrates that it is difficult to design a static fairness intervention that will always work in the long run.Monitoring can help us to, firstly, detect such dynamic biases and warn us in time, and, secondly, to change the interventions whenever necessary.
We consider time-varying social fairness properties, as a class of dynamic fairness properties.They can be written as the difference in expected values of a given function over unknown time-varying feature distributions across two demographic groups.Such properties can capture many existing aspects of long-run fairness properties in the society, such as the time-varying difference in expected credit scores across two groups [39], the time-varying difference in group representations [19], etc.
Our monitors perform statistical estimations to obtain a PAC-style estimate of the value of the social fairness properties in real-time.We do not make any assumptions about the policies of the already deployed machine-learned agent and the human users (i.e., the environment).The only assumption we make is that the monitor can observe the features of the selected individual, the actions of the agent, and the reactions of the individual.Moreover, we assume the availability of a change function, such that from each observation the monitor can infer the resulting change in the expected value of the unknown distribution.For instance, in the lending example, we assume the observability of the credit scores and the group memberships of the sampled individuals, the bank's decisions, and the reactions of repaying or defaulting of loans by the individuals.At any time, if the individual is selected from a group with size  , then the change function tells us that a repayment of the loan will increase the credit score of the individual by, say, 1 point, thereby increasing the average credit score of their group by 1 / .Similarly, a loan default will decrease the credit score of the individual by, say, 1 point, thereby decreasing the average credit score of their group by 1 / .Our monitor observes one long sequence of lending events, and, after each new observation and based on the given history of past lending events and the past valuations of the change function, computes an updated PAC-style estimate of the disparity in average credit scores across the two groups.
Computationally, our monitors are extremely lightweight, and their implementations required only a few lines of code.Yet, the mathematical analysis of their correctness is nontrivial.The difficulty stems from the fact that the samples observed on any given sequence are all statistically dependent on each other.For instance, the probability of sampling an individual with a certain credit score will depend on whether the previous individual who was from the same group and had the same credit score repaid the loan or not.As our monitor, we present an unbiased statistical estimator as well as PAC-style bounds for its estimates.The bounds are obtained by constructing a martingale from the estimates, analyzing the corresponding martingale difference sequence, and applying suitable concentration inequalities for martingale difference sequences.
We implemented our monitors in a prototype tool.Using this implementation, we designed monitors for two practical examples from the literature.The first example concerns the lending problem that we discussed earlier, where we monitored, in real-time, to what degree the lending policy of the bank has widened the disparity of average credit scores across the two demographic groups.The second example is an attention allocation problem [9], where incidents keep occurring at every step in multiple locations, and we have a machine-learned allocator for allocating its limited units of attention to the locations to discover the incidents.The rate at which incidents occur at each location is inversely proportional to the amount of attention allocated to that location in the previous step.Real-world applications of this example include child services, pest control, etc.We monitored, in real-time, to what degree the allocator's allocation policy has widened the disparity of discovery probability of incidents among two of the given locations.
Implementations of these systems were already available in the tool ml-fairness-gym [9].We executed our monitors on the simulation traces of the systems as extracted from ml-fairness-gym.We demonstrate that our monitors are able to produce tight statistical estimates of the considered fairness properties in real-time.
We believe that runtime monitors will be an important new addition to the fairness toolbox.On one hand, they will complement the existing model-based analysis and design tools by checking dynamic fairness in real-time, and helping us to trigger on-demand corrective measures.On the other hand, they will be useful in building trusted third-party fairness watchdogs.

Related Work
Fairness in automated decision making has become an active field of research in recent years.Early works only considered fairness in the static decision making settings, where the decision maker needs to be fair with respect to a time-invariant distribution.Several group fairness [13,17] and individual fairness [10] criteria were proposed, and measures for implementing them were invented.The proposed measures in this setting can be grouped into three categories: (a) ones which pre-process the training dataset to eliminate historical biases [6,16,23,37], (b) ones which design training algorithms that are more robust to biases (called in-processing) [1,4,35,36], and (c) ones which post-process the decision-maker's output to eliminate biases [17].
Later, it was observed by many authors that, surprisingly, decision policies that are statically fair may lead to unfair behavior in the sequential setting.In this regard, the simplest sequential setting studied in the literature is the two-stage one: in the first stage, the agent makes decisions on humans from two groups, which may cause the humans to take certain actions, and the resulting impact on the groups are then examined in the second stage [20,24].In the more general long-term setting, the agent is allowed to retrain its decision policy over time, which may be affected by a change or bias in the dataset, caused by the reactions of humans to decisions made by the agent in the past.This closed feedback was shown to self-reinforce biases that were present in the dataset as well as introduce new biases.Relevant works on the sequential setting can be found in a recent survey [39].While most of the existing works attempt to eliminate biases at design-time and assume information about the model [25,29,38], we detect them at runtime with little knowledge about the model.There are also simulation-based studies which study long-term impacts of static fairness measures [9].They are also incomparable to our monitoring setup: in simulations, it is shown how bias changes over time for an assumed model of interactions, whereas we make almost no assumptions on the model and use the concrete measurements to estimate the bias in the system.
Our monitors are designed to operate in a dynamic setting.Hence, static systems or systems where the decisions of the agent do not affect the parameters of the underlying population, which have been studied extensively in the literature (see Mehrabi et.al.[28]), are a special case of our setting.Therefore, monitoring could be applied.A natural setting would be the deployment of monitors to check whether an agent in a bandit setting is fair [8,22].
Runtime monitoring is a well-studied subject in the area of formal methods in computer science [3].The goal is to check, at runtime, if an unknown system satisfies or violates a given safety property.For instance, a monitor may be used to detect traffic congestion in the roads of a smart city [26], or safety violation of autonomous vehicles [27].The outputs of monitors are usually passed to a safety-supervisory control layer, which takes necessary actions to prevent damages, for example through a default fail-safe action [5].
Unfortunately, a majority of the existing works in runtime monitoring cannot handle statistical properties, such as fairness.Notable exceptions include the work by Ferrère et al. [14], which develops efficient techniques for monitoring statistical properties of systems.However, they do not consider fairness properties.Moreover, their monitors' outputs are correct only asymptotically, whereas our monitors output PAC-style error bounds for every observed sequence of finite length.
The closest to our work are the papers by Albarghouthi et al. [2] and a recent paper by us [21].Albarghouthi et al. [2] presented an approach for monitoring fairness in sequential decision-making tasks, which we generalized to monitoring fairness over Markov chains using techniques from both frequentist and Bayesian statistics [21].These works can be used to monitor only group fairness and individual fairness properties in static decision-making problems, whereas we monitor time-varying social fairness properties in dynamic decision-making problems.
There is a body of research that is ideologically similar to ours, and developed sequential statistical tests to evaluate the performance of already deployed machine-learned systems at runtime.Podkopaev et al. [31] proposed an algorithm for monitoring the expected loss of a given classifier due to shift in the dataset distribution.The expected loss is based on the misclassification rate of a classifier, and is uncomparable to fairness properties that we consider in this work.
Waudby-Smith et al. [33] proposed a sequential estimation algorithm that was used to estimate the time-varying average treatment effect (ATE) in a randomized experiment, which gives a measure of the expected difference in outcome between an individual chosen from the population receiving a treatment (like a medical drug that is being tested) and not receiving the treatment.Although there are some structural similarities between ATEs and fairness properties, they estimate how much the ATE was on an average until the present time, whereas we estimate how much fair the system is at the present time.Moreover, their estimates are asymptotically correct, whereas we provide finite-sample correctness guarantees.

The Sequential Agent-Environment-Interaction Model
We call a machine-learned decision maker an agent, and the population of the subjects of its decisions the environment.
For example, in a lending scenario, a bank's machine-learned lending policy is the agent, and the population of the loan applicants is the environment.We use a setup similar to the work of D'Amour et al. [9], where an agent engages in a sequential interaction with its environment, and as a result the parameters of the environment change.The environment contains a distribution over the individuals, where each individual is represented by a real-valued (scalar) feature of interest, such as their credit score, and a sensitive attribute, such as their ethnicity.In this work we only consider fairness properties that depend on the single available feature of the individuals; extension to fairness with respect to feature vectors is left open for future work.In general, we allow the individuals to have additional features, though they do not influence the fairness.For simplicity of notation, we suppress such additional unimportant features when considering the individuals.
At each step , the environment samples a single individual with feature   and the group membership   , where   is a real-valued random variable and   is a random variable which is assumed to have a binary support {, } for simplicity.We use the shorthand notations P  (  =   ) and P  (  =   ) to denote, respectively, the conditional At time , the agent performs an action   , which is also treated as a random variable.Given the agent's action, the environment may react by using its own reactions, which we denote using the random variable   .The randomness in   and   capture the modeling uncertainties, such as unknown factors that influence the agent's actions and unpredictability in the environment's reactions.In the lending example, the agent's (i.e., the bank's) actions are granting or rejecting the loan to the selected individual, whereas the environment's reactions are repaying or defaulting of the loan by the same individual.Some problems, such as the attention allocation example, do not require environment's reactions.(Although, in practice,   may lag from   , for simplicity, we assume that they happen at the same time step.) This completes one round of interaction between the agent and the environment, and a sequence contains many such interaction rounds.
The interactions between the agent and the environment form a sequence of (tuples of) random variables, i.e., a stochastic process ì  = ((  ,   ,   ,   ))  >0 .For every , the tuple of concrete values that the random variables take is called an observation, denoted as   (  ,   ,   ,   ).The sequence ì . ] is called an observation sequence.
In the process ì , the feature distribution   is subject to changes over time;   and   may also change, but that is irrelevant for us.We assume that the monitor can infer, from the observations, the resulting change in the current expected value of   (given the history of observations).For instance, in the lending scenario, if at any time the selected individual fails to repay the loan, then the credit score of that individual goes down, and so the distribution of credit scores in the population shifts.We assume that we can infer the shift in expected credit scores from the lending decision of the bank and the event of repayment/default.We formalize this in the following.
Assumption 1. Runtime monitors have access to a function Δ, called the change function, which maps every concrete observation   to a change in the expected value of   , such that for each group  ∈ {, }, for every time , and for every past sequence of observations ì   , we have: when centered, is a sub-exponential random variable with parameters ( 2 , ).
Assump. 1 imposes mild technical restrictions that are fulfilled by many real-world problems, including the lending example and the attention allocation example that we consider here.Whenever clear from the context, for simplicity, we write Δ  instead of Δ(  ,   ,   ,   ).

Time-Varying Social Fairness Properties
Let  :   = (  ,   ,   ,   ) ↦ → R be a function, called the well-being function, which is a measure of the well-being of the individual (  ,   ) who was subjected to the agent's action   to which they reacted with   .In the lending example,  maps an observation to the credit score of the selected individual.In the attention allocation example,  maps an observation to the ratio of the attention (action) to the number of incidents (reaction).
For each group , and for every observation sequence ì   , we define the (group-specific) expected well-being as: Observe that the expectation is with respect to the randomness in the feature distribution   , which makes   also random.We do not condition on the currently observed feature   , as it would make the expectation trivially equal to  (  ).In other words, the expectation in the well-being is only with respect to the past observations of credit scores in the lending example, and is only with respect to the past observations of incidents in the attention allocation example.
We consider a class of fairness properties, which we call the time-varying social fairness properties, defined as the difference in expected well-beings of the two groups for a given observation sequence ì   : Time-varying social fairness properties capture many interesting properties that were already studied in the context of sequential decision-making, such as the time-varying disparity in average credit score [24], time-varying disparity in the discovery probability of incidents [11,12], etc.
In (2), we present the general class of time-varying fairness properties that we consider, and the exact property will depend on the application and the definition of the function  .For instance, in the lending example,  (  ) will be independent of   ,   ,   and will give us the credit score of the individual sampled at time .
We point out that we do not impose any assumption on the agent's and the environment's policies for choosing their respective actions and reactions.However, following Assumption 1, (re-)actions at each time influence the expected observation at the next step.Hence, it is impossible to statically predict the conditional expectation in advance.Intuitively, this means that without observing the loan decisions of the bank and the subsequent repayment or default events, we cannot predict what the expected credit score will be at a particular point in the future.
As a result, we cannot statically predict the social fairness in the system in the long-run, even if we knew its initial value.Thereby, it is only possible to measure social fairness retrospectively, which is what we do using runtime monitoring.To our best knowledge, no prior work in the fairness literature considered this problem.

The Monitoring Problem
A monitor is a function that maps every observation sequence to a real interval, where the output interval computed by the monitor is a PAC-style statistical estimate of the given social fairness property.We summarize the monitoring problem in the following.
Problem 1.Let ì  be a stochastic process,  be a social fairness property, and  ∈ [0, 1] be a parameter.Design a monitor  such that for every time , the following holds: The probabilistic uncertainty in the monitor's output is due to the non-availability of the parameters of the initial feature distribution: were the initial parameters known to the monitor, at every time, a precise value of the fairness property could be calculated from the net change in the parameters as deduced from the change function.On the other hand, a naïve PAC estimate of  ( ì   ) at each time step is also not feasible, since the feature distribution is constantly changing.
The estimate gets more precise as the error gets smaller and the confidence gets higher.For the lending example, Prob. 1 asks us to design a monitor which will observe a sequence of lending events, and, after each observation, will output a (1 − ) • 100% confidence interval for the estimated disparity in average credit scores.
While our monitors output interval estimates of fairness properties in the form of confidence intervals, internally, they first compute point estimates of the expected feature of each group  for a given observation sequence #"   −1 , defined as: Note that the quantity,   ( #"   −1 ) gives us only the expected feature, which will be an intermediate step for estimating the well-being.Notice that the expected feature at time  only depends on the past observations until time  − 1, whereas the well-being at time  requires the action and the reactions, such as units of attentions allocated by the attention allocator at the current time .A point estimator Ê of   for a given #"   and a given group  is a function Ê : ì Additionally, Ê will be called unbiased, if for every observation sequence ì   −1 that may occur with positive probability, we have:  Intuitively, unbiasedness guarantees that, for any given history ì   −1 of loan events that may occur with positive probability, the expected credit score of a group at time  will be equal to the expected output of the estimator at time .
While unbiasedness guarantees that the estimator Ê's output coincides with   (ì   −1 ) in expectation, we also require that the output error remains statistically bounded at all time.To this end, we bound the estimation error by computing confidence intervals for   (ì   −1 ), obtained through application of concentration inequalities around the point estimate.
These confidence intervals of group-specific expected features are then used to obtain confidence intervals for groupspecific expected well-beings (i.e.,   (ì   )), which are then subtracted from each other to finally obtain the output confidence interval of the monitor for the time-varying social fairness property  (ì   ).We illustrate this sequence of steps in Fig. 1, along with references to the sections where they are described.
While we provide a general procedure for estimating   (ì   −1 ) in the first step, a general overall estimation procedure for any arbitrary  (ì   ) is difficult to derive and is left open.This is because the final confidence interval for  (ì   ) depends on the structure of the well-being function, which is problem-specific.
As a convention, by "monitor, " we will exclusively refer to the final interval estimator of  (ì   ), though it is not the only interval estimator that we will use.

AN INTERVAL ESTIMATOR FOR THE TIME-VARYING EXPECTED FEATURE
For any group  and an arbitrary observation ì   , in this section, we construct an interval estimator for the group-specific expected feature   (ì   −1 ) = E  (  | #"   −1 ) (defined in (3)).As   (ì   −1 ) does not depend on the fairness metric, hence the estimator is not tied to the fairness monitoring problem and can have other use.The estimator for   (ì   −1 ) will be the essential component of the monitors (i.e., the interval estimator for the social fairness property  (ì   )), which will be presented later in the respective example sections.The interval estimate of   (ì   −1 ) is obtained by first computing a point estimate of it, and then using concentration inequalities to bound the estimation error.The first part, i.e., the point estimation part, is explained using a coin-toss analogy.

Warm-Up: A Coin-Toss Puzzle
Suppose we have a coin with unknown initial bias.After each toss, its bias changes in a predefined manner as a function of the outcome of the previous toss.How to compute a point estimate of the bias of the coin at any given point in time, based on the given sequence of the observed past outcomes?
Let, us formalize the problem first.Let, at time , the probability that the coin shows 1 (heads) be   , and the probability that the coin shows 0 (tails) be 1 −   .The toss outcome at time  is denoted using the random variable   .
Let  1 be the initial bias which is fixed but unknown.After every toss, the coin's bias shifts by a constant  ∈ [0, 1], and the direction of the shift depends on the outcome of the previous toss: if we see   = 1 at time , the bias shifts to   +1 =   +  in the next step, whereas if we see   = 0 at time , the bias shifts to   +1 =   − .Let's assume for simplicity that the true initial bias is not too close to the boundaries 0 and 1, and moreover  is small enough and the observed sequence is short enough that the true bias never reaches the boundaries.We can succinctly write Observe that even if we knew  1 , without seeing the observations  1 , . . .,   , the value of   would only be probabilistically known.Thus a static analysis would not be possible even in that case.
For the trivial case of  = 0, i.e., when we know that the bias remains fixed at  1 throughout, we can compute an unbiased point estimate of   =  1 by simply computing the empirical average of the observed sequence as p = 1   =1   .When  > 0, we show (see App. A.3) that, for the given observation sequence  1 , . . .,   , the following is an unbiased point estimator of  1 : Once an estimate of  1 is known, we can obtain an unbiased point estimate for E(  |  1 , . . .,   −1 ) by accounting for the observed changes up to time : While there are techniques to estimate the non-time-varying mean of a statistical process from sequential observations [34], to our best knowledge, the problem we consider and the solution we propose are completely novel.

The Interval Estimator for Expected Features
Now we extend the coin-toss analogy to the point estimation, followed by the interval estimation, of the expected features   (•).For simpler notation, in the rest of this section, we assume that there is only one group, so that all the past observations correspond to that group only.We drop the superscript  from the property   (•).
Drawing comparison with the coin-toss setting, the bias   of the coin at every  is now replaced by E(  | ì   −1 ), and the bias shift Δ  of the coin at time  is now replaced by the value of the change function Δ(  ).We make these adjustments in the point estimator of bias of the coin in (5), and obtain the following point estimator for   (ì   −1 ):  return [ Ê − , Ê + ]

Soundness of the Interval Estimator
For soundness, we need to show that (a) the point estimator of  (ì   −1 ) is unbiased, and that (b) the interval estimate computed using the Azuma-style inequality is statistically sound.Claim (a) follows from the definition of unbiasedness.
For claim (b), we show that the sequence of the expected point estimates, conditioned on the increasingly longer sequence of prefixes of a given observation, is a Doob martingale.Furthermore, we show that the difference between any two consecutive elements of the Doob martingale is a sub-exponential random variable, which enables us to use an Azuma-style concentration inequality to compute the desired confidence interval from the point estimate.Below, we present the highlights of the soundness proof (details in App.A), which can be skipped over without any loss of continuity.
Unbiasedness: To demonstrate that the estimator Ê ( ì   ) is unbiased, we first show: Lemma 3.1.The estimator Ê1 (ì   ) Then we utilize the change function Δ and the definition of conditional expectation to transfer the result to Ê ( ì   ).
Proving concentration around the mean using martingales: To demonstrate that the estimator Ê ( ì   ) concentrates around its mean, we construct the Doob martingale: Intuitively, this martingale is a stepby-step approximation process.That is, it starts from the quantity we want to estimate, i.e.E( Ê1 ( ì   )) which due to Lemma 3.1 is equal to E( 1 ), and ends at E( Ê1 ( ì   ) | ì   ) which given the definition of the conditional expectation is the complete random variable, i.e. our estimator Ê1 ( ì   ).

Applying an Azuma-style concentration inequality:
To bound the distance between the first and last step of the martingale we want to apply some form of Azuma-style concentration inequality.However, this requires knowledge about the behavior of the difference between two consecutive martingale steps.
Hence, we can bound the probability that E( 1 ) is within an  > 0 interval around the estimator Ê1 ( ì   ), i.e.

A DYNAMIC LENDING PROBLEM
Now we present a monitoring algorithm for the lending example that we took from the literature [24].Suppose we have a bank that lends money to individuals by taking into account their credit score and group membership information.
Every individual has a credit score that may change over time, and let [0 . . max ] be the set of all possible credit scores of every individual.Also, the population is divided into groups  and , as usual, and let the number of individuals in the groups be   and   , respectively.At every step, the bank gets the credit score and the group information of a randomly chosen individual, and decides whether to accept or reject the loan application.If the loan is granted and is subsequently repaid, then the respective individual's credit score increases by 1, provided her initial credit score was smaller than  max .On the other hand, if the loan is granted but defaulted, then the credit score decreases by 1, provided her initial credit score was greater than 0. If the loan is rejected, then the individual's credit score remains unchanged.
We want to monitor, after each observation, if the bank's policy leads to unfair distribution of the expected credit score among the individuals of the two groups  and .

Problem Formulation
We assume a uniform distribution over the set of individuals at every time step.Given a (random) individual with features   and group   sampled uniformly at time , the bank's (random) action   of rejecting or accepting the individual is given by   = 0 or   = 1, respectively.Once the bank has chosen an action, the individual reacts as follows: if   = 0 then the individual's action is immaterial, and if   = 1 then the individual performs (random) action   , i.e., they can pick either   = 1 or   = 0 denoting, respectively, whether they repay the loan or not.The resulting change in the distribution is specified using the change function Δ defined as below: That is, if the loan is granted and the individual repays the loan, the expected credit score increases by 1

𝑁 𝑔 𝑡
; if the loan is granted and the individual fails to repay the loan, the expected credit score decreases by 1

𝑁 𝑔 𝑡
; otherwise the expected credit score remains the same.The well-being function  maps to the credit score (i.e., the feature itself), and the time-varying social fairness criteria is given by the disparity in expected credit scores of the two groups, i.e.,

The Runtime Monitor
The monitor for the lending problem, called LendingMonitor, is shown in Alg.(which implies it is sub-exponential with  = 0).The output of LendingMonitor is the difference between the two interval estimates computed by   and   .Observe that each of   and   uses higher confidence level 1 −  2 , so that the final confidence of the output estimate becomes 1 − , after applying the union bound.For simplicity, we chose union bound to compute the overall confidence, however, tighter interval estimates can be computed by observing that the group-specific stochastic processes are statistically independent, thereby allowing us to use the sharper bounds from the Hoeffding's inequality (see [32, p. 24]).  ←   .Compute(, , , )

Experimental Outcome
We summarize the outputs of our monitor in Fig. 2. We observe that our monitor's outputs match with the correct value of bias in the system at every point in time.Moreover, the monitor LendingMonitor took on an average 3 µs to compute a new estimate after receiving every new observation.
Interestingly, we note that the MaxRwrd agent's policy becomes more fair in the long run compared to the EqOpp agent's policy.This phenomenon has already been demonstrated in the existing literature [9].With our monitors, it is actually possible to detect such biases in real-time, without any information or assumption about what policy the bank is using or how the individuals will react (i.e., whether they repay or not).

A DYNAMIC ATTENTION ALLOCATION PROBLEM
Now we consider the attention allocation problem.Suppose there are  ≥ 2 locations, and at each location and at each time step, a number of incidents take place.There is a machine-learned allocator who needs to allocate its limited amount of resources to the locations in order to discover the incidents, where every event needs one unit of attention to be discovered.We design a runtime monitor to check, in real-time, if the allocator is fair in allocating its limited amount of attention among two particular locations  and .Suppose in each location and at every time step, some number of events appear according to the Poisson distribution with unknown parameters.At any location and at any given time, the rate with which events appear is inversely proportional to the number of attention units allocated to that location at the last time step; the exact relationship will be formalized in Sec.5.1.We assume that the allocator has knowledge about this relationship.The fairness criteria that we wish to monitor requires that the probability with which any event will be discovered across the two locations should be close to each other.
We streamline our exposition on a simpler instance of the original problem that was studied by D'Amour et al. [9], where they considered the fairness measure as the maximum pairwise disparity in discovery probabilities.We point out that this general property can also be handled using our monitors, by simply having a separate monitor for each pair of locations, and then aggregating the outputs of all the monitors using interval arithmetic and union bound.

Problem Formulation
The agent and the environment: Here, the two locations  and  are the two groups, as well as the only members in the respective groups.The feature of each location is the number of events (in N).At every time step , the environment samples a pair of (random) features    and    for the two locations, such that   + 1 and    + 1, ensuring that the minimum number of events is 1, which is necessary for technical reasons (will be explained later).Observe that, we slightly deviate from the setting that we introduced in Sec. 2, in that we obtain features of two individuals (i.e., the locations) from both groups simultaneously at each time step.Technique-wise, this is not a problem since we are going to use two separate ExpEstimator monitors for the two locations.Notation-wise, this is simpler, as otherwise we would need vectors-valued features.The agent's action is a (random) vector (   ,    ) ∈ N 2 to allocate its  units of attention to the respective locations, i.e.    +    ≤ .The entry    represents the number of attention units allocated to location  at time .In this example, the reaction of the environment to the agent's action has no role, i.e., we set   = ⊥ (a dummy symbol).
As usual, we assume a change function Δ  is given (defined below) that causes a shift in the expected value of   . in the rate at which events appear in the respective locations in the next step.Given a fixed parameter  > 0, which controls how dynamic the system is, the change function is defined as:

Since the expected value of Poisson(𝜆
where we drop the fourth argument   from Δ  as it has no role. The fairness property: The well-being function  in (1) in this example is called the discovery probability, and we want to monitor its disparity between the two locations.The discovery probability at time  and in location  can be formalized as the expected value of the ratio of the discovered events    = min{   + 1,    } to the the actual number of events    + 1.Notice that had we defined the number of events as    , discovery probability would be undefined (since    can be zero).The discovery probability at time  for a given observation sequence ì   can also be represented as the following conditional expectation: The time-varying social fairness criteria at every time  is given by      −    .

The Runtime Monitor
We show that  Using these auxiliary results, we construct the monitor as follows.We first use the general monitor ExpEstimator from Alg. 1 to estimate, for each location , the expected value E  (   ), which is the same as    (follows from the property of the Poisson distributions).We make two mild assumptions: First, we assume that the Poisson parameters in both locations are bounded between two positive reals  < , allowing us to establish that    is a sub-exponential random variable with parameters (2, 2) (see App. B.1).Second, we assume that the sequence of observations are such that the parameter would not reach zero no matter what its true initial value was.Otherwise the function Δ would no longer reflect the differences in the expected values.This can be checked by the monitor at runtime, by checking if the parameter would reach zero had it started from  in the worst case (the value closest to zero).We omit the check for simplicity.From the interval estimates for the discovery probability of each group, we obtain the overall fairness estimate by computing the interval difference, as we did for the lending monitor.The pseudocode of the monitor is in Alg. 3.   ← ExpEstimator.Init(Δ,  2 , 2, 2)

Experimental Outcome
We summarize the outputs of our monitor in Fig. 3.We consider three types of agent, uniform, greedy, and fairconstrained greed (with  = 0.75).While the uniform agent behaves randomly, without taking into account the actual incidence rate, the greedy agent tries to minimize the chances of missed discoveries by keeping an estimate of the incidence rates.The constrained greedy agent needs to additionally ensure a fairness criteria.Our experiments empirically show that no matter what the experimental conditions are, our monitor is able to provide real-time information about the time-varying biases in the system.Moreover, the monitor AttentionMonitor took on an average 28 µs to compute a new estimate after receiving every new observation.The experiments demonstrate the practical usefulness of our monitors.

CONCLUSION AND FUTURE WORK
We present an approach for real-time monitoring of the long-run fairness of machine-learned agents deployed in dynamic environments.Our monitors observe a long sequence of events generated from the interactions between the agent and its environment, and output, after each new observation, a quantitative PAC-style estimate of how fair or biased the agent's policy was until that point in time.The strength of our monitors is their ability to compute interval estimates of fairness values in the face of frequent changes in the underlying distribution, a setting that prevents a static estimation of fairness at each time step.The presented method allows for the computation of confidence intervals when the monitored random variable is sub-exponential.By extension, they can handle sub-Gaussian random variables as well.Using a prototype implementation, we demonstrated the practical usefulness of the monitoring approach on examples taken from the literature.
We took great effort to ensure that the interval estimates computed by the monitor hold non-asymptotically.
Consequently, we avoided a direct comparison with methods that rely on the central limit theorem.However, we acknowledge that loosening this restriction will allow for a wider range of applications.Furthermore, we showed computations of PAC estimates on fairness properties with some specific well-being functions, such as the expected credit score in the lending problem and the discovery probability in the dynamic attention allocation problem, etc.
Although an extension to general well-being functions is difficult, a generalization to restricted classes of well-being functions is conceivable: for instance, when the well-being function is in the form of arithmetic expressions, we can use interval arithmetic to deduce the overall PAC bound, or when the well-being function is convex, we can use convex optimization to deduce tight PAC estimates, etc.
We see several immediate future directions.Firstly, we considered only one particular class of fairness properties, namely the ones which can be written as a difference between expected values of a given function of population parameters across two groups.Investigating other classes of properties will be an important goal.We will be able to extend our monitors to handle individuals with multiple features, i.e. feature vectors, without adding any additional technical machinery.Secondly, we assumed perfect observability about the actions of the agent and the environment, whereas in reality actions are often either partially observable or the observations are noisy (e.g., the college admission example in D'Amour et al. [9]).Thus extensions to partially observed and noisy models will be helpful.Thirdly, our monitors require, at least partially, information about the changes in the system.A natural extension, would be to relax this condition.Finally, we only considered the monitoring, i.e., the problem of checking fairness in real-time.The next step will be to combine monitoring and intervention, so that we obtain an automated procedure for controlling dynamic fairness properties of a machine-learned agent.
Since   is constant when summing over   and since −1  =1 Δ(  ) is constant when summing over   , the internal sum can be rewritten as .
By the law of total probability we obtain Moreover, we know that . By repeated application of this equality we obtain P (ì  −1 ) = E( 1 ).

□
The following supporting lemma demonstrates a useful equality.
Lemma A.4.For the stochastic process ì  as in Definition A.1 and for any ,  ∈ N such that  < , the following equalities hold Proof.First let  =  + 1 then by the properties of the stochastic process.
Hence, by repeated substitution we obtain Let  >  + 1 and notice that which follows from the definition of conditional expectation.Hence, by repeated application of this equality we obtain the following nested expectation expression.
First we notice that we can decompose the innermost expectation using Lemma A.4 to obtain Now we apply the tower property of the conditional expectation and linearity of expectation to obtain The last step is possible because the expected value is a constant and because any random variables  and  .□ The following lemma shows that the estimator Ê estimates the quantity we are interested in, i.e.E (  | ì   −1 ).
Lemma A.5.For the stochastic process ì  as in Definition A.1 and for any time step : Proof.Here we utilize the properties of conditional expectation and Lemma A.4 to show that

□
The following lemma provides insight into the martingale difference sequence used for the approximation of our estimator.
Lemma A.6.For the stochastic process ì  as in Definition A. Δ(  ).
Therefore, now we split the sums at  and  + 1.Then we apply the previous observations to obtain With the martingale difference sequence bounded, we can use an Azuma-style concentration inequality to demonstrate that our estimator Ê1 concentrates well around its mean, and thus E( 1 ).
Lemma A.7.Let ì  be a stochastic process as in Definition A.

Fig. 1 .
Fig. 1.The operational diagram of the monitor and the sections of this paper where the components are presented.

From this point
estimator, we can obtain an interval estimator  (ì   −1 ) by applying a suitable version of Azuma-style inequality (see Theorem 3.3) to compute a (1 − ) • 100% confidence interval around Ê (ì   ) for any given .We call the interval estimator of  (•) ExpEstimator, and present its pseudocode in Alg. 1.In the function Init, the monitor first initializes various internal registers.After each new observation (, , , ), the monitor invokes the function Compute to compute a new (1 − ) • 100% confidence interval for the expected features   (ì   −1 ).

Fig. 2 .
Fig.2.Output estimates of the monitors at each time step on simulated trajectories for the lending example obtained from ml-fairness-gym.The three plots are ordered from left to right in the increasing order of initial bias.For each case, we considered two different policies of the agent: the MaxRwrd agent (blue) maximizes its own reward without trying to optimize any fairness criterion, whereas the EqOpp agent (red) also tries to ensure equalized opportunity statically (i.e., in its one-shot decisions).The shaded regions are the intervals computed by the monitor LendingMonitor, whereas the solid lines are the (unknown to the monitor) true values of the properties.The horizontal dashed line corresponds to the perfectly fair scenario (i.e.,   = 0).

𝑔𝑡∼
Poisson(   ), where Poisson(   ) represents the Poisson distribution with parameter    .The numbers of incidents in the two locations are given by ) is    , this corresponds to a change in the Poisson parameter, causing changes

1 −
− .(8) Furthermore, we show that, for a fixed , the function  (, •) is strictly decreasing everywhere in the positive reals (see App. B.3).This property of  (, •) enables us to efficiently compute an interval estimate of  (, •) and, in turn    , from an interval estimate of .
min ,   max ] is the interval output by ExpEstimator as the estimate of    .Then from    =  (   ,    ) and the strictly decreasing property of , we obtain the corresponding interval estimate for    as [ (   ,  max ),  (   ,  min )].

Fig. 3 .
Fig. 3. Output estimates of the monitors at each time step on simulation traces for the attention allocation example obtained from ml-fairness-gym.Left:  = 5 (no. of locations),  = 6 (total units of attention),  = 0 (change in the Poisson parameters).Middle:  = 5,  = 6,  = 0.0025.Right:  = 10,  = 10,  = 0.0025.For each case, we considered three different policies of the agent (description in the text).The shaded regions are the intervals computed by the monitor AttentionMonitor, whereas the solid lines are the (unknown) true values of the properties.