Scheduling "Last Minute" Updates for Timely Decision-Making

We consider a setting where requests for updates regarding time-varying processes are required prior to making a sequence of decisions. Each request has a finite length time window during which the update should be received. The end of the window reflects the time at which a decision is to be made, while the start of the window models the earliest possible time at which a useful update could be sent. An update scheduled as near to the end of the window as possible is deemed the best, i.e., reflects the most timely information about the process' state. This is modelled by a reward depending on the time difference between the decision point and the last scheduled update. Requests arrive arbitrarily and share a limited communication resource, e.g., a single request can be scheduled per time slot, hence not all decisions can be based on the latest possible update. We consider update scheduling policies which maximize the overall reward rate. In particular we consider an adversarial request model and evaluate proposed algorithms via their Competitive Ratio (CR). Specifically, we first derive a lower bound on the CR of any causal policy. We then propose two scheduling policies, denoted adversarial and greedy, and provide further analysis and insights on regimes where one might be superior to the other. We validate these observations via simulation for a setting with stochastic arrivals.


INTRODUCTION
The emergence of applications relying on networked systems has revolutionized the sensing industry and led the way towards modeling systems that rely on the timely sharing of information to support real-time decision making.Among these, a challenging set of examples is tied to automated vehicles, robots, UAVs, etc., that are constantly traveling through complex environments and requiring updates to smoothly navigate with a high degree of situational awareness.Such applications are often best supported when updates are delivered right on time.In the vehicular setting for example, cars driving at dierent speeds and heading towards an obstructed intersection may express interest in accurate information on the state of the intersection right before they reach it and thus generate requests for timely updates about such intersections.Sending an update to a vehicle early on may not accurately represent the state of the intersection by the time it gets there and will lead to poor decisions.On the other hand, scheduling an update transmission to the vehicle when it is close to the intersection is likely to be advantageous and results in better decisions.
A major challenge in such systems is the dynamic aspect of requests for timely updates.Requests may not only arrive arbitrarily but may also be short-lived, i.e., such that one can receive updates only for a nite length time window before making a critical decision at the end of the window preferably based on the freshest information update.
A key step in this direction is to dene appropriate metrics that capture the freshness and timeliness of the last received update to ensure that it accurately represents the state of the time-varying process that the request is interested in.To that end, the Age of Information (AoI) has been proposed as a metric that measures the freshness of the updates at the receiver [6,7,17].In contrast to prior work on AoI, we propose a novel setting where a request can only receive updates within a nite time window, which we refer to as the request's active window.The importance of scheduling an update transmission to a request close to the end of its active window is modelled through a reward function which depends on the time dierence between the end of the request's active window and the time at which the last update was received.Such a reward model is therefore tied to the age of the last received update within the request's active window which reects the freshness of this update.In this paper, we study scheduling policies that aim to optimize the rewards associated with scheduling such updates.Related work.There has been substantial work on the scheduling of requests with deadlines.The Earliest Deadline First (EDF) policy [12] is the most well-known policy for scheduling in realtime systems.It was proved to maximize the fraction of customers served prior to their respective deadlines when the service time is equal to a single slot.EDF requires that the customer with the earliest deadline be scheduled rst and at most once.Meanwhile [15] considers a real time queuing system where packets have deadlines and the processing time of a packet is known upon its arrival.A predetermined xed reward is associated with servicing each packet and the goal is hence to design a scheduling policy that maximizes the cumulative reward.Other works, e.g., [11], [5], introduce scheduling policies in systems with strict bounds on the service delays.In contrast to prior work, we allow a request to be scheduled more than once before the end of its active window, and we dene a reward function that is tied to the time dierence between the request's decision time and its last scheduled update.More recent work addresses scheduling in collaborative sensing settings in an attempt to achieve real-time situational awareness and can be found in [1,2,13,19].We extend on the prior works by assuming that requests for timely updates arrive arbitrarily and can only receive updates within a limited period of time before making a critical decision.Other recent work investigate scheduling sensing nodes to update a remote node under communication constraints with requirements on the AoI [3,4,[8][9][10].They consider applications where the AoI has to meet some freshness threshold, i.e., they impose either a hard or soft upper bound on the worst case AoI that can be achieved by any sensing node and devise policies that can schedule at most one node per slot to satisfy the constraints.On the other hand, [16] and [14] consider the setting where packets arrive arbitrarily over time and the algorithms only have access to information about packet arrivals.In particular, [16] devises a policy to minimize the energy consumption under the peak AoI constraint at all times, while [14] introduces a resource allocation problem that captures the trade-o between AoI, quality and energy associated with packet transmission and proposed a policy to minimize the three costs.Finally, [18] develops and implements a scheduling algorithm that enables the customization of WiFi networks to the needs of time-sensitive applications.They propose a scheduling approach which makes use of the most up-to-date data as opposed to all past sampled data points, similarly to the one suggested in our work.We dierentiate ourselves from [18] by considering a setting where requests arrive arbitrarily over time, as opposed to having a xed number of nodes requesting information, and additionally consider that the requests can receive updates only in a short period of time before making decisions based on the last received update.Contributions.We explore a new class of scheduling problems associated with delivering information updates "just in time".We consider a setting where requests for timely updates arrive arbitrarily and we assume that requests can receive updates within a nite-time window, which we refer to as the request's active window.Our goal is to ensure that requests have updates that are as fresh as possible by the end of their active windows to enable accurate decisions to be made based on the states of the time-varying processes they are interested in.We hence dene a reward function that captures the importance of scheduling an update transmission to a request as close to the end of its active window as possible.We propose to maximize the reward rate under the assumption that only a single request can be scheduled at a time.In this setting, we investigate an adversarial setting where the number of new requests' arrivals as well as the length of a request's active window are unknown a-priori and only revealed once requests arrive, and thus use the competitive ratio as our performance metric.We derive a lower bound on the reward rate achieved by any non-idling causal policy.We then propose two causal scheduling policies c 0 and c 6 , referred to as the adversarial and greedy policies respectively and further derive the competitive ratio of c 6 with respect to the optimal genie-based policy.Finally, we validate our theoretical analysis with numerical evaluations.

SYSTEM MODEL 2.1 Model for timely information requests
Consider requests for timely updates about time-varying processes that arrive arbitrarily to a time-slotted system.We let

Scheduling updates
We consider a time-slotted system where the length of a time slot corresponds to the duration it takes to transmit an update.Further, we consider for simplicity a setting where a policy can schedule a single update per time slot.That said, recall that multiple updates can be scheduled sequentially for the same request within its active window.Below, we formally introduce the notation to be used in this paper.We shall use the following notation.
8,C is an indicator variable that takes value 1 if an update for an active request 8 is scheduled in slot C under a policy c and 0 otherwise.
is the set of time slots in which updates for active request 8 are scheduled prior to and including C under a policy c.
< ;} is the set of requests in C that have been serviced by policy c.

Reward and age of last update
We let A c 8 denote the reward obtained for a request 8 that was serviced under policy c.It depends on the last slot in which an update for 8 was scheduled when it was active.Assume the last update for an active request 8 is scheduled in a slot close to its start time B 8 , then the "age" of the update would increase by the end of the active window and the update would not provide timely information to 8 as would an update scheduled in a slot close to 4 8 .Therefore scheduling an update for request 8 in a slot closer to its end time 4 8 is deemed advantageous.We let 3 c 8 denote the time dierence between the end time of active request 8 and the last slot in which an update is scheduled for 8 under policy c, i.e., Denition 2. (Reward obtained from servicing 8) The reward A c 8 obtained from servicing request 8 under a policy c depends on the last slot in which an update for 8 was scheduled and is collected when 8 is no longer active, i.e., at the end of slot 4 8 .It is modelled as In the rest of the paper, we consider both linear and convex reward functions 5 (•).Denition 3. (Linear reward function) A positive linear reward function 5 (G) = U + V (F max G + 2) associated with servicing request 8 under a policy c is given by where U, V, 2 0.
The linear reward associated with servicing a request 8 under a policy c as dened above consists of two main components, U and V (F max 3 c 8 + 2).The second component is bounded below and above as follows, 88 2 N, ) models the reward associated with servicing a request 8 under a policy c if where U, V, 2 0 and ⌘(•) is a positive convex function.
Paralleling the way we dened the linear reward, the convex reward has two main components, U and ⌘( V (3 c 8 + 2)), where the second term is bounded above and below as follows, 88 2 N,

Characterization of the scheduling problem
Our objective is to maximize the reward rate obtained from servicing a sequence of requests 1 ) in a nite time window where without loss of generality, C = 1 corresponds to the rst slot in which there are any arrival of new requests and ) is the last slot after which there are no longer any active requests.For a sequence of requests 1 ) , we let 6(c, | ) +1 | be the reward rate obtained under policy c, where | ) +1 | is the number of requests that were active prior to to time ) + 1. Formally, the problem is dened as follows.
where Equation ( 6) limits the number of users that can be scheduled at a time to at most 1, and where ⇧ is the set of causal non-idling policies dened as follows.
Denition 5. (Non-idling policy) A policy c 2 ⇧ is said to be non-idling if it only idles when there are no active requests.
Our goal is to design a causal non-idling scheduling policy that maximizes the reward rate in Problem 1.
A 1. In the remainder of the paper we consider the regime where U > V (F max + 2) and U > ⌘( V2) for both linear and convex reward functions respectively.
Discussion.An interesting regime for both the linear and convex reward functions in Denitions 3 and 4, is that where U > V (F max + 2) and U > ⌘( V2).Then a large reward of value U is obtained if an update to an active request 8 is scheduled in any slot within its active window and in addition to a smaller reward of value V (F max 3 c 8 + 2) which depends on how close to 4 8 is the last slot in which an update to 8 was scheduled.Therefore, a policy whose target is to maximize the reward rate dened in Problem 1 is driven to rst maximizing the number of scheduled requests then, if possible, to schedule these requests as close as possible to the ends of their associated update windows.In particular, if V = 0, the linear reward function in Denition 3 reduces to A c 8 = U and Problem 1 corresponds to maximizing the fraction of updates scheduled within their active windows.Hence, an optimal policy c that solves this problem when V = 0 needs to schedule at most one update per active request, in any time slot within its active window.An optimal policy for this setting is the Earliest Deadline First (EDF) policy [12].From Denition 2, the reward obtained from servicing a request 8 under a policy c depends on both the last slot in which an update for 8 is scheduled as well as on 8's release time B 8 , and is hence independent of its arrival time 0 8 .Therefore, and without loss of generality, we assume in the rest of the paper that a request's release time is equal to its arrival time, i.e., for all 8 2 N we have that B 8 = 0 8 .

OPTIMAL OFFLINE POLICY
In this section we characterize the optimal oine policy c ⇤ that solves Problem 1. Denition 6. (Optimal genie-based oline policy c ⇤ ) A policy c ⇤ is optimal for Problem 1 if it maximizes the reward rate.
Policy c ⇤ has knowledge of the requests' arrivals in the entire timeline and can therefore optimally service requests.Optimal oine policies are useful since they provide an upper bound on the reward rate that can be achieved by causal policies.c ⇤ schedules at most a single update transmission per active request, since an active request requires at most a single update being scheduled as close as possible to its end time.

CAUSAL SCHEDULING POLICIES
The limitation of causal policies ⇧ is that they have no knowledge of future request arrivals.Our goal is to devise online causal policies which have provable performance guarantees in terms of Competitive Ratio (CR) with respect to the optimal oine policy, dened as follows.
Denition 7. (Competitive ratio of a policy c) The competitive ratio of a policy c is given by where P ) is the set of all possible request arrivals in a time window of length ) , c ⇤ is the optimal oine policy that solves Problem 1, and 6(c, 1 ) ) and 6(c ⇤ , 1 ) ) are the reward rate expressions achieved by both c and c ⇤ respectively.An online algorithm is @-competitive for some @ 1 if it achieves at least 1/@ of the optimal oine value in the worst case, i.e., for all 1 ) 2 P ) , we have that 6(c, 1 ) ) 1 @ 6(c ⇤ , 1 ) ).
We shall begin by providing a lower bound on the competitive ratio for any causal non-idling scheduling policy in ⇧.

Lower bound on the competitive ratio of any policy in ⇧
The following theorem states that any causal non-idling policy c 2 ⇧ achieves a competitive ratio of at least 1 F max .
T 1.For any causal non-idling policy c 2 ⇧, and with a reward function that satises the condition in Assumption 1, the competitive ratio of c satises CR c 1 F max .
P. Consider a causal non-idling policy c 2 ⇧.According to Assumption 1, the worst scheduling strategy that c can follow is to schedule the same active request every slot until it is no longer active.Let [1,) ] be a time window of length ) F max , where ) corresponds to the slot after which there are no longer any active requests.A lower bound on the cumulative reward ) obtained under the optimal oine policy c ⇤ is upper bounded by ) 5 (0).Therefore, 8 ) F max , the competitive ratio of c is lower bounded as follows, CR c F max , which concludes the proof.⇤

Greedy policy c 6
The causal greedy policy c 6 presented in the Algorithm 1 panel schedules in every slot the request that would maximize the marginal increase in cumulative reward, and if need be, breaks ties arbitrarily (Line 5).
G c 6 8,g g ); break ties arbitrarily; T 2. (c 6 maximizes the ratio of serviced requests) Policy c 6 maximizes the ratio of serviced requests when the linear and convex reward functions satisfy the conditions in Assumption 1.
P. The proof of this theorem follows from the optimality of the Earliest Deadline First (EDF) policy in maximizing the ratio of serviced requests for the discrete time ⌧/⇡/1 ⌧ queue where the service time is exactly one unit of time [12].Following from Assumption 1, c 6 prioritizes scheduling requests that have not been scheduled yet.If in a time slot C there are more than one active requests that have not been scheduled prior to C, c 6 , similarly to EDF, schedules in slot C an update transmission to the request with the earliest end time.Otherwise in the case where all active requests at time C have already been scheduled prior to C, c 6 schedules the request that maximizes the marginal increase in the cumulative reward, which does not aect the ratio of serviced requests.This concludes the proof.⇤ ) +1 | be the total number of requests that have been serviced under c 6 in a nite time window of length ) .It follows that a lower bound on the cumulative reward A c 6 (( c 6 ) +1 ) obtained under c 6 is A c 6 (( c 6 ) +1 ) # 5 (F max ).The cumulative reward A c ⇤ (( c ⇤ ) +1 ) obtained under the optimal oine policy c ⇤ is upper bounded by # 5 (0).Therefore, for all ) F max , the competitive ratio of c 6 is lower bounded as follows, CR c 6 Note.As long as a request 8 is active, no reward is yet collected for request 8.A reward associated with servicing a request 8 is only obtained when 8 is no longer active.We introduce a specic notion for the reward associated with active request 8 under policy c 0 which we refer to as the tentative reward, dened as follows.
At any time C, an active request 8 may have been actually scheduled one or many times prior to but not including C, and may have been tentatively scheduled after C. Hence from Denition 9, the tentative reward depends on the latest time slot in which 8 is either tentatively scheduled or has been actually scheduled.We similarly dene the aggregate reward associated with a set of active requests.Denition 10. (Requests' aggregate tentative reward under policy c 0 ) The aggregate tentative reward at slot C associated with the set of active requests & C under policy c 0 , is the sum of the requests' tentative rewards and is given by We consider the setting where the reward obtained from servicing a request is either linear or convex as introduced in Denitions 3 and 4 respectively.c 0 is presented in Algorithm 2, which proceeds as follows. 17 Algorithm 3: Tentative schedule.
1 Input: (8,9) ; break ties by selecting 8 ⇤ with smallest 4 8 ⇤ ; break ties arbitrarily; For every slot C, c 0 operates in a backwards manner starting from the last slot in which it can tentatively schedule an update, max 8 2& C 4 8 , all the way back to C. For every slot g between C and max 8 2& C 4 8 , c 0 rst determines g , the subset of active requests in & C that can be tentatively scheduled in slot g.Let 8 2 g and 9 2 g \ {8} be two requests that will be active in slot g.In Algorithm 3, for every g, c 0 evaluates < c 0 C,g (8, 9) (Line 7).The reasoning behind < c 0 C,g (8, 9) is intuitive and described as follows.c 0 evaluates through < c 0 C,g (8, 9) if it is more advantageous in terms of aggregate tentative reward to tentatively schedule request 8 in slot g instead of request 9. Slots between g + 1 and 4 9 in which updates to request 9 are tentatively scheduled under c 0 are ignored when evaluating < c 0 C,g (8,9), in an attempt to characterize the importance of tentatively scheduling an update to request 9 solely in slot g.A detailed derivation of < c 0 C,g (8, 9) is provided as follows.The numerator of < c 0 C,g (8, 9) corresponds to the sum of the following two components: • Tentative reward of request 8 if 8 is tentatively scheduled in any slots in ) c 0
• Tentative reward of request 9 if 9 was only scheduled prior to slot C.
The denominator of < c 0 C,g (8, 9) corresponds to the sum of the following two components: • Tentative reward of request 9 if 9 is tentatively scheduled in slot g. • Tentative reward of request 8 if 8 is tentatively scheduled in any slots in ) c 0 8,C,g+1 and/or scheduled in any slots in ) c 0 8,C 1 .Therefore, < c 0 C,g (8,9) evaluates the ratio between the aggregate tentative rewards resulting from (1) tentatively scheduling request 8 at least once after and including slot g while only considering slots in which request 9 was actually scheduled prior to C and (2) tentatively scheduling request 9 only in slot g while considering all slots strictly greater than g in which 8 was tentatively scheduled to receive updates and/or all slots prior to C in which updates to request 8 were scheduled.
c 0 nally tentatively schedules in slot g the request 8 ⇤ such that min @ 2 g \{8 ⇤ } < c 0 C,g (8 ⇤ , @) max 9 2 g \{8 ⇤ } min @ 2 g \{ 9 } < c 0 C,g ( 9, @) (Line 8 of Algorithm 3). Figure 1 provides an example of requests scheduled and tentatively scheduled to receive updates under policy c 0 .Requests 8 3 and 8 4 became active at the beginning of slot C = 3.We observe in the left subgure of Figure 1 that requests 8 1 and 8 2 were scheduled to receive updates under c 0 in slots C = 1 and C = 2 respectively.Additionally at C = 3, requests 8 4 , 8 3 and 8 1 are tentatively scheduled to receive updates under c 0 in slots C = 5, C = 4 and C = 3 respectively.Once all slots in the interval [3,5] have been reserved to tentatively schedule updates for active requests, c 0 then schedules an update to request 8 1 in slot C = 3 (right subgure of Figure 1).The same scheduling process is repeated as long as there are active requests.
This is equivalent to saying that We can similarly prove the other direction of the condition, which concludes the proof.⇤ It follows from Corollary 1 that c 6 schedules in every slot the request 8 ⇤ 2 arg max (8,9) and c 0 becomes equivalent to c 6 .In other words, if c 0 does not tentatively schedule requests, then it is equivalent to the greedy request scheduling policy c 6 .
According to the above derivation, c 6 does not tentatively schedule active requests at time C and takes a restrictive approach by prioritizing the scheduling of any active requests that have not been scheduled prior to C. On the other hand, c 0 allows for more exibility in rescheduling active requests, where an active request that has already been scheduled prior to some slot C, can be rescheduled in C, even if this might be at the expense of not scheduling other active requests that have not been scheduled yet prior to C.
It follows that c 0 does not make any assumptions about the future load of requests' arrivals and its implications on the tentative schedules.We therefore expect c 0 to achieve a higher reward rate than c 6 in systems with small loads and requests with large active window's lengths.On the other hand, we expect that both c 0 and c 6 would achieve a similar reward rate in systems with high-loads and with requests that have small active windows, balanced on the one hand by c 6 's urgency to schedule new requests as soon as they become active, and on the other hand by c 0 attempting to schedule/reschedule requests as close as possible to the end of their active windows.

NUMERICAL EVALUATIONS
We conducted numerical evaluations to explore the performance of our proposed policies c 0 and c 6 in maximizing the reward rate introduced in Problem 1 and evaluate their results with respect to other baseline policies which we introduce below.

Model
We shall present results for a convex reward function that satises the conditions in Assumption 1.We let ⌘(G) = 4 G in Denition 4 and set U = 1, V = 2 and 2 = 1 2 ln(0.9).It follows that the convex exponential reward obtained after servicing request 8 under policy c is A The maximal achievable reward in this setting is equal to 1.9 whereas the smallest reward obtained after servicing a request is 1 + 0.94 2F max .We consider the setting where the number of new request arrivals at the beginning of every slot is drawn from a Poisson distribution with intensity _, whereas a request's active window length is generated from a discrete uniform distribution ⇠ * [1, F max ].We run simulations over a nite-time of length ) = 1000, where ) is the last slot after which there are no longer any active requests in the system.Our simulation results represent averages over randomly generated requests' arrivals as well as requests' active windows lengths.We ran 100 Monte-Carlo (MC) simulations and plotted both the mean ratio of serviced requests for any policy c, i.e., | ) +1 | , and the reward rate as well as the condence intervals corresponding to the standard deviation of the estimator resulting from the MC simulations.

Scheduling policies
In addition to presenting the results for all of c 0 and c 6 , we consider two baseline policies.c A is a causal policy that randomly schedules an active request in a slot.c '' schedules active requests in a roundrobin-like fashion which we describe as follows.At the beginning of every slot, the set of new requests is considered for scheduling right after the set of requests that arrived prior to this slot and have not been scheduled yet.
Due to the brute-force nature of the optimal oine policy c ⇤ we provide instead an upper bound (UB) on the maximal achievable reward-rate, which is equal to the product between the maximal ratio of serviced requests (achieved by c 6 ) and the maximal achievable reward which is equivalent to 1.9 in this setting, normalized by the total number of requests ) .

On the impact of the load of requests with xed maximal window length
We x F max = 30 and increase _ from 0.1 to 2.5.A rst interesting observation is that all of c 0 , c 6 and c '' maximize the ratio of serviced requests in both regimes where _  0.4 and _ 1.6.Whereas for 0.4  _  1.6, c 6 maximizes the ratio of serviced requests.Another interesting observation is that for _  0.7, the ratio of serviced requests achieved by c 6 is constant and equal to 1, whereas the reward-rate achieved by c 6 is signicantly decreasing in that range.The following behavior is due to the fact that c 6 maximizes the number of serviced requests at the expense of scheduling requests closer to the end of their active windows.Therefore, this justies that c 6 does not maximize the reward rate in low-load systems.
On the other hand, c 0 achieves the largest reward-rate among all proposed causal policies for _  2.1 even if it is linearly decreasing as _ increases.The linear decrease in the reward-rate is justied since c 0 may reschedule active requests in slots closer to their end times in an attempt to maximize the cumulative reward, which may come at the expense of servicing requests that have not been scheduled yet.
Finally, for _ 2.1, both c 0 and c 6 achieve the largest rewardrate, close to the maximal achievable reward-rate, which is aligned with our previous analysis suggesting that while c 0 is superior in systems with low-loads, both c 0 and c 6 achieve similar performance in systems with high requests' arrival rate.Figure 3: Ratio of serviced requests and reward-rate when the reward is convex, _ = 1 and F max is increasing.

On the impact of increased update window exibility
We x _ = 1 and increase F max from 2 to 40.The results are shown in Figure 3.A rst observation is that as F max increases, the ratio of serviced requests under c 0 , c 6 and c '' increases because of the additional exibility in scheduling requests in more slots.For F max  3, we observe that both c 0 and c 6 achieve the same rewardrate.For F max 4, c 0 achieves in a higher reward-rate than any of the other proposed causal policies.An interesting observation is that the reward-rate achieved by c 6 is clearly decreasing as F max increases.There are two factors that concurrently lead to the following phenomenon, the rst one being that for F max 4, the ratio of serviced requests under c 6 slowly increases as F max increases, and the second one being that the minimal reward obtained from servicing a request is decreasing as F max increases.That said, and as aligned with our previous discussions, c 0 is superior to c 6 in such settings with xed arrival rate but requests with large active windows, since it allows for more exibility in scheduling updates as close as possible to the requests' end times, whereas c 6 is driven towards maximizing the ratio of scheduled requests with less priority to scheduling those requests closer to their end times.

CONCLUSION
In this paper we have developed a model where updates are required by requests prior to making timely decisions regarding time-varying processes they're interested in.Requests can receive updates only within time windows of nite length.A key aspect is the design of a reward function that captures the importance of scheduling the freshest update transmission to a request as close as possible to the decision time as well as scheduling policies that achieve high rewards in adversarial settings.A key part of our future work is to extend our model to a real-time information market that includes multiple servers and allows for multiple updates' transmissions per slot by matching requests to servers through ecient algorithms.

where 5 ( 8 .
•) is a non-decreasing upper-bounded function of 3 c Scheduling an update for an active request 8 under a policy c in a slot close to its end time 4 8 results in a smaller 4 8 max C 2 [B 8 ,4 8 ] G c 8,C C and thus in a larger reward A c 8 .The cumulative reward at slot C under policy c is denoted by

Denition 9 .
(A request's tentative reward under policy c 0 ) The tentative reward Â c 0 8,C,C 0 of an active request 8 2 & C on slot C is a function of the slots in ) c 0 8,C 1 in which 8 was actually scheduled prior to C under c 0 and the slots ) c 0 8,C,C 0 in which 8 is tentatively scheduled after C 0 2 [C, 4 8 ] under c 0 , and is given by

Figure 1 :
Figure 1: Scheduling and tentatively scheduling under policy c 0 of requests with maximal update window of F max = 3 and receiving a linear reward as dened in Denition 3, with U = 1, V = 0.1, 2 = 0.

Figure 2 :
Figure 2: Ratio of serviced requests and reward-rate when the reward is convex, F max = 30 and _ is increasing.
Ratio of serviced requests vs. F max .Reward-rate vs. F max .
1 = (d 8 ) 8 2N denote the sequence of request arrivals, where the tuple d 8 = (0 8 , B 8 , 4 8 ) is the 8 C⌘ request, characterized by, Denition 1. (Servicing a request) We say that a policy c has serviced a request 8 if 8 is no longer active and c scheduled one or more updates for 8 within its active window [B 8 , 4 8 ].
We propose a causal scheduling policy c 0 .We shall refer to it as the adversarial policy.During every slot C, c 0 assigns to every slot in the interval [C, max 8 2& C 4 8 ] a single request that is active during this slot.We say that at time C, c 0 tentatively schedules updates for active requests in slots within the interval [C, max 8 2& C 4 8 ].By the end of slot C, an update is sent to the request scheduled in C, while the remaining slots in (C, max 8 2& C 4 8 ] are freed from any tentative schedules.Denition 8. (Tentatively scheduling updates for an active request) Policy c 0 tentatively schedules updates to an active request 8 2 & C if during slot C it assigns slots within the interval [C, 4 8 ] to potentially transmit updates to request 8 in those slots.We clearly dene additional notation specic to c 0 .1} is the set of time slots within the interval [C 0 , 4 8 ] for any C 0 2 [C, 4 8 ], in which active request 8 is tentatively scheduled.