A Control Architecture for Entanglement Generation Switches in Quantum Networks

Entanglement between quantum network nodes is often produced using intermediary devices - such as heralding stations - as a resource. When scaling quantum networks to many nodes, requiring a dedicated intermediary device for every pair of nodes introduces high costs. Here, we propose a cost-effective architecture to connect many quantum network nodes via a central quantum network hub called an Entanglement Generation Switch (EGS). The EGS allows multiple quantum nodes to be connected at a fixed resource cost, by sharing the resources needed to make entanglement. We propose an algorithm called the Rate Control Protocol (RCP) which moderates the level of competition for access to the hub's resources between sets of users. We proceed to prove a convergence theorem for rates yielded by the algorithm. To derive the algorithm we work in the framework of Network Utility Maximization (NUM) and make use of the theory of Lagrange multipliers and Lagrangian duality. Our EGS architecture lays the groundwork for developing control architectures compatible with other types of quantum network hubs as well as system models of greater complexity.


I. INTRODUCTION
A quantum network enables radically new capabilities that are provably impossible to attain in any classical network [1].Examples include applications such as secure communication [2], [3], secure quantum computing in the cloud [4], [5], and clock synchronization [6].Users utilize the end nodes of a network to run applications.The key to unlocking widespread roll-out of these applications is the ability to produce entanglement between these end nodes.
Prevalent methods for generating entanglement between two quantum nodes that are directly connected by a quantum communication medium (e.g., optical fibers) involve an intermediate device.A prime example is heralded entanglement generation [7], [8] in which the intermediary device is a socalled heralding station.This method of producing entanglement has successfully been demonstrated in many experimental platforms including Color Centers [9], [10], Ion Traps [11], [12], Atomic Ensembles [13], [14] and Neutral Atoms [15].As quantum networks continue to scale, it becomes increasingly impractical to maintain direct fiber connections and dedicated heralding stations for every pair of end nodes.
To address this challenge, we propose a scalable quantum network architecture for an Entanglement Generation Switch (EGS), a central hub equipped with a limited number of intermediate devices called resources, a switch, and a processor responsible for managing a scheduling algorithm and sending classical messages to nodes.This central hub enables multiple nodes to share the intermediate devices, significantly reducing the complexity and total resources required for large-scale deployment.While our results apply to an EGS sharing any type of entanglement generation resource, a specific example illustrates how an EGS can operate: Consider quantum network nodes that generate entanglement between them using the so-called single-click bipartite entanglement generation protocol (see e.g [10]).In this case the resource(s) to be shared are the heralding station(s).Such stations consist of two input channels connected to a 50/50 beam splitter, which is then connected by two output channels to a pair of photon detectors that are each connected to a device for processing the measurement outcomes, such as a Field Programmable Gate Array (FPGA).The basic principle of the single-click protocol requires that each network node of the pair locally generates entanglement between a qubit in their local memory and a travelling photon.The photon is sent to a heralding station at which an entanglement swap is attempted on the two photons received; if the entanglement swap is successful, the qubits of the two network nodes will have become entangled.An EGS aims to share one or more heralding stations amongst many connected network nodes.These nodes will still run the single-click protocol, but be limited to using the heralding station needed in the time allocated to them by the EGS.
A crucial challenge in implementing such an architecture is the efficient allocation of the central hub's resources to different pairs of users in distinct time slots.Similar to classical networking, the allocation process should be driven by user demand for network resources.In the context of quantum networks, this translates to the demand of a user pair (u i , u j ) for entanglement generation at a specific rate or fidelity.Given a set of user demands, the EGS must compose a schedule for the allocation of resources in order to service those demands.In general, the total demand of users may exceed the available resources at the central hub, leading to scheduling and resource allocation challenges.
Here, we introduce the first algorithm for regulating user demand to an EGS, thereby solving this key challenge.Specifically, the algorithm takes as input a vector of rates of entanglement generation demanded by pairs of users and outputs an updated rate vector.The current set of user-originated demands is a measure of competition for EGS resources.We construct the algorithm within the Network Utility Maximization (NUM) framework, wherein the problem of demand regulation is cast as a constrained optimization problem.To solve the problem, we derive the algorithm by using the theory of Lagrange multipliers and Lagrangian duality.These tools, respectively, enable including the constraints together with the objective of the optimization problem and solving for a parameter vector which is the unknown value of the combined problem.Regulating competition for the resources by modifying user demand makes it possible to enforce a notion of fairness in the allocation of resources and maximize resource utilization.Since the algorithm regulates competition by calculating the rates demanded by users, we call it the Rate Control Protocol (RCP).

A. RESULTS SUMMARY
We make the following contributions: • We characterize (Theorem II.1) the capacity region of the EGS, which is the maximal set of rates at which users can demand entanglement generation such that there exists a scheduling policy under which, on average, the demanded rates do not exceed the delivered rates.The impact of specifying the capacity region is that it delineates which rates can feasibly be serviced by the EGS.• We prove (Theorem II.1) that under the Maximum Weight Scheduling policy (Definition II.6) for resource allocation it is possible for the EGS to deliver average rates of entanglement generation that match the requested rates, for any rate vector from within the capacity region.Therefore, an EGS operated with this scheduling policy can achieve throughput optimality as long as the rates demanded by users lie withing the capacity region.To prove the theorem, we use the Lyapunov stability theory of Markov chains.• We derive the RCP, an algorithm to regulate the rates of bi-partite entanglement generation which pairs of users demand from an EGS.The RCP solves the problem of moderating user competition for EGS resources.The derivation is based on techniques from Network Utility Maximization (NUM) and its quantum network extension (QNUM), where resource allocation in a (quantum) network is modelled as an optimization problem that can be solved using methods from convex optimization theory.
• We prove (Theorem III.1) that the sequence of arrival rate vectors yielded by the RCP converges over time slots to an optimum value, given any feasible rate vector as initial condition.The significance of this result is that if the RCP is used to set the demand rates of entanglement generation over a series of time-slots, the set of demanded rates will approach an optimal value, as long as the initial rate vector supplied to the algorithm is feasible.The proof relies on Lagrange multipliers and Lagrangian duality theory.• Finally, we supply numerical results that support our analysis.

B. RELATED WORK
A quantum network hub that can store locally at least one qubit per linked node and distributes entanglement across these links has been studied [16], [17].We refer to such a hub as an Entanglement Distribution Switch (EDS).This system differs from our system because the central hub has qubits and/or quantum memories, whereas our system does not.In [16] the focus is on assessing the EDS performance in terms of the rate at which it creates n−partite entanglements, and in [17] the possible rate/fidelity combinations of GHZ states that may be supplied by an EDS [17] are studied.Maximum Weight scheduling is a type of solution to the problem of resource allocation which is based on assigning resources to sets of users with the largest service backlog.A Maximum Weight scheduling policy was originally presented in [18] for resource allocation in classical communication networks and was adapted to the analysis of a single switch for classical networking in [19], where it was shown that under this scheduling policy the set of request arrival rates matches the request departure rates (or in other words the policy stabilizes the switch for all feasible arrival rates).In [20] the capacity region of an EDS, defined as the set of arrival rates of requests for end-to-end multi-partite entanglements that stabilize the switch, is first characterized.Using the Lyapunov stability theory of Markov chains, a Maximum Weight scheduling policy is proposed and shown to stabilize the switch for all arrival rates within the capacity region.To summarize, in each of the classical network settings and in the EDS setting a Maximum Weight scheduling policy has the merit of achieving a specified performance metric.None of these results are immediately applicable to our system.We demonstrate that such a policy achieves the performance metric of throughput optimality when applied to the EGS by first characterizing the capacity region of the EGS, which has not been done before, and then proving that a Maximum Weight scheduling policy also achieves throughput optimality in our system.
These results on the analysis of EDS systems constitute the first analytic approaches to resource allocation by a quantum network hub.However, due to the assumption that an EDS locally controls some number of qubits per link, the system has a high technical implementation cost which may not be compatible with near-term quantum networks.Moreover, although these works assume that there is competition between multiple sets of users, the focus is purely on the capacity of the EDS system.Conversely, our analytic contributions apply to EGS quantum network hubs, which have a low technical implementation cost because the hub does not require local control of any qubits or quantum memory.Furthermore, our results extend beyond the analysis of the capacity of the EGS and we propose the RCP as a solution to the problem of moderating competition for the EGS resources.
In [21], a quantum network topology is studied where usercontrolled nodes are connected through a hub known as a Qonnector.The Qonnector provides the necessary hardware for limited end nodes to execute applications in pairs or small groups.A potential configuration of the Qonnector is as an EGS.While [21] focuses on assessing the performance of certain applications in this topology, it does not address control policies for the system.In contrast, our work examines control policies for an EGS.
NUM was first introduced in [22] and has been widely used to develop and analyze control policies for classical networks [23].It is a powerful framework for designing and analyzing communication protocols in classical networks wherein the problem of allocating resources amongst competing sets of users is cast as a constrained optimization problem.This framework was recently extended to QNUM by [24].Therein, the authors first develop three performance metrics and use them to catalogue the utility of resource allocation in a quantum network model where each link is associated with a rate and fidelity of entanglement delivery to communicating users.This work does not immediately extend to control policies, as the resource allocations investigated are based on static numerical optimization and need to be recalculated in response to changes in the constraints or sets of users.
In classical networks, probabilistic failures such as loss of a message during transmission or irreconcilable distortion due to transmission over a noisy channel may occur.A serious challenge introduced in the analysis of quantum networks is that in addition to the failure modes of a classical network several new probabilistic failure modes arise that are independent of the state of the network but nevertheless affect its ability to satisfy demands.An example is the probabilistic success in practical realizations of heralded entanglement generation [9]- [15].Due to this failure mode, scheduling access to a resource at a certain rate does not guarantee entanglement generation at that rate, thereby complicating the analysis of scheduling.
It is important to distinguish between the concept of rates in classical network control protocols and the notion of rate in the model of a quantum network hub presented here.In classical networks, users transmit data at some rate and classical network control protocols, such as the Transmission Control Protocol (TCP), regulate the rate at which users send their data [23].In contrast, in our quantum network hub model, users demand a rate of entanglement generation.However, a significant challenge in developing a control protocol for the EGS is the difference between the rate of attempted entanglement generation and the rate at which entanglement delivery is demanded and delivered to users.Explicitly, in the RCP it is the desired rates of entanglement generation that serve as the controllable parameters moderated by the protocol.

II. PRELIMINARIES
Operation of the EGS requires interactions between the set of quantum network nodes U and the EGS processor with control over R resources.See Fig. 1 a) for an overview of the physical architecture.Below we delineate the process by which pairs of nodes may request Fig. 1 d) and receive Fig. 1 b) and d) resource allocations from the processor.We assume: • the EGS operates in a fixed-duration time slotted system where t n denotes the n th time slot; • timing synchronization between the processor and each node is continuously managed by classical control electronics at the physical layer; • allocation of a single resource to communication session s for one time slot allows for the creation of a maximum of one entangled pair with a success probability of p gen .A consistent physical model involves a batched sequence of attempts, which can be terminated upon the successful creation of an entangled pair or at the end of the time slot.See Fig. 1 c) for an example quantum communication sequence compatible with heralded entanglement generation.
The classical communication sequence repeated in each time slot t n which governs resource allocation is summarized in Fig. 1 d).In what follows we introduce and explain each step of this communication sequence.The notation introduced throughout this section is summarized in Table 1.

A. DEMANDS FOR RESOURCE ALLOCATION FROM NODES TO THE EGS PROCESSOR
Definition II.1 (Target Rate, Communication Session).Each possible pair of nodes has the potential to require shared bipartite entanglement.To fulfill this need, a node pair (u i , u j ) requires the processor to allocate a resource.The node pair sets a target rate λ (i,j) (t n ) once per time slot, which represents the average number of entangled pairs per time slot they aim to generate using one or more EGS resources.A distinct pair of nodes with a non-zero target rate is referred to as a communication session and is associated with a unique communication session ID, s.The set of communication session IDs, S is defined as follows: where N = |U | is the total number of network nodes with connections to the EGS.
Henceforth each pair of nodes will be identified by its communication session id s.The target rates of all communication sessions in time-slot t n can be written as a vector A rate of entanglement generation is the service demanded by each communication session from the EGS.To address the difference between the desired rate and the rate at which a communication session requires resource allocation to achieve that rate, we establish the following model for demand, which is compatible with a discrete time scheduling policy.
Definition II.2 (Demand).Demands for resources are requests made by communication session s to obtain a single entangled pair.The number of demands a s (t n ) submitted by session s at time slot t n depends on its target rate λ s (t n ).If λ s (t n ) > 1, then communication session s first submits ⌊λ s (t n )⌋ demands.For a communication session s with 0 ≤ λ s (t n ) ≤ 1, or to account for the remaining part of the rate for any session with λ s (t n ) > 1, each communication session randomly generates demands by sampling from a Bernoulli distribution with a mean equal to λ s (t n ) − ⌊λ s (t n )⌋, so that in general the submitted demands satisfy a (shifted Definition II.3 (Designated Communication Node, Secondary Node).One of the nodes of every communication session is marked as the designated communication node for communicating the entanglement requests to the switch.The terms designated communication node and secondary node are used to refer to the two nodes of a communication session.

B. PROCESSING DEMANDS FOR RESOURCE ALLOCATION
Definition II.4 (Queue).When the processor receives a demand, it is added to one of |S| queues, one for each communication session.The set of demands received by the processor by time-slot t n and not yet satisfied is captured by the queue vector q(t n ) ∈ N |S| = (q s (t n ) ∀s), where the component q s (t n ) is the queue of communication session s at time t n .Each queue processes demands in first in first out the maximum rate at which each node u ∈ U can generate R + and/or make use of entanglement, across all of the sessions that it is involved in order.As all demands are identical, we interchangeably use q s (t n ) to refer to both the queue length of communication session s in time slot t n and the queue itself.
Definition II.5 ((Demand-Based) Schedule).A resource allocation schedule is a vector M (t n+1 ) ∈ N |S| calculated by the EGS processor in time slot t n determining the assignment of the resources for time slot t n+1 .A single session s may be allocated the use of multiple resources, up to a maximum number x s set by the EGS which does not exceed R, the total number of resources controlled by the EGS.For every session s ∈ S the entry corresponds to the number of resources assigned to s for the entire duration of time slot t n+1 .A demand based schedule is based on the vector of all queues, q(t n ), as it stands before new demands are registered in t n , and satisfies Each node of a communication session s requires a physical connection to the EGS switch.A single physical connection, such as an optical fiber, can be used for this purpose.To enable multiple connections between a node and the switch, options include the use of optical multiplexers over a single fiber or utilizing multiple fibers within a fiber bundle.The parameters x s ∀s are motivated by situations where the number of physical connections that can be dedicated to service communication session s are limited.
Definition II.6 (Maximum Weight Scheduling).The set M of feasible demand based schedules at time slot t n contains all vectors M ′ (t n+1 ) ∈ N |S| satisfying (2), (3), and (4).The EGS processor selects a maximum weight schedule M(t n+1 ) ∈ M from the feasible schedules for the following time slot by solving for In words, the schedule is selected from the set of feasible schedules by first solving for the subset of schedules that allocate resources to the sessions with the largest number of queued demands.If that subset contains more than one schedule, a schedule is randomly selected from the subset.
By the end of t n , the schedule for t n+1 has been computed by the processor and broadcast to the nodes.If the schedule allocates use of a resource to communication session s for t n+1 , the users of s utilize the allocated resource to make a batch of entanglement generation attempts over the duration of t n+1 .The demand at the front of queue s is only marked as served once both a resource has been allocated and the users of s have successfully generated entanglement.Hence the dynamics of each queue are given by, where [z] + = max(z, 0), and g s (t n ) is the number of successfully generated entangled pairs by s during t n .In words, every subsequent time slot the demands that arrived in the previous time slot are added to the queue and those that were scheduled and successfully resulted in the generation of an entangled pair are removed from the queue.The updated queue is always of non-negative length since the number of successfully generated entangled pairs is a sample of a binomial random variable where the number of trials is the number of resources allocated to s, M s (t n ) ≤ q s (t n ) , and the trial success probability is p gen , Definition II.7 (Supportable rate).The arrival rate vector where is the sum of the number of demands in the queue of each session in time slot t n .That is, λ(t n ) is supportable if the probability that the total queue length becomes infinite is zero.
Definition II.8 (Capacity Region).The capacity region of an EGS is the set of arrival rate vectors that are supportable by the EGS.For each rate vector λ in the capacity region, there exists some scheduling routine such that an EGS operating under that scheduling algorithm can support the rate vector λ.
If the rate vector λ falls outside the capacity region, the EGS cannot support it under any scheduling algorithm, leading to unpredictable performance.The goal of moderating the rate vector through the Rate Control Protocol (RCP) is twofold: first, to keep it within the capacity region, and second, to maximize resource utilization by saturating the capacity region, thus fully leveraging the potential of the EGS to facilitate entanglement generation.
Theorem II.1 (Capacity Region).Let x s be the maximum number of resources that can be allocated to a session s per time slot.For each resource, p gen is the probability that a communication session allocated the resource for one time slot will successfully create an entangled pair.The capacity region of an EGS with R resources is the set of rate vectors λ ∈ IntC, where C is defined as: λ EGS = R • p gen and λ max gen,s = x s • p gen .Moreover, maximum weight scheduling (Definition II.6) is throughput optimal and supports any rate vector λ ∈ IntC.For proof, see Section V-A2.
The first requirement of C states that all request rate vectors must be positive, meaning every component of the rate vector must be positive or zero (λ ≥ 0 ⇔ λ s ≥ 0 ∀s ∈ S).The second requirement enforces that the total rate of entanglement requested from the EGS, s λ s , cannot exceed the total average service rate of the EGS, R • p gen .The final requirement states that the request rate λ s of any communication session s must not exceed the maximum average service rate that can be allocated to the communication session, x s • p gen .

C. CONSTRAINTS
We assume that there are two types of constraints on the sequence of target rates set by a session.The first is a minimum rate of entanglement generation λ min s ; below this rate, session s cannot obtain sufficient entangled pairs within a short enough period of time in order to enable its target application.The second constraint λ u ∀u ∈ U is an upper limit on the rate at which each node u can generate and/or make use of entanglement across all of the sessions that it is involved in.This parameter can capture a range of technical limitations of the quantum nodes, including a limited rate of entanglement generation or a limited speed of writing generated entanglement to memory, hence temporarily decreasing the availability of the node for engaging in further entanglement generation immediately following the successful production of a pair.

III. RCP ALGORITHM
An algorithm moderating competition for EGS resources enables the possibility of introducing a notion of fairness in how resources are allocated amongst competing communication sessions and ensuring that the resources are fully utilized.We consider a situation where the rate vector produced by any such algorithm is constrained by the maximum service rate of the switch, as described by the capacity region C, as well as the node or user level constraints described by λ u ∀u and λ min s ∀s.In the framework of NUM, we pose an optimization problem where each communication session s is associated with a utility function f s (λ s (t n )) : R → R, which encodes the benefit s derives from the rate vector λ(t n ).We apply the theory of Lagrange multipliers and Lagrangian duality (see [25] for detailed coverage) to formulate and analyze the optimization problem.We then derive the RCP (Algorithm 1) as the solution to this problem.
The primal problem is to maximize the aggregate utility or the total benefit that users derive from the EGS by maximizing the sum of the utility functions, including the constraints by the use of Lagrange multipliers.The dual problem is to determine an optimal vector of Lagrange multipliers.In the case where there is no duality gap [25], a solution to the dual problem is equivalent to a solution of the primal problem.The vector of Lagrange multipliers 1+N ) , with components for the processor and each node, is denoted as the price vector in our algorithm and serves as a measure of the competition for resources amongst the communication sessions.Define S(u) := {s : u ∈ s} ⊆ S to be the subset of communication sessions in which node u participates.In each communication session one node is designated to communicate demand to the switch and the other node is secondary (see Definition II.3).Note that u ∈ s ⇔ s ∈ S(u).The feasible rate region of the communication session s is, and the feasible region for a rate vector λ is, We make the following two assumptions on the utility function f s of each communication session s: A1: On the interval Λ s = [λ min s , λ max gen,s ] the utility functions f s are increasing, strictly concave, and twice continuously differentiable; A2: The curvatures of all f s are bounded away from zero on Λ s .For some constant α s > 0, To ensure feasibility and satisfy the Slater constraint qualification [25], in addition to assumptions A1 and A2 it is necessary that the rate vector with components equal to the minimal rates of each communication session is an interior point of the constraint set, s∈S(u) Algorithm 1: Rate Control Protocol (RCP) Processor's Algorithm: At times t n = 1, 2, • • • , the processor: 1) receives rates λ s (t n ) from all communication sessions s ∈ S; 2) computes a new central price, where θ c is a constant step-size for the central price; 3) broadcasts the new central price p c (t n+1 ) to all communication sessions s ∈ S. Network Node u's Algorithm: 1) marks the subset of communication sessions COMM(u) ⊆ S(u) involving node u for which it is the designated communication node; 2) receives from every secondary node u ′ the price p u ′ (t n ) for each communication session s = (u, u ′ ) ∈ COMM(u); 3) computes a new node price, where θ u ∀u is a constant step-size for each node, which may be fixed or differ from node to node; 4) communicates the new price p u (t n+1 ) to the communication node from every communication session s ∈ S(u) \ COMM(u) in which u is a secondary node; 5) receives from the switch the central price p c (t n+1 ); 6) computes the new rate for every communication session s ∈ COMM(u), where [z] M m = max min(z, M ), m and p(t i ) = p c (t i ), p u (t i ) ∀ u is the vector of prices pertaining to time slot t i ; 7) communicates the new rate λ s (t n+1 ) to the EGS processor, for every communication session s ∈ COMM(u).

A. DERIVATION
Formally, the RCP yields rate vectors which solve the Primal Problem: subject to, s λ s ≤ λ EGS (17) s∈S(u) The Lagrangian, which includes the constraints ( 17), (18) with a vector of Lagrange multipliers p = (p c , p u ∀u) ≥ 0 together with the objective function (16), is given by We identify that the problem is separable in the communication sessions, S, and re-write the Lagrangian in separable form, where l s (λ s ) is defined as and we make use of the equivalence A rate vector λ * is a local maximum of ( 16) if it satisfies the optimality condition [25], If moreover F (λ) is concave over Λ, then ( 21) is also sufficient for λ * to maximize F (λ) over Λ [25] (it is also a global maximum).
To obtain a λ * satisfying both the optimality condition ( 21) and the constraints ( 17), (18) we set the gradient with respect to rate of each communication session of the Lagrangian to zero, The maximization in the primal problem ( 16) is constrained to the feasible rate region defined by ( 9), (10).To restrict solutions to the problem domain, any λ * ∈Λ is projected component-wise so that λ * s → λ * S ∈ Λ s ∀s.With the assumptions in ( 11), ( 12) there exists at least one set of Lagrange multipliers [25].In terms of a given vector of Lagrange multipliers p, an optimal rate vector λ * satisfies, where [z] M m = max min z, M , m .To obtain a λ * , it remains to obtain a vector of Lagrange multipliers.
An optimal vector p * of Lagrange multipliers is a solution to the Dual Problem: Select p = (p c , p u ∀ u) so as to achieve, where the dual objective function D(p) is defined as, With assumptions A1, A2 and (11,12), the problem satisfies the Slater constraint qualification and has no duality gap [25], meaning a solution to the dual problem is also a solution to the primal problem.Define λ * to be a rate vector that maximizes L(λ, p).A vector of Lagrange multipliers p * is an optimal solution to the dual problem if it satisfies the optimality condition, Gradient projection is a type of algorithm where in order to solve an optimization problem such as the dual problem, (23), with respect to a vector p, one starts by selecting some initial vector p(0) and iteratively adjusting p(t n ) → p(t n+1 ) by making steps in the opposite direction of the gradient of the objective function.We introduce a vector of step-sizes An implementation of the gradient projection algorithm is to iteratively adjusting the Lagrange multipliers according to, where λ * s (t n ) = λ * s p(t n ) is given by inputting the vector of Lagrange multipliers in (22).An implementation of the algorithm necessitates identifying parameters in the system that correspond to the components of the vector of Lagrange multipliers.We note that the centralized price p c (t n ) and the user prices p u (t n ) ∀ u, have respectively, the same dynamics as the total queue lengths and the sum total of the session queue lengths in which user u participates (6).Therefore, we make the following identifications, Note that these identifications are not unique, since the only strict criteria on the identification is that the queue dynamics generated by ( 6) match the dynamics of ( 28) and (29), whereas the scaling is arbitrary.For more information on the interpretation of Lagrange multipliers as prices in communication networks, see [22], [23].

B. CONVERGENCE
The RCP is a gradient projection algorithm with constant step-sizes from the vector θ ∈ R 1+N = (θ c , θ u ∀u).
Establishing that the algorithm converges is crucial to ensure that it yields solutions that effectively address the problem it is designed to solve.To establish convergence, we follow a similar treatment as in [26].
Theorem III.1 (RCP Convergence).Suppose assumptions A1 and A2 and the constraints (11,12) are satisfied and each of the the step-sizes θ r ∈ {θ c , θ u ∀u} satisfies θ r ∈ (0, 2/α|S|), where α = max s∈S α s with α s the curvature bound of assumption A2, and |S| is the number of communication sessions.Then, starting from any initial rate λ(0) ∈ Λ and price p(0) ≥ 0 vectors, every accumulation point λ, p of the sequence over time slots { λ(t n ), p(t n ) } tn generated by the RCP is primal-dual optimal.For proof, see Section V-A3.

IV. CASE STUDY
To illustrate use of the RCP we associate a log utility function with each session, Log utility functions are suitable when throughput is the target performance metric, and a set of sessions all employing log utility functions will have the property of proportional fairness.In such a system, if the proportion by which one session rate changes is positive, there is at least one other session for which the proportional change is negative [23].For compatibility with Theorem III.1 note that log utility functions satisfy A1, and A2 is satisfied with α s = (λ max gen,s ) 2 ∀s.Although the convergence theorem only guarantees asymptotic convergence of the sequence { λ(t n ), p(t n ) } tn to an optimal rate-price pair λ, p , in any realization of an EGS one expects that the convergence time ∆τ , the number of time slots that the RCP must run before convergence is attained, is finite.In addition, it is practically relevant to characterize the tightness of convergence δ, or the maximum size of fluctuations about the optima.
If an EGS is connected to N nodes, there are |S| max = N 2 possible sessions.We assume that in a real network not all pairs of users require shared entanglement.In Fig. 2 we numerically investigate the convergence time and tightness of convergence, (∆τ, δ), for an EGS with R = 3 resources and p gen = 0.05 connected to N = 20, 50 and 100 users, where the number of sessions is restricted to |S| = 0.1 • |S| max by randomly sampling 10% of the possible sessions.In these simulations we set x s = 1 ∀s, and average over 1000 independent runs of the simulation, each using the same set of sessions.
The reported convergence times ∆τ are the number of time slots that occur before the sum of demand rates Σ s λ s (t n ) first crosses the optimal value λ EGS .Reporting of the tightness of convergence, δ, is based on the maximum size of fluctuations of Σ s λ s (t n ) about λ EGS following ∆τ .As the number of sessions hosted by an EGS increases, we observe a trade-off between ∆τ and δ.When the number of sessions is lower, ∆τ is shorter but δ is larger.We have performed additional simulations which indicate that increasing the step size used in the RCP can be used to trade larger δ for somewhat shorter ∆τ .
If constraint changes occur slowly compared to ∆τ , Theorem III.1 implies that the RCP should re-establish convergence to a new optimal rate and price vector pair, ( λ, p) → ( λ′ , p′ ).In a real EGS system it is possible that the number of available resources will not be static in time, as resources may require periodic downtime for calibration.The effect of a change in the number of resources R → R ′ changes the maximum service rate To validate the robustness of the algorithm against such constraint changes we simulate EGS systems originally equipped with R = 3 resource nodes, where after every 10, 000 time-slots one of the resources may either be taken offline for calibration or an offline resource may be returned to service.Fig. 3 demonstrates that the RCP successfully reestablishes convergence of Σ s λ s (t n ) about λ ′ EGS following these constraint changes in an EGS system connected to N = 50 nodes, serving |S| = 123 communication sessions.
In Fig. 3 we record the sequence of convergence times, {∆τ }, after each constraint change as the first time-steps where Σ s λ s (t n ) crosses λ ′ EGS .To calculate the tightness of convergence, δ, we first calculate the sequence of {δ ′ }, the size of the maximum fluctuations about λ ′ EGS following each ∆τ ′ and set δ = max({δ ′ }).Notably, every subsequent ∆τ ′ < ∆τ and the achieved δ is equal to that observed when there are no changes to the constraint set in Fig. 2 (middle plot, δ 2 ) for an EGS with the same number of nodes, serving the same number of communication sessions.Additional simulations of EGS systems connected to various numbers of nodes ranging from 10 to 100, with random changes to the number of resources after every 10, 000 time-steps, suggest that the data in Fig. 3 is representative.Specifically, in each case investigated the absolute relative difference, between the achieved tightness of convergence when there are (δ) and are not ( δ) changes to the constraints is less than 1.
The constraints {λ u } u on the capabilities of nodes appear in (14) and therefore affect both the prices calculated by the nodes and the rates set by communication sessions in (15).Since these constraints limit the total rate at which a node can submit demands summed across all of the communication sessions in which it participates, it is expected that uniform settings of {λ u } u yield rate vectors under the RCP where {λ s (t n )} s are approximately uniform.In contrast, if the node constraints are non-uniform amongst the nodes, it is expected that the RCP yields rate vectors with larger differences between the rates set by each communication session.In Fig. 4 we investigate the effect of different settings for these constraints by plotting the difference between the average maximum max s {λ s (t n )} s and minimum min s {λ s (t n )} s communication session rates yielded by the RCP for two different settings of the constraints.In the first setting, node constraints are set uniformly as λ u = (|S| − 1)/2 • p gen ∀u so that in practice the algorithm functions as if the network node constraints have been removed.In the other setting there are three possible constraint values: a quarter of the nodes sampled at random have λ u = 1.5 • p gen , half of the nodes have λ u = p gen , and a quarter of the nodes have λ u = 0.5 • p gen .Fig. 4 confirms that the difference between the average maximum rate and the average minimum rate requested by any session at time-step t n is one or more orders of magnitude larger when nodes are associated with the nonuniform constraint set.The uniform node constraint setting led to communication sessions updating their rates of demand submission to be nearly uniform across all communication sessions.

V. DISCUSSION
We have presented the first control architecture for an EGS.The architecture is tailored to a simple system model.As a natural extension of this work, a refined version of the control architecture can be developed to suit a more versatile physical model.In the following discussion, we explore considerations for the development of a second generation control architecture.
In this work we assume a demand model in which user generated demands are fully parameterized by a desired rate of entanglement generation.Specifically, every communication session s sets λ s (t n ), updated once per time-slot and specifies the constraint parameter λ min s which defines the minimum rate of entanglement generation the communication session must receive in order to enable some target application.While this model is mathematically simple, it may not fully address real application requirements on a physical quantum network.Real applications may require the simultaneous existence of a number of entangled pairs, each with some minimum fidelity and it is possible that applications need such packets of pairs to be supplied periodically over a longer application run-time.In the future, it may therefore be relevant to consider a demand model wherein communication sessions submit demands for packets of entanglement generation.A packet would be fully specified by the desired number of entangled pairs, a minimum fidelity for the pairs, some maximum window of time between the generation time of the first and last entangled pair of the request, and possibly some rate at which the demand with the preceding parameters should be repeatedly fulfilled.
The discussed model assumes that user controlled nodes can engage in multiple entanglement generation tasks in parallel.We do not impose restrictions on simultaneously scheduling communication sessions.Hence, it is possible for communication sessions s and s ′ with node u ∈ s, s ′ to be scheduled simultaneously.Additionally, we consider the option of assigning multiple resource nodes to a sin- resources.After every 10, 000 time-slots, one of the resources may either be taken offline for calibration or an offline node may be returned to service.Black dashed lines indicate the convergence, ∆τ calculated for every R ′ (initially R).We observe and overall tightness of convergence of δ = 0.035, identical to that observed in Fig. 3 for the EGS operated with fixed R = 3 and with the same N, |S|.
Step-sizes (θ c , θ u ∀u) were all 1/(10 • λ EGS ).gle communication session in any time-slot.Therefore, we consider nodes with an unrestricted number of qubits and independent physical connections to the EGS.A subtlety we do not address here is that allocating multiple resources to a single communication session may require temporal multiplexing in the scheduling of individual entanglement generation attempts, especially when the multiple qubits of a single node are coupled to the physical connection via a single output.Furthermore, for nodes consisting of a single quantum processor, it may not be possible to calibrate the node to simultaneously engage in entanglement generation attempts with multiple partner nodes, even if the node has unlimited qubits.To capture this physical feature, it will be interesting to include the restriction of scheduling only non-overlapping communication sessions in the design of scheduling routines for future EGS control architectures.
The control architecture for an EGS relies on precise timing synchronization.Our model assumes that at both the control and physical layers, all communication sessions can adhere to the time slots defined by the EGS processor.Tight synchronization of timing is possible at the physical layer, which controls the quantum devices and coordinates the exact timing of entanglement generation attempts.However, tight timing synchronization of any type of classical communication may be a considerable challenge in any real world application.In particular, such coordination is a serious challenge if there are non-uniform communication times between any of the nodes and the EGS or between any of the node pairs.To reduce the timing requirements and possibly make the control architecture delineated here executable on a real-world system, it is possible to consider the processor of the EGS simulating the actions of the nodes.To do so, the processor would locally run the RCP and simulate the generation of demands originating from the user operated nodes by simply adding demands to the queues based on the rates output by the RCP.Such an approach trades the difficulty of timing synchronization for the requirement of increased power of the classical processor at the EGS.To reduce the need for timing synchronization, a second generation architecture may be designed which does not rely on fixed, centrally defined time slots.

A. PROOFS 1) Outline of goals to prove
In this section we will prove two theorems to establish the results quoted in the main body of the article.The results are as follows: 1) The capacity region of the EGS is the set of demand arrival rate vectors fully contained in the set C (8) and maximum weight scheduling (Definition II.6) supports any rate vector from within C (Theorem II.1).To establish the capacity region, we first prove a proposition stating that any rate vector λ ∈ C necessarily results in divergent queues.We then prove a second proposition establishing at once that any rate vector λ ∈ IntC is supportable under some scheduling algorithm and that maximum weight scheduling is such a scheduling algorithm.Therefore, we also demonstrate that maximum weight scheduling is throughput optimal.2) The RCP, Algorithm 1, results in the calculation of a sequence of rate and price vector pairs λ(t n ), p(t n ) which converge to optimal solutions λ, p of the primal and dual problems, defined in Section III (Theorem III.1).
2) Proof of Theorem II.1 First it is to be shown that no rate vector λ ∈ C of an EGS with R resources is supportable under any scheduling algorithm.
Proof.There are three cases where λ ∈ C, 1) 3) λ is not non-negative ( ∃ λ s * < 0 for at least some s * ∈ S).In the third case, the node pair corresponding to session s * has set a non-physical rate and the rate must be changed.The proof for case (2) is very similar to case (1) and equations from the first case are re-used or modified to complete the proof of case (2).The main strategy of the proof relies on where we use the distribution property of limits, which is possible because the individual limits (34) exist.Finally, by assumption (31) and (32, 33) and ( 35 Therefore, with probability 1, λ is not supportable, regardless of scheduling algorithm.Proposition V.1 (2): Suppose that λ s * > x s * • p gen for some s * ∈ S.Then, ∃ ϵ > 0 such that, In this case, we show that λ is not supportable by proving that the queue q s * (t i ) of demands associated with communication session s * diverges for large t i .Recall (34) and note M s (t i ) ≤ x s ∀ s, ∀ t i .This inequality describes that a maximum of x s heralding stations can be allocated any communication session s in t i .With this restriction, (34) becomes, Combining assumption (37) using (33), (38), and making repeated use of (6), Therefore, with probability 1, q s * (t n+1 ) → ∞ as n → ∞.Hence λ is not supportable.Proposition V.1 proved that rate vectors λ ∈ C are not in the capacity region of the EGS.To finish proving C is the capacity region of the EGS (Theorem II.1), it is necessary to prove that any rate vector λ ∈ C is supportable under some scheduling algorithm.To do so, we prove that the specific scheduling algorithm of Maximum Weight Scheduling (Definition II.6) supports all arrival rate vectors fully contained in C.
Modelling a queue vector as a Markov chain is a standard tool in queuing theory [23].This approach makes it possible to take advantage of the many strong analytic results on the behaviour of Markov chains, which can then be used to make statements about the queue vector.The vector q(t n ) = q s (t n ∀ s) of queued demands from each communication session maintained in the processor at t n can be modelled as a Markov chain, with transitions given by ( 6).An irreducible Markov chain has the property that any state i of the chain is reachable from any other state j.A positive recurrent Markov chain has the property that from any state i, the expectation value of the time it will take to re-visit any other state j is finite.A queue vector, with specified dynamics, that can be modelled as an irreducible Markov chain with the property of positive recurrence will not diverge (i.e. is guaranteed to remain a finite queue) [23].The dynamics of such a queue vector are fixed by the arrival rate vector and the scheduling routine, therefore if a queue vector can be modelled as a positive recurrent Markov chain, the arrival rate vector is supportable by the scheduling routine.To prove Proposition V.2 we demonstrate that the queue vector is an irreducible Markov chain and use the Foster-Lyapunov Theorem to prove that whenever λ lies strictly within C the Markov chain is also positive recurrent.An equivalent statement is that all rate vectors lying strictly within C are supportable by some scheduling algorithm.Theorem V.1 (Foster-Lyapunov Theorem [23]).Let {X k } be an irreducible Markov chain with a state space S. Suppose that there exists a function V : S → R + and a finite set B ⊆ S satisfying the following conditions: 1 Proof of Proposition V.2.First we establish that the queue vector, q(t i ) ∀ t i is an irreducible Markov chain.The queue vector, q(t i ) is a Markov chain with state space S = {q : q is reachable from 0 under the given scheduling algorithm}.
Assume that q(t 1 ) is finite and q(t 1 ) ∈ S. It follows from the definition of S that q(t i ) ∈ S ∀t i if q(t 1 ) ∈ S. Irreducibility of q(t i ) ∀t i requires that any state q(t j ) is reachable from any other state q(t i ).By the definition of the state space S, it suffices to demonstrate that from q(t i ), the Markov chain can always return to 0. Under Maximum Weight scheduling (Definition II.6), the processor always serves k(t i ) demands per time-slot, where where |q s (t i )| is the number of demands in the queue for session s in time-slot t i and x s is the maximum number of resource modules that can be allocated communication session s per time-slot.Hence when q(t i ) is non-zero, at least one demand and up to R demands are served per time-slot.Therefore, from any q(t i ) ∈ S, q(t i+l ) = 0 is reachable from q(t i ) in l ∈ {⌈ |q(ti) R ⌉, ⌈ |q(ti) R ⌉ + 1, • • • , |q(t i )|} time steps, where |q(t i )| := Σ s |q s (t i )|.Since any other q(t j ) ∈ S is then reachable from 0, it follows that q(t i ) is irreducible.To prove that λ is supportable, it suffices to demonstrate that q(t i ) is positive recurrent.
Define the Lyapunov function To apply the the Foster-Lyapunov theorem (V.1), the key quantity is the drift of L q(t i ) .Using the queue update dynamics (6), the drift can be expanded as L q(t i+1 ) − L q(t i ) = 1 2 s q s (t i ) + a s (t i ) − g s (t i ) Taking the conditional expectation of the Lyapunov drift with respect to the randomness of arrivals and the probabilistic success of scheduled demands, E L q(t i+1 ) − L q(t i ) | q(t i ) = q ≤ 1 2 s E a s (t i ) − g s (t i ) 2 | q(t i ) = q + s E q s (t i ) a s (t i ) − g s (t i ) | q(t i ) = q , (42) where q ∈ S is a particular queue vector.Using a s − g s 2 ≤ a 2 s + g s 2 and the linearity of expectation, the first term of the conditional expectation can be rewritten, Recall that the conditional expectation of the Lyapunov drift is taken with respect to the randomness of the arrival processes as well as the success of scheduled demands.The schedule selected for a given time-slot depends on the queues, but the success of any scheduled demand does not.The conditional expectation of pair production for communication session s can be re-written as, E[g s (t i ) | q(t i ) = q] = p gen • E[M s (t i ) | q(t i ) = q].(47) Recall that M denotes the schedule decided under the maximum weight scheduling policy, II.(48) Consider a scheduling policy M which schedules each session at a rate of λs+ϵ pgen (this is possible since, by assumption, (1 + ϵ)λ ∈ C).Such a scheduling policy is aware of the demand arrival rates to each queue but is not demand based (i.e. it does not use queue information in deciding the schedule).Hence, Assume ∇f satisfies the Lipschitz condition with Lipschitz constant L and consider the gradient projection iteration, with a constant step-size γ in the range 0, 2 L .Then every limit point x of the generated sequence {x k } satisfies the optimality condition: ∇f (x) T (x − x) ≥ 0, ∀ x ∈ X.

FIGURE 1 :
FIGURE 1: EGS Architecture: a) EGS structure: An EGS with R = 4 resources connected to N = 9 nodes.The EGS is controlled by a classical processor and consists of a switch, resources, and physical connections.Nodes have quantum communication channels to the switch and classical communication channels to the processor.b) Resource Allocation: The switch opens connections to link nodes 1, 2 and resource 1.For example, the connections may consist of direct optical fiber paths from the nodes to the switch and from the switch to the resource, via an interface at the switch.This establishes the physical allocation of resource 1 to the communication session of nodes 1, 2 for time slot t n .c) Quantum communication sequence: Node-to-processor communication in time slot t n with a batch size of three entanglement generation attempts.d) Concurrent classical communication sequences: Nodes and the processor communicate in time slot t n , governing resource allocation and the RCP (see Algorithm 1 for RCP details.)

FIGURE 2 :
FIGURE 2: The RCP drives the sum of the demanded rates of entanglement generation across all communication sessions, Σ s λ s (t n ), to converge with respect to the sequence of time slots to the maximum average entanglement generation rate of the EGS, λ EGS .The EGS has R = 3 resources, the probability of entanglement generation is p gen = 0.05, and the EGS is connected to N = 20 (top), N = 50 (middle) and N = 100 (bottom) nodes.The total number of communication sessions served are |S| = 19, 123, and 495 in the top, middle, and bottom plots, respectively.Black dotted lines indicate the convergence times, ∆τ .The observed values for the tightness of convergence, δ, are δ 1 = 0.12, δ 2 = 0.035 and δ 3 = 0.012.Step-sizes (θ c , θ u ∀u) were all 1/(40 • λ EGS ).

FIGURE 3 :
FIGURE 3: In response to changes in the number of resources available at the EGS (R → R ′ ), the RCP drives the sum of the demanded rates of entanglement generation across all communication sessions, Σ s λ s (t n ), to converge with respect to the sequence of time slots to the updated maximum average entanglement generation rate of the EGS, λ EGS = R ′ • p gen .In simulation, an EGS connected to N = 50 nodes, serving |S| = 123 communication sessions, is originally equipped with R = 3resources.After every 10, 000 time-slots, one of the resources may either be taken offline for calibration or an offline node may be returned to service.Black dashed lines indicate the convergence, ∆τ calculated for every R ′ (initially R).We observe and overall tightness of convergence of δ = 0.035, identical to that observed in Fig.3for the EGS operated with fixed R = 3 and with the same N, |S|.Step-sizes (θ c , θ u ∀u) were all 1/(10 • λ EGS ).

FIGURE 4 :
FIGURE 4: Differences between the average maximum rate and average minimum rate requested by any communication session in time-slot t n , for an EGS connected to N = 20 (top), N = 50 (middle) and N = 100 (bottom) nodes serving |S| = 19, 123, and 495 communication sessions, respectively.As described in the main text, nodes are either associated with a uniform and effectively un-restricted set of capabilities or a non-uniform and more restricted set of capabilities.Step-sizes (θ c , θ u ∀u) were all 1/(40 • λ EGS ).
6. Allow M to denote a schedule that is decided by any other scheduling policy.It follows from Definition II.6 that,s qs • E[Ms(ti) | q(ti) = q] ≥ s qs • E[ Ms(ti) | q(ti) = q].

TABLE 1 :
Inventory of notation introduced in Section II • • • , Ms(tn)} communication session s in tn C the set which has the capacity region of the EGS as interior R + λ EGS the maximum total rate that can be delivered, on average, by the EGS, λ EGS = R • pgen Then, noting that the arrivals are independent of the state of the queues, using the definition of variance and E[a s (ti )] = λ s , E[a 2 s (t i ) | q(t i ) = q] = E[a 2 s (t i )] = σ 2 s + λ 2 43) Recall that g s (t i ) ≤ M s (t i ) ≤ x s ∀ s, ∀ t i .Hence, s E g s 2 (t) | q(t) = q ≤ s x 2 s .(44)Define the variance in the arrivals to the queue of session s, σ 2 s := Var[a s (t i )].