Multi-dimensional State Space Collapse in Non-complete Resource Pooling Scenarios

The present paper establishes an explicit multi-dimensional state space collapse (SSC) for parallel-processing systems with arbitrary compatibility constraints between servers and job types. This breaks major new ground beyond the SSC results and queue length asymptotics in the literature which are largely restricted to complete resource pooling (CRP) scenarios where the steady-state queue length vector concentrates around a line in heavy traffic. The multi-dimensional SSC that we establish reveals heavy-traffic behavior which is also far more tractable than the pre-limit queue length distribution, yet exhibits a fundamentally more intricate structure than in the one-dimensional case, providing useful insight into the system dynamics. In particular, we prove that the limiting queue length vector lives in a K-dimensional cone of which the set of spanning vectors is random in general, capturing the delicate interplay between the various job types and servers. For a broad class of systems we provide a further simplification which shows that the collection of random cones constitutes a fixed K-dimensional cone, resulting in a K-dimensional SSC. The dimension~K represents the number of critically loaded subsystems, or equivalently, capacity bottlenecks in heavy-traffic, with K=1 corresponding to conventional CRP scenarios. Our approach leverages probability generating function (PGF) expressions for Markovian systems operating under redundancy policies.


Introduction
In the present paper we analyze the heavy-traffic behavior of parallel-processing systems with redundancy policies in scenarios that go beyond the conventional complete resource pooling (CRP) condition.In particular, we provide the first explicit characterization of a multi-dimensional state space collapse and the associated (scaled) steady-state queue length vector in fairly general non-CRP scenarios.
While there are several formulations in the literature, broadly speaking the CRP condition entails that the system does not experience any local capacity bottlenecks, and operates in heavy traffic as if there is only a single global resource constraint in force.Thus, the system behaves as a single-server queue with fully pooled resources in a critical-load regime, and the steady-state queue length vector typically concentrates around a line, thus exhibiting a state space collapse (SSC) which yields far greater tractability compared to the pre-limit queue length distribution.
As alluded to above, the CRP condition not only provides an appealing design objective but is also instrumental in facilitating a detailed analysis of the heavy-traffic behavior.Indeed, non-CRP scenarios have eluded an explicit derivation of the limiting queue length distribution in any degree of generality so far, yet such scenarios may naturally arise in various situations.
In order to illustrate this, let us focus for the moment on the simplest possible setting of a system with two job types and two servers.For convenience, suppose that type-i jobs arrive as a Poisson process of rate λ i = (1 − ϵ)µ i and have independent and exponentially distributed service requirements, with µ i denoting the processing speed of server i, i = 1, 2. The parameter ϵ ∈ (0, 1) may be interpreted as a relative capacity slack in a baseline scenario where type-i jobs can only be processed by server i, i = 1, 2, and both queues are dimensioned to operate at a relative load of 1 − ϵ.Denote by (Q 1 , Q 2 ) the steady-state queue length vector, with Q i counting the number of type-i jobs, i = 1, 2. In this complete partitioning scenario, the system decomposes into two independent M/M/1 queues, and for any non-anticipating and non-preemptive service discipline it holds that ϵ(Q 1 , Q 2 ) d → (U, U ′ ) as ϵ ↓ 0, with d → denoting convergence in distribution, and U and U ′ representing two independent and exponentially distributed random variables with unit mean.
In contrast, in a complete sharing scenario where both job types can be processed by either server, the system operates as a single M/M/1 queue with arrival rate λ 1 + λ 2 = (1 − ϵ)(µ 1 + µ 2 ) and service rate µ 1 + µ 2 .In that case, it holds for any non-anticipating and non-preemptive service discipline that does not distinguish between the job types that as ϵ ↓ 0, with p i = λi λ1+λ2 = µi µ1+µ2 the fraction of type-i jobs, i = 1, 2, and U an exponentially distributed random variable with unit mean.Thus the (scaled) queue length vector concentrates around a line with slope (p 1 , p 2 ) in heavy traffic, which is a manifestation of a (one-dimensional) SSC.Note that the (scaled) total number of jobs has the same unit exponential distribution as the number of jobs of each individual type in a complete partitioning scenario, and that the (scaled) number of type-i jobs is now stochastically smaller by a factor p i ≤ 1, i = 1, 2, reflecting the performance gains in the complete sharing scenario.
In many situations, however, it may not be feasible for all job types to be handled by all servers.Indeed, some servers may either be able to only handle generic jobs, or be customized to handle highly specialized jobs, while other servers may be highly flexible but costly or scarcely available.In the above setup, suppose that type-1 jobs can be handled by either server 1 or 2, while type-2 jobs can be handled by server 2 only.In hospital environments, one could imagine that brain or heart surgery patients can only be assigned to a specialized ward with intensive care, while orthopedic treatment patients can be accommodated anywhere.This scenario is commonly referred to as the 'N-model' in the literature in view of the compatibility graph between the job types and the servers as depicted in Figure 1.
As a special case of our results for systems with arbitrary compatibility constraints operating under redundancy policies (as further specified below), it follows for a FCFS service discipline that with U and U ′ two independent and exponentially distributed random variables with unit mean.Observe that the queue length vector no longer concentrates around a line, but lives in a two-dimensional cone spanned by the vectors (p 1 , p 2 ) and (0, 1).Thus, the two queue lengths are no longer perfectly correlated as in the complete sharing scenario, but are still coupled in a subtle manner, and not completely independent as in the complete partitioning scenario.Indeed, the system exhibits a SSC, since the cone is a subspace of the full two-dimensional state space of the pre-limit queue length, even though it has the same dimension in the particular case of the N-model.Further note that the (scaled) total number of jobs is distributed as U + U ′ in the limit, just like in the complete partitioning scenario, and that the (scaled) number of type-2 jobs is stochastically larger now, since they only have access to server 2 and face competition from type-1 jobs at that server.In contrast, the (scaled) number of type-1 jobs is stochastically reduced, since they still enjoy exclusive access to server 1, but can also compete for access to server 2.
In the present paper we establish an explicit multi-dimensional SSC and heavy-traffic limits similar to (1) for Markovian parallel-processing systems with arbitrary compatibility constraints operating under so-called redundancy policies.Such compatibility constraints are prevalent in a wide variety of societal and technological systems due to heterogeneity in service demands and resources.Prominent examples include data centers and cloud networks (due to data locality issues), ride-sharing platforms and housing allocation agencies (due to spatial proximity considerations), multi-skill service systems with varying degrees of server flexibility, and organ transplantation networks.
We use probability generating function (PGF) expressions to provide an explicit characterization for the distribution of the limiting queue length vector, which reveals that this queue length vector lives in a K-dimensional cone spanned by a set of possibly random vectors, capturing the subtle interaction between (a) Sketch of the system.the various job types and servers.For a broad class of systems we provide a further simplification, from which we deduce that the collection of random cones forms a fixed K-dimensional cone.The dimension K then represents the number of critical components, or equivalently, capacity bottlenecks in heavy traffic, with K = 1 corresponding to conventional CRP scenarios.If we denote the spanning vectors of the fixed Kdimensional cone by α 1 , . . ., α K , then the limiting (scaled) queue length vector can be represented as a linear combination of these vectors with K independent exponentially distributed random variables associated with the K critically loaded subsystems acting as scalar coefficients.Thus, the number of jobs of a particular type i is a weighted sum of these random variables, with the weight factors α i,1 , . . ., α i,K representing the load fractions that type-i jobs account for in each of the subsystems.
In the N-model discussed above, there are K = 2 critical components: the subsystem consisting of type-2 jobs and server 2, and the 'subsystem' comprising the entire system.Thus type-1 jobs only belong to the second 'subsystem' where they account for a fraction of the load p 1 , translating into the term p 1 U in (1), while type-2 jobs belong to both subsystems, making up the full load in the first one and accounting for a fraction of the load p 2 in the second one, yielding the term p 2 U +U ′ in (1).Further observe that the complete sharing scenario satisfies the CRP condition with K = 1, while the complete partitioning scenario amounts to a somewhat degenerate case with K = 2 critical components which are in fact entirely independent.
Heavy-traffic analysis has a long and rich history, dating back to the pioneering work of Kingman [30,31] focusing on the single-server queue.Subsequent work pursued heavy-traffic limits of an increasingly wide range of stochastic systems, such as queueing networks [10,29,33,44], parallel-server systems under various static and dynamic assignment policies [8,9,19,20,35,39,45], stochastic processing networks [13], bandwidth-sharing networks [28] and generalized switches [22,23,37,38].As testified by these papers, heavy-traffic theory provides a powerful approach to analyze (scaled) queue lengths and delays, yielding valuable insight in the key performance characteristics of a variety of complex systems which would be mostly intractable otherwise.
The above set of references is far from exhaustive and a detailed review of the huge literature on heavytraffic limits is beyond the scope of the present paper.Broadly speaking, however, the bulk of the literature pertains to process-level limits and/or systems that satisfy the CRP condition mentioned earlier.To the best of our knowledge, explicit results for steady-state queue length distributions in non-CRP scenarios have remained extremely scarce so far.
For example, Kang et al. [28] derive process-level diffusion limits for Markovian bandwidth-sharing networks operating under α-fair rate allocation policies, which generally do not satisfy the CRP condition.For the special case of linear topologies and α = 1 (Proportional Fairness) it is shown that, under certain local traffic assumptions, the steady-state distribution of the limiting diffusion process exhibits a product-form.This result was later extended by Vlasiou et al. [42], allowing for phase-type job size distributions and more general topologies.The authors in [28] conjecture that this product-form carries over to the heavy-traffic limit of the steady-state queue length distribution of the original system.A rigorous proof of such an interchange-of-limits was later provided by Wang et al. [43].
A different strand of more recent research on generalized switch models operating under a MaxWeight scheduling algorithm does directly target steady-state heavy-traffic results in non-CRP scenarios.Hurtado-Lange & Maguluri [24] adopt techniques from [14] to prove a multi-dimensional SSC, and show how this property can be used to derive (scaled) first moments of certain linear combinations of queue lengths.They observe however that this approach cannot be readily extended to obtain distributional results for the joint queue length vector due to the presence of unknown cross terms.In order to address the latter challenge, Jhunjhunwala & Maguluri [26,27] extend the transform method developed in [23] to non-CRP scenarios to provide an implicit characterization of the limiting queue length distribution for the class of input-queued switches, along with an explicit characterization in the special case of the above-described N-model under additional symmetry and uniqueness assumptions.A further discussion will be provided in Subsection 4.1, after presenting our results.Varma & Maguluri [40] focus on the performance impact of the structure of the compatibility constraints in the above context, and present algorithms for selecting the compatibility relations so as to obtain the desired dimension of the SSC or for identifying job server pairs that might decrease the dimension of the collapsed limiting distribution.
A further body of work that is targeted at steady-state heavy-traffic results pertains to Markovian parallelserver systems with a First-Come First-Served (FCFS-ALIS) assignment policy, which can equivalently be thought of as a Join-the-Smallest-Workload (JSW) policy.Specifically, Afèche et al. [3] and Hillas et al. [21] show that in a heavy-traffic regime the system decomposes into several components or subsystems each experiencing a critical load, and present expressions for the expected waiting times of jobs in the various components.
The model set-up that we adopt strongly resembles that in [3,21] and also permits arbitrary compatibility constraints between servers and job types.Rather than deriving expected waiting times, however, we focus on establishing a multi-dimensional SSC which yields distributional results for the joint queue length as well as delays (along with higher moments).
We consider redundancy policies which provide a natural class of job assignment mechanisms in the presence of compatibility constraints.Their key feature is to make copies or replicas of every arriving job and to assign one copy to each of its compatible servers, with the aim to exploit the variability in queue lengths and/or service times encountered by the different copies of the same job.As soon as one copy either terminates or initiates service the remaining copies are discarded, which is referred to as the cancelon-completion (c.o.c.) or cancel-on-start (c.o.s.) version of the redundancy policy, respectively.The latter version is equivalent to the JSW policy mentioned above.
Besides their natural fit to handle compatibility constraints, redundancy policies come with the remarkably luxury of mathematical tractability.Specifically, product-form expressions for the stationary distribution of the system occupancy in Markovian settings were derived by Gardner et al. [18] and Ayesta et al. [7], and are closely related to earlier results by Visschers et al. [41] and Adan & Weiss [2].An overview of related models involving job-server compatibility constraints that yield product-form expressions is provided by Gardner & Righter [17].
At first sight, the availability of such product-form expressions may seem to defeat the entire rationale of a heavy-traffic analysis.However, the involved state description of the system occupancy is so detailed that the product-form expressions unfortunately cannot be directly used for analyzing key performance metrics like queue lengths or delays [1,16,41].Our approach therefore leverages more convenient PGF expressions for the joint queue length vector [11] which pave the way for the analytical derivation of heavy-traffic limits and additionally lend themselves to a more insightful probabilistic interpretation in terms of geometrically distributed random variables.Closer inspection then not only allows for an alternative probabilistic derivation of the heavy-traffic behavior, but also illuminates that the K independent exponentially distributed random variables in the stochastic representation of the steady-state queue length vector reflect the contributions to the overall queue length from the K critically loaded subsystems.Building on the product-form expressions (as opposed to proving process-level heavy-traffic limits) offers the further advantage that convergence of the steady-state queue length distribution comes for free, without the need to prove an interchange-of-limits which usually is a significant challenge.
The remainder of this paper is organized as follows.The detailed model description and preliminary results can be found in Sections 2 and 3, respectively.The main results for the c.o.c.policy are outlined in Section 4, while those for the c.o.s.policy are deferred to Appendix D. The pre-limit characterization of the queue length vector and an alternative proof of the main heavy-traffic results, using stochastic arguments and a probabilistic interpretation of the PGF expressions are presented in Appendix E. We conclude with some topics for further research in Section 5.

Model description
Consider a system consisting of N parallel servers, each with their own dedicated waiting line, where jobs arrive according to a Poisson process with rate N λ, λ > 0. The servers, indexed as 1, . . ., N , process assigned jobs in order of arrival at speed µ n > 0, for n = 1, . . ., N .The average processing speed per server is denoted by The jobs can be categorized into different classes or types based on the specific subset of servers they are compatible with.In particular, a job that is compatible with all servers in S ⊆ {1, . . ., N } is labeled as a type-S job.The fraction of jobs that receive a type-S label is denoted by p S , and the collection of all job types is referred to as S := {S ⊆ {1, . . ., N } : p S > 0}.The arrival process can equivalently be seen as |S| independent Poisson processes with rates {λ S = N λp S : S ∈ S} and corresponding type labels.
A bipartite graph can be constructed to represent the compatibility constraints of the various job types, with a node for each job type and each server.A job type S is connected to server n whenever n ∈ S, for all S ∈ S and n = 1, . . ., N .
We denote the total number of type-S jobs, and the total number of waiting type-S jobs by Q S and QS , respectively, for all S ∈ S.
The arriving jobs are assigned to the various servers according to a redundancy policy.In particular, copies of the job are made for each of the compatible servers and forwarded to the respective waiting lines.Each copy is assumed to have an exponentially distributed service requirement with unit mean.The remaining (or redundant) copies are removed once one copy either finishes its service or initiates its service.The former setting is referred to as cancel-on-completion (c.o.c.) and the latter as cancel-on-start (c.o.s.).The aim of the redundancy policy is to exploit the variability in the queue lengths at different servers and the service requirements of different copies of the same job, which are assumed to be independent in case of the c.o.c.mechanism [1,7,18,41].

Stability and pre-limit results
In this section we elaborate on some of the known results for the system described in the previous section which will form the foundations for the analysis in Section 4.
First, the conditions below are necessary and sufficient in order for the system to be stable [18,41].
Condition 1 (Stability).The system described in Section 2 is stable if for all non-empty subsets of job types T ⊆ S, N λp(T ) < µ(T ), where p(T ) := S∈T p S denotes the fraction of jobs with labels in T and µ(T ) := n∈T µ n denotes the aggregate service rate of all servers that are compatible with at least one job type in T .
Under these stability conditions, product-form expressions for the steady-state distribution of the system state were derived by Gardner et al. [18], Ayesta et al. [7] and Visschers et al. [41].The downside is that these expressions involve a very detailed state descriptor and offer little insight in the system performance.In particular, the product-form expressions have a factor for each (waiting) job in the system and hence give some information about the steady-state distribution of the (ordered and centralized) waiting line.However, it is not immediately clear how to gain more global understanding of the system dynamics, for instance in terms of the compatibility constraints or the total numbers of jobs of the various types.
The joint probability generating function (PGF) of the numbers of jobs of the various types was derived in [11] by suitably aggregating the states.For completeness, the PGFs for the c.o.c. and the c.o.s.mechanisms are provided in Propositions 3.1 and A.1 (in Appendix A.1), respectively.
where z and 1 are |S|-dimensional vectors with entries |z S | ≤ 1 and ( The m-dimensional vector S consists of m different job types, and the set consisting of all these vectors is denoted by S m [11]. To ease the notation throughout this paper we introduce the following notation when focusing on the first j entries of the vector S, p(S, j) := j i=1 p Si and µ(S, j) := µ(S 1 , . . ., S j ) =

Heavy-traffic regime and non-complete resource pooling
We are mainly interested in the system performance when the average arrival rate per server, λ, approaches a critical value.
Definition 1 (Critical arrival rate).The critical arrival rate of the system, λ * , for given values of µ n , n = 1, . . ., N and p S , S ∈ S, is defined as 1 N min T ⊆S {µ(T )/p(T )}, with µ(T ) and p(T ) as defined in Condition 1.
Considering model parameters that satisfy the stability conditions (Condition 1), when the arrival rate λ is increased, (N λp S ) S∈S will reach the boundary of the stability region when λ becomes equal to the associated critical arrival rate λ * .To specify which part of the boundary of the stability region is reached, we define the critical subset(s) of job types.
Definition 2 (Critical subset of types).The subset T ⊆ S of job types is called critical if with µ(T ) and p(T ) as defined in Condition 1.The collection of all critical subsets of job types is denoted by CR(S).
A common assumption in the heavy-traffic literature is formalized below.
Condition 2 (CRP).The system described in Section 2 satisfies the complete resource pooling (CRP ) condition if there is a unique critical subset of job types, i.e., CR(S) = {T * } for some T * ⊆ S. Furthermore we distinguish between weak and strong CRP depending on whether T * ⊊ S or T * = S, respectively.
Notice that the condition for strong CRP is equivalent to p(T ) < µ(T )/(N µ), for all strict subsets of job types T ⊊ S.Moreover, the critical arrival rate is as large as possible, i.e., λ * = µ.We refer to [11,Appendix EC.1] for an in-depth comparison between the different notions of the CRP condition in the literature specialized to the setting of a parallel-server system, .
Definition 3 (non-CRP).All system settings as described in Section 2 that do not satisfy the CRP condition (Condition 2), are referred to as non-CRP scenarios.
For these non-CRP scenarios we introduce the notions of depth of the critical subsets and k-critical vectors to study the critical subsets of job types in more detail.
Definition 4 (Depth of the critical subsets).The depth K of the critical subsets of job types is defined as . ., T k ∈ CR(S)} .Thus K represents the maximum number of critical subsets that can be nested into each other.
Note that the stability region can be thought of as an intersection of half spaces using the inequalities in Condition 1, where K coincides with the number of half spaces that the critical arrival rate vector (N λ * p S ) S∈S reaches simultaneously.Moreover, depending on the model parameters, K can take any integer value between 1 and |S|.
The following connection between Condition 2 and Definition 4 can be made: Proof.To prove the equivalence, we exploit the definition and properties of the collection of all critical subsets CR(S).The first statement obviously implies the second statement since |CR(S)| = 1.For the reverse statement we show that K will always be strictly larger than 1 if there is more than one critical subset of job types, i.e., |CR(S)| > 1.Let T 1 , T 2 ∈ CR(S) be two distinct critical subsets of job types.We distinguish two scenarios: T 1 ∩ T 2 ∈ CR(S) and T 1 ∩ T 2 / ∈ CR(S).In the former setting it is clear that T 1 ∩ T 2 ⊊ T 1 and hence K ≥ 2. We show that the latter setting contradicts the definitions of the critical arrival rate λ * and the collection of all critical subsets of job types (Defintion 1) and the stability condition (Condition 1).If This violates the stability constraint induced by the subset of job types T 1 ∪ T 2 .The latter inequality follows from the fact that µ(T 1 ∪ T 2 ) + µ(T 1 ∩ T 2 ) ≤ µ(T 1 ) + µ(T 2 ).Indeed, if server n is compatible with a job type in T 1 ∩ T 2 , then it is compatible with a job type in T 1 and a job type in T 2 such that the term µ n will occur twice on both sides of the inequality.If server n is compatible with a job type in T 1 ∪ T 2 but not with a job type in T 1 ∩ T 2 , then there exists at least one job type in T 1 or T 2 that server n is compatible with.So, since server n contributes once to the left hand side, and at least once to the right hand side, the above inequality holds.This concludes the proof.
As can be seen in Proposition 3.1, the expression for the PGF of the numbers of jobs of the various types depends on an enumeration of all vectors whose entries are distinct job types, i.e., all m-dimensional vectors S = [S 1 , . . ., S m ] ∈ S m with m = 1, . . ., |S|.We now introduce a different perspective on the ordered vectors of job types that focuses on their relation with the critical subsets rather than their lengths.
Definition 5 (k-Critical vectors).Let T = [T 1 , T 2 , ..., T m ] ∈ S m be an ordered vector with m distinct entries belonging to the set of job types S.
In other words, T induces k critical subsets when combining the respective job types one by one.The collection of all k-critical vectors is denoted by N k and N := ∪ K k=0 N k .Moreover, for a fixed k-critical vector T ∈ N k we define CR(T ) := {i : T 1 ∪ • • • ∪ T i ∈ CR(S)} as the set that contains all indices i ∈ {1, . . ., |T |} such that aggregation of the first i entries of T yields a critical subset of job types.In particular, if such indices do not exist, i.e., CR(T ) = ∅, then T is referred to as a 0-critical vector.
Note that N k is empty for all k > K and that Example 1 (N-model).Figure 1 visualizes the N-model consisting of two job types and two servers and its stability region.The first job type is compatible with both servers, while the second job type is only compatible with the second server.Henceforth, the two job types are labeled as type-{1, 2} and type-{2} jobs, respectively.
We observe that there are three different heavy-traffic scenarios that can occur corresponding to the various edges and corner points of the stability region, labeled as I, II and III in Figure 1b.Table 1 summarizes these different heavy-traffic scenarios together with an illustration of the above definitions.
Table 1: An overview of the different heavy-traffic scenarios of the N-model for which S = {{1, 2}, {2}}, using definitions of Subsection 3.2.

CRP components
In this paper we pursue a heavy-traffic analysis for the entire boundary of the stability region and in particular extend the results from [11] allowing for boundary points to lie on an intersection of multiple faces of the stability region.These intersections appear when multiple subsets of job types simultaneously approach the stability conditions (Condition 1), i.e., there are several active capacity bottlenecks in heavy traffic.These kinds of scenarios yield a more intricate analysis and complex behavior, and are hence typically excluded from consideration in the literature.
To describe our results in Section 4 and to connect them to the work of Afèche et al. [3] which focuses on the c.o.s.mechanism, we first review some of the concepts and notation from [3].Then we will explain how our set-up with critical subsets fits their vocabulary.
Given a bipartite graph structure representing the compatibility constraints between job types and servers, the collection of edges that yield a stable system is referred to as a matching M .From this matching a residual matching M is obtained by solving a limiting (λ = λ * ) maximum-flow problem related to the system, then M consists of all edges with a non-zero flow in the optimal solution.Informally speaking, one can think of the residual matching M as a subset of the edges of the original matching M which will be dominant in the heavy-traffic regime.This residual matching M will induce a partitioning of the job types and servers in so-called CRP components.
Definition 6 (CRP components).Given the residual matching M , the system can be partitioned in K distinct connected components or CRP components.They are labeled C 1 , . . ., C K , and each component consists of a subset of job types C k and their asymptotically compatible servers Z k , i.e., C k := (C k , Z k ).
Remark 1.Note that Z k is a subset of the servers that the job types in C k are compatible with, i.e., Z k ⊆ T ∈C k T as the residual matching M is a subset of the original matching M .Using these CRP components, a directed acyclic graph (DAG) can be obtained by creating an edge (C i , C j ) whenever there are job types in C i that are compatible with any of the servers in C j [3, Lemma 2].Building on this DAG, the different CRP components can be ordered.
Definition 7 (Topological order).Let σ = (σ(1), . . ., σ(K)) be a permutation of {1, . . ., K}.Then C σ(1) , . . ., C σ(K) is called a topological order if for any directed arc (C i , C j ) associated with the above-mentioned DAG, it holds that σ −1 (j) < σ −1 (i).In words, if component C i can forward (some of ) its jobs to component C j , then component C j must occur earlier in the topological ordering.The set of all permutations yielding topological orders is denoted by Σ K .
Example 2. Consider the N-model in Example 1, assume that (p {1,2} , p {2} ) = (µ 1 , µ 2 )/(µ 1 + µ 2 ).Hence, scenario III will occur in the heavy-traffic regime (Table 1).The residual matching M is given by two edges, i.e., connecting job type {1, 2} to server 1 and job type {2} to server 2. This disconnects the compatibility graph and results in two CRP components, Since the job type in component C 1 is compatible with a server in component C 2 , the associated DAG is simply given by the edge (C 1 , C 2 ).Hence, the only permutation yielding a valid topological ordering of the CRP components is Σ 2 = {(2, 1)}.
Using the above definitions, we can trace the concepts of CRP components back to our critical subsets of job types (Definition 2).Construction 1.

Define for all
Remark 2. The obtained sets (T σ,k ) σ,k do not all have to be different.For instance, if λ * = µ, then the full set of job types S is a critical subset, or each job type is connected to at least one (compatible) server in the residual matching M .Hence, T σ,K ≡ S for all σ ∈ Σ K .Theorem 3.1.Construction 1 yields all critical subsets of job types, i.e., CR(S) The proof of Theorem 3.1 builds on arguments in the proofs of [3,Lemmas 3 and 4] which focus on the properties of the CRP components and is deferred to Appendix A.2.
Remark 3. As the subset of job types of different CRP components, i.e., (C k ) k , are non-empty, we have that Hence, the notions of K in Definition 4 (depth of the critical subsets) and Definition 6 (the number of CRP components) coincide.
In the remainder of this manuscript we will adhere to the following notation.
Notation 1.The full set of job types is always denoted by S, and any subset of S will also be written in calligraphic font, e.g., T ⊆ S, while individual job types are written in regular Roman font, e.g., T ∈ S. Vectors receive a bold letter and their elements are denoted by the same (non-bold ) letter and a subscript, e.g., T = [T 1 , T 2 , . . ., T j ].We use N to denote the set of natural numbers, including zero.

Main results
In this section we state the main heavy-traffic result in Theorem 4.1, which applies to both the c.o.c. and the c.o.s.mechanism.As the notation in Section 3 already alludes to, we assume that the fractions of jobs of each type remain fixed as the system approaches its heavy-traffic limit, i.e., (λ S ) S∈S = N λ(p S ) S∈S and λ ↑ λ * .A generalization of the main result where this assumption is relaxed is formulated in Theorem C.1 in Appendix C. In the proofs below, we will focus on the c.o.c.mechanism, and details for the c.o.s.mechanism are deferred to Appendix D.
For notational convenience, we define for any for k = 0, . . ., K, and for any T ∈ N K .The value P * (T ) represents the limiting probability that the state of the system corresponds to the ordered vector of job types T , i.e., the oldest job in the system is of type T 1 , the oldest job that is not of type T 1 is of type T 2 , etc.The expression for the PGF in Proposition 3.1 can then be leveraged to prove the following convergence result.
Proposition 4.1 (Convergence of PGFs).With the definitions as in Section 3, the joint PGF of the vector of (scaled ) queue lengths, i.e., as λ ↑ λ * with t S ≥ 0 for all S ∈ S.
Using Lévy's Continuity Theorem [25] and the above expression, it can be shown that the (scaled) numbers of jobs of the various types converge in distribution to a random vector (X S ) S∈S associated with a mixture distribution.There is a mixture weight for each T ∈ N K and a corresponding mixture component which consists of a weighted sum of K independent exponential random variables.Define the indices i 1 , . . ., i K such that {i 1 , . . ., i K } = CR(T ) with i 1 < i 2 < • • • < i K and j S (T ) the position of S in T such that T j S (T ) = S for any S ∈ S and T ∈ N K .Let (I T ) T ∈N K = e( T ) with probability P * ( T ) for any T ∈ N K and define e( T ) as a |N K |-dimensional unit vector with a one entry for the location corresponding to the ordered vector T , T ∈ N K .Then, the (random) coefficients are given by for any S ∈ S and k = 1, . . ., K, such that with U 1 , . . ., U K independent and exponentially distributed random variables with unit mean.
Theorem 4.1 (Convergence to a mixture distribution).The vector of (scaled ) queue lengths, (1− λ λ * )(Q S ) S∈S , converges in distribution to a random vector associated with the mixture distribution as specified above, i.e., (a) Sketch of the system.Hence, the limiting (scaled) queue length vector can be written as a linear combination of the random vectors P 1 , . . ., P K as in (9) with scalar coefficients given by K independent and exponentially distributed random variables with unit mean.This means that the limiting vector of (scaled) queue lengths lives in a K-dimensional cone spanned by these random vectors P 1 , . . ., P K .
Example 4. Consider the four-server system as depicted in Figure 2a with arrival rates and identical processing speeds per server (µ n ≡ µ).Then the critical arrival rate λ * = µ and the set of critical subsets is given by CR(S) = {{1}, {{3}, {3, 4}}, {{1}, {3}, {3, 4}}, S} .There are four critical vectors in N K , with K = 3, which are presented in the first column of Table 2.The associated mixture weights are computed using the expression in (5), and the various mixture components of the limiting random vector (X S ) S∈S are obtained from (10).Moreover, it follows that the total number of jobs converges to a random variable with an Erlang distribution with parameters 1 and K.The complexity of the expressions for the mixture weights and mixture components makes it difficult in general to get grip on the system's performance in heavy traffic.Careful consideration of Example 4 already suggests that the stochastic representation for (X S ) S∈S in (10) can be simplified.Indeed, two pairs of ordered vectors yield the same mixture component, so that the original mixture distribution in Table 2 can be rewritten as a distribution with only two mixture components and the adapted mixture weights 2 Definition 8 (Subgraphs and roots of the DAG).For any k = 1, . . ., K, consider the subgraph of the associated DAG rooted at the critical component C k and let V k denote the collection of all job types that occur in this rooted subgraph.Then p(V k ) := S∈V k p S represents the sum of the arrival probabilities of all job types that occur in V k .Let R K denote the set of the root nodes of the DAG.
Note that R K is always non-empty as the associated DAG has no (directed) cycles by definition [3, Lemma 2].
The stochastic representation for (X S ) S∈S in ( 10) can be simplified under the following assumption on the structure of associated DAG.
Assumption 1 (Root partitioning of DAG).For any r ∈ R K , define V r as the set of nodes of the subgraph of the associated DAG with root node r.Assume that the root nodes yield a disjoint partitioning of the nodes in the DAG, i.e., for any It is worth emphasizing that this assumption is satisfied in a wide range of systems and non-CRP scenarios, including but certainly not restricted to, so-called nested systems as also studied in [4,16,17].Indeed, the compatibility constraints of the system in Example 4 do not yield a nested system (Figure 2a), yet the associated DAG satisfies Assumption 1 (Figure 2b).A further discussion is provided in Appendix B.2.
The simplified result then states that the vector of (scaled) queue lengths converges in distribution to a random vector where the entries are linear combinations of K vectors p1 , . . ., pK whose (deterministic) entries are determined by (the subgraphs of) the DAG associated with the system.In particular, define for all k = 1, . . ., K. The scalar coefficients of the linear combination are given by K independent and exponentially distributed random variables with unit mean.Thus, the joint queue length vector exhibits a state space collapse onto a K-dimensional cone spanned by the vectors p1 , . . ., pK , which is one-dimensional only if K = 1, i.e., the CRP condition is satisfied.
Theorem 4.2 (Convergence to a non-mixture distribution).If Assumption 1 holds, then the vector of (scaled ) queue lengths, pk,S as defined in (11) and U 1 , . . ., U K independent and exponentially distributed random variables with unit mean.
As already established in Corollary 4.1, the (scaled) total number of jobs converges to a random variable with an Erlang distribution with parameters 1 and K since S∈S pk,S = 1 for all k = 1, . . ., K, and hence We now provide an example to illustrate the simplified result.A further discussion of Theorem 4.2 is provided in the following subsection.
and exponentially distributed random variables with unit mean.This example highlights that the simplified result in Theorem 4.2 is not obtained by showing that P k = pk for all k under Assumption 1, but rather relies on subtle relationships between the model parameters and the properties of independent and exponentially distributed random variables.Indeed, observe that p1 = [1, 0, 0, 0], while P 1 = [1, 0, 0, 0] with probability 2  3 or [0, 0, 1 3 , 2 3 ] with probability 1 3 .In fact, the subscript k in (9) specifically represents the depths of critical subsets induced by the ordered vectors T , while the subscript k in (11) corresponds to the different critical components which can be labeled arbitrarily.
The result in Theorem 4.2 is established by proving that the Laplace transforms of (X S ) S∈S and (Y S ) S∈S coincide after simplification, as formalized in the next proposition.Proposition 4.2.For t S ≥ 0 for all S ∈ S, it holds under Assumption 1 that with the former Laplace transform as in (8) and The proof of Proposition 4.2 is deferred to Subsection 4.3.
To summarize, the proof of Theorem 4.2 consists of two major parts, as depicted at the top of the diagram in Figure 3.We first investigate the heavy-traffic limit of the PGF in Proposition 3.1 (Proposition 4.1).From this we can deduce that the (scaled) numbers of jobs of the various types converge to a random vector associated with a mixture distribution (Theorem 4.1).Second, we show that the Laplace transform of this mixture distribution under Assumption 1 coincides with the Laplace transform of the random vector (Y S ) S∈S as in (12) (Proposition 4.2).The result in Theorem 4.2 then immediately follows, i.e., the (scaled) numbers of jobs of the various types converge to (Y S ) S∈S .
As visualized in the lower part of the diagram in Figure 3, we also provide an alternative proof of the heavy-traffic result in Theorem 4.1 in Appendix E. This alternative reasoning builds on the stochastic interpretation of the PGF in Proposition 3.1.In particular, we first derive a pre-limit characterization of the numbers of jobs of the various types in terms of weighted sums of geometrically distributed random variables whose parameters depend on the model parameters and the order in which the different job types occur.Subsequently we investigate the heavy-traffic behavior of each of these geometric random variables.

Discussion
In this subsection we provide further insight into the result stated in Theorem 4.2.
Depending on the compatibility constraints and the model parameters, K can take any value between 1 and |S|.We now present two examples of these extremes.
For instance, when the strong CRP condition (Condition 2) is satisfied, the associated DAG consists of a single node (i.e., K = 1) which includes all job types.In this special case, the result in Theorem 4.2 simplifies to  as λ ↑ µ, with U a unit-mean exponential random variable.Hence, the system exhibits full state space collapse.Moreover, the limiting joint distribution coincides with that of a multi-class M/M/1 queue with arrival rate N λ, service rate N µ and class probabilities (p S ) S∈S .This agrees with the results in [11, Theorem 1] which hold under both the weak and strong CRP condition.
In contrast, consider a setting with S = {{n} : n = 1, . . ., N }, identical arrival rates λ {n} ≡ λ and identical processing speeds µ n ≡ µ, i.e., each job type is compatible with only one server and each server has just one compatible job type.In this case, all subsets of job types are critical and the associated DAG consists of as λ ↑ µ, with U 1 , . . ., U N independent and exponentially distributed random variables with unit mean.This is consistent with the well-known result that the queue length of an M(λ)/M(µ)/1 queue, when scaled with 1 − λ µ , converges to an exponentially distributed random variable with unit mean as λ ↑ µ.The above example can of course be seen as a collection of N independent M(λ)/M(µ)/1 queues, hence the joint distribution of the (scaled) numbers of jobs of the various types converges to the distribution of N independent and exponential random variables.
As already alluded to above, the joint queue length vector exhibits a state space collapse onto a, possibly random, K-dimensional cone.Even if only a partial state space collapse occurs, i.e., K > 1, or even when K = |S|, the limiting system will be more convenient to analyze.So, even when K is large, and hence the dimension reduction is limited, it is more manageable to study a collection of independent exponential random variables than the pre-limit queue length distribution (possibly via the product-form expressions).
As observed in the introduction, explicit results for steady-state queue length distributions in non-CRP scenarios as presented in Theorem 4.2 have remained extremely scarce so far.To the best of our knowledge, no distributional results have been established in general settings, and we are only aware of some promising and revealing advances for special model instances.In particular, Maguluri & Jhunjhunwala [26,27] extend the transform method to non-CRP scenarios to derive an implicit characterization of the limiting queue length distribution in terms of a certain functional equation for the class of input-queued switches operating in a time-slotted fashion under a MaxWeight scheduling algorithm.In the special case of the N-model the functional equation can be solved under additional symmetry assumptions (equal service rates, and hence equal arrival rates in a non-CRP scenario) to obtain a result of the form as λ ↑ µ, with U 1 and U 2 independent unit-mean exponential random variables.
While the proof techniques and specific model attributes are quite different from our framework, it is striking to observe the close resemblance in the limiting queue length distribution in this particular case.Since the above results for the MaxWeight algorithm do not extend to heterogeneous settings or more general compatibility graphs, it is not immediately clear though to what extent the similarity is due to the symmetry assumptions or might possibly apply more broadly.
Finally, notice that we did not assume the compatibility graph to be connected to obtain the results above.In case of a disconnected graph it might however be notationally easier to apply Theorem 4.2 (or Theorem 4.1) to each connected component separately.In the remainder of this section it is assumed that all job types belong to (at least) one of the critical subsets and hence the whole system experiences criticality, so that λ * = µ.This assumption mainly serves to ease the notation and is non-essential; pointers to proofs and notation that need to be adapted otherwise can be found in Appendix B.1.
The proof of Proposition 4.1, and the proofs of some of the remaining results, will rely on the following intermediate result.
Notice that h(T , z) is precisely the term in f (z) in (3) for a given ordered vector of job types T and that P * (T ) = β(T )/β(N K ) for any T ∈ N K with P * (T ) as in (7).
Proof of Lemma 4.1.First, we note that we can write h(T , z) as the product of . Now, from Definition 5 we know that none of the factors of h 1 will diverge when λ ↑ λ * .In fact, lim λ↑λ * h 1 (T , z) = β(T ), with β(T ) as in (5).Next, from Definition 5 we know that each of the k factors of h 2 will diverge when λ ↑ λ * .However, after applying l'Hôpital's rule, we observe that Combining the above two observations concludes the proof.
Proof of Proposition 4.1.We need to show that the PGF in (2) converges to (8) when λ ↑ λ * and when the number of jobs of the various types is scaled by (1 − λ λ * ).First, we rewrite the function f in (3) by rearranging the terms according to the number of critical sets of job types they correspond to.In particular, and h(T , z) as in (14).Note that we may write .
We will first show that f k (z)/f K (z) converges to 0 as λ ↑ λ * for all k = 0, . . ., K − 1 and t S ≥ 0 for all S ∈ S. Then we prove that f K (z)/f K (1) indeed tends to (8).Recalling that z S := exp (−(1 − λ/λ * )t S ) for all S ∈ S and t S ≥ 0, this concludes the proof.Part 1: Show that for all k = 0, . . ., K − 1 and Due to Lemma 4.1, we know that both (1 Part 2: Show that for all z S := exp (−(1 We will again focus on the function (1 Note that the latter product evaluates as 1 if z = 1, or alternatively if t = 0, such that This concludes the proof of Proposition 4.1. Proof of Theorem 4.1.Considering (8), we notice that there is a weight P * (T ) associated with each vector T ∈ N K .Now, focusing on a fixed T ∈ N K , we observe that with x k (T ) := i k j=1 t Tj p Tj /p(T , i k ).Moreover, we note that , where we altered the summation order in the last equality from T ∈ T to S ∈ S by keeping track of the position of job type S in the K-critical vector T via j S (T ).Note that the above expression is still valid if S / ∈ T , in that case we can assume that j S (T ) = ∞ and hence the corresponding inner sum evaluates to 0. The convergence result then follows from Lévy's Continuity Theorem [25].
Remark 4. In case K = 1, the statement of Theorem 4.1 coincides with the results in [11,Theorem 2].The present paper not only allows for a more general setting (i.e., K ≥ 1), but it also significantly reduces the complexity of the proof by, first, enumerating the vectors of job types according to the number of critical subsets (T ∈ N k ) instead of the length (T ∈ S k ), and second, by multiplying the numerators and denominator with (1 − λ/λ * ) K (as in ( 16)) instead of extracting the possibly diverging components of the limiting PGF.
Proof of Corollary 4.1.Substitute t S = t for all S ∈ S in the expression obtained in Proposition 4.1, and observe that |CR(T )| = K for all T ∈ N K .The simplification to (1 + t) −K follows after straightforward manipulations.
We observe that this limiting PGF of the (scaled) total number of jobs coincides with that of a sum of K independent exponential random variables with unit mean, or in other words the Laplace transform of a random variable with an Erlang distribution.Hence, Lévy's Continuity Theorem [25] implies that the non-negative random variable (1 − λ λ * )Q converges in distribution to a random variable with an Erlang distribution and parameters 1 and K.

Proof of Proposition 4.2
In order to prove Proposition 4.2 we will use the fact that some of the components of the mixture distribution of (X S ) S∈S are the same and hence the corresponding mixture weights can be aggregated.This, together with further simplification, will result in an equivalent representation of the Laplace transform as in (13).
Indeed, considering Example 4 it can be seen that the K-ordered vectors both induce the same mixture component of the heavy-traffic limit.This is due to the fact that CR(T ) = CR(T ′ ) = {1, 3, 4} and the only difference between T and T ′ is a permutation of the job types that belong to the same CRP component, i.e., C 2 = {{3}, {3, 4}}.A similar observation can be made for the remaining two K-ordered vectors in N K .We first elaborate on the various mixture components and how the corresponding mixture weights can be aggregated.Definition 9 (σ-ordered vectors).Let σ ∈ Σ K be a topological ordering, then we define a set of ordered vectors as follows, T σ := T = perm(C σ(1) ), . . ., perm(C σ(K) ) , where perm(C k ) denotes any permutation of the job types in the critical component C k .We refer to the vectors in T σ as σ-ordered vectors.Example 6.In Example 5 there are two possible topological orderings of the CRP components, i.e., Σ K = {(1, 2, 3), (2, 1, 3)}.With the CRP components defined as C 1 = ({1}, {1}), C 2 = ({{3}, {3, 4}}, {3, 4}) and C 3 = ({1, 2, 3}, {2}), this results in the following two sets of σ-ordered vectors, Next, we argue that the above construction for all σ ∈ Σ K will yield a partitioning of the set of K-critical vectors, N K .Lemma 4.2.It holds that N K = σ∈Σ K T σ , with N K as in Definition 5 and T σ as in Definition 9.
Proof.Consider T ∈ T σ for some σ ∈ Σ K .We have to show that T induces K critical subsets when combining the respective job types one by one.Due to Construction 1, we know that are critical subsets of job types for all k.Since, by definition of K, there cannot be more than K induced critical subsets in any ordered vector in N , we conclude that T ∈ N K .Now assume that T ∈ N K with CR(T ) = {i 1 , . . ., i K }.Define T k := {T 1 , . . ., T i k } for k = 1, . . ., K and T 0 := ∅ to ease the notation.Since the critical components form a partitioning of the job types and all the critical subsets of job types T 1 , . . ., T K can be written as a union of the critical components, we have that (after possibly relabeling the critical components) We claim that (C 1 , . . ., C K ) is indeed a topological ordering and hence there exists a σ ∈ Σ K such that T ∈ T σ .If the critical component C i can forward some of its jobs to servers in critical component C j , then C j must occur before C i in the topological ordering.Assume by contradiction that C i is positioned before C j in the above-mentioned ordering.This implies that there is a job type S ∈ C i compatible with server n ∈ Z j .By construction, we have that Z k ⊆ ∪ T ∈C k T , and from [3, Lemma 3(i)] we know that N µp as server n is not contained in Z i (or Z 1 , . . ., Z i−1 ).This contradicts the criticality of the set T i .Thus, the above ordering is indeed a topological ordering.This concludes the proof.
Consider the result and notation of Theorem 4.1 and focus on any job type S ∈ S. Then there exists some k ′ = 0, 1, . . ., K such that i k ′ −1 < j S (T ) ≤ i k ′ for all T ∈ T σ .Moreover, note that p(T , i k ) = k j=1 p(C σ(j) ) for all T ∈ T σ .Thus, , which no longer depends on the actual ordering in T but only on σ ∈ Σ K .This concludes the proof.
As a consequence of the above two lemmas, if the topological ordering is unique, i.e., |Σ K | = 1 and hence the associated DAG is a line graph, then the representation of (X S ) S∈S in (10) can easily be rewritten to obtain the non-mixture distribution of (Y S ) S∈S in (12).In all other settings, this simplification is not so straightforward.
Next we investigate the mixture weights.We will use the following identity in order to aggregate the mixture weights (P * (T )) T ∈N K in (7) over all ordered vectors induced by the same topological ordering of the CRP components Lemma 4.4.Let c 1 , . . ., c K be K positive constants and Σ K be the set of all topological orderings of a DAG that satisfies Assumption 1, then Proof.The result is shown by induction on K, and is trivially true for K = 1.The induction step follows after fixing the last value of the permutation, i.e., i K , and applying the result for lower values than K.Note that only a root node of the DAG can be positioned at the end of the permutation.Let R K denote the set of root nodes of the DAG and Σ K (k) denote the subset of Σ K where the last entry is given by k, with k ∈ R K .Then, denote the subgraph rooted at node k in the DAG with root node l removed from the DAG, k ̸ = l.We can then apply the induction hypothesis to conclude that the final expression in ( 18) is equal to Notice that the collection of subgraphs governed by the nodes in V l , {V k : k = 1, . . ., K, k ̸ = l} is the same as the collection of subgraphs governed by {V k : k = 1, . . ., K} as l is a root node of the original DAG.So we obtain where we used that each node of the DAG is part of precisely one subgraph governed by the root nodes.This concludes the proof.
Remark 5.The identity in Lemma 4.4 is a generalization of the identity where perm(K) denotes the set of all permutations of the integers 1, . . ., K. Alternatively, one can think of a DAG which solely consists of isolated vertices where Σ K includes all permutations.Such a DAG can occur when analyzing a set of K isolated single-server queues.
Using the above lemma, we can aggregate the mixture weights.
Proposition 4.3.Let σ ∈ Σ K with Σ K the set of all topological orderings of a DAG that satisfies Assumption 1 and define Then, Proof of Proposition 4.3.We first focus on a particular T ∈ T σ for some σ ∈ Σ K , and emphasize how P * (T ) depends on CR(T ) = {i 1 , . . ., i K } and σ rather than the actual ordering of the job types within each critical component.Say that T = C σ(1) , . . ., C σ(K) with C σ(k) some ordering of the set C σ(k) for all k.So, , where p σ,k,i denotes the arrival fraction of the ith job type in C σ(k) , and α an appropriate normalization constant, i.e., α = β(N K ) −1 .Notice that for each T ∈ N K the same job types will occur and that all these job types contribute with factor N λ * p S for those job types S, irrespective of σ such that T ∈ T σ .Hence, we can update the normalization constant to α ′ and write Using the fact that {C σ(1) , . . ., C σ(k) } are critical subsets for all k and hence that for an updated normalization constant α ′′ .Notice that the first part only depends on the chosen permutation σ of the sets of job types of the critical components, C σ(1) , . . ., C σ(K) and not on the order in which these types occur within T , i.e., C σ(1) , . . ., C σ(K) .We now rewrite the denominator of the second part of the expression.The main idea to simplify this expression in the denominator is by neglecting those components that experience criticality.Before doing so we make the following observations: 1.Each CRP component C = (C, Z) which corresponds to a leaf node in the DAG gives rise to a critical subset of job types.Indeed, this C can be positioned first in a topological ordering such that by Construction 1 C is recovered as a critical set.
2. Let C k = (C k , Z k ) be a node in the DAG and V k be the subgraph of the DAG rooted at C k .The collection of job types in V k gives rise to a critical subset of job types, since we can generate a topological ordering with precisely those nodes associated with V k first.Moreover, C k will be positioned last (with respect to the other CRP components induced by V k ) in this topological ordering since it is the root of the subgraph.Again, V k then induces a critical subset of job types by Construction 1.Let V σ(k) denote the nodes of the subgraph rooted at C σ(k) .Given the above observations we can partition the job types as

Let (C
It immediately follows that Moreover, due to the structure of the topological ordering (and the observations above), we have that which only depends on the job types in the rooted subgraph starting at node C σ(k) , i.e., V σ(k) , and the ordering of the job types in just this node, i.e., C σ(k) .To slightly ease the notation, let us refer to the above notation as Combining the above yields For the last step we used that the second part of the expression no longer depends on σ and is in fact the same for all σ ∈ Σ K .The normalization constant of the above expression for all σ ∈ Σ K can be rewritten into the desired format as depicted in (19) using the identity in Lemma 4.4.
We have now established all auxiliary results to prove Proposition 4.2.
Proof of Proposition 4.2.Using the above results, we can show that the Laplace transforms of (Y S ) S∈S and (X S ) S∈S coincide.We consider (X S ) S∈S and condition on σ ∈ Σ K to aggregate the various mixture components (Lemmas 4.2 and 4.3), for all t S ≥ 0. Next, we rely on (10) to obtain .
Since U 1 , . . ., U K are independent and exponentially distributed with unit mean, we can write Using the expression in Proposition 4.3 for P * (T σ ), we obtain Using the identity in Lemma 4.4 with c k = p(C k ) + S∈C k t S p S for all k and substituting the expression for β(Σ K ) in (19), results in This concludes the proof.

Outlook
Our methods and results suggest two natural topics for further research.
First of all, it would be interesting to apply the above framework to slightly different models that also have product-form stationary distributions.One can think of, for instance, order-independent queues [32] or redundancy policies operating in overloaded systems with abandonments [17].Although the product-form expressions seem fairly similar at first glance, they have a different way to describe the total rate at which jobs leave the system.This complicates the derivation of the PGF as aggregation of the states according to the k-critical vectors (Definition 5) will no longer result in a closed-form expression.A similar analysis has also proven to be successful for related models without the queueing feature, e.g., matching models, which still maintain a product-form stationary distribution under some form of a CRP condition [12].It would be interesting to explore whether these conditions could be relaxed.
From a broader perspective, it is worth observing that the stability conditions in Condition 1 are not only necessary but also sufficient for a far wider range of routing and scheduling policies [15].Consequently, all these 'maximally stable' or 'throughput-optimal' policies have the same (number of) critically loaded subsystems as redundancy policies, and might therefore potentially also exhibit quite similar heavy-traffic behavior.This seems especially plausible for the celebrated Join-the-Shortest-Queue (JSQ) policy since it is similar in spirit to the Join-the-Smallest-Workload (JSW) policy, which in turn is equivalent to the redundancy c.o.s.policy as noted earlier.Indeed, similarities between JSQ and JSW in terms of processlevel limits and (full) state space collapse results have been observed in CRP scenarios [5,6].It would be interesting to explore whether this extends to non-CRP scenarios, and whether some asymptotic equivalence property for a wider class of policies might hold.In this regard it is worth recalling the close resemblance with the limiting queue length distribution for the MaxWeight scheduling algorithm in the special case of the N-model with equal service rates.While it is difficult to extrapolate from such a highly special case, the striking commonality suggests that the independent exponentially distributed random variables associated with the critically loaded subsystems may arise more universally.We conjecture however that the specific form of the cone and the relative proportions of the various job types are in general policy-dependent.
Note that µ(T σ,k ) refers to the aggregate service rate of all servers compatible with job types T σ,k .
By definition of the topological ordering, none of the compatible servers of the job types T σ,k can be outside of Z σ,k .Otherwise there exists an arc from one of the components in {C σ(1) , . . ., C σ(k) } to one of the components in {C σ(k+1) , . . ., C σ(K) }, which implies that the latter component must be positioned earlier in the topological ordering.This yields a contradiction.Hence the compatible servers of T σ,k form a subset of Z σ,k , so µ(T σ,k ) ≤ µ(Z σ,k ).
Since the CRP components form a partition of the job types and servers, we can write From [3, Lemma 3(i)] we know that p C σ(k) = µ Z σ(k) /(N µ).Hence, Combining the above observations results in where the first inequality is due to the stability conditions in Condition 1.This yields ( 21), and we can conclude that T σ,k is indeed a critical subset of job types.
Proof of Lemma A.2. Assume that T is a critical subset of job types according to Definition 2. We will argue that T can always be recovered using Construction 1.We first introduce some notation.Let X := S∈T S denote the set of all compatible servers of job types in T .Note that earlier we used T to denote both the set of job types and the compatible servers (as each job type is defined by its compatible servers), and we like to emphasize here the difference between the two.Recall that the compatibility relations between the job types and servers can be represented by a matching M with m S,n = 1 whenever job type S is compatible with server n (or n ∈ S), and 0 otherwise.For any subset of servers Y ⊆ {1, . . ., N }, the job types that can uniquely be served by them are denoted by U(Y ).Formally, In the remainder of the proof we will use the following two observations: Observation 1: T ⊆ U(X).Let S ∈ T and assume by contradiction that S / ∈ U(X).So n / ∈X m S,n > 0, and hence there is a server ñ ∈ {1, . . ., N } \ X such that m S,ñ = 1.However, as m S,ñ = 1, it follows by construction of X that ñ ∈ X.This yields a contradiction.Observation 2: U M (Y ) ⊆ U M (Y ) for any Y ⊆ {1, . . ., N } where M and M represent the original and residual matching, respectively.This follows directly from the fact that m S,n ≥ mS,n for all job types S and servers n.
We focus on the set of servers X, and distinguish three cases that each will be investigated separately: 1.For some σ ∈ Σ K and k ∈ {1, . . ., K}, it holds X = Z σ,k ; 2. The set X can be written as a union of (a subset of) the server sets of the CRP components, i.e., {Z k : k = 1, . . ., K}. However there exists no permutation σ ∈ Σ K such that X = Z σ,k for some k; 3. The set X can not be written as a union of server sets of CRP components.
We now show that the first case indeed results in the critical set of job types equal to T and that the remaining two cases cannot occur by showing a contradiction.Case 1: We aim to show that T = T σ,k .From Lemma A.1 we know that T σ,k is a critical subset of job types and that p(T σ,k ) = µ(Z σ,k )/(N µ) = µ(X)/(N µ).By assumption, T is a critical subset of job types and hence p(T ) = µ(X)/(N µ).It then follows that p(T ) = p(T σ,k ).
After combining the above two results, we can indeed conclude that T = T σ,k .Case 2: Assume, after possibly relabeling the CRP components, that X = ∪ k i=1 Z i and that there exists no permutation σ or topological ordering such that X = Z σ,k .This implies the existence of a CRP component C = ( C, Z) such that there is an arc (C i , C) in the DAG for some i ∈ {1, . . ., k}, and hence C would have to be positioned before C i in any valid topological order.Hence there must exist a job type S ∈ C i and a server n ∈ Z such that m S,n > 0. So, S / ∈ U M (X), then From Observation 1, we have that T ⊆ U M (X) and so As T is a critical subset of job types, µ(X) − N λp(T ) → 0 as λ ↑ µ.From [3, Lemma 3(i)] we know that also µ(Z i ) − N λp(C i ) → 0 as λ ↑ µ.Hence it follows that p S = 0, which yields a contradiction.We conclude that Case 2 can not occur.Case 3: Assume that X can not be written as a union of server sets of CRP components.From Observation 2 we deduce that Since the CRP components are disconnected under M , we also have that Since X is not equal to the union of server sets of CRP components, there is a k such Combining the above results with Observations 1 and 2 yields This contradicts the fact that T is a critical subset of job types, and we conclude that Case 3 cannot occur.This concludes the proof of Lemma A.2.

B Main results -additional comments
B.1 Comment on the used notation in Subsection 4.3 In the notation in the proofs of Subsection 4.3 it is assumed that all job types belong to (at least) one of the critical subsets and hence the whole system experiences criticality, so that λ * = µ.However, if there had been job types that do not experience criticality, these job types could occur in the K-critical vectors.In particular, these vectors would be of the form T = C σ(1) , . . ., C σ(K) , N C , for some σ ∈ Σ K where C σ(k) denotes a permutation of the job types in the CRP component C σ(k) for all k, and N C a permutation of (a subset of) the job types that do not experience criticality.Notice that these non-critical job types must always occur at the end of the critical vector, otherwise the K-criticality of the vector would be violated.Moreover, for any topological ordering σ ∈ Σ K , the same collection of permutations of non-critical subsets would occur.This would yield an adaptation of the definition of the σ-critical vectors (Definition 9), and a few additional steps in the proof of Proposition 4.3.Especially in the last step (see (20)) one needs to use the above observation that the non-critical job types will contribute to each P * (T σ ) equally much, such that the normalization constant α ′′′ can be adapted and the same expression as in Proposition 4.3 would be obtained.This small technicality would have made the notation in the proofs of the above intermediate results even more cumbersome.Moreover, we would like to emphasize that the main results still hold if not all job types experience criticality, even using the notation as in Subsection 4.3.The job types that will not experience criticality will not occur in the DAG either, so that eventually their (scaled) number of jobs will vanish in a heavy-traffic regime.

B.2 Comments on Assumption 1
In this subsection we provide some examples of systems that satisfy the root partitioning assumption as formulated in Assumption 1.We emphasize that the list of examples is not exhaustive, and only serves to illustrate the wide range of models for which Assumption 1 holds.
All systems for which the model parameters satisfy the CRP condition yield a DAG that consists of a single node (Condition 2) for which Assumption 1 trivially holds.
Assumption 1 also holds for systems with a unique topological ordering.By contradiction, it can be argued that if a node belongs to the subgraphs induced by two root nodes r 1 and r 2 , then there exists a topological ordering where r 1 is positioned before r 2 and vice versa.Alternatively, we can observe that all critical vectors yield the same expression for the corresponding mixture components when there is a single topological ordering.Hence, (X S ) S∈S and (Y S ) S∈S have the same distribution, and the statement of Theorem 4.2 follows directly from Theorem 4.1.(A more formal argument can be found in the proof of Lemma 4.3.) Nested systems are systems where the compatibility graph satisfies the following property: let S 1 and S 2 be two distinct job types, if they have a non-empty intersection of compatible servers, i.e., S 1 ∩ S 2 ̸ = ∅, then one of them must be contained in the other, i.e., S 1 ⊊ S 2 or S 2 ⊊ S 1 [16,17].By contradiction, it can again be argued that if a node belongs to the subgraphs induced by two root nodes r 1 and r 2 , then there exists a node v where the subgraphs meet for the first time and nodes v 1 and v 2 on the path from r 1 and r 2 respectively to node v such that there is one job type in node v 1 and one job type in v 2 for which the nested property is violated.
As the system in Example 5 (Figure 2) illustrates, Assumption 1 can be satisfied for systems that are neither nested, nor have a unique topological ordering.In particular, for arbitrarily structured compatibility graphs, whether or not Assumption 1 is satisfied depends on the delicate interplay between the compatibility constraints and the model parameters.To illustrate this, consider a system with three servers, each operating at speed µ and define the job types as S = {{1, 2}, {2, 3}, {2}}.
Hence, even if Assumption 1 is not satisfied for a particular set of model parameters for a given system, it might be for other values of model parameters.

C Main results -generalization
To derive the main result in Theorem 4.2, we assumed that (λ S ) S∈S = N λ(p S ) S∈S , and in particular that the fractions of jobs of each type remain fixed as the system approaches its heavy-traffic limit, i.e., when λ ↑ λ * .Similar to the heavy-traffic analysis conducted by Hillas et al. [21], we can relax his assumption by selecting any positive constants (γ S ) S∈S and set for all S ∈ S and 0 ≤ ϵ ≤ ϵ + for some ϵ + small enough such that all the arrival rates are positive.The heavy-traffic behavior of the system can then be studied by taking ϵ ↓ 0. To ease the notation, we will omit the ϵ dependence of λ S .
Notice that this definition coincides with the setting in the previous subsections when we set γ S = N λ * p S , where ϵ = 1 − λ λ * .In terms of the stability region, this allows us to approach any point on the boundary from any interior point, while the previous set-up only allowed the boundary to be approached in a direction starting from the origin.
Using this general direction of convergence of the arrival rate vector, we can prove a generalization of Theorem 4.2.Similarly to (4), we first define γ(T , j) := j i=1 γ Ti for any ordered vector of job types T and j = 1, . . ., |T |.Moreover, let γ(V k ) represent the sum of those elements of (γ S ) S∈S such that the corresponding job types occur in the nodes of the subgraph rooted at critical component C k , denoted by V k .Then, we define a slightly altered version of the (deterministic) coefficients in (11), i.e., γk,S := for all k = 1, . . ., K and S ∈ S.
Theorem C.1 (Main result -generalization).If Assumption 1 holds, then the vector of (scaled ) queue lengths, ϵ(Q S ) S∈S , converges in distribution to (Y S ) S∈S , i.e., γk,S as defined in (23) and U 1 , . . ., U K independent and exponentially distributed random variables with unit mean.The result holds for both the c.o.c. and c.o.s.mechanism.
If the CRP condition (Condition 2) is satisfied, then the associated DAG consists of a single node with all critical job types.Denote this subset of critical job types by T * , then the scaled numbers of jobs of the various types converge to as ϵ ↓ 0 with U a unit mean exponential random variable.From this we can deduce that, as long as the total speed γ(T * ) at which a point of the boundary of the stability region is approached is unaltered, the direction from which this point is approached plays no role when the CRP condition is satisfied.This observation is no longer true if the CRP condition is not satisfied, as illustrated in the following example.
Example 9. Consider the N-model with the heavy-traffic trajectories as in Example 8. Applying the above theorem yields We observe that the number of type-{2} jobs is larger when δ > 0 while the number of type-{1, 2} jobs remains the same compared to a setting with δ = 0.
We now consider a slightly larger example to illustrate the impact of (γ S ) S∈S .
(a) Sketch of the system.For larger examples, with more complicated DAG structures, it becomes less obvious how the choice of the parameters (γ S ) S∈S influences the limiting representation of the vector of queue lengths compared to the settings discussed in Section 4.
The proof of Theorem C.1 for the c.o.c.mechanism follows a similar two-step framework as the proof of Theorem 4.2.Therefore we will only give a broad outline of the proof and mention some of the differences.The proof outline for the c.o.s.mechanism is provided in Appendix D.
First, a convergence result similar to Proposition 4.1 must be established in terms of the PGF of the numbers of jobs of the various types.We define and ω(N K ) := for any T ∈ N K .
Theorem C.2 (Convergence of PGFs -generalization).With the definitions as in Section 3 and the (prelimit) arrival rates as in (22), the joint PGF of the (scaled ) numbers of jobs of the various types, i.e., with z S = exp (−ϵt S ) converges to as ϵ ↓ 0 with t S ≥ 0 for all S ∈ S.
The proof is similar to the proof of Proposition 4.1, where we now use ϵ instead of 1 − λ λ * and the fact that γ(T , i) no longer has to be equal to µ(T , i) = N λ * p(T , i) for i ∈ CR(T ).
Second, from the above theorem we can deduce that (X S ) S∈S follows a mixture distribution where the mixture weights are determined by ω(T )/ω(N K ) for T ∈ N K and the mixture components consist of linear combinations of independent exponentially distributed random variables.
Define the indices i 1 , . . ., i K such that {i 1 , . . ., i K } = CR(T ) with i 1 < i 2 < • • • < i K and j S (T ) the position of S in T such that T j S (T ) = S for any S ∈ S and T ∈ N K .Let (I T ) T ∈N K = e( T ) with probability ω( T ) ω(N K ) for any T ∈ N K and define e( T ) as a |N K |-dimensional unit vector with a one entry at the location corresponding to the ordered vector T , T ∈ N K .Then, the (random) coefficients are given by Γ k,S := for any S ∈ S and k = 1, . . ., K, such that with U 1 , . . ., U K independent and exponentially distributed random variables with unit mean.
Proposition C.1 (Convergence to a mixture distribution -generalization).The (scaled ) vector of queue lengths, ϵ(Q S ) S∈S , converges in distribution to a random vector associated with the mixture distribution as specified above, i.e., ϵ We emphasize that, even though the (pre-limit) arrival rates change, the critical arrival rate to the system, λ * will remain the same as in the previous sections.Hence, the sets of critical job types, CR(S), the CRP components and the associated DAG will be also be the same, even when (γ S ) S∈S differs from (N λ * p S ) S∈S .
After establishing the mixture distribution, we show that it (and its Laplace transform) can be rewritten or simplified, as formalized in the following proposition.
Proposition C.2.For t S ≥ 0 for all S ∈ S and under Assumption 1, it holds that with the former Laplace transform as in (25) and such that Lemmas 4.2 and 4.3 are still applicable without any alterations and will be the building blocks of the proof.However, the expression for the aggregation of the mixture weights according to the σ-ordered vectors, T σ , does depend on (γ S ) S∈S .Indeed, for any σ ∈ Σ K , which serves as an alternative result for Proposition 4.3.This, together with the proof outline of Proposition 4.2, is sufficient to establish (26).

D Main results -redundancy c.o.s.
Since the result in Theorem 4.2 is a special case of the result in Theorem C.1, we will only focus on the latter to show its validity in case of the c.o.s.mechanism.The proof follows the same outline as the proof for the c.o.c.mechanism.The main difference is the limiting form of the PGF of the numbers of waiting jobs of the various types, which will be provided below for completeness.
Let u be an ordered vector of a subset of the servers which represents the idle servers and the order in which they became idle, then The set of all ordered idle servers with no compatibilities among the job types in T is denoted by E(T ).
Theorem D.1 (Convergence of PGFs -c.o.s., generalization).With the definitions as in Section 3 and the (pre-limit) arrival rates as in (22), the joint PGF of the (scaled ) numbers of waiting jobs of the various types, i.e., E S∈S z QS S with z S = exp (−ϵt S ) converges to as λ ↑ µ with t S ≥ 0 for all S ∈ S where E(T ) represents the collection of vectors of servers which are not compatible to any of the job types in T .We define ω(T ) and α(u) as in (24) and (27), respectively, and The proof uses similar arguments as the proof of Proposition 4.1.

E.1 Pre-limit characterization of the system
Before turning our attention to the heavy-traffic limit, we will investigate in this section the pre-limit stationary behavior of the system.Specifically, we will derive a characterization of the total number of jobs of the various types in terms of weighted sums of geometrically distributed random variables whose parameters depend on the model parameters and the order in which the different job types occur.It is worth emphasizing that simple, closed-form expressions for the stationary distribution at the level of the number of jobs (of the various types) still seem out of reach, despite this characterization.
We introduce the following two concepts to keep track of the number of jobs (of the various types) in the system given an ordered vector T .Definition 10 (Jobs and segments).Given an ordered vector of job types T , define Q j | T as the number of jobs between the first occurrences of a type-T j and a type-T j+1 job, for j = 1, . .
where the type labels are ordered such that the first label corresponds to the oldest job in the system, etc.There are four segments of which the second is of size zero and the other three are denoted by R 1 , R 3 and R 4 .So the numbers of jobs in the various segments are given by Q (1, 0, 5, 2).Considering each segment separately, the numbers of jobs of the various types are given by Theorem E.1.For a given T ∈ N , the following (pre-limit) characterizations hold.
d.The numbers of jobs in different segments are independent.
e.The numbers of jobs of the various types in a particular segment given T and the total number of jobs in that segment, i.e., (Q j Ti | Q j , T ) i=1,...,j , follows a multinomial distribution with parameters Q j and (p Ti /p(T , j)) i=1,...,j for all j = 1, . . ., |T |.
Note that p T ,j , p T ,j,i ∈ (0, 1) for all i = 1, . . ., j, j = 1 . . ., |T | and T ∈ N due to the stability conditions (Condition 1).The proof of Theorem E.1 relies on the detailed stationary distribution provided in [18] and a sequential aggregation of the states.
Proof of Theorem E.1.To prove the statements in Theorem E.1 we start from the detailed stationary distribution provided in [18] and sequentially aggregate the states.The product-form expression is given by with c = (c 1 , . . ., c n ) representing the state of the system.Note that for a given ordered vector of job types T , a state c satisfying this order can be rewritten as where R j = (R j i , . . ., R j n j ) with R j n ∈ {T 1 , . . ., T j } for all n ≤ n j , n j ∈ N and j ≤ |T |.Hence, we can also write where we subdivide the factors of each job type in (29) into different categories, where either the corresponding job type label occurs for the first time or the corresponding job is part of the jth segment for some j.Consider a (non-empty) subset J of the indices {1, . . ., |T |}.Let n j ∈ N j for all j ∈ J .To obtain an expression for the joint stationary probability that , we have to aggregate over all states with R j such that i for all i = 1, . . ., j and j ∈ J and R j can be anything for all j / ∈ J , which is a shorthand notation for the indices j ∈ {1, . . ., |T |} \ J .Moreover, we define N λp(T ,j) µ(T ,j) Let n j ∈ N for j ∈ J .In order to obtain the (joint) stationary probability that Q j | T j∈J = n j j∈J , we aggregate the above states further.In particular, we sum over all vectors n j such that |n Next, we add all the above states for all n j ∈ N for all j ∈ J to obtain the stationary probability that the system state complies with the ordered vector T , i.e., This proves part c of Theorem E.1.Using this part and the computations in (31) we obtain N λp(T ,j) µ(T ,j) n j = j∈J 1 − p T ,j p T ,j n j .
We observe that the latter expression corresponds to the joint probability distribution of |J | independent geometric random variables with parameters (p T ,j ) j∈J .This concludes the proofs of parts b and d.N λp Ti µ(T , j) It is worth emphasizing that the numbers of jobs of the various types within one segment are not independent as can be deduced from the above expression.In order to obtain P(Q j Ti = n j i | T ) for some 1 ≤ i ≤ j ≤ |T |, we sum over all vectors (n j ĩ ) ĩ=1,...,j with n j i fixed.Hence, which concludes the proof of part a. From ( 32) and ( 33 Moreover, we have, for instance, that Q 2 | T is geometrically distributed with parameter p T ,2 = N λ(p T1 + p T2 )/(2µ) = 5λ 6µ , and Q 3 T1 | T is geometrically distributed with parameter Observe that Theorem E.1 allows us to count the number of jobs of a fixed type S ∈ S given the ordered vector of job types T .Define j S (T ) as the position of S in T if S ∈ T and as ∞ otherwise, then We now combine all of the above results to describe the distribution of the numbers of jobs of the various types (Q S ) S∈S , irrespective of the ordered vector of job types.
First we introduce the random vector α = (α T ) T ∈N to distinguish between the different ordered vectors T that may occur.In particular, α ≡ e T , a |N |-dimensional unit vector with a one entry at the position corresponding to T , with probability P(T ).
Then we construct a matrix M (T ) ∈ N |T |×|T | with random entries corresponding to the numbers of jobs of the various types in each of the |T | segments, Hence, Relying on the observation in (34), we can write Define the vector 1(T ) ∈ {0, 1} |S| such that 1(T ) S = 1{S ∈ T } and an all-ones vector of length |T |, E(T ).
Theorem E.2.With the notation as above, the numbers of jobs of the various types can be written in terms of geometrically distributed random variables defined in Theorem E.1, The above theorem gives a probabilistic representation of the numbers of jobs of the various types which agrees with the PGF in Proposition 3.1.We emphasize that P (T ) and E(T ) consist solely of 0 and 1 entries, while M (T ) contains geometrically distributed random variables where the various columns are independent (Theorem E.1, parts a and d) and the sum of these columns yields again geometrically distributed random variables (Theorem E.1, part b).Hence (Q S ) S∈S follows a mixture distribution, where each mixture component corresponds to a sum of geometrically distributed random variables.Remark 6.The above probabilistic interpretation focuses on the redundancy c.o.c.policy.However, many of the above observations and results are also true for the redundancy c.o.s.policy with some minor adjustments.In particular, the results in parts a, b and d of Theorem E.1 still hold for the waiting numbers of jobs in the various segments and part c is alternatively formulated as with normalization constant (C ′ ) −1 = T ∈N h(T , 1)k(T ) and with E(T ) the set of all ordered vectors of idle servers which are not compatible with any of the job types in T .Consequently the statement in Theorem E.2 is also true for the numbers of waiting jobs of the various types, i.e., ( QS ) S∈S , when the weights of the unit vectors α are updated according to (36).

E.2 Heavy-traffic characterization and Theorem 4.1
We now use the stochastic characterization of the system as derived in the previous subsection to investigate the heavy-traffic behavior.This analysis will result in an alternative proof of Theorem 4.1 along with a probabilistic interpretation.
We first identify the dominant state configurations in the heavy-traffic regime.As it turns out, in general only the states with configurations with a maximum number of nested critical subsets will occur with non-zero probability in the heavy-traffic regime.
As already suggested by the notation, the above expression for the heavy-traffic limit of the probabilities P(T ) coincides with (7) for T ∈ N K .The proof of Lemma E.1 follows similar arguments as the proof of Proposition 4.1.
Example 14.Consider the system in Example 5. Using Lemma E.1, it can be computed that Second, we focus on the number of jobs in each segment, i.e., (Q j | T ) |T | j=1 for a given ordered vector T .As formalized below, the scaled number of jobs between the first occurrences of a type T j and T j+1 job, Q j | T , will either converge to an exponentially distributed random variable or vanish in the heavy-traffic regime.
Since some of the segments, i.e., (Q j | T ) j / ∈CR(T ) , will vanish after scaling, the numbers of jobs of the types T 1 , . . ., T j in that segment will also vanish.For the numbers of jobs of the various types in the remaining segments, we can show the following convergence results.
Proof.Using Theorem E.1, we can derive the PGF of 1 Relying on (37), we observe that lim Noticing that 1 + j i=1 p T i p(T ,j) t Ti −1 is the Laplace transform corresponding to U p T i p(T ,j) j i=1 , the results follow by applying Lévy's Continuity Theorem [25].
From the above lemmas and the observation in (34), it follows that as λ ↑ λ * , for any T ∈ N k for some k = 0, 1, . . ., K. Note that some of the terms in the above summations have no contribution, i.e., j / ∈ CR(T ), as they may correspond to vanishing segments.In fact, at most k of the terms are nonzero.Therefore we can rewrite the summation while focusing on those non-zero elements.Let CR(T ) = {i 1 , . . ., i k } as in Definition 5 with i Example 15.Consider the system as outlined in Example 5 and T = [{1}, {3}, {3, 4}, {1, 2, 3}], then CR(T ) = {i 1 , i 2 , i 3 } = {1, 3, 4}.According to Lemma E.2, we observe as λ ↑ λ * with U 1 , U 2 and U 3 three independent unit-mean exponential random variables.The second segment is vanishing for the given ordered vector T , since the limiting arrival rate of type-{1} and type-{3} jobs combined is strictly less than the aggregate service rate of their compatible servers, i.e., N λ * (p {1} + p {3} ) = 5 3 µ < 2µ.Moreover, the scaled total number of jobs of the various types behaves as as λ ↑ λ * by relying on the observation in (38).
We will now combine the results in this subsection with those in the previous subsection to describe the heavy-traffic limiting distribution of the scaled numbers of jobs of the various types, irrespective of the ordered vector of job types.
First we note that only few of the possible values of the N -dimensional random vector α = (α T ) T ∈N can occur in the heavy-traffic regime with non-zero probability.Specifically, Lemma E.1 implies that as λ ↑ λ * , where α * = e T with probability P * (T ) if T ∈ N K and α * = 0 otherwise.Second, we observe that, irrespective of the choice of T ∈ N , 1(T ) ∈ {0, 1} |S| becomes an all-zeros vector in the limit when scaled by (1 − λ/λ * ).In other words, the contribution of the first occurrence of a job type to the (scaled) total number of jobs of that type will be negligible in the heavy-traffic regime.
Hence, we are now interested in Thanks to Lemmas E.2 and E.3 we know how to characterize the limiting random variables which are the entries of the matrix M * (T ).With T ∈ N k , only k out of the |T | columns will contain non-zero elements (Lemma E.2), with which we can associate k independent unit-mean exponential random variables U 1 , . . ., U k .
Combining the above results, the (scaled) limiting distribution of ( 35) is given by as λ ↑ λ * .Note that the expression on the right-hand side is indeed a vector with components indexed in S ∈ S. Furthermore, rewriting M * (T )E(T ) can further simplify the representation due to the |T | − K all-zeros columns.Define the |S| × K dimensional matrix W (T ), which contains the weights or fractions each of the job types receives of the independent exponential random variables U 1 , . . ., U k , i.e., W (T ) i,l = p T i p(T ,i l ) if i ≤ i l , 0 otherwise, with CR(T ) = {i 1 , . . ., i K } for all i = 1, . . ., |S| and l = 1, . . ., K. Note that we will not alter the permutation matrix P (T ), hence the indices in the above-defined matrix are resembling the job types, and hence the order, of the vector T .Moreover, with as λ ↑ λ * , which is consistent with the result in Theorem 4.1.We emphasize that the matrices P (T ) and W (T ) consist solely of real-valued and non-negative entries given the vector T , and only U (K) contains random variables.Hence, (Q S ) S∈S , properly scaled, will follow a mixture distribution in the limiting regime, where each mixture component consists of a weighted sum of independent exponential random variables.Note that the expression on the right-hand side of the above equation is again a vector with components indexed by S ∈ S. where we used that the value of k(T ) is the same for all vectors T ∈ N K as they all include the same set of jobs.

F Moments
In Section 4 we focused on the Laplace transform and joint distribution of the (scaled) numbers of jobs of the various types.In this section we will turn our attention to the nth moments of the total number of jobs and those of the individual job types.The nth moments of the total number of (waiting) jobs for a fixed arrival rate were derived in [11,Lemma EC.4] and are stated below for completeness.
Proposition F.1.Under the stability conditions stated in Condition 1, the nth moment of the total number of jobs in the system, Q, under the c.o.c.mechanism is given by where for any S ∈ S m and j = 1, . . ., m we define Alternatively, one could rely on the probabilistic arguments outlined in Subsection E.1 to derive an expression for the nth moments in terms of the individual moments of geometrically distributed random variables.An outline of the proof is deferred to Appendix F.2.

F.1 Convergence of the moments
Relying on Proposition F.1, we can prove the following result.Note that Theorem F.1 aligns with the result in Corollary 4.1 since the nth moment of an Erlang distribution with parameters 1 and K is precisely given by (42).
Proof.C.o.c.mechanism: From Proposition F.1, we can write and h(T , 1) as in (14).Note that f and f no longer depend on (z S ) S∈S , but do depend on the arrival rate λ.
To ease the notation, this dependence is omitted.We also note that the summation index m in (40) ranges from 1 to |S|, hence in the notation above we must exclude the empty vector from N 0 .This is a minor technicality that is negligible in the heavy-traffic limit.Let us first focus on the limiting behavior of the numerator of (43), which can be rewritten as It can easily be observed that Obviously, the first product has a non-zero limit only if n i = 0 for all i / ∈ CR(T ).For each i ∈ CR(T ), the corresponding term in the above product can be rewritten as as in (15).By definition of m ∈ R(n i ), we have that n i ≥ |m|.However, only terms with precisely n i = |m| will be non-vanishing in (47).This implies that m Now, let us focus on the denominator of (43).From ( 16) and ( 17 In order to prove the convergence result in Theorem F.1 in terms of the total number of jobs in the system, we use that Q ≤ Q ≤ Q + N .Hence, where the last term can be rewritten as as λ ↑ λ * .We used the inductive argument that the kth moment of 1 − λ λ * Q converges to k+K−1 K−1 , for k < n.This concludes the proof.Applying Little's law [34] to the first moments in Theorem F.1 results in the following corollary for the response time of an arbitrary job.In [11,Lemma EC.5], also an expression for the nth moment of the number of jobs of a particular job type S ∈ S was derived.Using a similar methodology as above we can prove the following theorem.Hence F (s) = F (s) + 1, so that (50) follows.This concludes the proof.

Figure 1 :
Figure 1: Visualization of the N-model.

Proposition 3 . 1 (
Probability generating function -c.o.c.).The joint PGF of the numbers of jobs of the various types for the redundancy c.o.c.policy is given by

Lemma 3 . 1 .
The following two statements are equivalent for given values of µ n , n = 1, . . ., N , and p S , S ∈ S: (i.)The (weak ) CRP condition is satisfied; (ii.)The depth of the critical subsets K = 1.

Figure 2 :
Figure 2: An illustrative example with model parameters as defined in Example 5.

Corollary 4 . 1 (
Convergence of the total number of jobs).With the definitions as in Section 3, the PGF of the (scaled ) total number of jobs, i.e., E z Q with in distribution to a random variable with an Erlang distribution with parameters 1 and K.The proofs of Proposition 4.1, Theorem 4.1 and Corollary 4.1 are deferred to Subsection 4.2.

Figure 3 :
Figure 3: Schematic representation of the derived heavy-traffic (HT) results starting from the PGF.

Figure 5 :
Figure 5: An illustrative example with model parameters and CRP components as defined in Example 10.
. , |T | − 1. Analogously, Q |T | | T denotes the number of jobs that arrived after the first T |T | -type of job.We refer to Q j | T as the numbers of jobs in the jth segment given T .Moreover, define Q j Ti | T as the number of type-T i jobs in the jth segment, for i = 1, . . ., j and j = 1, . . ., |T |.Example 11.Consider the system in Example 5 and let us illustrate the above definition by fixing the ordered vector T = [T 1 , T 2 , T 3 , T 4 ] = [{1}, {3}, {3, 4}, {1, 2, 3}].The state of the system could be given by j | = n j to obtain P Q j j∈J = n j j∈J ,

j = 1 ,Example 13 .
for a given T ∈ N .Notice that the different columns correspond to the various (independent) segments (Theorem E.1, parts b and d), and the rows correspond to the numbers of jobs of the various types in those segments (Theorem E.1, part a).The triangular part below the diagonal will always contain zeros.Since |T | is possibly smaller than |S|, M (T ) will be embedded into a larger matrix with the desired dimensions.Let M (T ) ∈ N |S|×|T | be such thatM (T ) = M (T ) O ,whereOis an all-zeros matrix with dimensions (|S| − |T |) × |T |.Now, the rows in M (T ) are arranged according to the order of the job types in T , and ideally they would be arranged according to the set of all job types (S) S∈S .This can be accomplished by multiplying M (T ) to the left with a carefully designed permutation matrix P (T ) of dimensions |S| × |S|, i.e., . . ., |T | and T j = S 1 if j = |T | + 1, . . ., |S| and Tj−|T |−1 = S 0 otherwise, permuting the jth row to the Sth row for j = 1, . . ., |S| and S ∈ S. The first case ensures that T j = S is associated with the correct job type.The second case takes care of the job types that are not present in T as T is defined as an arbitrary, but fixed, order of the job types in S \ T .With the notation as above, we can define M (T ) and P (T ) with T as in Example 11, i.e., M

Theorem F. 1 .
With the definitions as in Section 3, it holds for both the c.o.c. and c.o.s.mechanisms lim λ↑λ

) we know that lim λ↑λ * f ( 1 1
by (49) concludes the proof for the c.o.c.mechanism.C.o.s.mechanism: The proof concepts are similar to those in the proof for the c.o.c.mechanism.Together with the arguments in the proof of Proposition 4.1 we can observe that lim λ↑λ *

Corollary F. 1 (
Expected response time).With the definitions as in Section 3 and R denoting the response time of an arbitrary job, it holds for both the c.o.c. and c.o.s.mechanismslim λ↑λ * E 1 − λ λ * R = K N λ * .

Table 2 :
The limiting mixture distribution (X S ) S∈S in Example 4 obtained after applying Theorem 4.1, with U 1 , U 2 and U 3 independent exponentially distributed random variables with unit mean.