Sufficient Conditions for Central Limit Theorems and Confidence Intervals for Randomized Quasi-Monte Carlo Methods

Randomized quasi-Monte Carlo methods have been introduced with the main purpose of yielding a computable measure of error for quasi-Monte Carlo approximations through the implicit application of a central limit theorem over independent randomizations. But to increase precision for a given computational budget, the number of independent randomizations is usually set to a small value so that a large number of points are used from each randomized low-discrepancy sequence to benefit from the fast convergence rate of quasi-Monte Carlo. While a central limit theorem has been previously established for a specific but computationally expensive type of randomization, it is also known in general that fixing the number of randomizations and increasing the length of the sequence used for quasi-Monte Carlo can lead to a non-Gaussian limiting distribution. This paper presents sufficient conditions on the relative growth rates of the number of randomizations and the quasi-Monte Carlo sequence length to ensure a central limit theorem and also an asymptotically valid confidence interval. We obtain several results based on the Lindeberg condition for triangular arrays and expressed in terms of the regularity of the integrand and the convergence speed of the quasi-Monte Carlo method. We also analyze the resulting estimator’s convergence rate.


INTRODUCTION
Research and analysis of complicated problems across diverse ields of science, engineering, business, etc., often entail computing integrals, as in molecular dynamics, queueing systems, or pricing of inancial instruments.The integral frequently corresponds to the mean of a stochastic model, and model complexity usually precludes analytically evaluating the integral, so numerical methods are employed.Monte Carlo (MC) methods are computational algorithms based on random sampling that can be applied to estimate a mean (integral); see, e.g., [1,17] among the vast literature on the topic.Along with its many other desirable features, MC methods can provide a measure of an estimate's precision through a conidence interval (CI) obtained from an associated central limit theorem (CLT) assumed to hold as the sample size → ∞.
Quasi-Monte Carlo (QMC) methods [27,31] replace the random sampling of (standard) MC by a sequence of points deterministically chosen to be łevenly dispersed" over the space.The points form a so-called low-discrepancy sequence, and thanks to the more rapid coverage of the domain, the resulting approximation can converge faster than MC to the integral's exact value, at least for regular integrands.This improved convergence speed can be shown through several existing error bounds [15], the most famous being the Koksma-Hlawka inequality [31,Section 2.2].A drawback of QMC stems from the fact that such error bounds, while theoretically valuable in proving asymptotic properties, are in most cases practically useless, being diicult to compute and grossly overestimating the error for a ixed sample size and speciic integrand; see, e.g., [41,Section 2.2] for a discussion of the issues.
Randomized quasi-Monte Carlo (RQMC) methods have been introduced with the main purpose to solve this problem of assessing error in QMC methods.The principle is to randomly łshakež the low-discrepancy sequence, but without losing its good repartition over the sampling space.From independently randomized QMC estimators, we then can obtain an approximate CI via a CLT, exploiting the assumed convergence to a normal distribution as → ∞.For tutorials on RQMC, see among others, [19,22,27,41].Several types of randomization exist, the main ones being the random shift [7], which translates all points of the low-discrepancy sequence by the same uniform random vector; the digital shift [19], which applies random (digital) shifts to digits of the points of the low-discrepancy sequence, the same digital shift being utilized on digits of the same order for all the points; and the digital scrambling, which randomly permutes the digits [35,36].
In implementing RQMC, we apply independent randomizations of points from the low-discrepancy sequence of QMC.Given a computation budget that allows for about evaluations of the integrand, the user then needs to determine how to allocate to (, ) so that ≈ .Choosing large and correspondingly small beneits from the faster convergence speed (i.e., the improved precision) of QMC compared to MC.One might try applying a common rule of thumb that suggests a CLT roughly holds for ixed ≈ 30, and taking to be as large as possible to get a smaller error.But this heuristic generally lacks a theoretical basis.For example, [23] establish that for a single (i.e., = 1) random shift of a lattice, the limiting error distribution as → ∞ typically follows a non-normal distribution; thus, for RQMC with any ixed ≥ 1 in this setting, the error's limiting distribution will not be Gaussian, so the common suggestion of specifying a small rests on unsteady ground.A CLT, though, has been veriied by [28] (as → ∞ for ixed ≥ 1) in the case of digital scrambling of a particular type called nested uniform scrambling; [2,13] similarly prove CLTs for related scrambles.But this scrambling is computationally expensive, which has limited its adoption in practice in the past, although this may be less of an issue with today's more powerful computers.The last paragraph of Section 17.6 of [37] states, łIt seems likely that the other scrambles considered [in his book] do not satisfy a central limit theoremž for a single or ixed number of randomizations.
The question we thus aim at answering here is, for RQMC, how should (, ) comparatively increase to ensure a CLT and yield an asymptotically valid CI (AVCI)?The ordinary (Lindeberg-Lévy) CLT [3,Theorem 27.1] typically does not apply in our setting because the distribution of the estimator from a single randomization changes as varies.Instead, we formulate the problem using a so-called triangular array, for which a CLT can be obtained under the Lindeberg condition [3,Theorem 27.2].More precisely, we provide suicient conditions on the relative increase of and under various alternative assumptions on properties of the integrand.We initially establish CLTs in terms of the true variance from a single randomization, but this quantity is usually unknown.Thus, we also derive AVCI conditions when the variance is estimated.Comparing our conditions on the integrand shows that weaker assumptions correspond to stronger requirements on the number of randomizations (i.e., higher relative proportion for ).Similarly, the faster the convergence of the underlying QMC method, the more weight should be placed on randomizations.But in all cases when → ∞, we further verify that under our assumptions, RQMC always outperforms MC in terms of convergence speed of the estimators' root-mean-square errors (RMSEs).
The rest of the paper unfolds as follows.Section 2 presents the notation and reviews MC, QMC and RQMC methods.It also introduces the diferent assumptions on the relative increase of and and on the properties of the integrand, along with related useful lemmas.Section 3 applies the conditions of Lindeberg and Lyapounov [3,Theorem 27.3] for triangular-array CLTs in our RQMC framework.We further specialize the CLTs to obtain corollaries exploiting speciic properties of the integrand, and show that stronger conditions on the integrand allow more weight to be put on the QMC component.Section 4 presents similar results when the variance is estimated, the łreal-life" situation, to obtain conditions for AVCI.Section 5 expresses the results under the special context when ≡ and ≡ 1− for ∈ (0, 1), helping to gain deeper insights on the conditions and their implications.Section 6 provides concluding remarks.Appendices contains all proofs (Appendix A), as well as additional analysis when (, ) = ( , 1− ) (Appendices B and C) and numerical results (Appendix D).The present work extends our conference paper [30], which presents many of the theorems, corollaries and propositions (but all without proofs), as well as Figures 1, 3, and 4, which are in Appendix C of the current paper.Moreover, [30] focus mainly on the case of (, ) = ( , 1− ), whereas the current paper also develops a more general setting.

NOTATION/FRAMEWORK
Our goal is to estimate where ℎ : [0, 1] → | is a given function (integrand) for some ixed ≥ 1, random vector ∼ U [0, 1] with U [0, 1] the uniform distribution on the -dimensional unit hypercube [0, 1] , ∼ means łis distributed asž, and ❊ denotes the expectation operator.Integrating over [0, 1] is the standard QMC setting, and means of many stochastic models may be expressed in this way, e.g., through a change of variables.We can think of integrand ℎ as being a complicated simulation program that converts independent univariate uniform random numbers into observations from speciied input distributions (possibly with dependencies and diferent marginals), which are used to produce an output of the stochastic model.
The following subsections will describe methods for estimating via MC, QMC, and RQMC.

Randomized uasi-Monte Carlo
RQMC randomizes the low-discrepancy sequence, without losing its good repartition property, and computes an estimator from the randomized point set.More precisely, let ( ′ ) ≥1 be a randomized low-discrepancy sequence constructed from Ξ, such that each ′ is uniformly distributed over [0, 1] but ( ′ ) ≥1 are correlated and preserve the low-discrepancy property of the original sequence.RQMC repeats this ≥ 1 i.i.d.times, computing an estimator from each randomization.Speciically, let ′ , ∈ [0, 1] be the -th point of the -th i.i.d.randomization ( ≥ 1 and 1 ≤ ≤ ).The RQMC estimator is then for the total number of evaluations of ℎ, but we use the simpler notation (, ) here to introduce the ideas before adopting the more complicated notation ( , ) later.)We next describe some existing randomizations and associated existing theoretical results.

Random Shit.
A random shift of a low-discrepancy sequence generates a single uniformly distributed point ∼ U [0, 1] and adds it to each point of Ξ, modulo 1, coordinate-wise [7].Formally, using the irst points of the low-discrepancy sequence Ξ and ≥ 1 independent randomizations leads to ) =1,..., (depending on , as well as , , and Thus, the convergence speed has an asymptotic upper bound that is even better than what we get from the Koksma-Hlawka bound ( 2) and (3).For a randomly-shifted low-discrepancy sequence, the RQMC estimator may not obey a Gaussian CLT for a ixed as → ∞, as shown for lattices in [23].In particular, for = 1, the limiting error distribution in dimension = 1 is uniform over a bounded interval if the integrand is non-periodic, and has a square root form over a bounded interval if the integrand is periodic.In higher dimensions (still for = 1), [23] argues that characterizing the error distribution is not practical.Thus, for any ixed ≥ 1, the limit distribution as → ∞ is generally non-normal, motivating our goal of deining rules on (, ) that ensure a Gaussian CLT.

Scrambled Digital
Nets.Owen [34,35] introduces scrambled digital nets as another form of RQMC.The method scrambles the digits of special low-discrepancy sequences, namely, digital nets [31].The approach applies random permutations to the digits in a way that preserves the low-discrepancy property.We do not provide the full description of the construction of the low-discrepancy sequence (see [31,Chapter 4], for details), but rather focus on the randomization.For a digital net Ξ = ( ) ≥1 in base 0 and dimension , we express the -th coordinate (1 ≤ ≤ ) of the -th point = ( (1)  , . . ., ( ) ) as ( ) ) denote the -th randomized point in a generic randomization (therefore omitting the index for the -th randomization), its -th coordinate is deined by ) , ∀, ℓ, are independent and uniformly distributed on the set of 0 !permutations of {0, . . ., 0 − 1}.In other words, the digits are randomly permuted, with independent permutations.This randomization is called nested uniform scrambling.For a digital net, the development in base 0 of is inite, meaning that the required number of permutations is inite, even if large.Other more computationally eicient (but theoretically less powerful) forms of scrambling appear in [16,19], and [29], including the linear matrix scramble.
Scrambled digital nets keep the discrepancy property of the original digital net.Speciically, let Ξ Π denote the scrambling of a digital net Ξ in base 0 .If there is a constant > 0 such that * (Ξ) ≤ −1 (ln ) for all , then the scrambling also satisies ( [19,34] so the estimator in (4) from a single randomization converges at the same speed as (7) when the integrand ℎ ∈ BVHK.For special classes of ℎ [36], the RMSE of the quadrature rule based on nested uniform scrambling can be as small as −3/2 (ln ) ( −1)/2 ; Appendix B.5 describes similar results.
For RQMC via digital nets and Owen's full nested scrambling, [28] establishes a Gaussian CLT for = 1 as → ∞ (so the number of involved independent permutations also tends to ininity); [2] and [13] establish extensions.But these CLTs are restricted to this speciic scrambling, whose large computational cost has limited its use in practice in the past (although this may be less of an issue with today's more powerful computers); more general CLTs are needed for other forms of RQMC.Another drawback of the CLT of [28] is that it applies to nested scrambling to only a particular class of digital sequences with so-called -value of 0, where lower values of ensure better equidistribution; see [37,Section 15.7] for details.But restricting to = 0 rules out the popular Sobol' sequence except in very small dimensions .While = 0 allows for Faure sequences, empirical studies seem to indicate that these produce less accurate approximations to than Sobol' sequences.
As with the scrambling procedure of Section 2.3.2, the digital shift applied to a digital net retains the lowdiscrepancy property of the original sequence [19].Speciically, for a digital net Ξ, let Ξ Dig be its digital shift based on ∼ U [0, 1] .Then there exists a constant > 0 such that when * (Ξ) ≤ −1 (ln ) for all , we also have * (Ξ

Assumptions and Preliminary Results
Recall that for a given computation budget of about integrand evaluations, we deine the RQMC estimator in (4) with (, ), where is the number of randomizations and is the number of points used from each randomized sequence, so the total number of evaluations of the integrand ℎ is .For a fair comparison with MC, which uses evaluations of ℎ when the sample size is , we will assume that RQMC has ≈ , which will be more precisely expressed below.To study the asymptotic behavior as → ∞, we take ≡ ≥ 1 and ≡ ≥ 1 as functions of satisfying the following: Assumption 1.A. ≤ for each ≥ 1, with / → 1 and → ∞ as → ∞.
Under Assumption 1.A, the RQMC estimator of in (4) becomes so , is the estimator from randomization = 1, 2, . . ., , of points, where ≤ .Section 3's goal is to determine conditions on the behavior of allocation ( , ) as grows that ensure RQ , obeys a CLT (with Gaussian limit) as → ∞, and Section 4 derives conditions for an asymptotically valid CI for .Other papers (e.g., [8,11]) adopt a framework similar to Assumption 1.A to study MC methods for analyzing steady-state behavior of a stochastic model (for example, batching with batches, each of size ).
Assumption 1.A requires → ∞ as → ∞ because otherwise, a Gaussian CLT may not hold.As noted earlier, [23] show that when applying RQMC using a lattice rule and the random shift, the resulting estimator can obey a limit theorem with non-Gaussian limit as → ∞ for ixed ≥ 1 (see the discussion at the end of Section 2.3.1), and the only Gaussian CLTs that the authors are aware of are only for the (computationally expensive) nested digital scrambling [2,13,28] (see the end of Section 2.3.2).But while Assumption 1.A speciies that → ∞ as → ∞, it does not require that → ∞ as → ∞, and we sometimes take = 0 for some ixed 0 ≥ 1, where 0 = 1 corresponds to MC. Section 5 will also consider the following special case of Assumption 1.A: Assumption 1.B.= and = 1− with ∈ (0, 1).
We now give some remarks on Assumption 1.B.
• Under Assumption 1.B, both , → ∞ as → ∞ because ∈ (0, 1).In particular, = 1− → ∞ precludes the setting of [23] in which a Gaussian CLT does not hold.• As and need to be integers, Assumption 1.B should deine, e.g., = ⌊ ⌋ and = ⌊ 1− ⌋, where ⌊•⌋ denotes the loor function.Moreover, while our asymptotic analyses also will hold for ( , ) = ( 1 + 1 , 2 + 2 1− ) for nonnegative constants 1 and 2 and positive constants 1 and 2 , we simplify the discussion by assuming that ( , ) = ( , 1− ) is integer-valued under Assumption 1.B.• Assumption 1.B precludes allocations such as ( , ) = (/ln , ln ) allowed by 1.A.But allocations of this form can permit the number of points from the QMC sequence to grow more rapidly than for any > 0 as → ∞, so Assumption 1.A enables taking greater advantage of QMC's fast convergence rate than is possible under Assumption 1.B.• While Assumption 1.B speciies that < 1, we could also consider the case of = 1, which corresponds to a single randomization or a ixed number 0 of randomizations.Then for fully nested scrambling [2,13,28], each randomization satisies a CLT as → ∞, so an asymptotically valid Student-conidence interval could be constructed when 0 ≥ 2; see Appendix B.5 for related discussions.
For those ∈ (0, 1) ensuring that RQ , satisies a Gaussian CLT or that AVCI holds, Section 5 determines the value of that maximizes the convergence rate for RMSE[ RQ , ] as → ∞.Let Ξ ′ be a generic randomized low-discrepancy sequence.It is Ξ ′ = Ξ for a random shift (Section 2.3.1),Ξ ′ = Ξ Π for digital scrambling (Section 2.3.2), and Ξ ′ = Ξ Dig for a digital shift (Section 2.3.2).As we will explain below, each of these randomization methods satisfy the following condition, which will be helpful to analyze RQMC estimators with integrands ℎ ∈ BVHK.Assumption 2. For the RQMC method used, there exists a constant 0 < ′ 0 < ∞ such that each randomized sequence Ξ ′ satisies * (Ξ ′ ) ≤ ′ 0 −1 (ln ) for all > 1, where ′ 0 depends on the RQMC method but not on the randomization's realization (e.g., of the random uniforms or permutations).
Our results will further depend on properties of the integrand ℎ.We will often consider four alternative conditions on ℎ, presented in order of decreasing strength (see Proposition 2.1 below).For ixed ≥ 1, we write ℎ ∈ L when We next give two bounds on absolute central moments of the estimator ,1 in (10) based on points from a single randomization, which will be useful when establishing a CLT or AVCI.We omit the proof of the following, which follows by an argument analogous to [39,Theorem 2].
Lemma 2.2.Under Assumptions 1.A, 2, and 3.A, for any > 0 and for all such that > 1, By ( 12), when ℎ ∈ BVHK and → ∞ as → ∞, the order-absolute central moment of , in (10) shrinks as ([(ln ) / ] ) as → ∞, so , = ( −+ ) for each > 0. But assuming HK (ℎ) < ∞ is restrictive; e.g., the proof (in Appendix A) of Proposition 2.1 notes that HK (ℎ) = ∞ for ≥ 2 if ℎ is an indicator function (so is a probability) with discontinuities not aligned with the coordinate axes.If HK (ℎ) < ∞ is not true or cannot be veriied, we can still bound , as follows under a moment condition on ℎ( ).(The proof appears in Appendix A.) For a given total number of integrand evaluations, a common suggestion when using RQMC methods is to let be as large as possible to beneit from the superior convergence speed of QMC (compared to MC), but we still want to be big enough so that a Gaussian CLT roughly holds (see the discussion at the end of Section 2.3.1) and an asymptotically valid CI for can be constructed.As in (12), when ℎ ∈ BVHK and → ∞ as → ∞, we have that as → ∞ by replacing and by and 1− , respectively, as in Assumption 1.B.Hence, larger leads to faster convergence.Taking = 1 is optimal in this respect but does not satisfy Assumption 1.B, and then a Gaussian CLT may not be guaranteed [23], as noted earlier (Section 2.3.1).
The following sections establish various conditions that ensure RQ , obeys a CLT and when we can obtain an AVCI for .Sections 3 and 4 derive these conditions when ( , ) satisfy Assumption 1.A, whereas Section 5 instead adopts Assumption 1.B, which permits simpler and more intuitive analysis.
This setup allows for the distribution to change with , as is the case in (10).
We now adapt the Lindeberg condition (16) to study the RQMC estimator RQ , in (10).By ( 14), Denote the distribution of , = , − by , which does not depend on by (14).Note that 2 = ∫ 2 d (), and let We will impose another assumption to avoid uninteresting cases.The following precludes the exact result from being eventually always returned by the RQMC estimator.
We omit the proof of the following, which specializes for RQMC the condition ( 16) as ( 19) below.

Corollaries of Theorems and 3.2
We now develop various suicient conditions that secure CLT (20) through Theorem 3.1 or 3.2.Section 3.2 will compare the conditions under a general allocation ( , ) of Assumption 1.A, with Table 1 in Section 4.2 outlining the results (also including those for AVCI in Section 4.1).Later in Section 5.8, Table 2 summarizes and compares all of our corollaries under the simple allocations ( , ) = ( , 1− ) of Assumption 1.B; Figures 1,  2, and 3 in Appendix C provide graphical comparisons.Our results provide conditions on ( , ) in terms of , whose exact asymptotic behavior has not been established in the literature for most RQMC methods.One exception that we are aware of is for nested uniform scrambling (Section 2.3.2) under certain restrictions, which we discuss in Appendix B.5.For some other RQMC methods, Appendix D provides numerical results studying the behavior of .
In the following corollary, we deine = 1 =1 ℎ( ′ ) as an estimator based on points from a single randomization Ξ ′ = ( ′ ) ≥1 , and 2 = Var[ ].This is to emphasize the dependence of these quantities on the point-set size but not on the budget nor the allocation ( , ).
Corollary 1. Suppose that Assumptions 1.A and 4 hold for allocation ( , ) with → ∞ as → ∞, and that there are constants > 0 and Then the Lyapounov condition (21) and CLT (20) hold.A suicient condition for (22) is that there exists a constant 2 ∈ (0, ∞) such that Condition ( 22) (resp., ( 23)) bounds the moment (resp., almost sure) behavior of the absolute error | − | from a single randomization of points relative to its standard deviation for all large .While Corollary 1 requires → ∞, ( 22) and ( 23) do not depend otherwise on the allocation ( , ), so Corollary 1 can fulill Assumption 1.A with growing slowly to ∞ as → ∞.Under Assumption 1.B, we may then take ( , ) = ( , 1− ) for ∈ (0, 1) arbitrarily close to 1, allowing large to exploit QMC's fast convergence rate; Section 5.1 will explore this idea.But Assumption 1.A further permits, e.g., ( , ) = (⌊/ln ⌋, ⌊ln ⌋), so can increase even more quickly in .However, [37,Chapter 17 end notes] states that under a linear matrix scramble [29], łThere is reason to believe that the skewness and kurtosis . . .could divergež as → ∞ for some integrands ℎ, in which case (22) may not hold for = 1 and 2. Appendix D.4 investigates numerically the validity of Condition (22) and illustrates that it appears to be satisied in certain (but not all) settings.
Establishing (22) or ( 23) may be diicult, so we next provide other conditions that can be more readily veriiable to ensure CLT (20).We will prove corollaries corresponding to each of our restrictions on the integrand ℎ in Assumptions 3.Aś3.D, which are in decreasing order of strength (Proposition 2.1).
The following corollary of Theorem 3.1 considers the case when the integrand ℎ is a bounded function.
As with Corollary 2, the next corollary follows from Theorem 3.2, but it does not require ℎ ∈ BVHK, precluding the application of Lemma 2.2.It instead employs Lemma 2.3, so it assumes a moment condition on ℎ( ) (Assumption 3.C).
While Assumption 1.A speciies that → ∞ as → ∞, it does not require that → ∞ as → ∞.The previous corollaries allow → ∞ as → ∞, although this was not required except for Corollary 1.The following result specializes Theorem 3.1 to consider the case when is ixed.

Comparison of Conditions for CLT
We now want to compare Corollaries 2ś5 from Section 3.1, each of which ensures the CLT (20).Proposition 2.1 establishes that these corollaries impose successively weaker restrictions on the integrand ℎ.We next show that in many settings, the corollaries require correspondingly stronger conditions on ( , ), demonstrating tradeofs in our assumptions.(Section 5. then (25) implies condition (24) of Corollary 2. Condition (28) holds, e.g., under Assumption 1.B.

ASYMPTOTICALLY VALID CONFIDENCE INTERVAL
We now enhance the CLTs in Section 3 to construct an asymptotically valid CI for under the framework of Assumption 1.A. Suppose that ≥ 2, which Assumption 1.A ensures holds for all suiciently large.For each such , the , , = 1, 2, . . ., , are i.i.d. by (14), and their sample variance is an unbiased estimator of 2 in (13).For a given desired conidence level 100%, with 0 < < 1, we then consider a CI for as Our goal now is to provide conditions ensuring that , is an AVCI, i.e., (31) below holds.
We now want a suicient condition ensuring that (30) holds to secure AVCI (31).
To summarize, we see that Theorem 4.1 imposes an extra condition (30) to those ensuring a CLT through Theorems 3.1 or 3.2 to further guarantee an AVCI.Theorem 4.2 then shows that assuming the Lyapounov condition (21) of Theorem 3.2 holds for = 2 (i.e., (32)) is suicient to secure (30).

Specific Suficient Conditions for AVCI
We next establish AVCI (31) (often through Theorem 4.2) under various conditions.By Corollaries 1 and 6, the condition (23) ensures both the CLT (20) and AVCI (31).But as we will see in Section 5, securing AVCI often uses stronger conditions than ensuring a CLT.
The next result considers the case when ≡ 0 is ixed, as in condition ( 27) of Corollary 5.
Corollaries 5 and 9 assume the same conditions, where the former establishes the CLT (20), and the latter proves that (29) is AVCI (31).Thus, when = 0 is ixed, AVCI does not require any additional conditions beyond that for a CLT.Corollaries 7 and 8 also allow for = 0 , but further permit → ∞.But if = 0 is ixed, Corollary 9 ensures AVCI (31) more broadly than Corollaries 7 and 8, as the latter impose additional restrictions on integrand ℎ.

Cor. Ensures Assumption on only ℎ
Other Key Condition Implications 1 CLT Eq. ( 22) for some > 0 2   (20) and AVCI (31) provides further insights.Corollary 1 (resp., 6) ensures the CLT (resp., AVCI) when condition (22) holds for > 0 (resp., = 2), so the corollaries impose a more stringent condition to ensure AVCI beyond a CLT.But both Corollaries 1 and 6 also guarantee CLT and AVCI under the more restrictive requirement (23).Corollaries 5 and 9 assume the same conditions, so when = 0 is ixed, AVCI (31) does not require any additional conditions beyond that for the CLT (20).Theorem 4.2 and Corollaries 7 and 8 provide suicient conditions for AVCI, which also imply CLT (20).Ideally, we would have that AVCI is true whenever the CLT holds without any additional restrictions (although this may not be possible, e.g., [9,Example 3.4.13]provides an example of i.i.d.summands having ininite variance, and a CLT holds but with a nonstandard scaling).To study this, we could compare (the conditions of) Theorem 3.2 to Theorem 4.2, compare Corollary 2 to Corollary 7, and compare Corollary 4 to Corollary 8.However, as such a comparison is long, we instead will carry out the analysis only under Assumption 1.B in Section 5.8 (see Table 2), which will lead to simpler and more intuitive results.Compared to the broad generality of Assumption 1.A used by Table 1, the additional structure of Assumption 1.B enables sharper conclusions in Table 2.

ANALYSIS WHEN
Recall that Assumption 1.B specializes Assumption 1.A so that ( , ) = ( , 1− ) for some ∈ (0, 1).We now want to utilize the results of Sections 3 and 4 to determine what values of ensure CLT (20) or AVCI (31), and which of those lead to the best rates of convergence for RMSE[ RQ , ] as → ∞.Table 2 at the end of this section will summarize the results under Assumption 1.B.
With RQMC, we typically have , deined before Corollary 1, is ( − ) as → ∞ with > 1/2 (e.g., see (12) when ℎ ∈ BVHK).In this case, the RQMC standard deviation (and RMSE) for a single randomization of a low-discrepancy sequence of length has a better rate of convergence than MC.
Throughout the remainder of this section, we will make the following assumption.
Assumption 5.The decreasing rate limit exists.
Note that while the limit in in Assumption 5 in (35) may not exist for certain RQMC methods, it may instead hold for particular subsequences * of .This is often assumed in the QMC and RQMC literature when investigating the methods' eiciency through numerical experiments.Using the particular structure of so-called (0, * , )-nets in base 0 ≥ 2 for example, such limits are established [28,Theorem 1] for nested scrambling and suiciently smooth ℎ with * = * 0 as * → ∞, which we will further analyze in Appendix B.5, comparing the convergence rates to some of our corollaries from Sections 3 and 4. Speciically, in the particular case of nested scrambling applied to a smooth function with Lipschitz continuous mixed partial of order , [28, Theorem 1] obtains an exact convergence rate for the variance, which Appendix B.5 uses to get * = 3/2 along the subsequence * = * 0 as * → ∞.Even if we may need to consider the limit in (35) along such a subsequence * , we will consider it in to simplify the notation.
Consequently, for any choice of , the RQMC estimator's RMSE shrinks faster than the MC estimator's RMSE, which from (1) decreases at rate − MC as → ∞, where Moreover, for any ixed * satisfying ( 38), ( * , ) is strictly increasing in , so choosing larger leads to the RQMC RMSE shrinking faster.We next want to see how large ∈ (0, 1) can be chosen and still ensure CLT (20) or AVCI (31).
As will be shown below in Sections 5.1ś5.6, the corollaries guaranteeing CLT (20) or AVCI (31) in Sections 3 and 4 will typically lead to imposing restrictions on of the form for some 0 < ( * ) ≤ 1 depending on the particular Corollary considered.(The only exceptions to constraints as in (42) are Corollaries 5 and 9, which essentially have = 0 because they assume that = = 0 is ixed in (27); see Section 5.7 for more details.)We will see that except for = 1 and 6, each upper bound ( * ) in ( 42) is strictly decreasing in * .Thus, as * gets larger (i.e., better RQMC convergence rate for a single randomization), the choices for ensuring CLT (20) or AVCI (31) shrink in most cases, so the length = of the low-discrepancy sequence needs to grow more slowly and the number = 1− of independent randomizations must increase more quickly in .In this sense, when RQMC does better on a single randomization, our suicient conditions handicap the method by choosing a budget allocation to employ more randomizations (i.e., smaller ) to secure a Gaussian limit.Appendix D.2 presents numerical results studying this, showing that for a ixed budget and varying , CI coverage degrades when is too large.The łlargestž possible satisfying (42) is = ( * ) − for > 0 ininitesimally small.For this choice of , what is the RMSE convergence rate of the RQMC estimator?The exponent of in (40) and 1 (, For each (arbitrarily small) > 0, we take any ∈ (0, 0 ()) for 0 () = /( * − 1/2) so that 1 (, ) > 0 under our assumption (38).We then get by (40) that as → ∞, for any > 0 and any ∈ (0, 0 ()).

RQ
, decreases at about the rate for a single randomization of a low-discrepancy sequence of full length = (or for an RQMC estimator with a ixed number 0 ≥ 1 of randomizations, each of length = ⌊/ 0 ⌋).

Summary of Results Under Assumption 1.B
Appendix B (resp., C) analytical (resp., graphical) comparisons of some of the ( * ) and ( * ) values for the various Corollaries .
Table 2 summarizes the most important assumptions and conclusions of Corollaries 1ś9 under Assumption 1.B, which enables sharper results compared to those in Table 1 under Assumption 1.A.The table shows that strengthening the assumptions ensures that RQMC's RMSE shrinks at a faster rate.

CONCLUSIONS
RQMC methods provide powerful simulation tools accelerating the convergence rate with respect to MC.A standard way to estimate the RQMC error in practice exploits an assumed CLT over ≥ 2 i.i.d.randomizations, but typically restricting the size of so that more weight can be put on the low-discrepancy size to gain from the superior convergence rate of QMC.A common rule of thumb suggests a CLT roughly holds for ixed ≈ 30; however, this heuristic has lacked rigorous theoretical support for most RQMC methods.The only existing CLT results that the authors are aware of as increases and ixed < ∞ [2, 13, 28] are for scrambled digital nets with nested uniform scrambling, but this RQMC technique is computationally expensive, which has limited its adoption in the past, although this may be less of an issue with today's more powerful computers.In contrast, [23] proves that increasing for any ixed number of randomizations can lead to a non-normal limiting distribution.To our knowledge, no theoretical result covering a broad class of RQMC methods has ever been published in the literature guaranteeing a CLT for → ∞.Our paper provides suicient conditions on (, ) and their relative increase under the framework of Lindeberg's condition for triangular arrays.The conditions depend on the properties of the integrand and the convergence speed of the RQMC standard deviation from a single randomization (at least along a speciied subsequence).We have also given conditions for AVCI, when the standard deviation is estimated.We have presented several properties of the conditions and the convergence speed of the resulting estimators.
Many of our corollaries in Sections 3.1 and 4.1 specify that an allocation (, ) = ( , ) has that does not grow too quickly relative to as the computing budget → ∞.For these results, if the variance of an RQMC estimator from a single randomization decreases too quickly, we may not be able to ensure that an RQMC estimator from randomizations obeys a CLT with a Gaussian limit nor yields an AVCI; see Appendix D.2 for numerical results studying this.But an exception to this is Corollary 1 (resp., 6), which secures a CLT (resp., AVCI) for → ∞ at any arbitrarily slow rate as → ∞ when condition (22) holds for some > 0 (resp., = 2).We are currently investigating alternatives to condition ( 22) that similarly allow → ∞ at any rate (possibly sub-polynomial, e.g., = ⌊ln ⌋).As directions for other future research, we also plan to provide a guide for practitioners on how to choose under Assumption 1.B a value of as large as possible to satisfy a CLT.As this may entail estimating * in (35), which is the exponent deining the rate at which the standard deviation decreases for a single randomization, we would need to account for the statistical error in our estimate of * .Given that we provide suicient conditions for a CLT or AVCI, we also aim to see if they can be weakened.More numerical investigations can also be worthwhile towards this goal.Moreover, rather than build a CI for based on a CLT, we are additionally investigating instead employing resampling methods, such as the bootstrap [37, Chapter 17 end notes].

ACKNOWLEDGMENT
This material is based upon work supported by the National Science Foundation under Grant No. CMMI-1537322.Additional support came from an NJIT seed grant.The authors would like to thank Art Owen for pointing out references, used in the proof of Proposition 2.1, showing that if the integrand ℎ has bounded Hardy-Krause variation, then ℎ is bounded.

B ANALYTICAL COMPARISONS OF THE ( * ) AND THE ( * ) FROM SECTION 5
For the various Corollaries from Sections 3.1 and 4.1, we now compare their corresponding values of the upper bounds ( * ) in (42) for ∈ (0, 1) in Assumption 1.B and the optimal approximate rates ( * ) in (43) from Sections 5.1ś5.6.While many of the corollaries involve conditions that depend on the interaction of the integrand and the RQMC method through only = Var[ 1 , the cases = 1 and 6 instead impose other requirements, ( 22) or (23), that involve higher-order absolute central moment or almost-sure behavior, which may be diicult to verify in practice.The diferent forms of the conditions for = 1 and 6 complicate their comparisons with the other corollaries, so we mostly omit = 1 and 6 from the following comparisons.Note nevertheless that 1 ( * ) = 6 ( * ) = 1 ≥ ( * ) and 1 ( * ) = 6 ( * ) = * ≥ ( * ) for all ∉ {1, 6} when * ≥ 1.
As shown before in Propositions 2.1 and 3.3 under Assumption 1.A, we will see trade-ofs in the conditions that ensure CLT (20) under Assumption 1.B: stronger conditions on the integrand ℎ (through Assumptions 3.Aś3.C) lead to looser constraints on from larger ( * ) in (42).A similar situation will also hold for guaranteeing AVCI (31), as we saw before in Proposition 4.3 under Assumption 1.A.Also, when imposing comparable conditions on ℎ, the value of ( * ) is always no larger (and often strictly smaller) to ensure AVCI than for the CLT, so making sure AVCI holds typically requires restricting more than for a CLT.
Comparing across the diferent panels in Figure 1, we see that each upper bound ( * ) in (42) on decreases as * increases.To investigate this further, Figures 3 and 4 plot the ( * ) and ( * ), respectively, as functions of * .We can then see the diferences between the various corollaries and assumptions as the QMC method improves (i.e., * increases).
Figure 3 more clearly illustrates that as * grows, each upper bound ( * ) in (42) on decreases, so guaranteeing a CLT (20) or AVCI (31) requires putting more efort on the MC part (i.e., for ixed > 0, = 1− grows as decreases) and correspondingly less on the QMC (i.e., = shrinks as gets smaller).The tradeof could potentially harm RQMC's optimal RMSE convergence rate because of the diminished beneits (decreasing ( * )) of QMC's improved convergence rates from larger * .However, Figure 4 shows that this is not the case for all CLT and most AVCI conditions: the RMSE optimal convergence speed determined by ( * ), as in (43), is generally increasing in * .The one exception is 7 ( * ), which determines the optimal RMSE convergence rate to ensure AVCI when ℎ ∈ BVHK.In this case, Figure 3 shows that as * increases, 7 ( * ) drops of quickly, so the number of randomizations must grow rapidly as * increases to ensure AVCI when ℎ ∈ BVHK.But even so, we still have 7 ( * ) > 8 ( * ) for all * , where 8 ( * ) in (59) is the optimal RMSE rate exponent for AVCI obtained under the moment condition (ℎ ∈ L 4 ) of Corollary 8.

D NUMERICAL RESULTS
The goal of this section is to study numerically the asymptotic results in our paper.We aim to see for various values of ∈ (0,

D.1 Estimation of *
For checking if the values ( * ) from Section 5 provide appropriate thresholds on values of under Assumption 1.B to secure a CLT or AVCI, we need to estimate * in (35).A standard procedure for convergence rate estimation of QMC and RQMC methods applies log-log regression.Assuming ≈ − * is equivalent to ln( ) ≈ ln() − * ln().To estimate the unknowns and * , we generated data for values ( ) , 1 ≤ ≤ , of .For each , we estimate ( ) through the sample standard deviation of 0 independent estimates of from single randomizations of ( ) points.Then, using the simplifying notation ℓ = ln( ( ) ) and = ln( ( ) ) for each 1 ≤ ≤ , standard linear regression yields an estimator of * as , where = 1 For our estimations, we used 0 = 100, = 17, and ( ) = 2 +5 for ∈ {1, 2, . . ., 17}.Table 3 gives the estimated * for the three considered RQMC methods (Sobol' sequence with DS, Sobol' sequence with LMS, and Lattice with RS) and the four integrands (ℎ v , ℎ b , ℎ ub,0.35 , and ℎ sum ) in dimensions = 3 and = 4.For the function ℎ v ∈ BVHK, the estimated values of * are larger than or equal to 1, agreeing with (37).The estimated * is very close to 1 for the additive ℎ sum ∈ BVHK for Sobol' with DS and Lattice with RS; thus, the bound in (37) appears to be tight.For ℎ sum using Sobol' with LMS, the regression provides a bigger estimated * but also a much larger constant : = 16585 for LMS in dimension 3 as opposed to = 1.33 for Sobol' with DS and Lattice with RS. Figure 7 (also see [24, Section 4.2]) shows extremely spiky histograms for ℎ sum with Sobol' and LMS, indicating very low variance, which has been similarly observed in [20] in other contexts; but in general, the numerical values should be considered with caution.For functions ℎ b and ℎ ub,0.35 , which have ininite Hardy-Krause variation, estimated values of * are smaller than 1 but larger than 1/2 (see (38)), the exponential rate when using MC methods.

D.2 Analysis of the impact of the allocation on coverage
In this subsection, we set the dimension as = 3. Recall that is the total number of integrand evaluations, which is distributed among = 1− independent randomizations of = points from a low-discrepancy sequence.We proceed as follows: we ix = 2 14 = 16384, and choose ( , ) = (2 , 2 14− ) for ∈ {2, . . ., 12} to study a wide range of values of = /14.Figure 5 displays, as increases, the estimated coverage values of the produced nominal 95% conidence intervals RQ , , from (29) for the integrands ℎ v , ℎ b , ℎ ub,0.35 , and ℎ sum and the three types of randomization considered in Table 3.For the sake of readability, we do not draw the 95% conidence intervals  on the coverages generated from the 200 independent experiments, but recall that for a true coverage value 0.95 (resp., 0.85), the resulting error of the coverage is ±0.03 (resp., ±0.05) (based on a 95% conidence interval for the coverage).
• For all integrands and randomizations, coverage is close to the expected 0.95 when is small.Then it drops signiicantly when is larger than 0.8 for all functions.Note though that, as expected, the stricter the assumption on the integrand, the larger the threshold value ( * ) above which coverage is statistically signiicantly below 0.95.• For function ℎ v in (67), up to value = 0.786, we do not detect any suspect coverage values, but when = 0.857, the estimated coverage levels drop to 0.815, 0.87 and 0.90 for the various randomizations, which lead to rejecting the null hypothesis of 95% coverage (at 95% level of conidence).Theoretically, as ℎ v satisies Assumption 3.A, Corollaries 2 and 7 apply; thus, taking * = 1.35 as an approximation from  3, we get 2 ( * ) = 1 (resp., 0.221) from (50) and 7 ( * ) = 1 (resp., 0.124) from (57).Thus, for Sobol' with LMS, our suicient condition seems to be stronger than necessary.For Sobol' with DS and Lattice with RS for which can increase polynomially as slowly as we wish to ininity with to get a CLT, it may indicate that the number of randomizations has to be large enough to obtain a reasonable variance estimate, which may not occur when is too small (we indeed have = 4 for = 0.857 here).
Remember that for ixed, the centered and scaled RQMC estimator may not converge (weakly) in general to a normal random variable as → ∞; e.g., [23] prove a non-Gaussian limit for the random shift with lattices for a single randomization = 1, so the same is true for any ixed ≥ 1.To further study the unsatisfactory coverage when = is small in Figure 5, we also constructed histograms of 2000 values of (deined before Corollary 1 in Section 3.1) for diferent values of = 2 for functions ℎ ub,0.35 and ℎ sum , which are the ones in Figure 5 for which the coverage sufers the most for small , and our three types of randomization considered in Table 3. , can lead to a CLT providing a poor normal approximation and sizable CI coverage error when the sample size is not large; this coincides with the substandard behavior for small in Figure 5.For example, Figure 6 presents histograms for the integrand ℎ ub,0.35 when using Sobol' with LMS; the plots for the other randomization methods with ℎ ub,0.35 are similar so are not shown.The graphs illustrate that has a non-normal and highly skewed distribution for function ℎ ub,0.35whatever the randomization type, even for large .This aligns with the degrading coverage as shrinks in the middle left panel of Figure 5.
Figures 7 and 8 display histograms for ℎ sum when applying Sobol' with LMS and DS, respectively; the plots for ℎ sum when applying Lattice with RS are similar to those in Figure 8, so are omitted.Figure 7 shows increasing spikiness (growing kurtosis) as = 2 increases, with extreme skewness for the largest .(Additional omitted plots with other values of reveal that the skewness becomes apparent starting around = 11 and gets more pronounced as grows.)Correspondingly, the middle right panel of Figure 5 indicates that the coverage for Sobol'-LMS sufers greatly when is small (so is large).In contrast, the histograms for Sobol' with DS (Figure 8) and Lattice with RS are symmetric and have somewhat of a Gaussian appearance (although for large , the tails seem to end more abruptly than for a normal and perhaps more akin to a triangular distribution).The minimal skewness and kurtosis in Figure 8 is consistent with the middle right panel of Figure 5, where coverage with small does not sufer as much for these RQMC methods as for Sobol' with LMS.

D.3 Analysis in terms of the dimension
We aim here at investigating the coverage in terms of the dimension for two integrands, ℎ ub,0.35 and ℎ sum , respectively a product and a sum of functions of individual coordinates.RQMC eiciency is expected to deteriorate as the dimension increases for ℎ ub,0.35 and to be independent of the dimension for ℎ sum because the efective dimension [19,37] increases for the former and not for the latter.The efective dimension measures the contribution to the variance from low-dimensional subsets of coordinates, but it is not clear if the metric can be used to say anything about the quality of a normal approximation or a CI having close to nominal coverage.
Figure 9 displays the evolution of the coverage in terms of the dimension when we ix = 2 5 and = 2 9 , so that ≈ 0.643 and to consider = 2 14 = 16384 as in Section D.2.We again consider 200 independent experiments for the estimation of the coverage.Coverage seems independent of the dimension for ℎ sum but to decrease with it for ℎ ub,0.35 .It suggests that a low-efective dimension might have some connections with asymptotic coverage properties, but further theoretical study is needed.

D.4 On the validity of Condition (22)
If the conditions of Corollary 1 are satisied, then a CLT holds for growing arbitrarily slowly to ininity.Unfortunately, it appears diicult to prove rigorously that Conditions (22) or (23) holds for speciic functions, including those described in Section D. Instead, our goal here is to investigate numerically whether (22)  are estimated by considering 0 = 2 12 independent replications of used for both the numerator and the denominator, using the known expected value .Figure 10 displays the results, which provide suggestive, but by no means deinitive, evidence.It is not so easy to discern trends in the plots for ℎ ub,0.35 in the left panel of Figure 10 because they exhibit somewhat chaotic behavior, which perhaps can be explained as follows.The variance of an estimate of the th-order moment depends on the 2th moment, which can make estimating higher-order moments diicult.

( 11 )
Assumption 3.D. ℎ ∈ L 2 ; i.e., ℎ( ) has inite variance.Assumption 3.A limits the roughness of ℎ.Assumptions 3.B, 3.C and 3.D restrict how slowly the tails of the distribution of ℎ( ) can decrease, with 3.B being an extreme case of no tails.None of 3.A, 3.B, 3.C and 3.D depends on the allocation ( , ) nor the randomization method.Proposition 2.1.Assumption 3.A is strictly stronger than Assumption 3.B, itself strictly stronger than Assumption 3.C, which in turn is strictly stronger than Assumption 3.D.

Theorem 4 . 1 .
Suppose that Assumptions 1.A and 4 hold.If the CLT (20) holds and if

Figure 1
Figure1corroborates the results obtained in Section B and summarized in Table2.We can readily check the following:

Figure 2 2 Fig. 2 .
Fig.2.Plots of the negative exponent ( * ) of the optimal rate at which the estimator RMSE decreases as functions of for diferent values of * .The upper let panel does not include 2 ( * ) and 7 ( * ) because these require ℎ ∈ BVHK, which then implies * ≥ 1 by(37).The plots show that stronger restrictions on the integrand ℎ lead to larger ( * ).

Fig. 9 .
Fig. 9. Evolution of the coverage in terms of the dimension for the three randomization types and the two integrands ℎ ub,0.35(let panel) and ℎ sum (right panel).The estimation is based on = 2 5 and = 2 9 .
1) from Assumption 1.B if a CLT or AVCI seem to actually hold as → ∞, where is the total number of integrand evaluations.As with any empirical study of asymptotic behavior, our analysis encounters

Table 3 .
Estimated values of * in (35) from log-log regressions for diferent integrands, dimensions, and RQMC methods.