Importance Sampling BRDF Derivatives

We propose a set of techniques to efficiently importance sample the derivatives of several BRDF models. In differentiable rendering, BRDFs are replaced by their differential BRDF counterparts which are real-valued and can have negative values. This leads to a new source of variance arising from their change in sign. Real-valued functions cannot be perfectly importance sampled by a positive-valued PDF and the direct application of BRDF sampling leads to high variance. Previous attempts at antithetic sampling only addressed the derivative with the roughness parameter of isotropic microfacet BRDFs. Our work generalizes BRDF derivative sampling to anisotropic microfacet models, mixture BRDFs, Oren-Nayar, Hanrahan-Krueger, among other analytic BRDFs. Our method first decomposes the real-valued differential BRDF into a sum of single-signed functions, eliminating variance from a change in sign. Next, we importance sample each of the resulting single-signed functions separately. The first decomposition, positivization, partitions the real-valued function based on its sign, and is effective at variance reduction when applicable. However, it requires analytic knowledge of the roots of the differential BRDF, and for it to be analytically integrable too. Our key insight is that the single-signed functions can have overlapping support, which significantly broadens the ways we can decompose a real-valued function. Our product and mixture decompositions exploit this property, and they allow us to support several BRDF derivatives that positivization could not handle. For a wide variety of BRDF derivatives, our method significantly reduces the variance (up to 58x in some cases) at equal computation cost and enables better recovery of spatially varying textures through gradient-descent-based inverse rendering.

Forward rendering of different materials whose derivatives can be importance sampled efficiently by our techniques.The insets above show difference images which indicate the regions where our estimators have lower variance (blue) and the regions where standard BRDF importance sampling has lower variance (red).Fig. 1.We propose a new set of importance sampling techniques for sampling derivatives of BRDFs in rendering, and they achieve significant variance reduction in the estimated derivatives.Our techniques work better because they correctly deal with real-valued BRDF derivatives, for which BRDF importance sampling from forward rendering is not well suited.Our techniques are general and apply to a wide variety of BRDF derivatives, which was not possible by previous work in differentiable rendering [Zeltner et al. 2021;Zhang et al. 2021a], enabling low variance estimates of the derivative at limited sample counts.
We propose a set of techniques to efficiently importance sample the derivatives of several BRDF models.BRDF importance sampling is a crucial variance reduction technique in forward rendering.In differentiable rendering, BRDFs are replaced by their differential BRDF counterparts which are realvalued and can have negative values.This leads to a new source of variance arising from their change in sign.Real-valued functions cannot be perfectly importance sampled by a positive-valued PDF and the direct application of BRDF sampling leads to high variance.Previous attempts at antithetic sampling only addressed the derivative with the roughness parameter of isotropic microfacet BRDFs.Our work generalizes BRDF derivative sampling

INTRODUCTION
BRDF importance sampling is an essential variance reduction technique for Monte Carlo forward rendering.However, there is no simple counterpart for differentiable rendering.Taking the derivative of a BRDF with respect to one of its parameters transforms it into a real-valued differential BRDF.The differential BRDF can have a very different shape from the BRDF, and can also be negative valued.As a result, naïvely applying BRDF sampling from forward rendering can lead to an estimator with a very high variance.Previous attempts at tackling this problem [Zeltner et al. 2021] are limited to the roughness derivatives of isotropic GGX (Trowbridge-Reitz) and Beckmann BRDFs, and cannot handle even their anisotropic counterparts.Another method [Zhang et al. 2021a] was developed primarily for odd functions with symmetric positive and negative lobes, and can produce substantially higher variance when the derivative is close to an even function.We propose effective importance sampling of derivatives of not only anisotropic GGX and Beckmann BRDFs, but also a wide variety of other analytic BRDF models like Ashikhmin-Shirley, Oren-Nayar, Hanrahan-Krueger, Mixture BRDFs, and ABC models.Fig. 1 demonstrates the benefits of our method on several BRDFs compared to BRDF sampling.Importance sampling a real-valued function leads to unique challenges.Its variance has two sources, a) its sign, and b) its shape.Our idea is to decompose the function into a sum of single-signed functions (all positive or negative), which we call single-signed decompositions.Single-signed functions, by definition, have no sign variance.Importance sampling these functions eliminates their shape variance.
A classical strategy, positivization [Owen and Zhou 2000], is a special case of our single-signed decomposition.It has positive and negative parts with non-overlapping support, which in turn requires a) analytic knowledge of the roots and b) analytic integrability of the BRDF derivative up to the roots, which is possible only for certain BRDF derivatives.To sidestep these issues due to a partition of the domain, we introduce the product and mixture decompositions for which we allow the positive and negative parts to overlap.In fact, we ensure that both the positive and the negative parts have support over the entire hemisphere.This enables analytic integrability and significantly expands upon the set of BRDF derivatives we can handle.Our main contributions are three single-signed decompositions and the corresponding importance sampling PDFs of a large set of BRDF derivatives, see Table 1.
Positivization: First, we introduce a simple decomposition called positivization (Sec.4.2), which partitions a real-valued function about its roots into a positive and a negative function.We show that Zeltner et al. 's [2021] antithetic sampling is a special case of positivization, and positivization provides an explanation of both the correctness and efficiency of their approach.When applicable, positivization leads to significant variance reduction, for example, the isotropic GGX, Beckmann and Hanrahan-Krueger BRDF derivatives.However, others like anisotropic GGX, Beckmann,  are not analytically integrable up to their roots, and the derivatives with mixture weights (Sec.6) do not have analytic roots.Positivization cannot handle these derivatives.Zeltner et al. 's antithetic sampling inherits these limitations too.
Product Decomposition: Second, we propose a novel product decomposition (Sec.5).Our key observation is that after differentiation, many BRDF derivatives can be decomposed into single-signed functions by separating the terms that result from the derivative product rule.Product decomposition does not require knowledge of the roots for the decomposition and only requires the resulting single-signed functions to be analytically integrable.Product decomposition can importance sample the derivatives of anisotropic GGX, Beckmann, Ashikhmin-Shirley, and more.
Mixture Decomposition: Finally, we introduce mixture decomposition (Sec.6).Derivatives of BRDFs with linear combination coefficients, e.g., mixture weights of a layered BRDF, result in realvalued functions whose roots cannot be found analytically in most cases.Our mixture decomposition exploits the fact that this derivative is the difference between two positive-valued terms.Separating them results in a single-signed decomposition, and the two terms can then be importance sampled separately.Mixture decomposition handles the derivatives of Oren-Nayar and mixture weights of Uber BRDFs such as the Disney BRDF or Autodesk Standard Surface.
It is likely that several other BRDF derivatives not surveyed in this paper can also be dealt with by one of our three decompositions, and we provide a recipe for handling them in Sec. 7. We provide a library of importance sampling PDFs for the derivatives of all the BRDF models discussed are above in Table 1.

RELATED WORK
Our work connects two areas in rendering research, differentiable rendering and BRDF sampling.

Differentiable Rendering
History of derivatives in rendering.Computing derivatives or gradients of light transport has a long history.Earlier work focused on accelerating light transport using derivatives [Arvo 1994;Ramamoorthi et al. 2007;Ward and Heckbert 1992].Approximate differentiable renderers [de La Gorce et al. 2011;Kato et al. 2018;Laine et al. 2020;Liu et al. 2019;Loper and Black 2014] have been used for many computer vision tasks, and light transport derivatives have been used for recovering scattering coefficients [Gkioulekas et al. 2013;Khungurn et al. 2015].
Background on differentiable rendering.Much of the current interest in Monte Carlo differentiable rendering was started by Li et al. [2018], who introduced an edge sampling approach to correctly handle discontinuities in both primary and secondary visibility.As shown by them and subsequent work [Zhang et al. 2019], the derivative of the rendering equation is made up of an interior integral which handles continuous function variation, and a boundary integral which encapsulates discontinuities.
Follow up work [Bangaru et al. 2020;Loubet et al. 2019;Yan et al. 2022;Yu et al. 2022;Zhang et al. 2020] focused on accurately computing the boundary integral.Some other recent work focused on reducing memory requirements [Nimier-David et al. 2020;Vicini et al. 2021], and building automatic differentiation systems and compilers [Jakob et al. 2022;Nimier-David et al. 2019].Efforts have been made to handle different light transport phenomena [Wu et al. 2021;Yi et al. 2021;Zhang et al. 2019Zhang et al. , 2021b]].Much of the recent inverse rendering work has started to incorporate differentiable rendering components [Azinović et al. 2019;Che et al. 2020;Deschaintre et al. 2018;Luan et al. 2021;Nimier-David et al. 2021].Zeltner et al. [2021] show that directly importance sampling a BRDF's derivative leads to a detached derivative with only an interior term, and no boundary term.They also show that reparameterization before differentiation leads to a different attached derivative with not only an interior term but an additional boundary term too.The boundary term requires careful handling for unbiased estimates and extra auxiliary rays at each shading point to estimate it too (4 to 64 extra rays as per Bangaru et al. [2020]).As a result, attached estimators are not suitable for the low sample budget within which we aim to operate.Our estimators fall under the detached derivative regime, which does not require these extra auxiliary rays, which makes them suitable for low sample budget derivative estimation.

BRDFs and Importance Sampling
Our work supports importance sampling the derivatives of a wide variety of analytic BRDF models.1. List of Supported Material Derivatives.The first column lists the name of the BRDF, and the second column lists the corresponding parameter whose derivative we can importance sample.The third column lists the type of single-signed decomposition applied (Positivization, Product Decomposition, Mixture Decomposition).The fourth column lists the section number in the Appendix (with links) with the relevant sampling PDFs.Please refer to the original papers for definitions of the parameters.
Data-Driven BRDFs.Apart from analytic BRDFs, data-driven measured BRDFs [Dupuy and Jakob 2018;Matusik et al. 2003] and Neural BRDFs [Fan et al. 2021;Kuznetsov et al. 2021Kuznetsov et al. , 2022;;Sztrajman et al. 2021] are another common class of BRDF models that can model a wide variety of materials.However, both these Neural BRDFs and non-analytic measured BRDFs have a very large number of parameters, and it is unclear which parameters one might want to differentiate and importance sample with respect to.Hence, we do not consider either of these classes of BRDFs in our work and focus instead on common analytic BRDF models.

BACKGROUND
For the sake of simplicity, we begin our discussion by focusing on the direct lighting setting, and extend it to indirect lighting in Sec. 9.The reflected radiance   , at a shading point , in the direction   , is given by the reflection equation [Cohen and Wallace 1993], Here,  is the cosine-weighted BRDF at , and  is a scalar BRDF parameter that controls  .In practice,  is the vector of all BRDF parameters in a given scene.However, for ease of exposition, we assume  is scalar-valued, with the results for the other parameters following similarly.For example,  could be the roughness of an isotropic GGX BRDF.Since we are dealing with only direct lighting, the incident radiance   does not depend upon .Differentiating the expression for the reflected radiance with , we get In forward rendering, BRDF sampling aims to minimize the variance of the BRDF  in the reflection equation, Eqn.(1).Similarly, our goal is to minimize the variance of the differential BRDF    in differentiable rendering, which can be expressed as We drop the spatial coordinate , without loss of generality, for simplicity.We deal with the incident radiance   using light source sampling.The estimators for    and   can be combined using Multiple Importance Sampling [Veach and Guibas 1995].We finally want to compute     so the final estimator must always include multiplication by   .(3), they did not further investigate the effectiveness of antithetic sampling.We show that Zeltner et al. 's method is a special case of another technique called positivization [Owen and Zhou 2000].We show in Sec.4.2 and Appendix B that positivization provides a theoretical grounding of antithetic sampling: the effectiveness mainly comes from the stratification (separating the real-valued function into a positive and a negative function).The major drawback of antithetic sampling is its inapplicability to several BRDF derivatives, due to the lack of closed forms of root finding and integration, which we discuss in Sec.4.2.2.3.1.2Antithetic Sampling of Odd Derivatives.Zhang et al. [2021a] introduce another antithetic-sampling-based method to deal with the derivative of the GGX Normal Distribution Function,  ( ℎ ) with the half vector  ℎ .They exploit the fact that the derivative   ℎ  ( ℎ ) is odd about the local shading normal, i.e

Previous
Their estimator for Eqn.
(3) requires two antithetic samples  ,1 and  ,2 , and is given by Here, and going forward, we drop   ,  from the function arguments of  (  , ) and  (  ,   , ) for brevity.This method works well for the odd derivative with  ℎ .However, for non-odd derivatives, there are no variance reduction guarantees.Furthermore, several BRDF derivatives are even, e.g., roughness of GGX, Beckmann, and Zhang et al. 's method increases variance in these cases.Additionally, Eqn. ( 5) is not in the standard importance sampling form of    / due to the presence of a sum in the numerator and denominator.Hence, it is unclear how to use it in conjunction with multiple importance sampling.

SINGLE-SIGNED DECOMPOSITIONS
In this section, we describe the concept of sign variance in realvalued integrals, and then show how our first decomposition, positivization, can handle this source of variance for some BRDF derivatives.Positivization requires a) analytic knowledge of roots and b) analytic integrability of the BRDF derivative, which limits its applicability.In Sec. 5, we present a novel product decomposition that exploits the single-signed nature of the terms resulting from the product rule for derivatives, for the correct handling of sign variance.It significantly expands the set of BRDF derivatives we can handle.In Sec. 6, we present a novel mixture decomposition that exploits the fact that derivatives with mixture weights are a difference of two positive functions, to decompose them into single-signed functions, allowing us to importance sample even more BRDF derivatives.Finally, we describe a general recipe to handle other BRDF derivatives not surveyed in this paper in Sec. 7.

Sign Variance
We introduce sign variance through the following representative 1D example, showing the real-valued derivative    of a normal distribution  (; , ) with its mean , as shown in Fig. 2 (a,b): For  < , the integrand    is negative, and for  > , it is positive.The best importance sampling strategy using a single PDF  ∝ |   | no longer has zero variance [Owen and Zhou 2000].This is due to the sign variance, i.e., the positive-valued PDF  cannot match the sign of the real-valued integrand    over the entire domain, leading to non-constant sample weights, see Fig.

Positivization
It is possible to construct an estimator for any real-valued integrand    , e.g., Eqn. ( 6), which has zero variance.By partitioning    into its positive    + and negative    − parts, we are left with two functions that are single-signed by definition.
They can be perfectly importance sampled if we can construct the following two PDFs,  − () ∝    − () and  + () ∝    + (), see Fig. 2(c,d).The resulting estimator is where  + ∼  + and  − ∼  − .This technique is called positivization [Owen and Zhou 2000], and we apply it to importance sampling BRDF derivatives.The zerovariance claim is only with regard to the variance arising from the differential BRDF    .The derivative of the reflection equation, see Eqn. (2), is a product of the differential BRDF and the lighting, and as a result, it will still have variance from the lighting.
In Appendix B, we show that Zeltner et al. 's antithetic sampling is a special case of positivization.Positivization gives a theoretical grounding of Zeltner et al.'s approach.First, through an empirical study, we have found that the majority of the variance reduction of antithetic sampling comes from the implicit splitting of    into positive and negative lobes (   + and    − ), instead of the negative correlation between samples, see Fig. 3. = FindRoots( ) = FindRoots( )  challenges a) the roots, which define the region where    > 0, don't have a closed-form expression for some BRDF derivatives, and b)    isn't analytically integrable over the region where it is positive, for others.A similar argument follows for  − too.Some BRDF derivatives, like isotropic microfacet GGX and Beckmann can be handled by positivization.These microfacet models are given by the following equation, where  is the Fresnel term,  is the shadowing and masking term, and  is the normal distribution function for the specific BRDF.The unit vector  ℎ is halfway between   and   , and its spherical coordinates are  ℎ ,  ℎ .The derivative of the isotropic GGX BRDF with its roughness  has two components,    ( ℎ ) and    (  ,   ).However, as noted by previous work [Zeltner et al. 2021;Zhang et al. 2021a], the    term only has a minor effect on the overall derivative.Hence, we focus on the    term, which is given by Its roots have an analytic form and are tan  ℎ =  for all  ℎ .Additionally, the derivative    is analytically integrable over both the positive and negative regions, and so both conditions to apply positivization are met.Hence, the importance sampling PDFs (and CDFs) can be obtained for this derivative.
4.2.2Inapplicability of Positivization to Anisotropic GGX.For many BRDFs' derivatives, however, one of the two conditions fails, which precludes the use of positivization (and antithetic sampling) for them.For example, consider the derivative of  ( ℎ ) of an anisotropic GGX , is the product of a shape function  (, ) and normalization term  (). is the spatial coordinate, and  is a parameter that controls its width and height.Differentiating with respect to  gives rise to the real-valued derivative    (, ) (b).The derivative's root is given by 3 −2 /3 = (3 −  )/( − ), which has no analytic solution and renders positivization inapplicable to this derivative.However, the derivative can be written as    (, ) =    () (, ) +  ()    (, ) due to the product rule for derivatives.The first term (c, yellow) is purely negative and the second term (d, yellow) is purely positive.These single-signed functions have no sign variance and can be perfectly importance sampled with the PDFs  1 ,  2 (c,d, blue) leading to constant sample weights (c,d, red).
BRDF with its roughness   , Its roots are the set of ( ℎ ,  ℎ ) such that the expression 3 tan 2  ℎ cos 2  ℎ − tan 2  ℎ sin 2  ℎ  2  / 2  −  2  = 0, and are shown in Fig. 5 (a), purple curve.However, the derivative     is not analytically integrable over the positive and negative strata, see Fig. 5 (a), red and blue regions, which prevents us from applying positivization to this derivative.Derivatives of other materials like the diffuse BSSRDF from Burley [2015] (Fig. 6) and even the isotropic microfacet ABC BRDF do not have closed-form expressions for the roots, which prevents them from being positivized.
Apart from the derivatives of the isotropic GGX, Beckmann BRDFs with their roughness, positivization can also be used for the derivative of Hanrahan-Krueger BRDF with the scattering parameter  of its Henyey-Greenstein phase function.However, like the anisotropic GGX, there are several other BRDFs, like anisotropic Beckmann, Ashikhmin-Shirley, isotropic ABC, Oren-Nayar, Burley's diffuse BSSRDF, mixture models, etc., that cannot be handled by positivization because of one of the two conditions failing.   (, ) =   () −   () for mixture decomposition.For both decompositions, the two terms  1 () and  2 () are single-signed.Neither of the decompositions requires complicated root finding and integration up to the roots, unlike positivization, see Fig. 4. Instead, the integration domain is simply the entire hemisphere H.This makes PDF construction for these decompositions possible for several BRDF derivatives where positivization was not applicable.
It is common in BRDF importance sampling to numerically invert a CDF using binary search or Newton iterations, and we will do this with some of our product and mixture decomposition CDFs.However, for positivization, taking a purely numerical approach is not practical.Numerically approximating non-analytic roots and nonanalytically integrable PDFs requires storing a high dimensional representation (for e.g.6D   ,   ,   ,   for anisotropic GGX).Storing such a high dimensional histogram (piecewise approximation) can be infeasible.
Positivization is a specific single-signed decomposition that decomposes the real-valued function into a positive and a negative function with non-overlapping supports.As a result, it requires rootfinding and analytic integration over complicated domains defined by these roots.In the following two sections, we discuss two novel decompositions for which the positive and negative functions are defined over simple domains of integration like a plane or hemisphere, with overlapping support.As a result, they do not require root finding, or integration over complicated domains, and as a consequence can handle a broader class of derivatives.

PRODUCT DECOMPOSITION
Our first new decomposition is product decomposition.It can handle the derivatives of anisotropic microfacet BRDFs, diffuse BSSRDFs, and the isotropic ABC BRDF that positivization could not handle.
The key idea we exploit is that after differentiating any of these materials, they split up into two terms following the product rule.Both of these are single-signed, have no sign variance, and are analytically integrable over their simple domains of integration (hemisphere or plane).
Several BRDFs (or normal distribution functions) are of the form where ( ℎ , ) is a non-negative shape function, which determines the overall shape of the BRDF over all  ℎ , at the parameter value . () is a directionally constant (independent of  ℎ ) normalization term that ensures  integrates to 1. Differentiating  with , we get, Because  and    are directionally constant, the variance in the two terms above comes from  and    respectively.The first term above is single-signed because  ≥ 0. The second term with    can potentially be real-valued.However, we have found it to be singlesigned for several common BRDFs.For example, for the anisotropic GGX normal distribution function  ( ℎ ), we have, where    is single-signed, see Fig. 5. Additionally,    is also analytically integrable over its hemispherical domain.
Let us provide some geometric intuition for why the shape derivative    is often single-signed.For our BRDFs, the parameter  often controls the variance of the distribution, e.g.,   ,   for GGX, Beckmann,   ,   for Ashikhmin-Shirley.For all of these, the variance  stretches  horizontally, and increases (or decreases) its value at all locations, making its derivative single-signed.On the other hand,  stretches  () vertically to negate the increase (or decrease) in area due to , and ensure it integrates to 1.
We construct importance sampling PDFs for the two single-signed terms separately, with PDFs  1 ∝  and  2 ∝   , Fig. 7 (a) describes the pipeline to generate importance sampling PDFs for product decomposition.Product decomposition can handle the derivatives of anisotropic GGX, Beckmann, Ashikhmin-Shirley which are not analytically integrable over the positivized domains, and Burley's diffuse BSSRDF and the isotropic ABC BRDF, which have no closed-form solution for the roots.However, they all have single-signed    which is analytically integrable.Note that the product rule in and of itself does not guarantee a single-signed decomposition.For example, the product of the microfacet distribution () and geometric terms () does not lead to a single-signed decomposition for the derivative with   (or   ).This is because both     and the     terms are real-valued.The decomposition  =  is one of the many product decompositions, but the only one we found to preserve the single-signed property.

MIXTURE DECOMPOSITION
Our second new decomposition further expands the set of BRDF derivatives we can handle.Consider, for example, a BRDF made up of a diffuse   and specular   lobe with scalar mixture weights   and 1 −   respectively.
The derivative with the mixture weight   is positive when the diffuse lobe contribution is higher than the specular lobe and negative otherwise.In general, this derivative is very hard to positivize, because   and   can be arbitrary BRDFs, and so the roots of   −   are unlikely to have a simple analytic form.However, we can once again decompose this derivative into singlesigned functions with overlapping support; we refer to this as the mixture decomposition.Since   and   are non-negative valued BRDFs, they are single-signed, and can be importance sampled separately with appropriate PDFs   and   .
Mixture weights show up in all Uber BRDFs, like the Autodesk Standard Surface, Disney BRDF, etc., and our mixture decomposition can be applied to all of them.Mixture decomposition is also applicable to the derivative of BRDFs that aren't explicitly mixture models, but internally are made up of different lobes, with parametric weights.For example, the Oren-Nayar BRDF, which is a linear combination of two terms.Here, the positive weights (), () depend upon the roughness  of the BRDF.
where  = max (  ,   ),  = min (  ,   ).Once again, since both terms of the BRDF above are positive, the real-valued derivative with  is simply the sum of a positive and a negative term, with the sign of the term decided by the sign of    and   .Importance sampling the first term is simply cosine-hemispherical sampling, and we provide an importance sampling PDF for the second term in Appendix A.3.2.Besides Oren-Nayar, the microcylinder BRDF [Sadeghi et al. 2013] is also a mixture model with weights   , 1 −   , where   is the isotropic scattering coefficient, and can be handled by mixture decomposition as well.

RECIPE FOR IMPORTANCE SAMPLING BRDF DERIVATIVES
We now present a recipe to importance sample BRDF derivatives based on the key ideas introduced in the previous sections.
Step Step 2, Try Product or Mixture Decomposition.If positivization is inapplicable for either reason (no analytic roots or lack of analytic integrability), try to apply either product or mixture decomposition.
Step 2.1, Product Decomposition.If the original BRDF is of the form  ()(  , ), where  appears in a directionally invariant (independent of   ) normalization term  () and an unnormalized shape function (  , ), product decomposition may be applicable.
First check if    is single-signed, i.e., it has a constant sign for all   , and is analytically integrable.If these conditions hold, product decomposition is applicable.Construct a PDF  2 (  ) ∝    and compute the normalization terms for it and its conditional and marginal counterparts.The other PDF  1 (  ) ∝  is simply the BRDF sampling PDF.See Fig. 7 for the PDF generation, and Eqn. ( 16), for the estimator.
Step 2.2, Mixture Decomposition.If instead the parameter  appears in the form of linear combination weights either explicitly as a mixture model between two BRDFs, or implicitly as a mixture between two lobes that form a single BRDF, mixture decomposition is likely applicable here.In this case, simply use the PDFs and sampling strategies most suitable for the two mixture lobes if they are available (e.g., visible normal distribution function sampling for a GGX lobe), or construct PDFs  1 (  ) ∝  1 (  ),  2 (  ) ∝  2 (  ) for the two lobes, where  1 ,  2 are the two lobes.See Fig. 7 for the PDF generation, and Eqn. ( 18) for the estimator.
Fig. 8 depicts the estimators for all three of our decompositions for direct illumination.They all require two shadow rays at the shading point, corresponding to the positive and negative lobes of the corresponding decomposition.
Although we have not found examples that require it, our three decompositions can also be interleaved with one another for complicated BRDF derivatives.For example, it is possible that for some BRDF derivatives, the derivative of the shape function from the product rule    could be real-valued.It could then further be positivized to eliminate sign variance.
Forward Rendering Sampling Technique Reuse.Both product and mixture decomposition reuse BRDF sampling developed for forward rendering as one (or both) of the techniques for differential BRDF sampling.For product decomposition, this corresponds to  1 ∝ .For mixture decomposition, perfect importance sampling can be achieved by only employing two standard BRDF sampling techniques from forward rendering in some cases.BRDF sampling when used directly to estimate for    suffers from sign and shape variance, however, when paired with the right decomposition, it can correctly handle the shape variance of one of the terms.
Multiple Importance Sampling.For the product and mixture decompositions, the positive and negative decomposition PDFs can have overlapping support (for positivization they are necessarily nonoverlapping).As a result, the samples generated for one decomposition can be shared with the other using Multiple Importance Sampling.Also, all three of our decompositions reduce the variance from the differential BRDF    , and can be used in conjunction with light source sampling via Multiple Importance Sampling to reduce the lighting,   's variance.

RESULTS
We organize our results into two subsections.First, we demonstrate that our decompositions do reduce variance in practice for a number of BRDF derivatives under a wide variety of lighting conditions in Sec.8.1.Next, we demonstrate that lower variance in gradients indeed does enable better spatially-varying texture recovery in an inverse rendering setting, in Sec.8.2.
Implementation Details.We implemented all the different decompositions and BRDFs on our own CPU-based differentiable renderer, using the Embree [Wald et al. 2014] library for ray tracing.At each shading point, all three of our decompositions require two shadow rays, see Fig. 8.To have a fair comparison with BRDF sampling, we shoot out two shadow rays at each shading point for it too, which ensures an equal-ray comparison with our method.All our error comparison images are computed by averaging the squared error of the gradient images, which were each generated at 9 samples per pixel over 50 runs.We use numerical CDF inversion for the sampling of    in product decomposition and the   () term in Oren-Nayar.(with  = −0.9)lion is differentiated with its Henyey-Greenstein parameter for anisotropy .The Henyey-Greenstein phase function at  = −0.9 is highly back-scattering and is very badly importance sampled by regular BRDF sampling, which cannot correctly account for the highly peaked and signed nature of the derivative.Since positivization is correctly able to handle both sign and shape related variance, we see significant variance reduction of 1.96× and 58.57× for the teapot and lion respectively.8.1.2Product Decomposition.Next, we compare product decomposition with BRDF sampling for the derivative of an anisotropic Beckmann BRDF with its roughness   , lit under constant environment illumination in Fig. 9. Positivization (and by extension Zeltner et al.) cannot handle this derivative, see Sec. 4.2.2, and Zhang et al. 's method fails for even derivatives like this one, see Fig. 3. Constant illumination eliminates variance from lighting and only keeps variance from the BRDF derivative and visibility.Since product decomposition can correctly handle both the sign and shape variance of the BRDF derivative, it has an overall 14.5× reduction in variance, whereas BRDF sampling fails because it cannot handle either source of variance.In most regions (Fig. 9 see right inset), the derivative of the normal distribution function    is the major source of BRDF derivative variance; we eliminate it and see a big improvement of 601×.However, in the grazing angle regions (Fig. 9 see left inset), the derivative of the shadowing function    dominates.Here, our improvement is still significant (27.1×), but relatively less pronounced, since our sampling strategy minimizes   's variance.Now, we change the lighting to a realistic setup with two area lights and three anisotropic BRDFs, in Fig. 10.The three anisotropic BRDFs are GGX for the tray, Beckmann for the pot, and Ashikhmin-Shirley for the cup, and we compute the derivatives with   for GGX and Beckmann, and   for Ashikhmin-Shirley.The Ashikhmin-Shirley gradient is scaled up by 10 3 since it has a lower magnitude.We show the estimated derivative variance improvement for three anisotropic microfacet BRDFs, higher is better, for our product decomposition with MIS compared with BRDF sampling.The BRDFs are GGX for the tray, Beckmann for the pot, and Ashikhmin-Shirley for the cup, and are lit by two area lights.The visibility (from the object inter-occlusions) and the lighting introduces additional variance from these terms.Nontheless, our product decomposition which shares samples across the positive and negative lobes using MIS, is once again better able to handle the sign and shape variance than BRDF sampling, leading to an improvement of 1.58×.

Variance Reduction
Apart from BRDF derivative variance, this scene has two other major sources of variance, lighting, and visibility.When the variance is significant from other sources too, we have found that sharing samples between the positive and negative decomposition is beneficial, see Sec. 7, Multiple Importance Sampling (MIS).Our product decomposition with MIS better handles the shape and sign variance than BRDF sampling, and can outperform it for all three BRDF derivatives (see three insets in Fig. 10), and has an overall variance reduction of 1.58×.We show two more examples of product decomposition in Fig. 1, for anisotropic GGX and Beckmann BRDF derivatives, which achieve variance reduction of 1.56× and 3.61× respectively.The insets in the top row of Fig. 1 show the regions where our decomposition has lower variance than BRDF sampling in blue.Product decomposition outperforms BRDF sampling in almost all regions.
8.1.3Mixture Decomposition.Finally, we compare BRDF sampling with Mixture Decomposition to estimate the derivative of a mixture model with its mixture weight for the fish-shaped pot in Fig. 1.The mixture model is a linear combination of a lambertian diffuse lobe, and a GGX specular lobe and the lighting is two area lights.Mixture decomposition can reduce the variance by 4.72x, because it correctly handles shape and sign variance, unlike BRDF sampling.
Fig. 1 also shows an example of an Oren-Nayar pot, and its derivative with the roughness .BRDF sampling here is simply cosine hemispherical sampling, and works quite well in the central regions of the pot, because the cosine lobe is dominant in the non-grazing angle regions, see Eqn. (19).However, in the grazing angle regions towards the edges of the pot where the correction term is more dominant, BRDF sampling breaks down and has high variance.On the other hand, our mixture decomposition with MIS correctly accounts for the derivative of both terms with regard to their sign and shape variance, and can achieve low variance in all regions of the pot, and leads to a 3.91× reduction in variance.

Inverse Rendering
We demonstrate the benefits of correctly handling sign variance in gradients, for gradient-descent-based inverse rendering.We apply inverse rendering to the task of spatially varying texture recovery, and evaluate the effectiveness of all three of our decompositions on it.Our results for positivization are presented in Fig. 11, product decomposition in Fig. 12, and mixture decomposition in Fig. 13.All our inverse rendering results use 4 samples per pixel for both forward and gradient rendering at each optimization iteration.We use the ADAM optimizer [Kingma and Ba 2015] and the respective loss graphs show the mean absolute texture recovery error (1) after some initial iterations.
For positivization (Fig. 11), we recover the (spatially varying) scattering parameter  of a Hanrahan-Krueger BRDF with the semiinfinite depth assumption, lit by a single area light.The ground truth texture consists of a slightly back-scattering background region with  = −0.3, and a highly back-scattering logo region with  = −0.9,see Fig. 11 (c).The initialization is a back-scattering random initialization, i.e.,  < 0. Positivization benefits from lowered gradient variance by correctly treating sign variance, and consistently has lower texture recovery error than BRDF sampling, see Fig. 11 (b).This is especially effective in the highly back-scattering logo region, where BRDF sampling suffers from high variance, whereas positivization which correctly accounts for sign variance can recover texture in this region better, see Fig. 11 (d).Positivization casts two shadow rays at each shading point.To ensure an equal ray-triangle intersection budget, we cast two shadow rays for BRDF sampling at each shading point.
For product decomposition (Fig. 12), we optimize the spatially varying anisotropic roughness textures (  and   ) of a Beckmann BRDF under a photometric stereo setup under two illumination conditions.The two lighting conditions are rotated versions of the same environment map.Starting from a random initialization for both textures, product decomposition's correct handling of the sign  Our product decomposition computes the gradients for both roughness values using three samples at each shading point combined using multiple importance sampling (one each from  1 ,  2, ,  2, ).To ensure an equal-ray budget, we use three samples for BRDF sampling at each shading point too.
For mixture decomposition in Fig. 13, we recover the spatially varying roughness of an Oren-Nayar BRDF under environment map illumination.Once again, mixture decomposition benefits from lowered variance in gradients, and can recover a texture with lower error than BRDF sampling at an equal ray-triangle intersection budget, see Fig. 13 (b).

GLOBAL ILLUMINATION
We now describe how to importance sample BRDF derivatives under multiple bounce global illumination.The recursive rendering equation [Kajiya 1986] (ignoring emission) is given by a generalization of Eqn. ( 1  The overall BRDF is given by  =    + (1 − )   with  = 0.1 for the left box and  = 0.9 for the right box.We estimate the derivative with the weight .  is a lambertian diffuse lobe, and   is an isotropic GGX lobe with  = 0.05.Mixture decomposition leads to lower variance in most regions since it correctly deals with sign variance.However, in some regions, e.g the green inset, the lighting, and visibility variance is more significant, and in that particular region, BRDF sampling happens to be better aligned to reduce this variance.
which recursively describes how differential radiance is reflected.The two integrals (Eqn.( 22) and ( 23)) can be importance sampled separately.We have seen how to importance sample Eqn. ( 22) by applying different BRDF derivative decompositions in Sections 4.2, 5, and 6.Irrespective of the decomposition required, this requires two evaluations of   corresponding to the positive and negative lobes and is done by regular path tracing (similar to the standard splitting approach [Arvo and Kirk 1990]).To importance sample Eqn. ( 23), we follow standard BRDF sampling and continue the same recursive importance sampling of     at the next shading point.This means that we need three samples at each shading point, one each for BRDF, positive lobe and negative lobe importance sampling.Fortunately, for product and mixture decomposition, we can reduce this to two samples at each shading point.For product decomposition, as we saw in Sec. 5, one of either the positive or negative lobe decomposition PDFs is the same as BRDF sampling, and can share a sample with it.For mixture decomposition, BRDF sampling can be simulated by randomly choosing a sample from either the positive or negative lobes with the probability equal to the mixture weight of the BRDF sampling strategy.
Branching Complexity and Comparison with BRDF sampling.Even though we use two samples to estimate Eqn. ( 22), the total number of rays required to estimate     for a maximum depth  is quadratic i.e.,  ( 2 ), instead of exponential, see Fig. 14, whereas it is  () for BRDF sampling.This is because we only apply splitting when estimating Eqn. ( 22), which recurses on   , and we do not split when estimating Eqn. ( 23).The recursive call of   in Eqn. ( 22) does not require splitting, which prevents exponential branching.We have found that for one bounce of global illumination, with an equalray budget, our mixture decomposition can significantly reduce variance, see Fig. 15.

LIMITATIONS AND FUTURE WORK
Determining the number of samples for each decomposed component.For all three decompositions, our current implementation applies a two-sample estimator which uses one sample per component.It is possible that a different estimator can be more efficient in some cases.For example, when the two components have different areas (i.e., ∫    1 ≠ ∫    2 for components  1 and  2 ), it might be useful to adjust the number of samples according to the area of the component (we show in Appendix C that microfacet normal distribution functions always have components with equal area).Research in allocating budgets for multiple importance sampling can likely help in our case as well [Grittmann et al. 2022;He and Owen 2014;Sbert et al. 2018].Our estimator that always samples all components belongs to the deterministic mixture scheme [Owen 2013].An alternative is a random mixture, which randomly chooses one component.We opt for deterministic mixtures since they consistently outperform random mixtures in our direct lighting experiments (due to the stratification effect, similar to standard MIS v.s.one-sample MIS).For global illumination, random mixtures are the same as applying Russian Roulette to keep only one of the two branches, and can be more computationally convenient in some cases since they omit the need for quadratic branching.
Branching and Global Illumination.Our adoption of deterministic mixtures requires path splitting for global illumination.While the branching complexity is quadratic instead of exponential (same as a bidirectional path tracer), it can add undesired overheads.There are several ways to reduce the branching, 1) deterministically using only BRDF sampling or using random mixtures instead of deterministic mixtures after a certain recursion depth, 3) using path reconnection similar to Zhang et al. 's approach [2020], to reconnect the branches back to a single primary path.
Better Optimization Schemes.Ultimately, for inverse rendering, the optimization is both ill-posed and non-convex.Recently, we have seen some work [Xing et al. 2022] which takes a step in this direction.We believe the study of efficient estimators of the derivatives is largely orthogonal and equally crucial.

CONCLUSION
Our importance sampling techniques provide a fundamental component for future differentiable rendering work, enabling correct handling of sign and shape variance of differential BRDFs.BRDF sampling is widely used in forward rendering to deal with a variety of light transport phenomena; this includes unidirectional, bidirectional and gradient domain path tracing, Metropolis light transport, path guiding, photon mapping, etc.Similarly, as the need to deal with the differentials of more complicated light transport phenomena arises, we will need to develop differential counterparts of these algorithms and we believe that our method will be well suited to serve as a fundamental building block for them.Our product and mixture decompositions can also potentially have use outside of graphics for importance sampling real-valued functions.
weight   .This BRDF does not include cosine foreshortening.(36) where F is the Fresnel term, A is the albedo, and  is a Gaussian with width   .The first term is importance sampled using cosine hemispherical sampling, which is also the importance sampling technique used for this BRDF in forward rendering.The second term is importance sampled using inverse transform sampling for the Gaussian. .

𝑓 (𝝎
The only difference between the estimator above and the positivization estimator is that the samples  − () and  + () are correlated because they use the same uniform random number , whereas they are uncorrelated for positivization because positivization does not impose any such restriction.Thus, antithetic sampling is a special case of positivization with correlated random numbers.
Positivization (with uncorrelated random numbers) achieves its variance reduction due to the stratification of the real-valued function into positive and negative functions, and we have experimentally verified that antithetic sampling (with correlated random numbers) consistently has similar variance reduction as positivization.As a result, antithetic sampling's variance reduction can be explained by the implicit stratification of    into positive and negative lobes.See Fig. 3 for an example of the variance reduction.

C MICROFACET BRDF DERIVATIVES INTEGRATE TO ZERO
Previous work [Zeltner et al. 2021] has noted that the derivative of the normal distribution function of the isotropic GGX (and Beckmann) BRDF with its roughness parameter has positive and negative lobes with equal area.Here, we prove that this observation extends to all the derivatives of all microfacet normal distribution functions.
The projected area of a microfacet BRDF's normal distribution function  always integrates to 1 i.e a constant, ∫  ( ℎ , ) cos  ℎ d ℎ = 1 (39) As a result, its derivative with any parameter  integrates to 0, ∫    ( ℎ , ) cos  ℎ d ℎ = 0 (40) which means that the positive and negative lobes of    cos  ℎ have equal area.Since we generally construct microfacet derivative sampling PDFs proportional to the derivative of the projected normal distribution function, the sampling PDFs (irrespective of the decomposition) for the positive and negative lobes of the derivative must necessarily have equal area.

Fig. 2 .
Fig. 2. Sign Variance and Positivization.Differentiating the positive original function (a, yellow) results in a real-valued derivative (b, yellow).(b)Although the derivative (b, yellow)    and the PDF (b, blue)  match in shape, i.e.,  ∝ |   |, the sample weights   / (b, red) are non-constant due to a mismatch between their signs, causing sign variance.Positivization[Owen and Zhou 2000] splits the derivative into its positive and negative (c, d, yellow) parts.Since both parts are either purely non-negative or nonpositive, they can be perfectly importance sampled by constructing PDFs  + and  − (c, d, blue).The resulting sampling weights (c, d, red), are constant and the corresponding estimator has zero variance.
Fig. 3. Comparison between BRDF Sampling, Zeltner et al. [2021], Zhang et al. [2021a], and our Positivization, for the derivative of an isotropic GGX BRDF with its roughness .The object in the scene is a fire hydrant lit by two area lights.BRDF sampling is unable to correctly handle the sign or shape variance of the differential GGX BRDF and has high variance.Zhang et al. 's method is unsuitable for even derivatives like roughness and produces a high variance estimator as well.Zeltner et al.'s method is a special case of positivization and both of them have very similar variance reduction properties.We have found that positivization and Zeltner et al.'s method have very similar performance across several scenes, with positivization being better in some and Zeltner et al. in others.
Fig. 5. Complicated Roots and Product Decomposition of the Anisotropic GGX BRDF derivative.(a) The derivative    has complicated roots shown in (a), purple curve.Positivization requires analytical integration over the domain defined by these roots, which is not possible.Our product decomposition separates the two terms resulting from the product rule for differentiation,    =     +    , which are both single-signed as shown in (b,c).Product decomposition does not require integration up to the complicated roots, which enables easy PDF construction for both the positive and negative decompositions.
Fig.6.Product Decomposition and Inapplicability of Positivization.The BSSRDF profile  (, ) =  () (, ) from Burley [2015] (a), is the product of a shape function  (, ) and normalization term  (). is the spatial coordinate, and  is a parameter that controls its width and height.Differentiating with respect to  gives rise to the real-valued derivative

Fig. 8 .
Fig. 8. Positivization, Product and Mixture Decomposition for Direct Illumination.All three techniques send out two shadow rays corresponding to two different sampling techniques at each shading point, shown by red and green arrows.For positivization, the PDFs for these sampling techniques have non-overlapping support, shown by the red and green lobes.For mixture and product decomposition, however, the corresponding PDFs may have overlapping support.

Fig. 9 .
Fig.9.Product decomposition v.s.BRDF sampling under constant illumination.We show the estimated variance of derivatives.Numbers indicate the relative improvement, higher is better.The scene contains an anisotropic Beckmann BRDF under constant environment illumination.Under constant illumination, the BRDF derivative is the main source of variance.Our product decomposition correctly handles both the sign and shape variance, because of which we see an overall 14.5× reduction in variance compared to BRDF sampling.

Fig. 10 .
Fig.10.Product decomposition v.s.BRDF sampling under sharper illumination.We show the estimated derivative variance improvement for three anisotropic microfacet BRDFs, higher is better, for our product decomposition with MIS compared with BRDF sampling.The BRDFs are GGX for the tray, Beckmann for the pot, and Ashikhmin-Shirley for the cup, and are lit by two area lights.The visibility (from the object inter-occlusions) and the lighting introduces additional variance from these terms.Nontheless, our product decomposition which shares samples across the positive and negative lobes using MIS, is once again better able to handle the sign and shape variance than BRDF sampling, leading to an improvement of 1.58×.

Fig. 11 .
Fig. 11.Inverse Rendering of the scattering parameter  of a Hanrahan-Krueger BRDF.(a) forward rendering of the target.Our positivization has a lower texture recovery error than BRDF sampling (b).(c) shows the ground truth texture, with an inset of the recovered texture using positivization.(d) shows the error images for positivization and BRDF sampling.BRDF sampling is unable to recover the texture in the highly backscattering logo region.However, positivization can handle this region well.

Fig. 12 .
Fig. 12. Inverse Rendering of the roughness of an Anisotropic Beckmann Plate under a photometric stereo setup.The forward rendering of the target under the two lighting setups is shown in (a) and its inset.The texture recovery loss (b) demonstrates that our product decomposition achieves lower texture recovery error than BRDF Sampling for both the   and   textures.(c) and (d) show the ground truth textures for   and   , and the insets show our recovered textures.(a)Forward Rendering of Target

Fig. 14 .
Fig. 14.Branching for Global Illumination with Max Depth=3 for Product & Mixture Decomposition.Starting from a single vertex at  = 0, estimating     requires recursive estimation of both   and     via Eqns.(22) and (23).For product & mixture decomposition, since one new branch is created at every vertex, the total number of rays used here is quadratic i.e  ( 2 ), in the maximum depth .The sampling technique  1 is the same for BRDF sampling and the positive lobe of the BRDF derivative, and  2 is the sampling technique for the negative lobe.For positivization, two new branches are created at each vertex corresponding to  + ,  − , however, once created, they only require evaluation of   , and do not further branch out, so the complexity is still  ( 2 ).

Fig
Fig. 15.Comparison between BRDF Sampling and Mixture Decomposition under 1 bounce Global Illumination and equal ray-triangle intersection budget.The overall BRDF is given by  =    + (1 − )   with  = 0.1 for the left box and  = 0.9 for the right box.We estimate the derivative with the weight .  is a lambertian diffuse lobe, and   is an isotropic GGX lobe with  = 0.05.Mixture decomposition leads to lower variance in most regions since it correctly deals with sign variance.However, in some regions, e.g the green inset, the lighting, and visibility variance is more significant, and in that particular region, BRDF sampling happens to be better aligned to reduce this variance.

B
ZELTNER ET AL. 'S ANTITHETIC SAMPLING IS A SPECIAL CASE OF POSITIVIZATIONZeltner et al.'s [2021]  antithetic sampling involves generating paired and correlated samples for the positive and negative lobes of the BRDF derivative    in two separate passes, one pass for each lobe, and then averages out the final result.The correlation is induced by using the same random number generator state across the two passes.The only difference between the two passes are that the first one uses a flag to trigger sampling from the positive lobe  + of the PDF  =  + + (1 − ) − , and the second one triggers sampling from the negative lobe  − .Here,  is the relative area of the positive lobe of    , given by| ∫    + |/(| ∫    + | + | ∫    − |)and is equal to 0.5 for the BRDF derivatives they consider, see Appendix Sec.C.Their estimator for the integrand    is given by, are drawn from  + ∼  + and  − ∼  − , and the factor of 1/2 comes from the fact that they average the result of the two passes.We can further simplify Eqn.(37), to bring it in a form similar to the positivization estimator in Eqn.(8), by noticing that    ( ) =    + ( ) when  ∼  + and similarly for  − too, which gives us  =    + ( + ())  + ( + ()) +    − ( − ())  − ( − ()) Table 1 has a comprehensive list of ] Zeltner et al. [2021]BRDF    leads to high variance since    and  can be very different functions.They instead construct a PDF  ∝ |   |, called the differential detached PDF, which matches    in shape.This eliminates variance from the shape of    , i.e., the sample weights    / are constant in magnitude.There is, however, additional sign variance resulting from the mismatch in the sign between the positive-valued  and the real-valued integrand    resulting in sample weights    / that change sign.To deal with sign variance,Zeltner et al. [2021]applied antithetic sampling.While this does indeed reduce the variance of Eqn.
Fig. 4. Importance Sampling PDF construction for Positivization.Positivization splits  about its roots into single-signed  + ,  − .For PDF construction, we first solve for  's roots, which split the domain H based on  's sign into H + and H − .Next, we integrate inside the domain to compute the normalization constants.There are two potential roadblocks for positivization i) no analytic form of the roots of  , so H + and H − cannot be computed, see Fig.6, and ii) the positivized functions are not analytically integrable over H + /H − , see Fig.5.