NeuSample: Importance Sampling for Neural Materials

Neural material representations have recently been proposed to augment the material appearance toolbox used in realistic rendering. These models are successful at tasks ranging from measured BTF compression, through efficient rendering of synthetic displaced materials with occlusions, to BSDF layering. However, importance sampling has been an after-thought in most neural material approaches, and has been handled by inefficient cosine-hemisphere sampling or mixing it with an additional simple analytic lobe. In this paper we fill that gap, by evaluating and comparing various pdf-learning approaches for sampling spatially varying neural materials, and proposing new variations of these approaches. We investigate three sampling approaches: analytic-lobe mixtures, normalizing flows, and histogram prediction. Within each type, we introduce improvements beyond previous work, and we extensively evaluate and compare these approaches in terms of sampling rate, wall-clock time, and final visual quality. Our versions of normalizing flows and histogram mixtures perform well and can be used in practical rendering systems, potentially facilitating the broader adoption of neural material models in production.


ABSTRACT
Neural material representations have recently been proposed to augment the material appearance toolbox used in realistic rendering.These models are successful at tasks ranging from measured BTF compression, through efficient rendering of synthetic displaced materials with occlusions, to BSDF layering.However, importance sampling has been an after-thought in most neural material approaches, and has been handled by inefficient cosine-hemisphere sampling or mixing it with an additional simple analytic lobe.In

INTRODUCTION
Rendering photo-realistic images of 3D scenes requires using physically plausible material models.Analytic bidirectional scattering distribution functions (BSDFs), typically based on microfacet theory, have been the dominant choice in the past decade.Combined with spatially varying parameters (diffuse and specular reflectance, roughness, normal and displacement maps), these models are capable of approximating a wide variety of appearances.However, not all real-world materials have reflectance functions that fit the analytic paradigm well.Furthermore, displacement mapping is expensive, and the cheaper normal mapping has only limited ability to realistically approximate surface mesostructure, ignoring occlusions, inter-reflections and parallax effects.
Recently, neural representations have been proposed to augment the material appearance toolbox [Rainer et al. 2019[Rainer et al. , 2020;;Kuznetsov et al. 2021;Baatz et al. 2021;Sztrajman et al. 2021;Fan et al. 2022].Unlike analytic models, neural models can theoretically represent any material, e.g., measured reflectance in the form of bidirectional texture function (BTF) data, or complex mesostructure defined using synthetic or real displacement data.The utility of these methods ranges from measured BTF compression, through efficient rendering of synthetic displaced materials with occlusions, to BSDF layering.The methods vary in their neural-network architectures, but they typically combine learned spatial-feature textures and fully connected blocks to approximate 6D reflectance functions over surface position, incoming and outgoing directions.
However, importance sampling has been an after-thought in most neural-material works.In NeuMIP [Kuznetsov et al. 2021] and its extension to curved-surface silhouettes [Kuznetsov et al. 2022], importance sampling is handled via simple Lambertian (i.e., cosinehemisphere) sampling, which is ineffective for specular materials.Works on neural BRDF representation [Sztrajman et al. 2021] and BSDF layering [Fan et al. 2022] perform importance sampling via mixtures of Lambertian and Gaussian (or Blinn-Phong) lobes with network-predicted parameters; studying the efficiency of this approach has not been a core goal of these prior works.However, practical Monte Carlo rendering systems require all materials to provide three core interfaces at each surface shading point: evaluating the BSDF, sampling it, and evaluating the sampling probability density function (pdf); the pdf is expected to be approximately proportional to the BSDF.The latter two operations are needed for obtaining unbiased rendering as well as implementing multiple importance sampling (MIS) [Veach 1997] which is key to achieving robustness in complex production scenarios.
The objective of this paper is to fill this gap, by evaluating and comparing various pdf-learning approaches for sampling neural materials, as well as proposing new variations of these approaches.Specifically, we show our improved version of the analytical mixture method [Sztrajman et al. 2021;Fan et al. 2022], as well as improved methods based on normalizing flows [Kobyzev et al. 2019] and a novel histogram-mixture prediction method inspired by Zhu et al. [2021] but significantly extended with new techniques.We find that both normalizing flows and histogram mixtures perform well across a selection of neural materials, while the improved analytic mixture is competitive for some but not all materials.In terms of wallclock time, the histogram mixture method tends to perform best, while normalizing flows are also competitive, especially in terms of sampling rate (i.e., number of samples per pixel); performance depends on scene complexity and rendering cost.Our evaluation focuses on a NeuMIP-style architecture (Fig. 2), simplified to 2D feature textures instead of multi-resolution pyramids, but applies to any similar architectures parameterized by feature textures.
Our key contribution is a thorough evaluation of importance sampling methods for neural materials: • We study three sampling approaches: analytic mixtures, normalizing flows, and histogram prediction (Section 3).
• Within each approach, we improve upon previous work and extensively evaluate and compare these approaches, in terms of sampling rate, wall-clock time, and final visual quality (Fig. 1 and Section 4).
Based on our evaluation results, we recommend our histogram mixture prediction as the overall best-performing method.However, our normalizing-flow variant can do better at equal sampling rate and could become overall best-performing in complex scenes.We believe our methods facilitate the broader practical adoption of neural material models in production rendering.They could also have impact beyond material importance sampling, such as in path guiding and complex luminaire sampling.

RELATED WORK
Neural materials.Learning to sample neural materials is an instance of the more general problem of learning a conditional probability distribution  (   |x,    i ), where x is a point in a 3D scene,    i is the direction of the incoming path and    is an outgoing direction.This problem occurs repeatedly in 3D rendering: for example, path guiding is the problem of sampling, for a given path vertex x, a direction    that is likely to contribute significant incoming radiance (x,   ).A similar situation occurs in complex-luminaire rendering, where the goal is to sample a direction    toward a luminaire point with strong emission.Therefore, methods from all of these areas are relevant to neural material sampling, and we will evaluate some of these ideas in our context (specifically, normalizing flows and histogram prediction methods), in addition to the simple analytical-mixture methods previously used for neural materials.
Neural-material representations have recently received much attention, as they show promise to overcome limitations of traditional analytic BSDFs with parameter textures and displacement maps, or measured and tabulated BTF data.Rainer et al. [2019Rainer et al. [ , 2020] ] focused on BTF compression.[Kuznetsov et al. 2022] showed how to represent synthetic or real materials with significant parallax and occlusion effects in a neural form that is much more efficient to evaluate than true microgeometry.Sztrajman et al. [2021] and Fan et al. [2022] have approximated neural BSDFs with architectures allowing for high specularity and (in the latter work) layering operators.These approaches can theoretically represent any material data, be it complex synthetic microgeometry with displacements, shadows and inter-reflections, or real measured reflectance functions.Neural materials are gaining in importance, but are still lacking a thorough exploration of importance sampling-a component that is crucial for broader practical adoption.
Analytic approximations.Cosine-hemisphere sampling of a neural material [Kuznetsov et al. 2021[Kuznetsov et al. , 2022] ] is inefficient unless the material is very rough.To handle a wider variety of reflectance functions, some recent works fit analytic lobes to the local reflectance at a shading point, which have tractable pdf evaluation and sampling.Sztrajman et al. [2021] fit an entire Blinn-Phong model per spatial location, using a neural network that predicts two parameters (glossiness and diffuse/specular ratio) from a local feature vector   .Fan et al. [2022] take a similar approach to predict a mixture of a Lambertian and Gaussian lobe (including mean and variance) given    and    i .This approach is slightly more general, since the Gaussian lobe mean does not need to be on the reflected direction, allowing for additional effects like varying specular normal.While these analytic predictions can be sampled efficiently, they diminish the key advantage of neural materials, which is the ability to represent precisely the materials that do not fit well with analytic BRDF models.We propose and study an improved version of Fan et al.'s method with more Gaussian lobes which uses a similarly compact multi-layer perceptron (MLP) as Sztrajman et al. [2021] but extends to the complex spatially varying scenario.For training, Sztrajman et al. resort to supervised learning by using the fitted Blinn-Phong parameters of Ngan et al. [2005], while we directly maximize the likelihood of samples, avoiding the limitations of manually defined metrics.
Normalizing flows.A family of methods for fitting general probability distributions, called normalizing flows, has been proposed by the machine-learning community [Dinh et al. 2014;Kobyzev et al. 2019].The key idea is to learn a bijective, invertible mapping  = (;  ) between the learned distribution  () and a simple base distribution   () (typically Gaussian or uniform).Moreover, the mapping  should have an easily computable Jacobian determinant.This allows for efficient sampling of the resulting pdf (by sampling the base distribution and passing the sample  through ) as well as efficient pdf evaluation (by mapping a point  through the efficient inverse  =  −1 and computing the base pdf and the Jacobian determinant).The function  is sometimes called a pushforward while  −1 is called a normalizing flow, since it maps (flows) the complex target distribution onto a much simpler (often Gaussian), normalized distribution.Many architectures have been proposed to implement  and .Please refer to the survey of Kobyzev et al. [2019] for an overview of the design decisions.
A general application of normalizing flows to sampling problems in graphics, such as path guiding, was presented by Müller et al. [2019].They use a large U-shaped neural network with fully connected layers to predict parameters for the piecewise polynomial coupling transforms that warp the initial uniform distribution to the target.Müller et al. showed that the learned pdfs outperform state-of-the-art sampling methods for equal sample counts but not in terms of wall-clock time.They acknowledged that the practicality of their method is not immediate as its cost is higher than the gains from the better importance sampling achieved.They did not study importance sampling for materials.
For materials, Xie et al. [2019] used normalizing flows to approximate microfacet BRDFs with multiple scattering by fitting data simulated using geometric optics on actual heightfield microgeometry.Their method employs efficient affine transformations including conditional scaling and translation.While this exact architecture could be used for neural-material sampling, we found the fitted pdfs were not of sufficiently high quality, and we propose a modified architecture.The comparison can be seen in Table 2.
We build upon a more advanced invertible transformation (typically neural spline flows [Durkan et al. 2019]) introduced in the normalizing flow literature that we found to maximize expressiveness.We apply additional techniques to keep the architecture efficient enough to achieve a gain in wall-clock time.
Histogram prediction.Instead of learning invertible transformations or lobe mixtures, we can predict a histogram (a discretized pdf approximation) given the conditional information.This approach is only feasible in low dimensions but is effective for 2-dimensional material importance sampling.To our knowledge, the only related targets importance sampling complex luminaires [Zhu et al. 2021].It learns an MLP to predict a small image of the luminaire from a given view, which is then used for evaluation and sampling.Our histogram-mixture method significantly extends this idea.

METHODS
In this section, we introduce key concepts, followed by the three sampling method types evaluated in this paper.
Preliminaries.We would like to find an approximation  (   |  ,    i ) to a target pdf  * (   |  ,    i ) over outgoing directions   , conditioned on an incoming direction    i and a local feature vector    that encodes the material properties at a given location.The target pdf  * is the luminance of the BRDF lobe at that location and direction, normalized to integrate to 1 over the hemisphere: this is the perfect importance-sampling distribution.We study different methods to approximate  * : all are able to evaluate the approximation  for a given    i as well as sample    from it efficiently.Note that the terms incoming and outgoing refer to the direction of simulation.We consider BRDFs, which are non-zero in the top hemisphere; the technical task of extending to full BSDFs we leave for future work.
Projected hemisphere.Our goal is to sample proportionally to the product of BRDF and cosine foreshortening on the unit hemisphere H .This is equivalent to sampling the projection of the BRDF (without cosine) onto the unit disk H ⊥ .Such a pdf can be transformed into a hemispherical one via multiplication with the cosine term (division in the opposite direction).We choose to fit distributions of unit-disk projections    ⊥ ∈ H ⊥ .For most methods, the planar unit disk is a more convenient domain for defining the pdfs.
Application to NeuMIP.Our approach could work for most neuralmaterial representations, but we specifically learn sampling for a variant of NeuMIP [Kuznetsov et al. 2021], where the BRDF depends only on a learnt 8-dimensional feature vector   that is retrieved from a given UV coordinate.In NeuMIP, the UV coordinate is corrected by a separate offset module to handle parallax effects; the offsetting happens before BRDF importance sampling and is orthogonal to our method.We do not consider the multi-resolution version of NeuMIP (trilinearly-interpolated pyramid of feature textures), nor its silhouette extension [Kuznetsov et al. 2022].These methods are also orthogonal to BRDF importance sampling, so they are likely to work with our approach as well.
Three sampling methods.We consider three neural sampling approaches.The first one predicts a mixture of a few simple analytical lobes (Lambertian and Gaussian).The second one is based on normalizing flows, and the third predicts discretized histograms.Within each category we present a method that improves over previous work for our application.In the following subsections we describe these methods, and in Section 4 we analyze their performance in terms of variance reduction and computational cost.
Fitting.We train neural networks that fit the pdf  to the ground truth  * by minimizing the KL divergence   ( * ∥) between the two.This is equivalent to maximizing the log-likelihood (with respect to ) of directions sampled from the ground truth distribution  * .In practice, in every training batch we sample from the groundtruth distribution  * (   ⊥ |  ,    i ) (discretized to a high-resolution grid of directions    ⊥ for a randomly chosen value of the condition   ,    i ), evaluate the log pdf of the trained model , and back-propagate to update the neural-network parameters.

Mixture of analytical lobes
Improved baseline analytical mixture.A baseline method [Fan et al. 2022] approximates the desired pdf  * (   |  ,    i ) by predicting a combination of a Lambertian lobe and an isotropic 2D Gaussian lobe, given    and    i .The predicted parameters are the (scalar) standard deviation  of the Gaussian lobe and the relative weight  between the two lobes.However, this approach is too limited to represent even materials with local shading normals which are very common in neural BTFs including NeuMIP.To that end, we improve the baseline by also predicting the mean    of the Gaussian lobe: where  (   ⊥ ;   , ) is a normalized 2D Gaussian with mean    and standard deviation , evaluated at the projected direction    ⊥ .The inputs to the MLP are the feature vector    and the direction    i in the local shading frame, while the outputs are the mixture parameters: , , and .Note that the Lambertian pdf on the projected hemisphere is a constant 1/, i.e., uniform sampling on the projected hemisphere is equivalent to cosine sampling on the hemisphere.
Our analytical mixture.To better capture multi-modal highlights or highlights with non-Gaussian or asymmetric falloff, we further choose a mixture of one Lambertian lobe and two axis-aligned anisotropic Gaussian lobes with diagonal covariance matrices: where   (   ⊥ ;   ,   ) is a 2D Gaussian with mean    and standard deviations    = (  ,   ) in the  and  axes.The three weights are positive and sum to 1, and are predicted by a simple MLP with one hidden layer, along with the corresponding Gaussian means and standard deviations.Similarly, the inputs to the MLP are a feature vector    and direction    i in the local shading frame, and the outputs are the parameters of the mixture:    1 ,    1 ,    2 ,    2 ,  1 ,  2 ,  3 (see Fig. 3).More components or full anisotropy can potentially capture more complex distributions but increase the fitting difficulty and the computational workload for common cases with only one highlight.We compromise with two diagonal Gaussian lobes.
Invalid samples.The support of the above two pdf mixtures is wider than the unit disk, i.e., outside the BRDF support, meaning that in practice some samples will be invalid (i.e., off the hemisphere).The resulting Monte Carlo estimates thus have zero values, as also happens for importance sampling analytic BRDFs like microfacet models [Walter et al. 2007], without posing practical issues.

Normalizing flows
Neural materials have arbitrary reflection profiles, embedded local normals, texture, and potentially complex layered (anisotropic) behaviors.A mixture of simple lobes with a few trainable parameters is not expressive enough to always provide accurate importance sampling, also considering the difficulty in determining the number of mixture components required for all materials.Normalizing flows provide a compelling alternative for our application, as they support both sampling and density evaluation, and have been shown to be able to fit complex, multi-modal distributions.Normalizing flows learn a bijective mapping between a simple base distribution and (an approximation of) a complex target distribution, and can be used to generate samples from the latter.A carefully designed architecture with an autoregressive property ensures the mapping's Jacobian is triangular, with an easy-to-compute determinant, which is needed for pdf evaluation.The architecture also ensures tractable invertibility of the mapping [Dinh et al. 2016;Papamakarios et al. 2017], useful for multiple importance sampling.
As a base distribution   we use a 2D Gaussian.We learn a bijective function  to map samples  from   to samples  = (|  ,    i ) that approximately follow the target  * (•|  ,    i ).If the inverse  =  −1 and its Jacobian determinant are efficiently computable, we can evaluate the learned distribution  at a point (direction)    ⊥ as , where  =  (   ⊥ |  ,    i ). (3) This framework satisfies our goal of fitting pdfs that can be evaluated and sampled from efficiently, provided we can find a neural network to represent  (and ) with the desired properties.A detailed exploration of the neural architectures used for normalizing flows is out of our scope, and we refer the reader to the surveys of Kobyzev et al. [2019] and Papamakarios et al. [2021].
We did ablations on representative state-of-the-art transformations, from simple affine transformations [Dinh et al. 2016] to freeform monotonic neural networks [Wehenkel and Louppe 2019;Huang et al. 2018].The former has limited flexibility due to the simple transformation while the latter lacks an analytical inverse.We concluded that monotonic piecewise polynomials [Müller et al. 2019;Durkan et al. 2019;Dolatabadi et al. 2020] achieve the best balance, providing similar expression power while being efficient for computing the inverse.We build upon monotonic piecewise rational quadratic splines [Durkan et al. 2019] which provide more flexibility compared to RealNVP [Dinh et al. 2016] and piecewise polynomials [Müller et al. 2019].On the other hand, since we are working with distributions of only two dimensions, there is no significant difference in performance between coupling layers and autoregressive architectures.We choose coupling transforms due to their simplicity.Namely, each invertible transformation is applied onto only one dimension of    ⊥ and we couple the two transformations together to fuse the dimensions.
Below we detail the monotonic piecewise rational quadratic (RQ) spline and our corresponding application.We split the square region [−1, 1] of the initial  space and target    ⊥ space into several intervals.Then within each interval, we learn a monotonically increasing rational-quadratic function (we omit the interval index for brevity): where 1 − 0 , and  0(1) ,  0(1) and  0(1) are the locations and derivatives at the left (right) boundary of each interval.Those are predicted by a 3-layer MLP taking (  ,    i ) and the other dimension of    ⊥ as input.We use 20 intervals to capture the pdf variation.Solving and selecting the correct root of a quadratic equation gives the inverse of the transformation.We refer the reader to Durkan et al. [2019] for Jacobian computation and further details.Since we have two dimensions in the target pdf variable    ⊥ , the 3-layer MLP inference and the inverse of the above transformation need to run twice during sampling.Note that the predicted splines can be shared between the pdf query and sample routine for multiple importance sampling at each query point (  ,    i ). Figure 4 illustrates our architecture.The rich information encoded in the conditional neural feature vector    helps us to greatly simplify our importance sampling network of invertible transformations to capture the complex spatially varying pdfs.Unlike previous work [Müller et al. 2019;Xie et al. 2019], we further use a conditional Gaussian base distribution that depends on (  ,    i ), instead of using a uniform distribution or a Gaussian distribution with fixed zero mean and unit variance.Conditional normalizing flows have been studied for super-resolution and image segmentation tasks [Winkler et al. 2019].As we show in Section 4 below, this conditional distribution helps us to further reduce the total architecture size (one 3-layer MLPs used in the flow, and one 2-layer MLP to predict the base-Gaussian parameters).Moreover, the conditional probability is guaranteed to be normalized by construction.The flat log-log convergence plots in Fig. 8 have the expected slope, which confirms the unbiasedness.

Histogram prediction
Another approach for importance sampling is to directly predict an easy-to-sample distribution.We opt for a piecewise constant 2D distribution, i.e., a histogram, represented as a regular grid of bin values.Following previous work which uses a low-resolution predicted rendering of a complex luminaire as a sampling distribution [Zhu et al. 2021], we predict a histogram approximating  * (   ⊥ |  ,    i ) through an MLP taking the condition (  ,    i ) as input.
We found that a direct application of this idea does not perform well in our setting (called 'histogram (baseline)' in Table 2).The reason is likely that while a luminaire can be always centered in the predicted histogram, our pdf lobes can have very different centers and shapes as the condition (  ,    i ) varies.To this end, we present a variant that fits our distributions better and is also computationally efficient.Instead of directly predicting a histogram for each query, we assume these histograms can be decomposed into histogram mixtures, i.e., weighted combinations of shared-basis histograms: (5) where H  are  basis histograms, globally shared across the neural material,   are the corresponding mixture weights;  represents azimuthal rotation,   ∈ [0, 2] are rotation angles applied to    i , and   ∈ [0, 1] is a scalar latent code.The rationale behind the 2D (  ,   ) parameterization is to efficiently encode continuous changes in the BRDF lobe as the incoming direction    i varies.As can be seen in Fig. 6, both the lobe's position and shape change.
The code   captures the shape change and part of the rotation.This design reduces the number of mixture components needed for fitting the pdf compared to simpler designs.Training.We use  = 10 basis histograms for each neural material and train the network using  2 loss between histogram and ground truth pdf response for randomly sampled tuples (  ,    i ,   ): We have found this loss to perform better than KL divergence.After training, we tabulate histograms of resolution  ×  = 64 × 64 and discretize the latent code  into  = 100 equi-spaced values.
Since   is learned implicitly, it is not guaranteed to be distributed uniformly in [0, 1].To encourage such distribution, and to reduce post-training discretization error, we add a quantization term to the above loss:  q = ||  − t || 2 2 , where t is the quantization of   .

RESULTS
We now present an evaluation of our three proposed methods.The supplemental document discusses ablations around design choices.
We implemented all our sampling techniques in PyTorch [Paszke et al. 2019] and integrated them into Mitsuba 3 [Jakob et al. 2022].Every sampling and pdf-evaluation call for a specific material is run on a wavefront of rays.All results are produced using a single NVIDIA RTX 3090 GPU.Our implementations are renderer-agnostic and could be easily integrated into other systems.We will make our code and data publicly available.
Stratification and MIS.For simplicity, and to ensure correctness, all our results use independent pseudo-random sampling, showing slopes of −1 on log-log plots due to the linear variance reduction with increasing sample count.One can also utilize (low-discrepancy) stratification; we have verified this helps convergence slightly and does not change the relative ordering of the methods.
Most of our results utilize only BRDF sampling as it is our main focus.Emitter sampling in combination with multiple importance sampling (MIS) works as expected with our methods, since we support all required sub-routines for pdf evaluation and sampling.We utilize MIS in Fig. 11 as well as in the Cowhide and Metal scenes in Fig. 9, which are all rendered with global illumination.
Table 1: Timing breakdown for GPU rendering of NeuMIP materials on the scene from Fig. 1.The neural-material evaluation itself takes 0.025 s/spp and is included in the total rendering time.This simple scene is an almost worst-case scenario for the overhead of (our) advanced importance sampling.In practical scenes this overhead diminishes quickly.
Table 1 provides a breakdown of the sampling and total rendering time for direct lighting in the scene in Fig. 1.The overhead of our importance-sampling techniques ranges from 2% to 12% of the total rendering time.Our improved analytical and histogram-mixture methods each use one shallow MLP, two-and four-layer respectively, resulting in a small 2-4% overhead.Our normalizing-flow sampler is slower as it performs three shallowevaluations (one two-layer and two three-layer) and some sequential batch operations, e.g., calculating the CDFs along the polynomial intervals necessary for inverse-inference root-finding.That sampler can generate 0.78× number of samples to render in equal time (see Fig. 1).In more complex practical scenes, the overhead of our neural sampling becomes even more negligible.
Pdf prediction.In Fig. 7 we plot pdfs predicted by all sampling methods on four materials at randomly chosen conditions (  ,    i ).The improved, normal-mapping-aware baseline is able to find the isotropic highlight when the pdf varies little over the surface (Goat Leather Clover Quilt) but fails to capture anisotropy (Stylized Melted Metal).Our analytical method improves on this shortcoming but still suffers from the limited expressiveness of the simple analytical lobes to cover the entire 4D conditioning space (UV and incoming direction).Our normalizing-flow and histogram alternatives greatly improve the fitting accuracy, leading to more efficient sampling.2 where we also include two additional baselines: practical normalizing-flow sampling [Xie et al. 2019] and naive histogram sampling [Zhu et al. 2021].

Rendering results.
Our proposed methods consistently outperform the cosine and baseline methods used in previous works.Please note that the "baseline" method we provide is actually significantly improved by us.The original baseline [Sztrajman et al. 2021;Fan et al. 2022] would fail to beat even cosine sampling, since it cannot learn normal mapping.We also observed that suboptimal solutions can perform well at some (  ,    i ) conditions but create problems in others, resulting in overall worse performance (see Giallo Marble, where the analytical method is worse than baseline).Instead, the two alternatives we recommend are robust and can handle these complex variations.Normalizing flows perform best for some materials, and are especially good at capturing anisotropy (Fig. 10 and bottom three materials in Fig. 9).Histogram mixtures perform best in some scenarios, and are especially good at multi-coating materials with complex variation (Victorian Fabric) while bringing just a tiny overhead.These two methods outperform the analytical one in most complex cases with difficult spatial BRDF variations, while remaining robust on simpler materials (Cowhide Leather, Sheep Leather).Overall, our samplers significantly reduce noise at low sampling rates, cutting the rendering cost when more neural materials are applied in a practical application.Table 2 shows that the benefits range from 1.28× (Giallo Marble) to 6.78× (Moroccan Tiles), with a mean of 2.58× corresponding to a time benefit of 158%.
The "tabulated" column in Fig. 10 employs sampling from an impractical, near-optimal pdf that is a brute-force 2048×2048 discretization of the ground-truth BRDF distribution at each shading point.Those results still show some small error due to using the BRDF luminance to importance sample all three color channels.Our practical methods achieve similar performance in comparison.
We further plot error convergence in Fig. 8 to quantify the benefit of using our neural samplers.The normalizing flow mostly performs on par with the histogram mixture, better for anisotropic materials (equal-spp plots in first row), but the latter does best in most common cases thanks to its much smaller overhead (equaltime plots in second row).When the geometry is not too complex, we recommend our histogram mixture, to benefit from the advanced importance sampling with nearly no sacrifice in computational cost.The normalizing flow is a good choice for anisotropic materials.
Fig. 11 shows a scene with multiple neural materials, rendered with our normalizing-flow sampler with 128 spp and 12 light bounces.Our sampler handles global illumination well and can be used for practical rendering needs.The four stools are textured with the Elephant Leather material, the table is made of Stylized Melted Metal and Giallo Antico Marble, creating a realistic embossing effect.The bottle is textured with the Stylized Light Bulb Screw material.
Failure cases and limitations.As seen in some zoom-ins in Fig. 9, it is not guaranteed that our proposed methods outperform the improved baseline at every condition (  ,    i ).Our analytical method has almost no advantage on simple isotropic materials with a single highlight, though it helps in more challenging cases.Normalizing flows have longer inference time (Table 1), thus not showing gains on simpler scenes and materials.We also observe that in rare extreme cases, histogram mixtures can fail to predict the pdf lobe when its shape at a neighboring condition (  ,    i ) changes rapidly.
Summary and recommendations.Based on our experiments, one should use our analytical mixture method only when it is known beforehand that the material has simple lobes with only a few highlights.The other alternatives provide a more general solution without bringing extra rendering overhead.In terms of wall-clock time, in our experiments the histogram method achieves the best variance reduction, which makes it our default recommendation.The method is also relatively easy to use at inference time thanks to the precomputed discretized tensor replacing a large MLP.However, wall-clock times are heavily dependent on scenes and the renderer.The normalizing flow may sometimes be the winning option, especially with anisotropy or in more complex settings with expensive ray tracing or material evaluation.Future code optimization for the MLP and RQ-spline evaluation may further speed up these methods; it is possible that hardware support for these operations on the GPU will become available, further increasing the gains from accurate sampling.Our improved analytic mixture is an additional option if easy implementation and very fast sampling are priorities.

CONCLUSION AND FUTURE WORK
In this paper, we evaluated and compared several importance sampling approaches for neural material representations.We studied three types of methods: analytic mixtures, normalizing flows, and histogram prediction.While simple analytic mixtures of Lambertian and Gaussian lobes have been used before, our version outperforms them as they lack the ability to handle dramatic normal variations, anisotropy, and layered materials.Moreover, we introduced variations of normalizing flows and histogram mixtures based on novel design ideas that perform well across the board, and we can recommend either approach for practical neural material implementations.These approaches provide a complete toolbox to enable the use of spatially varying neural materials in production rendering.
Our work opens several future directions.Our sampling models, like many neural material models themselves, are not universal and need to be trained and precomputed for every material separately; finding a universal evaluator and sampler architecture, where only the feature textures change, would be valuable.Another direction would be to jointly train sampling and evaluation models; currently the sampling model fits the evaluation model including its approximation error.Finally, our solution is not limited to neural materials, and could be applicable to other importance-sampling tasks where directions are sampled conditionally on scene position, such as path guiding, complex luminaire sampling, and portal sampling.

Figure 1 :
Figure1: Two neural materials rendered using five BRDF importance-sampling methods.To achieve equal render times, we adjust the number of samples per pixel (spp); from left to right: 33 spp, 32 spp, 32 spp, 25 spp, 32 spp.

Figure 2 :
Figure 2: Neural materials can (a) encode various appearance effects: specular/glossy reflectance, displacement, anisotropy, and multi-layered BRDFs.(b) The NeuMIP architecture uses two feature textures and two MLPs.Only the last step depends on the light direction, so it is the only part that requires importance sampling.(c) The model is trained on synthetically rendered slices of the 6D reflectance function with varying camera and lighting directions.(d) Our methods learn to importance sample neural materials, conditioned on the pretrained NeuMIP features.

Figure 3 :
Figure3: Our analytical-lobe mixture method takes as input the neural-material features at the UV surface location and the viewing direction to infer the parameters of three lobes, one Lambertian and two anisotropic Gaussians, to capture diffuse reflection and (potentially multi-modal) highlights.High performance is ensured by using an MLP with one hidden layer.

Figure 4 :
Figure 4: Our lightweight normalizing-flow model transforms input samples ( 0 ,  1 ) to a (projected) direction    ⊥ = ( ′ 0 ,  ′ 1 ).The input samples are generated from a base Gaussian distribution with MLP-inferred mean and standard deviation.Each is then warped by an analytically invertible piecewise rational quadratic (RQ) spline.The spline parameters (bin widths, bin heights, derivatives) are inferred by another small MLP.

Figure 5
illustrates our histogram-mixture architecture;   ,   ,   are encoded into a small MLP that takes the condition (  ,    i ) as input, and H  are implicitly encoded into a (  ,   )-dependent MLP.After training, basis histograms are baked into a  ×  ×  ×  tensor for fast query ( basis histograms with  ×  resolution, latent-code discretization levels).This scheme allows us to use a larger MLP during training to better fit the pdf, without hurting the speed of the inference which performs fast tensor look-ups.
histogram-mixture model sends the neuralfeature vector    and incoming direction    i to a small MLP to infer a weight, azimuth angle, and latent code for each basis histogram.These are then combined to produce the output histogram.The histogram is encoded by another MLP during training, and is then baked into a 4D tensor for efficient sampling during inference.A more basic approach would use the initial (green) MLP to directly predict a 2D sampling histogram from    and    i .While fast and simple, that approach cannot accurately model the wide variety of lobe shapes in the large space of the condition (  ,    i ) for every material.

Figure 6 :
Figure 6: Our histogram approach can efficiently model continuous changes in the pdf lobe shape based on the incoming angle (columns) using the learned rotation and scalar latent code.The change in lobe shape cannot be explained by rotation alone, and the latent code  helps significantly.Different basis histograms (left and right) are necessary to represent different modes of the pdf lobes and/or different spatial locations on the material.
Figures 1, 9 and 10 show equal-time comparisons of all methods on eleven neural materials.The Turtle Shell and Wagon Fine Wood Panelling materials are uniformly lit, and for the others we use a more complex environment map to show specular effects.Cowhide Leather and Stylized Melted Metal also show global illumination.The top three materials in Fig. 9 have nearly isotropic highlights everywhere; Victorian Fabric has multiple lobes, while the bottom three and Fig. 10 have anisotropy.The improvement factors for our methods are shown in Table

Figure 7 :
Figure 7: Comparison of pdf prediction of various importance-sampling methods on four neural materials.Each column represents the predicted pdf of the outgoing direction at a random pair of UV coordinates and incoming direction.We show the corresponding mean squared error (MSE) and KL divergence numbers below each result w.r.t. the reference pdf plotted image.

Figure 9 :
Figure 9: Equal-time rendering comparison of various neural materials under constant (top row) and environment-map (remaining rows) lighting.Cowhide Leather and Stylized Melted Metal are rendered with global illumination.First three materials have one isotropic highlight each; Victorian Fabric has multiple lobes; bottom three have anisotropic highlights.As in Fig. 1, spp are adjusted for cosine weighted (1.03×) and normalizing flow (0.78×) on simple scenes to achieve equal render time.

Figure 10 :
Figure 10: Equal-time rendering comparison of two complex anisotropic materials.The "tabulated" column samples from a near-optimal discretized ground-truth pdf; it takes hours to render and is only for comparison to the reference.

Figure 11 :
Figure 11: A more complex scene containing multiple neural materials, rendered with global illumination using our normalizing-flow sampler.

Table 2 :
[Zhu et al. 2021]ng speed-up factors, in terms of the relative runtime the improved baseline (Section 3.1) needs to achieve equal quality.We also include normalizingflow[Xie et al. 2019]and histogram[Zhu et al. 2021]baselines.For each material we highlight the best improvement factor.
Reference Improved Analytical Norm.flowHistogram Reference Improved Analytical Norm.flowHistogram baseline baseline Convergence graphs.Top row: log-log plot of pixel MSE w.r.t.samples per pixel.Bottom row: log-log plots of rendering MSE w.r.t.rendering time.Please refer to the supplementary material for graphs on more neural materials.