LightRidge: An End-to-end Agile Design Framework for Diffractive Optical Neural Networks

To lower the barrier to diffractive optical neural networks (DONNs) design, exploration, and deployment, we propose LightRidge, the first end-to-end optical ML compilation framework, which consists of (1) precise and differentiable optical physics kernels that enable complete explorations of DONNs architectures, (2) optical physics computation kernel acceleration that significantly reduces the runtime cost in training, emulation, and deployment of DONNs, and (3) versatile and flexible optical system modeling and user-friendly domain-specific-language (DSL). As a result, LightRidge framework enables efficient end-to-end design and deployment of DONNs, and significantly reduces the efforts for programming, hardware-software codesign, and chip integration. Our results are experimentally conducted with physical optical systems, where we demonstrate: (1) the optical physics kernels precisely correlated to low-level physics and systems, (2) significant speedups in runtime with physics-aware emulation workloads compared to the state-of-the-art commercial system, (3) effective architectural design space exploration verified by the hardware prototype and on-chip integration case study, and (4) novel DONN design principles including successful demonstrations of advanced image classification and image segmentation task using DONNs architecture and topology.


INTRODUCTION
Deep neural networks (DNNs) have experienced substantial growth in recent years, making significant contributions in many application domains like autonomous systems, natural language processing, and health care [2,3,11,15,16,29,51].However, large DNN models producing high system throughput, usually suffer from high carbon footprint.For example, recent studies estimated 626,000 pounds of planet-warming carbon dioxide, equal to the lifetime emissions of five cars, produced in training Transformer network [47,50].On the other side, the embedded accelerators [4,46,49,52,57,58], which are designed to improve resource and power efficiency, suffer from limited functionality and throughput.Thus, while there have recently seen great progress in customized accelerators that adjust the computing performance with efficiency in hardware architectures and systems, the Pareto-frontier of conventional accelerators remains the same [12,19,26,29,43,45].
To advance the Pareto-frontier of ML systems, i.e., offering high computing performance as well as high power efficiency, accelerators taking advantage of optics, namely optical neural networks (ONNs), have recently attracted significant interest in machine learning and hardware acceleration.The main advantages of ONNs over digital accelerators can be summarized as follows - (1) In optical computing systems, since the input features are encoded and carried by light, the computation and data movement will happen at the speed of light in the medium with orders of magnitude advantages in computation speed [7,18,22,23,30,33,48,60].(2) The laser implemented in the optical systems can be easily expanded with passive optical devices, such as beam splitters, to multiple channels, which means parallel computation can be easily realized with ONN systems, and the throughput of the system will be significantly increased [6,34,38,68].(3) The trained ONN system will be deployed with passive optical devices, which means there is no additional energy cost for all-optical inference process, thus improving the energy efficiency significantly [17,22,31,42,48,53,61,64,67].Diffractive Optical Neural Networks (DONNs) is one of the most promising research areas in ONNs, which mimic the propagation and connectivity properties of conventional neural networks, by utilizing the nature physics of light diffraction and phase modulation of coherent light [6,30,31,34,38,44,68].Even though the inference of the physical DONN is all optical, the training part that leads to its design is done through digital platforms, where a precise, efficient and hardware-aware emulation engine is required.
The existing optical emulation engines, such as Mathworks BeamLab [56] and LightPipes [55] 1 , mainly focus on the emulation of the physical phenomenon while lack the key functionalities and domain-specific runtime optimizations in supporting the developments of DONNs.Specifically, it is particularly challenging for existing optical emulation frameworks to deal with DONN training and inference due to the following reasons: (1) The core emulation functions are not differentiable, which makes the backpropagationbased training hard to implement.(2) The implementation is not optimized in runtime.For example, LightPipes does not support tensor representations and operator fusion, which significantly limits the runtime performance (see Table 1).(3) There does not exist hardware/device aware emulation supports, which require significant extra efforts for correlating numerical emulations and physical deployments.
The critical technical barriers in design, training, exploration, and hardware deployment of DONNs are summarized as follows: Challenge 1: Sufficient multi-disciplinary domain-knowledge in optical physics, fabrication, and machine learning (ML) are required for DONN system design and deployment, which puts a critical technical barrier to exploring and advancing DONN systems in real-world applications.At this point, there does not exist an endto-end design framework that supports design and exploration for full-stack DONNs design, optimization, fabrication, and on-chip integration.Moreover, the broad architectural search space with software, optics, and fabrication hyperparameters can be an obstacle for efficient design space exploration (DSE), which also motivates the development of an end-to-end design framework.Challenge 2: There have observed significant performance degradation when deploying the trained DONN model to the practical hardware, namely, there is an algorithm-hardware miscorrelation gap between the numerical modeling and the physical system.The miscorrelation gap can come from two aspects: (1) The imprecise numerical modeling of the DONN system, i.e., the lack of precisely implemented physics emulation intermediate representation (IR).Classic numerical models for fundamental physics kernels in DONNs such as Finite-difference time-domain (FDTD) and scalar diffraction modeling via Fast Fourier Transform (FFT), are both verified to be sufficiently precise in the DONN system emulation [37]; (2) Lack of domain-specific hardware-software codesign algorithms to realize quantization-aware hardware deployment and deal with the intrinsic noise (such as fabrication variations, non-unify optical response, etc.) in optical devices.These challenges have been confirmed by Zhou et al. [68] in Figure 1, who reports ≥ 30% accuracy degradation while deploying the trained model to the physical optical system.Challenge 3: Training and emulation of DONN system are challenging due to high computational cost in modeling the optical physics.For example, [34,39,68] reported that training 5-layer DONNs for MNIST-10 with 5 epochs takes 3-4 days (Figure 1).Besides, existing optics simulation frameworks lack runtime optimization in developing the physics kernels, nor domain-specific language (DSL) supports.Table 1 summarizes the limitations of existing frameworks for DONNs design.More importantly, the choice of numerical physics modeling has significant impacts in runtime efficiency, while it is required to offer high fidelity to the hardware deployment and fabrication.
Thus, we propose LightRidge, an agile end-to-end framework, aiming to lower the barriers to design, training, design space exploration, and hardware deployment of DONN systems.In particular, LightRidge is implemented with high-performance, precise, and versatile optical physics kernels, which precisely correlate to • We propose a novel agile physics-aware design framework LightRidge for end-to-end design, exploration, and deployment for DONNs, consisting of versatile and optimized physics modeling kernels and hardware-software codesign algorithms that enable efficient and precise DONNs modeling w.r.t realworld hardware systems (Section 3).• We propose LightRidge-DSE to accelerate the end-to-end design cycle for DONNs design, exploration, and on-chip integration, verified by our physical prototype and on-chip integration case study.Moreover, LightRidge-DSE confirms critical domain-knowledge insights [5] for designing an efficient DONN system in physics meanings (Section 4).• We experimentally validate the effectiveness and precision of LightRidge in designing practical DONN systems and on-chip integration, via visible-range DONN prototype and end-to-end on-chip integration case study (Section 5.1-5.5).

Input Image Diffract Layer1
Diffract Layer2 Diffract Layer3 Camera • Furthermore, two novel advanced DONN architecture principles are developed via LightRidge to advance DONNs in complex image classification tasks, and first-ever all-optical image segmentation (Section 5.6).• Finally, LightRidge will be released as an open-source hardware project. 2

DIFFRACTIVE OPTICAL NEURAL NETWORKS
Compared to conventional neural networks (NNs) on digital platforms, the information carrier changes from electrons to photons in DONN systems, i.e., instead of manipulating electrons between transistors to realize the computation, in DONN systems, the computation is realized by manipulating the information-carried light with its physical features.Specifically, the DONN system is composed by multiple diffractive layers stacking in sequence as shown in Figure 2(a), which embed the phase modulations trained w.r.t the ML task for manipulating and encoding information on the light signal.The connection between layers is realized by the light diffraction when the light signal propagates between layers.Thus, in DONN systems, light diffraction can be considered as "neural operators" for data movements, and phase change patterns can be seen as "weights" for data manipulations, when compared with conventional NNs.However, the DONN system requires the analog-to-digital converter to read out the prediction results, where a detector is employed at the end of the system to capture the light intensity pattern for analysis and predictions.Thus, DONN systems take advantage of the light signal to encode and propagate information, and its physical nature to realize the computation.Since the physical phenomenon happens by nature with light propagation, the computation happens with no extra energy cost at the light speed for all-optical inference.However, the practical computation efficiency of the DONN system is determined by the analog-to-digital conversion.This section presents the overview of DONNs, including emulation, training, and the hardware deployment of DONN systems.
First, to get an effective DONN model w.r.t a specific ML task, the propagation process of the light signal is emulated and the model is trained based on the optical emulations on digital platforms, where a precise mathematical approximation for the optical phenomenon, i.e., light diffraction and phase modulation, is required, which is illustrated in detail in Section 3.1.Each point at a given diffractive layer acts as a secondary source of the input light wave in accordance with the Huygens-Fresnel principle.The phase of the input wave is determined by the product of the input wave and the complex-valued phase modulation at that point.The diffraction space is required to generate the diffraction pattern at the receive plane.The phase modulation at each point w.r.t its location at the layer is the learnable parameter iteratively adjusted during the training process with error back-propagation method [34,68].The physical kernel implemented in LightRidge for DONN emulation and training is constructed with the widely used, precise and efficient mathematical approximations for scalar diffraction formulas.Finally, the trained model is physically deployed with optical devices as shown in Figure 2(b) or on-chip integration systems as shown in Figure 11, to realize the fully optical inference with low energy cost, high computation speed and high system throughput.

DONN Emulation and Training
Enabling the precise hardware-software codesign aware emulation of the physical phenomenon happening in DONN systems including input encoding, light diffraction, phase modulation, and detector reading, is critical for the practical realization of DONN systems.There are mainly two mathematical methods for formulating light diffraction: (1) Finite-difference time-domain (FDTD) method [63], which performs the full-vector differentiable numerical simulation of photonic structures by solving Maxwell's equations directly without physical approximations.It is a sophisticated and powerful method for light propagation emulation, while suffering from heavy computation efforts and heavy data dependency that prevent parallelisms in kernel developments.Specifically, FDTD requires the entire computational domain to be sufficiently fine gridded, which means the DONN system size will be expanded exponentially in the FDTD-based emulation.Since DONN systems target large-scale machine learning tasks, the FDTD-based emulation is infeasible in computation runtime and memory for DONN systems due to the system scalability.(2) Fast Fourier Transform (FFT) method [21], which performs mathematical approximation based on scalar diffraction theory.It simplifies the computation with scenario-specific approximations while keeping the emulation sufficiently precise.There are three widely used approximations for light diffraction in different application scenarios, i.e., Rayleigh-Sommerfield approximation, Fresnel approximation, and Fraunhofer approximation.While both FTDT and FFT-based approximations are differentiable, FFT-based scalar diffraction modeling is more capable for large-scale DONNs emulation without size expansion requirements for fine gridded computational domain.More importantly, [37] and our physical experiments in Section 5 verify that the FFT-based approximations are sufficiently precise to close the codesign gap for the DONN system emulation.Therefore, we implement the FFT-based physics kernel in LightRidge as IR to provide precise and efficient DONN emulation and training (Section 3.1.1).The phase modulation is applied to the input light wave by complex-valued matrix multiplication as illustrated in Section 3.1.2.
In our framework, the FFT based mathematical emulation for light diffraction is design to be fully differentiable from the detector to the laser source w.r.t the loss function acquired from the diffraction pattern captured at the detector.Specifically, during the training process, the prediction is generated according to the intensity of the diffraction pattern captured on the detector with pre-defined detector regions for different classes, where the light intensity  collected by each detector region mimics the probability of output prediction after Softmax in conventional DNNs.Thus, the class whose corresponding detector region collects the highest light intensity is selected as the final prediction.With the one-hot represented ground truth class , the loss function  is acquired with the MSELoss between the predictions Softmax ( ) and one-hot represented ground truth labels , i.e.,  = ∥ Softmax( ) - ∥ 2 .Thus, the whole system is differentiable and compatible with conventional automatic differential engines.

Hardware Deployment
The devices for physical hardware to deploy the trained DONN model need to be carefully selected, as optical devices made from different materials can have significantly different optical responses to different laser wavelengths.For example, SLMs can function as diffractive layers in the DONN system with the laser wavelength in visible range; while for systems with laser wavelength in Terahertz (THz) range, SLMs cannot provide efficient phase modulations to the light signal and the 3D printed masks with designed thickness at each pixel made with UV-curable resin are used as the diffractive layers in THz optical systems [34].
In our experimental hardware systems shown in Figure 2(b), the wavelength of the laser source is 532nm, which is in the operating range of the SLM 3 .Specifically, the SLM is an array of twisted nematic liquid crystal, where each pixel (liquid crystal) can be independently twisted to different angles by different applied control voltages, providing different phase modulation for the input light beam.However, such analog optical devices hardly have unified optical response to the control and can vary from each single due to fabrication errors, worsening the correlation between the numerical emulations and the hardware deployment, which highlights the importance to design precise computation kernels for emulation and hardware-software codesign algorithms for DONN systems.

LIGHTRIDGE FRAMEWORK
Figure 3 shows the end-to-end design flow of DONN systems with automation provided by LightRidge.With the user-defined design specification and the targeted ML task, 1 the architectural and fabrication parameters such as diffraction distance, diffraction unit size, chip dimensions, etc., are selected and produced automatically by conducting fast and efficient design space exploration (DSE) with the emulation model in LightRidge, which circumvents the critical domain knowledge requirements for designing a functioning DONN model (Section 4).This exploration is enabled with our 3 https://holoeye.com/lc-2012-spatial-light-modulator/accelerated and precise emulation engine, improving the runtime efficiency significantly. 2 When the satisfying hyperparameters are acquired from the fast DSE, the emulation model will be updated with the hardware information for physical deployment, e.g., the optical response curve for SLMs w.r.t the control voltages, where the emulation model is further trained with codesign algorithms with hardware-aware optimizations. 3 Optical devices for practical deployment are fabricated/set w.r.t the parameters in the trained model, i.e., the phase modulations in diffractive layers.The device fabrication information is dumped and generated automatically by LightRidge. 4 With all components ready for deployment, a targeted all-optical DONN system can be setup for efficient and energy-saving all-optical inference (Section 5). 5 Moreover, the LightRidge automation processes are all efficiently realized by the user-friendly DSL support in LightRidge.
In this section, we will introduce the LightRidge framework including the physics kernel with mathematical approximation modelling for DONN systems implemented in LightRidge, a novel complex-valued regularization algorithm to improve the training performance, and the front-end DSL designed for the LightRidge compilation implementations.

Physic Kernel Implementation
The DONN system functions as a neural network based on two physical phenomena, i.e., light diffraction and phase modulation.In our framework, we take FFT-based scalar diffraction theory to build our modelling kernels.
First, the continuous-wave (CW) laser source is deployed to encode the input information.The light wave is described with complex-valued numbers in physics with two properties, amplitude and phase of the wave, i.e., =  , where = √ −1,  is the amplitude, and  is the phase.The input information is encoded with the intensity  of the light wave with phase initialized as 0, i.e,  =0, = .Then, as shown in Figure 4, the information-carried light wave is diffracted over the diffraction distance , emulated with mathematical diffraction approximations described in Section 3.1.1.At the diffractive layer, each diffraction unit embeds a phase modulator, where the trainable parameter, phase modulation, is applied to the light signal as described in Section 3.1.2.The forward function for a multiple-layer constructed DONN system calculates diffraction and phase modulation iteratively through all stacked diffractive layers.Finally, the diffraction pattern, i.e., the distribution of light intensity, is captured and converted to digital processable information at the detector for computer processing.

Light Diffraction approximation.
There are typically three mathematical approximation methods for scalar theory of diffraction, i.e., Rayleigh-Sommerfeld approximation, Fresnel approximation, and Fraunhofer approximation.They work under specific application scenarios with different assumptions of the system, such as aperture size and propagation distance.
The Rayleigh-Sommerfeld is the most commonly used approximation as it works with least physical approximations of the system and is reported to give quite accurate results.The Rayleigh-Sommerfeld approximation is implemented with Equation 1 in our framework.As shown in Figure 4  the (, ) plane, and is illuminated in the positive  direction.We calculate the wavefield across the (, ) plane, which is parallel to the (, ) plane and at distance  from it.The  axis pierces both planes at their origins.Then, when  01 ≫ , the Rayleigh-Sommerfeld approximation will be described as where  = √ −1,  (, , ) describes the wavefield on target (, ) plane after diffraction distance  and  (, , 0) describes the wavefield on the emission (, ) plane,  is the wavelength of the input laser,  is the wave number where = 2  ,  01 is the vector pointing from  1 to  0 and the distance  01 is given by When diffraction angle  shown in Figure 4 is small enough, the computation complexity can be further reduced by applying conditions to the application scenarios while maintaining the emulation accuracy.As a result, In Fresnel approximation, by simplifying  01 with binomial expansion of the square root in Equation 2 and eliminating terms but  in the  2 01 appearing in the denominator of Equation 1, it is described as (3) In Fresnel approximation, the critical approximation happens in the approximation of the exponent, which can be seen that the spherical secondary wavelets will be replaced by wavelets with parabolic wavefronts.Thus, the condition on the distance  will be , the observer (the (, ) plane) is in the near field of the aperture.Furthermore, when  ≫ is satisfied, which means the quadratic phase factor under the integral sign in Equation 3 is approximately unity over the entire aperture, Franuhofer approximation will further greatly simplify the calculations.Thus, in the far field of the aperture, the diffraction can be approximated as Thus, the diffraction process can be more generally formulated as -when an input wave resulted from  −1-th layer (, ),   −1 (, , 0), diffracts over diffraction distance  to the -th layer (, ), the resulted wavefield  1  (, , ) in time domain is described as where ℎ is the diffraction function of free space.It can be calculated with spectral algorithm with Fast Fourier Transform (FFT) for fast and differentiable computation.By convolution theorem, the integral can be calculated with (, , ) =   −1 (, , ) (, , ) (7) Then, the multiplication result   (, , ) will be transformed back to the time domain as  2  (, , ) by inverse Fast Fourier Transform (iFFT) for phase modulation, which is the input wavefunction for applying the phase modulation.
3.1.2Phase modulation.The phase modulation functions like weight parameters in conventional neural networks and is updated iteratively during training process.Specifically, the input wave  2  (, ) (for simplicity, we discard  in phase computation representation as  is not involved) can be described by its amplitude and phase.By Euler's formula, it can be described with a complex-valued number in time domain, i.e., Where  = √ −1,  is the amplitude,  is the phase of the input wave; Acos is the real part and Asin is the imaginary part.After applying the phase modulation  (, ), the wave function is modulated as: which can be realized with complex-valued matrix multiplications.  (, ) is the input wavefunction for the forward function (Equation 5) for the  + 1-th diffractive layer.

Codesign Algorithm with Physics-aware Complex-valued Regularization
First, for the model emulation and training process on digital platforms, considering the physics in optics, the DONN system is described and emulated with complex-valued numbers.However, according to Equation 9, the training for the DONN system is more phase modulation dominated, while the intensity at the end of diffraction will decrease exponentially as the number of diffractive layers increases, which means a regularization between amplitude and phase is required to avoid gradient vanishing and explosion in the training process.With this insight, we introduce a novel regularization factor  in the forward function to improve the training efficiency, which can flexibly change the gradient scales between amplitude and phase modulations.Specifically,  is applied to amplitude vector  in Equation 9, where  is implemented with .Furthermore, our framework integrates the physics-aware codesign algorithm presented in [30] for efficient hardware deployment of the trained DONN model.Specifically, the framework takes the vector of experimentally measured optical responses w.r.t arbitrary optical hardware (e.g., the calibrated optical response of a SLM shown in Figure 3 2 ) as inputs, which is discrete and can have different levels of available valid optical responses.However, for optical devices, the number of available levels are usually too limited to fit an accurate function curve.With the implemented algorithm, the discrete and level-limited hardware-aware vector is formulated with Gumbel-Softmax [25] for differentiable training to map the training parameters directly to the available hardware levels during the training process, i.e., quantization-aware training without quantization approximations, which saves the manual calibration efforts and improves the end-to-end DONN deployment efficiency as shown in Figure 1 and Figure 6.

LightRidge Framework
LightRidge framework (Table 2) consists of four major components to simplify and accelerate the process of design, exploration and deployment of the DONN system, including a) versatile programming modules for precise physics modeling, b) domain-specific neural architecture modules of DONNs, c) accelerated physics kernels for training and inference runtime improvements, and d) hardware deployment supports.Low-level physics modeling -Three components are required to design a DONN model, including laser source, diffractive layers, and optical/photon detector.To model the whole physical phenomenon of DONNs, we first introduce the mathematical modeling modules for the implementation of DONN systems -(1) Various laser source modelings with flexible wavelength settings and beam profiles.(2) Precise light diffraction approximation, which falls into three categories -Rayleigh-Sommerfeld, which handles both far and near fields but with the highest computational complexity (Equation 1); Fresnel, which approximates the propagation with parabolic wavefronts, namely the near field propagation (Equation 3); Fraunhofer, implemented with Equation 4, approximating the propagation with planar wavefronts in the far-field [54].(3) The optical/photon detector digitizes the analog light intensity to make it processable by the computer.Model-level APIs -The DONN model is constructed with flexible model-level modules with LightRidge, where the architectural parameters can be used to customize the system -(1) the laser source module lr.laser offers precise laser customization including laser specifications such as wavelength, src_profile, etc. (2) The physics modeling of diffraction with trainable phase modulation is implemented in lr.layers.Two diffraction modelling with and without hardware-aware optimization are provided with lr.layers.diffractlayerand lr.layers.diffractlayer_raw,repectively.Specifically, to deal with challenge 2 in Section 1, lr.layers.diffractlayeremploys the codesign algorithm, where the device-level information is delicately integrated in the training process with quantization methods in [30] applied on the trainable parameters in diffractive layers for efficient modeling-to-hardware deployment.Both modules can alternate three diffraction approximation algorithms according to the user definition.Additionally, user-defined system hyperparameters such as size of diffraction unit (pixel_size), diffraction distance (distance), the available levels of the hardware implementing diffractive layers (level) can also be customized easily with our framework.(3) The detector
Optical/photon detector An photon detector to capture the light intensity and convert the analog intensity information to the digital computer-processable information.

Model-level APIs
lr.laser Define the laser source for the system, including laser wavelenghth and its beam profile. is employed to capture the light intensity after propagation and modulation through the system, which is the interface component for linking training loss construction and the DONN model emulation.In lr.layers.detector,x_loc and y_loc are lists of spatial coordinates of the detector, and the size of the detector regions is customized by det_size.(4) Finally, lr.models is a sequential container that stacks arbitrary numbers of customized diffractive layers in the order of light propagation in the DONN system and a detector plane.As a result, we construct a complete DONN system just like constructing a conventional neural network.
Training support -The DONN model is trained with conventional automatic differentiation engines in complex domain, which is supported by our differentiable physics kernels and training utility functions.Specifically, the original one-dimensional input is processed to a complex-valued input by initializing the phase information in data_to_cplex.Training parameters such as optimizer, complex-valued regularization regu_factor, loss function loss, etc., are also enabled in complex domain by lr.train.utils.The CPU and GPU accelerations are enabled by to(device).Finally, lr.train.dseenables physics-aware DSE for DONNs design and integration (Section 4).
Hardware deployment -The visualization of trained model parameters is provided with lr.layers.view().To practically deploy the digitally trained model to hardware, the quantization to the specific hardware (post-training quantization) is provided by lr.model.to_system.For example, for SLMs implemented DONN systems, the framework produces the trained applied control voltage array for each SLM for light signal manipulations.For THz systems, which is implemented with 3D printed phase masks, the framework will produce the thickness array for mask fabrications by calling lr.model.to_system.

DESIGN SPACE EXPLORATION
Taking advantages of LightRidge, we introduce the first explicit architectural design space exploration (DSE) engine for DONNs, namely LightRidge-DSE.As discussed earlier, the domain knowledge of optics and optical hardware are critical technical barriers to design DONNs.Therefore, there is a great need to enable an automatic DSE exploration in LightRidge, which will significantly shorten the design and hardware deployment cycle of DONNs and lower the optical domain-knowledge requirements.We propose an analytical model based DSE approach to accelerate the DSE process, where the analytical model is extracted from a ML regression model.Our main goal of the DSE engine is to provide guidance to design DONN systems under new design parameters with fabrication and chip integration requirements (e.g., fabrication technologies, chip dimension, etc.) with learnt knowledge from existing setups.Design space of DONNs -We consider the DONN design space from two aspects: (1) The major physical architectural design parameters of DONNs include -1 the diffraction unit size (the dimension of each diffractive unit), and 2 the diffraction distance, i.e., the physical distance between the source to the first diffractive layer, layer to layer and the last layer to the detector ( in Figure 2).These two are critical architectural parameters under a fixed laser profile (wavelength).( 2) The space exploration over DONNs, i.e., spatial architectural parameters -3 system size (or system resolution) paired with 4 hardware/device precision, i.e., discrete phase modulation levels provided by the device, which are sensitive parameters w.r.t the performance of ML tasks.We take the physical architectural DSE as an example in this section.DSE features and data collection -In our case, we show the process of conducting the DSE with the physical architectural design parameters, i.e, the diffraction unit size  and the diffraction distance , for DONN systems under different laser wavelength .The analytical model by DSE can generalize the learnt optical formula to DONN systems with new laser wavelength while following the traditional maximum half-cone diffraction angle theory [5], i.e., the analytical model should be applied to a nearby wavelength within the applicable range by the theory of the training data.In our DSE example (Figure 5), we use the analytical model trained from 432 nm and 632 nm for predictions under 532 nm.However, such a analytical model trained with wavelength in visible range will not work for predictions for wavelength in other ranges, such as Infrared (IR) and Microwaves because of the theory violation.Sensitivity analysis -We perform single parameter control variable tests for all three parameters in Table 3.Our results show that diffraction unit size is the most sensitive parameter, while wavelength and distance are almost equally sensitive to the accuracy performance w.r.t the image classification task with MNIST dataset.By shifting the DSE explored best parameters (the star point in Figure 5(d)) +10%/+5% or -10%/-5%, we observe sharply accuracy drops on unit size shifting (dropped to 30% in accuracy by shifting only ±5%), while less accuracy drops on the other two parameters (dropped to ∼70% in accuracy by shifting ±5%).
With the guidance from the analytical model, LightRidge-DSE finds the best architecture dimension and training parameters with several emulation iterations for selected possible parameters instead of sweeping through the grid-based search space.For example, in our case shown in Figure 5, aided by the analytical model, few emulation iterations (e.g., two emulations) instead of grid-searching over 121 data points are required for DSE, resulting in 60× speedups.On the other hand, DSE engine is able to provide general design parameters for the similar type of ML task.For example, the DSE model for image classification trained by MNIST dataset is also confirmed to be applicable to other MNIST-like datasets such as FashionMNIST [59], Kuzushiji-MNIST [8], Extension-MNIST-Letters [9] [32].

EVALUATION
In this section, we first demonstrate that LightRidge and LightRidge-DSE offer precise hardware-software correlations w.r.t real-world DONNs system realization (Figure 6) via a visible range DONN prototype.Second, we demonstrate the effectiveness of LightRidge framework over SOTA experimental baselines [34,68] in training performance and emulation runtime (Figure 7 -8).Finally, we demonstrate that LightRidge and LightRidge-DSE enables comprehensive DONN system on-chip integration (Figure 11) and the capabilities to design advanced DONNs design principles, including multi-channel DONNs classifier on Place365 [65] dataset (Figure 12) and the all-optical image segmentation architecture (Figure 13).
Note that experiments in Section 5.1 are physically deployed on optical hardware shown in Figure 6a, while other results are from emulations with LightRidge.).To make the input easier for hardware deployment, we train and validate the model with binarized MNIST images as shown in Figure 6a, where the trained phase modulation parameters are loaded on the SLMs.The resulted detector patterns for the inputs are shown in Figure 6b.The SLM used to encode input binary images is illuminated by the laser source, and the input information will be encoded on the intensity of the input light signal.The intermediate propagation results in all-optical DONN inference are not available as the information is carried with the light beam.At the end of the system, a detector is implemented for analog-to-digital conversion to capture the diffraction pattern, i.e., the light intensity distributions, for model analysis and predictions.As shown in Figures 6b, DONNs emulation results in LightRidge precisely match the experimental measurements, which demonstrates: (1) precise correlations between the implemented high-level modeling and low-level physics experimental system, which improves the design efficiency significantly without manual HW calibration requirements shown in Figure 1; (2) and the effectiveness of LightRidge-DSE in exploring architecture parameters, which has been further utilized for on-chip integration (Section 5.5).

Emulation-level Evaluation
We further verify the design parameters from DSE model as discussed in Section 5.1 at emulation level.The accuracy results for image classification with MNIST [28] and FashionMNIST (FMNIST) [59] dataset are shown in Figure 7, where the baseline results are conducted on training methods in [34], [68] without the proposed physics-aware complex-valued regularization.The inputs are encoded with the amplitude of the laser beam.To make the input fit the DONN system, we first extend the image with the original size of 28 × 28 in MNIST10 and FMNIST datasets to 200 × 200 in SLM resolution, and transfer the original one-dimensional image to complex-valued image in the emulation.With the regularization factor  implemented, our training algorithm has a significant advantage in training less complex DONN models.For example, when the DONN model is implemented with only one diffractive layer (D=1), the accuracy performance is 31% (34%) improved for MNIST (FMNIST) dataset, compared with the baseline.Additionally, our algorithm can achieve a similar accuracy performance (0.98 for MNIST, 0.89 for FMNIST) for DONN systems regardless of its complexity, i.e., the number of diffractive layers implemented in the system, by adjusting  for the model training.However, according to the discussion in [34], the performance of DONNs with fewer number of layers are fundamentally limited by the optical physics, which is opposite to our accuracy results.
To understand the increase of accuracy, we analyze the robustness of the DONNs trained with complex-valued regularization.Specifically, we explore the confidence of the predictions acquired by the system, by adding random uniform noise at the detector phase with upper bound 1%, 3%, and 5% intensity noise.As a result, for both datasets, as the depth of DONNs increases, the prediction confidence increases, while the prediction accuracy with no noise applied are all relatively the same.For example, there is no accuracy degradation on five-layer DONNs for MNIST, and less than 1% degradation on FMNIST with up to 5% applied noise.However, for single-layer DONNs, the accuracy drops 63% for MNIST and 54% for FMNIST with 1% noise applied, and drops to 0 when applied noise increases to 3% and 5%.

LightRidge Runtime Evaluation
Runtime efficiency of emulating DONNs is crucial in simulation, training, and exploration.Thus, optimizing runtime performance is another key contribution in LightRidge framework.As shown in Figure 8, we first analyze the DONN workloads, where we identify that the majority (≥90%) of the runtime complexity comes from the numerical modeling of light diffraction.Thus, the major optimization efforts should lie over the diffraction kernels.Second, to effectively utilize the modern computing platforms, we aim to maximize the parallelism from the fundamental physics modeling, which is the main reason of implementing scalar diffraction modeling instead of FDTD in the computation kernel as mentioned earlier in Sections 1 and 2. The diffraction approximation functions with scalar diffraction modeling (Equations 1 -4) can be breakdown into three major tensor-level operators: complex-domain 2-D FFT (FFT2), inverse 2-D FFT (iFFT2), and complex matrix multiplications (Complex MM).Based on the analysis and kernel breakdowns, we take advantages of modern CPU and GPU platforms by incorporating efficient complex-tensor datatypes and operators.For CPU, the diffraction kernel is optimized via Intel Math Kernel Library (MKL-DNN) complex kernels with AVX-512 support; for GPU, cuFFT, cuFFTW, and cuTENSOR libraries with efficient complex-domain FFTs and MM are deployed.
To demonstrate the runtime improvements, we compare the runtime of our proposed framework with the commercial tool Light-Pipes(2021) with its up-to-date version, running various emulation loads, i.e., {1,3,5,7,10}-layer DONNs with system resolution sweeping from 100×100 to 500×500.All LightPipes-CPU and LightRidge-CPU results are conducted on Intel Xeon Gold 6230 20x CPU.To make fair GPU comparisons, we re-implement the kernels in Light-Pipes with cupy [40], and runtime results are collected on Nvidia 3090 Ti GPU platform.Figure 9 shows LightRidge consistently outperforms LightPipes on both CPU and GPU backends.Specifically, Figure 9a shows at most 6.4× speedup of LightRidge-CPU over LightPipes-CPU at depth=5, system size=500 2 .Figure 9b shows at most 12× speedup of LightRidge-GPU over LightPipes-GPU at depth=1, system size=500 2 .To understand the runtime speedups offered by LightRidge, we provide normalized speedups breakdown analysis w.r.t LightPipes CPU/GPU, shown in Figure 8 with 5-layer DONNs workload.We observe that the 6.4× CPU runtime speedups are contributed from   10).The runtime is acquired on a single Nvidia 3090 Ti GPU.We can see that LightRidge handles 30-layer DONNs training in ∼ 280 seconds per epoch, with input image resolution at 500 2 .Besides, we observe runtime increases almost linear w.r.t the DONNs depth, while there is a runtime jump when the system size increases beyond 300 2 , mainly due to the limited resource on a single GPU.This posts strong motivations for further CUDA optimization and multiple-GPU training supports in future works.

Performance Comparison between DONNs and conventional NNs
Compared with conventional NN models on digital platforms, the current optical-devices-deployed DONN systems at this early stage suffer from accuracy performance degradation while feature with significantly improved energy efficiency.As shown in Table 4, we evaluate two conventional NNs including a MLP, which consist of two linear layers with hidden size of 128, and the input image is flattened as one-dimensional tensor, i.e., MLP (40000 → 128 → 10); and a CNN, which consists of two Conv2D, where the kernel size of both layers is set as (5,5) and 32 filters for the first layer and 64 filters for the second layer with stride and padding being 2, two MaxPooling2D, where kernel size is set as (3, 3) with stride 2, followed by two linear layers.Additionally, we deploy the conventional NNs on different digital platforms including Nvidia GPU 2080 Ti, Nvidia GPU 3090 Ti, Intel Xeon 6230 20x CPU, and Google EdgeTPU [62].
As a result, the conventional NNs can produce the accuracy performance of 0.99/0.99 for MNIST, and 0.91/0.91 for FashionM-NIST with the MLP and the CNN model, respectively, while DONN systems reach the accuracy performance of 0.98/0.89for MNIST/-FashionMNIST, which shows 1% accuracy performance degradation.For practical realization with DONN systems, we take the prototype in Figure 6a as an example, the power of a CW 532nm laser source is ∼ 5mW.The diffractive layers are passive optical devices and require no extra energy for computation.Then the power consumption at the CMOS detector is ∼ 1 W (max) @ 1000 fps with the system size of 200 × 200.Thus, the power efficiency for the DONN system can be estimated as 995fps/Watt.The corresponding energy efficiency results for conventional NNs on various digital platforms are shown in Table 4, which shows that the DONN system is roughly 2 orders more efficient than desktop CPU and GPU, and 1 order than digital edge devices with batch size as 1.The energy efficiency provided by DONN systems can be more significant when dealing with more complex ML tasks (e.g., applications in Section 5.6) as the computation part (with passive optical devices) consumes zero power.Note that DONNs energy efficiency can be further optimized with integrated fabrication and high-end detector.
Therefore, the DONN system shows its great potential in completing ML task much more energy-efficiently than conventional NNs.However, the degradation of accuracy performance and the challenges in deploying the practical inference systems call for more future works in broad disciplinaries, such as complex-domain training algorithms, domain-specific co-design, and optics, which also highlights the potential of our framework.

On-chip DONNs Integration via LightRidge
The bulky 3D free-space DONN systems can be integrated as a 3D monolithic on-chip DONNs via 3D additive fabrication [13,14,20,36], e.g., galvo-dithered two-photon nanolithography [20], electron beam lithography overlay process [36], etc.Such monolithic on-chip DONNs can be integrated in a hybrid computing system, with DONNs performing as an optical co-processor hosted by central processor via system interconnects (e.g., PCIe 4.0).The host processor controls the laser encoding for loading images and the results collection with the co-processor interconnects, illustrated in Figure 11.Each diffractive layer is a thin film, where the trained phase information is encoded with the thickness of the material used for layer fabrications.Between diffractive layers, the optical clear adhesive is employed to provide free-space light diffraction, whose thickness is the diffraction distance.Diffractive layers and optical clear adhesive are stacked sequentially to construct an onchip DONN system.The final prediction is captured on the detector, with Analog-Digital-Converter (ADC), I/O interface, and memory buffers integrated on the peripheral circuits.An example of aforementioned real-world DONN on-chip integration is realized by [36].However, due to the three challenges we discussed earlier, the design cycle could take months to year efforts.LightRidge framework can significantly simplify the end-to-end on-chip design process, demonstrated by the case study as follows.
Case study -We target a 5-layer DONN system integration under wavelength 532nm for a CMOS detector chip (CS165MU1 from Thorlabs, Inc.), shown in Figure 11, where the CMOS chip defines the pixel size of 3.45um.The key for on-chip integration is to search for valid fabrication parameters with high prediction performance w.r.t ML tasks.Therefore, following the four steps of LightRidge design flow (Figure 3), we first deploy LightRidge-DSE to explore the 3D fabrication dimension, including distance, resolution, and diffraction unit size.According to the emulation results in Section 4 and Figure 5(c), when we fix the wavelength as 532nm and the diffraction unit size (pixel size of the CMOS chip) as 3.45um, considering image classification as ML tasks (e.g., MNIST), LightRidge-DSE returns the diffraction distance of 532um, and the resolution 200×200, with the emulation accuracy of 92%, to fit the CMOS chip.Thus, the DONNs fabrication dimension is finalized as 690um × 690um × 2660um, where 2660um is the height, and flat chip dimension is 690 × 690um 2 , which aligns with the chip fabrication procedure in [36].Next, after training completed, each layers will be fabricated w.r.t the phase parameters optimized by the codesign stage via nano-printing on the targeted CMOS detector chip.The integrated DONNs can then be used as a co-processor via ADCs and I/O integrated with the CMOS detector chip, where the pre-fabrication design process takes less than a day via LightRidge.

Advanced DONN Architectures
With the design capabilities of LightRidge and LightRidge-DSE verified by physical optical systems, we further explore the potentials of DONN systems with more advanced architectures dealing with more complex computer vision tasks.Specifically, we propose and evaluate (1) a multi-channel DONN architecture implemented with diffractive layers to deal with RGB image classifications, and (2) the first all-optical image segmentation demonstration using DONNs with optical skip connection for image segmentation and potentially other image-to-image synthesis tasks.

5.6.1
All-optical RGB image classification.To deal with more complex datasets in image classification, e.g., Place365 [65], a highresolution RGB image dataset, we propose a multi-channel RGB-DONNs architecture.As shown in Figure 12, three optical channels are employed in the DONN system to deal with 'R', 'G', 'B' channels separately in the original image, i.e., the original RGB image is split into three 'R'/'G'/'B' channel-only gray-scale images for three optical channels.The input laser beam is split with the beam splitter into three beams and reflected with mirrors into three channels to encode the corresponding input information.Note that the image information is encoded with light intensity at the encoding layer for each channel, in which case each channel takes a gray-scaled image as input and propagates through five diffractive layers.Each channels is constructed with the same system parameters in Section 5.1 expect for changing to 5 diffractive layers.The output laser beams from all channels are projected to a single detector, where the light intensity is merged for the final prediction.Similar to the detector design for classification shown in Figure 2 The emulation accuracy results for image classification with Place365 are shown in Table 5, including top-1, top-3, and top-5 accuracy.The baseline is the emulation accuracy from the DONN model trained with the algorithm in [68].The model trained with our framework has better accuracy performance than the baseline in all accuracy matrix (29%/25%/17% improvement for top-1/top-3/top-5 accuracy, respectively), and ours outperforms the baseline most at the top-1 accuracy.5.6.2All-Optical image segmentation.Image segmentation is an important and challenging task in modern computer vision tasks, which has a great impact on autonomous systems such as autonomous driving, robotics, etc.Unlike image classification tasks, image segmentation is a process of generating representations of an image into specific image-to-image objectives.While in DONN classification systems, we observe that the system (output detector  in particular) is not fully utilized, as only a given number of small detector regions are used for classification.As the DONN system propagates the input image w.r.t trained phase modulations in the full spatial dimension of the system, it is expected to be able to deal with image-to-image based tasks.Thus, we design and demonstrate the first-ever all-optical image segmentation.Figure 13a includes the proposed 5-layer DONN system, where we introduce two innovations in DONN architecture: 1) optical skip connection, which is inspired from the residual block design in conventional ResNet [24] architecture.It aims to smooth the gradient for better training performance and also is involved in inference for better detailed segmentation.Since the light signal is aggressively diffracted during the propagation, the optical skip connection can help to restore some features from less-diffracted inputs, making the model prediction be aware of the original information, which is confirmed to introduce better image segmentation performance with our results; and 2) layer normalization [1] before the detector plane, which is only employed in the training process for better training performance of the DONN by smoothing the training gradients.The dataset we demonstrate here is selected from CityScapes dataset [10], where the images are converted to gray-scaled images and resized to 350 × 350.We use binary labels in this case study to generate segmentation masks for buildings and others.The baseline is the results from the DONN model construction without optical skip connection and the training method without layer normalization proposed in [34,68].The system parameters and training setups are the same as discussed in Section 5.1 expect for the system size changing to 350 × 350 and the model structure changing to Figure 13a.The results shown in Figure 13b demonstrate that the advanced model trained with LightRidge outperforms the baseline in edge detection and with significant clarity improvements on small objects segmentation.These advanced DONN architectures and validations demonstrate the generalizability and power of LightRidge in exploring new architectural designs and applications.

CONCLUSION AND FUTURE WORK
This work presents an agile end-to-end design framework LightRidge that enables seamless design-to-deployment of DONNs.LightRidge accelerates and simplifies the design, exploration, and on-chip integration by offering highly versatile and runtime efficient programming modules, and DSE (LightRidge-DSE) engine to construct and train the DONN systems in a wide range of optical settings.The high-performance physics emulation kernels are optimized for runtime efficiency, and verified together with hardware-software codesign algorithm on our visible-range prototype.Additionally, two advanced DONN architectural designs constructed with LightRidge show the capabilities and generalizability of LightRidge for various ML tasks and system design explorations.We believe our framework LightRidge will enable collaborative and open-source research in optical accelerators for not only ML tasks, but also other opticsrelated research areas such as optical structure emulation, chip fabrication (lithography), meta-material exploration, etc.
In the future, we will further optimize the runtime efficiency of LightRidge, including realizing high-performance CUDA kernel optimization and multiple-GPU computation.Additionally, as we have initialized full-chip integration (Figure 11) in embedded SoC system, we can deploy our advanced DONNs in image segmentation enabled by LightRidge to demonstrate first all-optical autonomous driving prototype.We also expect more functionality to be integrated in the framework and more hardware prototypes for experimental demonstrations.For example, the non-linearity in DONN systems, which can be realized by nonlinear optical materials (crystals, polymers, graphane, etc.), is an important implementation for more complex DONNs systems.Finally, we can employ LightRidge for optical phenomena exploration such as the interpixel crosstalks in optical field, which happens when there is a sharp phase change between adjacent phase modulators [35,66].

A.6 Evaluation and expected results
Expected results should match https://github.com/lightridge/lightridge/tree/main/ASPLOS2024_AE and Access the specific AE page, in 1) accuracy metrics and 2) propagation and phase visualization.

A.7 Experiment customization
This framework provides customization for both model constructions and training setups.Different model constructions involve exploration efforts to find the paired parameters as shown in Section 4. The training setups such as learning rate (-lr), training epochs (-epochs), batch size (-batch_size), etc., can be customized by the arguments implemented in the python file.

Figure 2 :
Figure 2: Overview of DONN system and hardware implementation -(a) Illustration of the DONN system, including the input plane, three diffractive layers, and a light intensity readout plane.(b) The reconfigurable optical hardware system to deploy the DONN system.

Figure 4 :
Figure 4: Diffraction illustration.-(a) (, ) is the plane for diffraction aperture and illuminated by the input light beam in positive  direction, Σ on the plane (, ) denotes the illuminated area.(, ) plane is the target plane. 1 and  0 are illuminated points on the planes. is the angle between the outward normal and the vector pointing from  1 to  0 .(b) Light propagation and phase modulation through the diffractive layer w.r.t the input light wave.

Figure 5 :
Figure 5: Results of architectural DSE of DONN systems w.r.t diffractive unit size, and diffraction distance under different laser wavelength () with each grid colored according to accuracy on MNIST10.(a) and (b) are training data from emulations w.r.t design space under  = 432nm and  = 632nm for the inference model.(c) Predicted performance w.r.t design space under  = 532nm with the ML DSE model trained with data points from (a) and (b).(d) Grid-search validation under  = 532nm that verify the ML-based DSE quality.The DSE-guided setup at the star point is verified with the experimental prototype in Section 5.
trained with mean squared error (MSE) loss.The regression model is built with n_estimators=3500, learning_rate=0.2,max_depth=3, random_state=25.The approximated prediction result from the analytical model is employed to guide DONN DSE under a new target .To evaluate the analytical model based DSE strategy, we compare the predicted design space (Figure 5(c)) with the emulation verified design space (Figure 5(d)) under  = 532nm.The star point in Figure 5(d) shows our analytical DSE can find the best design points, which is further verified by the end-to-end LightRidge development process in Section 5.
Detector pattern for experiments and simulation.

Figure 6 :
Figure 6: Evaluation results of a 3-layer DONN system in visible range explored, trained, and deployed by LightRidge -(a) The experimental system trained and deployed by LightRidge.The corresponding detector pattern from experiments and simulation results produced by LightRidge are shown in Figure 6b; (b) Corresponding detector patterns of experimental measurements and simulation results (the simulation results generated with lr.layers.view()) of the 3-layer DONN system.
Model construction via LightRidge-DSE and training -This section demonstrates the hardware-software codesign precision and the effectiveness of LightRidge-DSE, where the parameters of the DONNs model used for physical validation experiments are automatically produced by LightRidge-DSE with the system size of 200× 200.Specifically, the emulation model for DONN training is constructed with 3 sequentially stacked diffractive layers in lr.model,where each layer is defined with lr.layers.diffractlayerintegrating hardware specifications: 1 the diffraction pixel size is 36um × 36um; 2 the laser wavelength is 532nm.Consulted on DSE results shown in Figure5(c), distance is explored to be ∼ 0.3m, which is further adjusted to 11 inches (0.28m) on our optical table.There are 10 pre-defined detector regions for labels placed evenly on the detector plane.The model is trained with MSE loss with one-hot represented ground truth labels using Adam[27] as the training optimizer.The learning rate for the training process is set as 0.5, the training epoch is set as 100, and the training batch size is set as 500 for all experiments.Hardware prototype and validation -Laser source CPS532 from Thorlabs, Inc. is implemented as the laser source for the physical DONN system, where SLMs (LC 2012 HOLOEYE) is implemented as diffractive layers.The levels of SLMs for model training are experimentally measured and cover a phase modulation range close to [0, 2].The final diffraction pattern is captured on a CMOS camera (CS165MU1 Thorlabs, Inc.

Figure 7 :
Figure 7: Confidence evaluation of DONNs trained with complex-domain regularization under various system complexity.Baseline results are conducted on methods in [34, 68] without noise assumptions.

Figure 11 :
Figure 11: Monolithic on-chip DONNs design and overall hybrid architecture system integration.
DONNs architecture for image segmentation tasks with optical skip connections and layer normalization (only for training process).Input Image Target Our results Baseline (b) LightRidge enabled advanced segmentation DONNs compared to baseline models and training methods proposed in [34, 68].

Figure 13 :
Figure 13: Image segmentation demonstrations using CityScapes [10] datasets with (a) a novel advanced DONNs architecture with optical skip connection and layer normalization for improving training efficiency, and (b) evaluations and comparisons to SOTA baselines [34, 68].

Table 1 :
[34]view comparisons of existing programming frameworks for DONNs compilation.Lines of Code (LoC) efforts are evaluated with a 5-layer DONNs[34].experimentalphysical systems, enabling out-of-box software-tohardware realization in an end-to-end fashion, and showing its capabilities to explore advanced DONN architectures for complex ML applications.The contributions of this paper are summarized as follows:

Table 2 :
Overview of the LightRidge programming modules and partial front-end APIs.Note that we use lr to represent our integrated Python package lightridge.
Include modules of different types of diffraction modeling, e.g., hardware-specific layer module lr.layers.diffractlayerandgeneraldiffractivelayer lr.layers.diffractlayer_raw, that can be configured with various approximation methods, distance, diffraction unit size, etc.lr.layers.detectorDefinedetectordesigns for various ML tasks, e.g., in image classification task, the coordinate and the size of the detection region for each candidate class.lr.modelsSequential container to customize DONN system by stacking diffractive layer and detector modules.
Training lr.train.utilsTraining utility modules including data handling (e.g., utils.data_to_cplex),complex-valued regularization, loss function, optimizer, etc. lr.train.to(device)Enable CPU and GPU accelerations for accelerating diffraction emulation and DONNs training.lr.train.dse(specs)Perform pre-fabrication design space exploration with chip integration specifications as inputs.Hardware deployment lr.layers.view()Visualize the original phase value per layer or values w.r.t the hardware specifications.lr.model.to_systemGenerate device-specific phase parameters for deployment w.r.t the hardware specifications (e.g., configurations of SLMs, thickness of 3D printed masks).

Table 3 :
Sensitivity analysis w.r.t wavelength, diffraction distance, and the diffraction unit size.

Table 4 :
Energy efficiency (fps/Watt) and accuracy comparisons between DONN systems and conventional NNs.