Poster: Maestro: The Analysis-Simulation Integrated Framework for Mixed Reality

The recent development of DNN and hardware has created new opportunities for mixed-reality applications. These applications demand the ability to analyze the real world and simulate realistic virtual content. However, designing mixed-reality applications faces diverse challenges due to the absence of a unified framework, such as huge programming effort and inconsistencies between the real scene and virtual content induced by end-to-end latency. This paper proposes Maestro, an analysis-simulation integrated framework for mixed-reality applications. Maestro provides a programming model for effective application representation and control, aiding runtime optimization. Maestro runtime takes an object-level execution approach to minimize misalignment, integrating both simulation and analysis pipelines for applications to process individual objects based on their latency sensitivity.


ABSTRACT
The recent development of DNN and hardware has created new opportunities for mixed-reality applications.These applications demand the ability to analyze the real world and simulate realistic virtual content.However, designing mixed-reality applications faces diverse challenges due to the absence of a unified framework, such as huge programming effort and inconsistencies between the real scene and virtual content induced by end-to-end latency.
This paper proposes Maestro, an analysis-simulation integrated framework for mixed-reality applications.Maestro provides a programming model for effective application representation and control, aiding runtime optimization.Maestro runtime takes an object-level execution approach to minimize misalignment, integrating both simulation and analysis pipelines for applications to process individual objects based on their latency sensitivity.

CCS CONCEPTS
• Human-centered computing → Ubiquitous and mobile computing systems and tools; • Computer systems organization → Real-time system architecture.

INTRODUCTION
Ally wants to purchase a new sofa and uses a furnishing application with mixed-reality (MR) glasses to place various virtual sofas in her living room.She then plays a pet breeding game to play with and feed a virtual cat.Recent advances in deep neural networks (DNNs) enable smooth and accurate interactions in such an MR app, including placing a virtual sofa with hand gestures, locating food to give instructions to the virtual cat, and realistically rendering the sofa and cat with light estimation.However, the active use of complex DNNs in an MR app likely causes a subtle but irritating mismatch between the virtual content and the real scene, such as the sofa moving slightly out-of-sync with the hand, due to high execution latency, making it challenging to adopt them.
The core challenges in building such a DNN-enabled MR app lie in the seamless integration between two separate computation pipelines: (i) analysis pipeline to analyze the surroundings with multiple DNNs and (ii) simulation pipeline to present the virtual world to the display.C1 -Integrating analysis-simulation pipeline.The simple black box integration of the two pipelines induces continuous inconsistency between the virtual contents and the real scene (seen by bare eyes or video see-through) (See Fig. 1).We observe that such inconsistency results from periodic frame-level execution and synchronization.First, frame-level analysis and simulation, i.e., completing the entire input frame analysis and performing the corresponding simulation, often result in a high end-to-end (e2e) latency.The frame analysis with multiple DNNs incurs high processing latency, while the simulation can be started only after the completion of the whole frame analysis.Second, the simulation pipeline runs at its own duty cycle and is loosely synchronized with the analysis pipeline.Hence, the simulation is not started right after the analysis but rather with a considerable delay (e.g., 33 ms for 60 FPS display).Such delay results in the misalignment between the real scene and the virtual content, especially with fast camera movements or content changes.C2 -Developing DNN-enabled applications.Developing and integrating the two pipelines involves huge programming efforts.The pipelines are often developed with different systems: the former with mobile DNN frameworks [4,5] such as TensorFlow Lite and the latter with game engines such as UnrealEngine [3], each of which has a steep learning curve.MR apps require sophisticated implementations in each framework for fast multi-DNN execution with complicated 3D virtual scene simulation.Recently, game engines support DNNs with XR SDKs [1] to reduce programming efforts.However, they are limited to outdated task-specific DNNs (e.g., hand pose) [2].
In this paper, we propose Maestro, the analysis-simulation integrated framework for DNN-enabled MR apps.Maestro provides a programming abstraction to represent the analysis and simulation operations for MR apps with a simple graph while giving cues for runtime optimizations.We provide a highly optimized runtime engine to accelerate the execution of the graph and minimize perceivable inconsistency.We implement our system with mobile multi-DNN framework [5] and Unreal Engine [3] and show that our runtime significantly reduces the inconsistency (up to 1.6× higher streaming accuracy) compared to the typical black-box integration of the two pipelines.Also, we demonstrate that our programming interface easily expresses nine distinct MR apps.

MAESTRO: THE ANALYSIS-SIMULATION INTEGRATED FRAMEWORK FOR MIXED REALITY
We introduce Maestro [6], the analysis-simulation integrated framework for mixed reality.Fig. 3 shows an architectural overview.Maestro provides a programming abstraction to represent the analysis and simulation operations for MR apps with a simple graph while giving cues for runtime optimizations.Maestro offers a highly optimized runtime engine that centralizes its execution in object-level that (i) optimizes an end-to-end latency with fine-granular processing and (ii) considers each object's latency sensitivity, such as movement, to minimize perceivable inconsistency.It efficiently handles all underlying technical complications, such as tight synchronization between the two pipelines and operator scheduling for heterogeneous mobile processors.

CONCLUSION
This paper introduces Maestro, a framework for mixed-reality applications that combines analysis and simulation pipelines.It offers a programming model that allows developers to designate important objects to enhance perceptual quality.Maestro runtime utilizes this information and adopts the object-level execution to maximize the overall consistency between the real scene and the displayed virtual scene.

Figure 1 :
Figure 1: User perception in mixed reality.Users perceive greater visual inconsistencies in HMD where they directly observe reality with bare-eye along with virtual contents.

Figure 3 :
Figure 3: Architectural overview of the Maestro.