From Real to Virtual: Exploring Replica-Enhanced Environment Transitions along the Reality-Virtuality Continuum

Recent Head-Mounted Displays enable users to perceive the real environment using a video-based see-through mode and the fully virtual environment within a single display. Leveraging these advancements, we present a generic concept to seamlessly transition between the real and virtual environment, with the goal of supporting users in engaging with and disengaging from any real environment into Virtual Reality. This transition process uses a digital replica of the real environment and incorporates various stages of Milgram’s Reality-Virtuality Continuum, along with visual transitions that facilitate gradual navigation between them. We implemented the overall transition concept and four object-based transition techniques. The overall transition concept and four techniques were evaluated in a qualitative user study, focusing on user experience, the use of the replica and visual coherence. The results of the user study show, that most participants stated that the replica facilitates the cognitive processing of the transition and supports spatial orientation.

transition techniques.The overall transition concept and four techniques were evaluated in a qualitative user study, focusing on user experience, the use of the replica and visual coherence.
The results of the user study show, that most participants stated that the replica facilitates the cognitive processing of the transition and supports spatial orientation.

INTRODUCTION
Recent advancements in video-based see-through Head-Mounted Displays (HMDs), such as Varjo, Meta, and Apple devices allow to perceive all stages of Milgram's Reality-Virtuality Continuum (RVC) [36] within a single display.Most current Mixed Reality (MR) applications are designed and implemented to work on one stage of the RVC, but depending on the given use cases it could be desirable to use multiple stages sequentially [6,7,11,27,29].Moving between or interconnecting different stages of the RVC is known as cross-reality (CR) [50] or cross-virtuality (XV) [44].In their survey on CR, Auda et al. [4] present a thorough overview of this research area, while Froehler et al. [16] place particular emphasis on the visual analytics aspect.VRception [18] provides a toolkit for prototyping such CR systems, simulating all stages of the RVC within VR.The transition between these stages is facilitated by a basic fade transition.However, no user study was conducted addressing this specific aspect, considering the nature of such toolkit publications.In our approach, we deliberately address this open research gap by demonstrating how to design a comprehensive transition process and evaluating it through a user study.
Existing research has investigated various aspects of transitioning between Virtual Reality (VR) environments [12,21,35,39,51], improving engagement through the use of replicas [53,56,57,59], or transitioning between individual stages of the RVC [7,13,17,22,43,54].Our novel approach integrates all these approaches by utilising a replica and employing transition techniques across multiple stages of the RVC.By integrating these individual, existing aspects, we have developed a integrated concept to achieve a seamless transition process between the real and the fully virtual environment (see Figure 2).In comparison to Valkov et al.'s [59] concept, our seamless transition does not start with a virtual replica environment.Our method starts one step earlier by initiating with a video-see-through (VST) [46] view without any additional virtual content.Unlike a replica, the VST mode does not capture a momentary snapshot of the space; rather, it offers a real-time video stream of the environment, ensuring that users see the same content as they would without the HMD.Starting without computer-generated content enables a gradual integration of virtual elements, facilitating the inclusion of AR and AV in our process to gradually increase the amount of virtual replica content.Utilising this approach, we aimed to minimise the perceptual disparity between the actual environment and a statically reconstructed representation.We believe that this approach enhances and supports the user experience when engaging with a VR environment, particularly for users with limited HMD experience.
By introducing a replica to the transition process, it provides the opportunity to truly create the illusion of real objects disappearing.Without a replica, the objects in the desired VR experience would have to be placed at the same location as the real objects.Therefore, the target environment would need substantial customisation for each individual use case and would be restricted to the real environment.Using a qualitative evaluation approach we investigated how users experienced the transition using the replica and different visual effects.We also included a baseline without the replica so users would also be able to experience the transition process without this mediating step.To ensure a high degree of fidelity between the replica and the actual environment, we simulated sunlight and the resulting shadows in accordance with the current time and weather of the real world.Detailed investigation into how this may influence the transitional process remains a subject for more comprehensive exploration in future research.In our initial assessment, we queried participants in the user study regarding their ability to discern between coherent and non-coherent lighting.In the interview, we requested initial feedback on this topic and suggestions for future application areas.
Main contributions of this publication: We developed a generic concept for seamless transitions across all stages of the RVC.Based on that, we designed and implemented a set of four transition techniques to investigate their applicability for movement between real and virtual environments.To allow for such seamless transitions we introduce a replica which is separated into sub objects and object groups.To validate our approach we conducted a user study retrieving qualitative information about the user experience of the overall transition process comparing the four techniques to a baseline condition, as well as first qualitative insights into the relevance of visually coherent lighting in the transition process.

RELATED WORK
This section gives an overview on existing publications that employ transition techniques to switch between stages [16] of the RVC or within a single stage.As well as the use of replicas and visual coherence.The overview primarily focuses on transition techniques that involve changing the environment and excludes publications that do not utilise visual feedback during the transition process.It specifically emphasises fully implemented environment changes rather than windows to other realities metaphors [17,34], obstacle avoidance [24] or bystander interactions [40].

Visual Transition Techniques
To integrate VR, Augmented Reality (AR) and traditional desktop interfaces within a single display, visual transition techniques can be employed.One of the first techniques was introduced by Billinghurst et al. [7] in their MagicBook, using a flight metaphor.These techniques go beyond simple switching mechanisms, enabling the smooth transition between different realities [22,43,54] or virtual scenes [12,21,35,39,51] while providing visual cues to support the transition process.Transition techniques are not exclusively employed for bi-directional environment or reality changes, but also serve the purpose of specifically exiting the VR experience [20,54].
Pointecker et al. [43] provide a comprehensive list of publications in this research area, as well as examples in films and video games.They classified all techniques from their literature search according to five criteria, to represent the strength and weakness of each transition technique.In their comparison of four techniques, it was found that the Fade technique is well-suited for daily use and achieved the highest scores in terms of continuity and overall ranking.Participants in Knibbe et al. 's [30] study also highlighted the importance of a seamless environment change.They proposed various techniques to achieve a smooth VR exiting experience.Based on participant responses they suggest a subtle transparency change that allows users to orient themselves, adapt to lighting, and acclimate to the social setting before removing the HMD.Transition techniques can be applied not only to the entire environment but also selectively to specific objects.In this context, five interaction techniques were examined for transitioning objects between environments [10].Unlike environment transitions, object transitions generally occur more frequently, and the transition does not impact the entire field of view.Therefore, different design considerations need to be taken into account.

Replica
A digital replica of the real environment can be employed to simulate reality [18] or used in conjunction with a transition technique to enhance the transition process.The integration of such replicas into the transition process has been explored in the following publications so far.
Slater et al. [53] describe an experiment in which participants, found themselves situated in a digital replica of the laboratory after putting on an HMD.In this setup, a door served as a portal for transitioning from this replica into a new virtual environment.
In a similar approach, Steinicke et al. [56,57] used a virtual reconstruction of the real environment as a way to start the VR experience.The transition was facilitated through the use of a portal and a three second animation sequence.The experiment revealed that using a replica instead of directly commencing with the new virtual environment resulted in an enhanced sense of presence and improved distance estimation.
Valkov and Flagge [59] also employed a replicated version of the initial real environment to facilitate a seamless engagement of the user in the VR experience.They utilised an object-based morphing technique to transition between environments, focusing on altering only the objects outside the user's field of view to maintain continuity.They observed that employing a smooth transition resulted in users exhibiting increased confidence in their spatial awareness of the room boundaries, leading to faster walking speeds and maintaining smaller safety distances from real-world objects.

Visual Coherence
Visually coherent scenes combine the real world and augmentations within a single display in a consistent and coherent manner, going beyond simple AR registration.The integration of real and virtual content, taking into account registration and lighting, was first introduced by Fournier et al. [15] in the early 1990s.Various approaches have been employed since that time to acquire realworld data on lighting [1,2,5,23] and geometry [19,32,47,61].Having a deeper understanding of lighting and geometry can significantly improve the transition process between stages of the RVC by minimising discrepancies between virtual and real objects, thereby opening the way for novel transition techniques that leverage this knowledge [42].Collins et al. [9] describe the main challenges related to achieving visual coherence in AR, while Alhakamy and Tuceryan [3] provide a comprehensive summary of various approaches addressing visual coherence.

TRANSITIONING THE RV-CONTINUUM
We present a novel concept for a seamless transition between the real environment and a fully virtual environment, displayed through an HMD.To increase understanding, we provide definitions for the following key terms: Target Environment: This refers to any desired VR environment that the user aims to experience.The target environment is essentially the reason why a user wears an HMD.Depending on the specific application, the target environment may vary, it could be a video game, a hub environment, or a space where a dataset is analysed.For our user study, two example target environments were implemented (see section 5).
Real Environment: The real environment is perceived through a video-based see-through HMD, where the front-facing cameras of the HMD provide the real-world video stream.At this stage, there is no additional virtual content present.
Replica Environment: This refers to a digital replica of the current real surrounding in which the user is situated.The process of creating this replica and determining its level of detail is described in Section 5.2.Both the replica environment and the target environment are entirely represented in VR, but they are not identical.

Generic Concept of a Seamless Transition
In VR, users have typically been immediately engaged in the target environment, isolating the real environment and replace it with any desired fully virtual environment.However, this approach involves a direct and abrupt change without any assistance for the user.Previous research in this domain has demonstrated the benefits of utilising a replica as a starting point to enhance the initiation process by increasing sense of presence [57], which corresponds to Slater's place illusion [52].Building upon this concept, we propose a novel approach that involves commencing the experience within the real environment, gradually introducing virtual content in the form of the replica, and subsequently transitioning to the target environment.
To facilitate a seamless and comfortable transition between the real environment and a fully virtual environment, we describe a novel transition procedure that utilises the entire RVC by employing a replica.This concept aims to visually support the total engagement into and exit from a target environment in VR, enabling a supportive and understandable transition.Particularly for users with limited experience with HMDs, this approach could provide support during the transition to enhance user experience.
To achieve this, the user is progressively introduced to the target environment.The virtual component on the RVC is incrementally increased, starting with a steadily growing number of virtual objects (AR) until only a small part of the real environment is visible (Augmented Virtuality (AV)), and finally, the entire scene is covered with virtual objects (VR).Similarly, the type of virtual content is designed to gradually align with the target environment.A digital replica of the real starting environment is utilised, with the real environment progressively overlaid by the replica.Once the replica is complete, it is continuously dismantled while simultaneously constructing the target environment, allowing it to emerge stepwise.2: A concept for a seamless transition process incorporating the real environment (dashed blue), a replica (orange) and a target environment (green).The figure illustrates timing parameters of the transition, the distinct stages of the RVC and the varying proportions of each environment relative to the ongoing transition process over time.

Real
The overall concept for a seamless transition process is given in Figure 2. The horizontal axis represents time, while the vertical axis depicts the displayed environment content.The environment content indicates the proportion of each environment in the overall scene.Regardless of the point in time, the sum of displayed environments (real or virtual) must always equal 1, as the scene must be completely filled, but with varying proportions depending on the stage of the process.The environment content can be either the real environment (dashed blue), the replica (orange), or the target environment (green).During the transition process, the user progresses through different stages of the RVC. Figure 1 illustrates how this seamless transition concept could look like in a specific implementation.
After putting on the HMD, the user can remain in the real environment for any desired duration (time in the real environment).Nothing has changed yet, only the real surrounding is now experienced through the HMD.Once the transition to the target environment is initiated, the proportion of the real environment decreases step-by-step as digital replica objects gradually overlay it.The proportion of the replica increases until the environment is complete, and the user finds themselves in VR.At this point, there is a moment when only the replica is visible, and the duration of this phase (waiting time in the replica) can vary depending on the specific implementation.A longer waiting time allows the user more time to process the replica but also extends the overall transition time and may impede the flow.Subsequently, the proportion of the replica decreases while the proportion of the target environment increases.Once the target environment is fully established, the user can stay in it for any desired duration.If the user wishes to transition back to the real environment, the transition process is initiated by the user, and the procedure is carried out in reverse order.The progression of environment proportions does not have to be linear, and the waiting time and overall transition time can be adjusted according to specific requirements.

Replica Environment
The replication of the real environment plays a crucial role in achieving a seamless transition process.This allows the user to be initially transferred to a familiar environment that is already fully virtual.Various transition techniques are employed to construct this environment and subsequently modify it to resemble the target environment.Such significant changes are only possible in the virtual space, as the real environment cannot be easily altered.The level of detail in the replication depends on the Extent of World Knowledge [36] available about the real environment.This knowledge, in turn, is influenced by the available technical capabilities such as photogrammetry and depth sensors, enabling dynamic [55] or static representations (our approach, see section 5.2).When knowledge about geometry and lighting is available, environment-aware rendering [14] can be achieved which is used to implement traditional visual coherence approaches in AR.The coherent lighting is used to represent the real-world lighting in both the replica and the target environment to enhance a smooth transition by consistent lighting conditions throughout the whole transition process.

Object Transition Order
In order to achieve a seamless construction and dismantling of the replica and target environment, the visual transition does not occur simultaneously for all objects but rather on an object-byobject basis.Since the number of objects in the scene can vary significantly, object groups can be formed.The overall duration of the transition is primarily influenced by the time it takes for the next object group to begin the transition procedure.To ensure that an object is either fully virtual or completely disappeared, a certain transition duration per object/object group is required.This duration indicates how long an object remains in the transition phase, for example, until a real table is completely obscured by its virtual counterpart.For a smooth transition, the next object group should begin the transition while the previous object group has not yet completed the transition entirely.
Due to the object-based nature of these techniques, not all objects initiate the transition simultaneously but can be processed in a predefined order.In addition to the fluent transformation caused by the transition technique itself, this order contributes to the gradual construction or deconstruction of the environments and affects the slopes depicted in Figure 2. The order in which objects or larger object groups initiate the transition in the replica follows the reversed Painter's algorithm [37].As a result of this approach, virtual content only obscures real objects that would also be obscured in the real environment, and a detailed depth information is not required to selectively occlude virtual objects.Large objects such as floors and walls should be transformed last.If walls, ceilings, or the floor were transformed first, the real video stream would quickly be obscured, and the VR component would increase rapidly without any AR or AV content being visible.The principle of a seamless transition of the replica would not work in such a scenario.

CONCEPT AND DESIGN OF TRANSITION TECHNIQUES
Section 3 describes the overall transition process necessary for a seamless transition between the real environment and a target environment.To accomplish this transition process, an object-based transition technique is required to provide a procedurally transition for each environment.This allows to influence the gradients of the three involved environments shown in Figure 2. General timing parameters which are outlined in Figure 2 remain the same across all four distinct techniques.Prior research on transition techniques demonstrates that the total transition time varies significantly, ranging from a few seconds to up to 30 seconds [20,21,43,59].During user studies conducted by Feld et al. [12] and Pointecker et al. [43] on various transition techniques, participants reported that a lengthy transition time disrupted their workflow and that they prefer a short transition.On the other hand, a transition that occurs too quickly impedes a smooth engagement into the target environment and may cause discomfort due to the pronounced visual load.Therefore, the total transition duration needs a compromise between speed and potential visual strain.The overall transition process is initiated by a user interaction, either to initiate the transition process to the target environment or to the real environment, triggering the individual object-based transitions.In addition to visual feedback, audio and haptic feedback is incorporated for each transition technique to ensure that the user is informed through multiple sensory channels [43].In comparison to the visual changes, audio and haptic feedback are utilised as subtle support.The following gives an comprehensive overview of the visual concept and design for four distinct object-based transition techniques.

Fade
We selected the Fade transition technique because it's well-suited for object-based transitions, offering a subtle transition [30] with minimal visual distractions.Moreover, it's a widely used approach for changing environments in video games, films, and literature [43].To avoid sudden appearances or disappearances of objects in both the replica and the target environment, the Fade transition gradually adjusts the transparency of the virtual objects (see Figure  3).During the transition from the real environment to the replica, the objects from the real environment undergo a fluent superimposition process, wherein they are continuously replaced by their corresponding counterparts in the replica.Once the replica has been fully faded in, the objects within the replica will progressively enhance their transparency, resulting in a reduction in their visibility.Simultaneously, the objects from the target environment will begin to emerge as their transparency is successively decreased until the entire target environment becomes fully visible.
The transition from the VR environment to the real environment follows a reverse process.During the final transition phase, from AR to the real environment, the objects within the replica undergo a steady substitution with their corresponding real-world counterparts, creating the perceptual illusion of virtual objects being progressively replaced by their real counterparts.

Dissolve
For the next transition technique, we decided to use a technique that is more visually prominent instead of the rather subtle and visually simple Fade, to enhance the user's attention during the transition process.The Dissolve transition disintegrates or materialises objects with a burning or flooding effect on their surfaces.The visual effect manifests as a randomised noise pattern (e.g.Perlin noise [41]) spreading across the object.A radiant orange glow is applied to the edges of the spreading pattern, creating the visual impression of the object undergoing dissolution or emergence through a melting process (see Figure 4).Unlike a morph effect, the 3D model and structure of the objects remain unaltered.We chose this visual effect as it is commonly employed in video games (e.g., Assassin's Creed [58]) or demos (e.g., Varjo1 ) to add or remove objects and characters in both plausible and aesthetically pleasing manner.When transitioning between the replica and target environment, objects either vanish or materialise with the dissolve effect, as they lack a corresponding counterpart.

Translate
The choice of the Translate technique is inspired by a real-world metaphor where shutters and blinds can be opened or closed by sliding them.This transition technique involves the movement of objects into or out of the environment to facilitate the transition between the real and target environment (see Figure 5).Object translation is also a technique used in films (e.g., Inception [38], Shazam!Fury of the Gods [48]) to transform the entire environment with translation.However, the plausibility of this technique varies depending on the specific object involved and the user's expectations.While this may be a realistic behaviour for blinds and curtains [60], as it is expected for them to move at some point, even without user interaction, it may not necessarily apply to objects like tables or walls.To ensure smooth motion, the virtual objects gradually decelerate as they approach their target position until they come to a complete stop.When moving away, the speed increases.The virtual objects have the capability to move either upwards or downwards.Objects closer to the floor move downwards, while objects closer to the ceiling move upwards.This approach minimises the distance covered by the objects during their movement.If the virtual objects are moved outside the physical floor or ceiling during the transition, they undergo a fading effect, resulting in their disappearance or appearance.This creates the illusion that the virtual objects are located beneath the actual floor or above the ceiling, seemingly vanishing behind them.
Unlike Fade and Dissolve, the visual effect does not start directly at the position of the real object.Instead, the replica objects gradually move towards their corresponding real-world positions until the end of the transition process.On the other hand, when transitioning from the replica to the real environment, the replica object is moved away from its real-world position.Due to the inability of real objects to move, there is always a visual mismatch during the transition process.No visual mismatch occurs when transitioning between the replica and target environments because, in this scenario, all objects have the ability to move and do not need to replace any real objects.

Combine
For our last technique we selected a combined approach of all three previously described transition techniques to highlight the strengths of each individual technique, which can be seen in Figure 1.The Combine transition utilises the Fade transition for large surfaces such as walls, floors, and ceilings, as it is less noticeable and therefore less distracting when transforming such large objects.Using Dissolve or Translate on these large surfaces would introduce more visual disturbances.Therefore, Dissolve is applied to objects closer to the user and objects that require the user's attention, such as tables, chairs, or screens.These objects also define the boundaries of the space and potential obstacles in the real environment.The prominent dissolve effect makes these objects more quickly and easily recognisable.In the target environment, Dissolve is also used for objects that need the users attention.To achieve an authentic impression, Translate is used in the replica for blinds and curtains, as these are also used in a similar way in the real world.This increases the credibility of the replica.If such objects are present in the target environment, Translate is also applied there.

IMPLEMENTATION DETAILS
This section describes the specific implementation and parameterisation of the concepts presented in section 3 and 4.

General Transition Parameters
In order to minimise the duration of the transition process while allowing for sufficient processing of visual effects without overwhelming the user, we evaluated the parameters for transition time in a pre-study, which is described in section 6.1.The results indicated that a total transition time of 13 seconds is a good compromise between speed and visual load.Accordingly, the waiting time in the replica was set to 1 second, allowing the user to perceive the completeness of the replica before transitioning to the target environment.The transition time per object was set to 2 seconds, which proved to be sufficient for all four transition techniques to adequately represent the visual changes.In order to minimise the overall duration of the transition process, objects were grouped into object groups.These object groups behave as a single object during the transition and therefore start the transition process simultaneously.A total of 12 object groups were created for each environment (replica, target environment 1, and target environment 2).To ensure that the next object group begins the transition while the previous ones are still undergoing the transition, the time until the next object group starts was set to 0.4 seconds.
The transition process is initiated by pressing a button on a controller.This controller also provides haptic feedback by rumbling at the transition start, with the vibration intensity increasing until the replica has fully appeared.Subsequently, the vibration intensity gradually decreases and stops completely once the target environment is complete.For audio feedback we choose ambient sounds with a futuristic aesthetic to utilise subtle feedback which is played during the transition process.To ensure better comparability in the user study outlined in section 5, the total transition time, as well as haptic and audio feedback for all four transition techniques and the baseline condition is the same.

Replica and Coherent Lighting
We conducted the user study in our laboratory, which is why we created a geometric-static replica of this room.This replica contains the room boundaries, including doors and windows.Also included in the replica are important main objects such as tables, screens and shelves with a low level of detail.The material properties of the objects consist of textures taken from the real environment.
To achieve the impression of realistic lighting in the replica, the sun is simulated as the primary light source and the colour grading in the replica is based on the video feed from the passthrough cameras.The position of the sun is determined based on the current GPS coordinates and time [33].Leveraging the available knowledge of room and window geometry, the actual angle of sunlight entering the room can now be simulated in the replica.Real-time weather data is also incorporated into the simulation, allowing for the representation of cloudy skies, rain or snow.This also effects the light in such conditions and prevents direct sunlight into the room.However, other light sources and indirect reflections are not taken into account.The combination of geometry (replica) and light knowledge enables us to implement coherent lighting throughout the transition process.According to Pointecker et al. [42], our implementation meets the criteria for visual coherence in terms of geometry at level 3 and in terms of lighting at level 2. Figure 6 demonstrates how the combination of real-world light and geometry knowledge accurately replicates the shape and angle of light rays in the replica.We investigated the perception of coherent lighting during the transition process within the user study (see section 6.4).

Target Environments
The user experience regarding transition techniques between AR and VR relies on the specific use case and can be categorised based on professional and non-professional context [43].To assess their transition techniques, Feld et al. [12] conducted an evaluation by seamlessly transitioning between an office environment and a farm environment, which represent both contexts.In order to cover both  application areas in our user study, we decided to implement two different environments representing both areas.Target Environment 1 presents a medieval scenery with a fountain and medieval objects such as pots and swords (see Figure 7 left).Target Environment 2 showcases a classic office environment with a large conference table and chairs (see Figure 7 right).Both environments are designed in a way that the virtual objects also delineate the real tracking area, creating a believable virtual boundary for the tracking space.

Baseline Condition
In the baseline condition the transition process is implemented without the replica environment.This allowed us to evaluate the impact of the replica on the perception of the transition process in the user study.As a transition technique, we chose the Fade technique to enable a transition between the real and the target environment.Fade is a widely used transition technique [12,21,35,39,51] known for its visual simplicity, making it suitable for various application domains [43].For better comparability, the total transition time is the same as for all other transition techniques (13 seconds).

USER STUDY
For the user study we chose a qualitative approach to gain detailed insights into the reasoning for participants' preferences.Additionally, we used the short version of the User Experience Questionnaire (UEQ-S) [49] to gather comparable scores for user experience.We also employed the Fast Motion Sickness Scale (FMS) [26], which is a single item verbal question.This was included to make sure that all participants were feeling well and could abort the study when necessary.

Pre-Study
We conducted a short pre-study with two male participants aged 32 and 37.This allowed us to validate the study design and prototype parameters, such as the measure of motion sickness, the overall length of the study and collect initial feedback on the implementation of the visual effects.We mainly used the same procedure as in the main study, which is described in section 6.5.To select a measure for motion or simulator sickness we used the Virtual Reality Sickness Questionnaire (VRSQ) [28] for one participant and a combination of Simulator Sickness Questionnaire [25] and the FMS for the other participant.We selected the FMS for the main study, as it provides a quick and simple overall measure of motion sickness.Furthermore, one participant mentioned slight nausea symptoms which are not measured by the VRSQ.
For the study parameters we adapted the length of the transformation process, as both participants mentioned several times that it took too long to transition from reality to the target environment.Therefore, we shortened the transition time from 16 seconds to 13 seconds.Given that our entire transition process passes through three distinct environments, these 13 seconds correspond to a transition time of 4.3 seconds per environment.We did not further shorten the total transition time to ensure that the transition between each environment remains perceivable.

Participants
We invited 16 participants to take part in the study (6 female, 10 male).The average age was 30.9 years (SD = 9.09).Eight participants had a university degree while the other eight were under-graduate students.Eight worked in or studied Software Engineering, seven worked in research and one was an elementary teacher.While two participants had no experience in using VR or AR applications, ten reported that they had no experience with developing software for VR and AR.All but one participant had normal or corrected to normal eyesight, however the respective participant did not report any troubles with the task.

Apparatus
The study prototype was developed using Unity (2021.3.2f1) and ran on the Varjo XR-3 2 HMD.In addition, the HTC Vive handheld controllers were used as an input device.The HMD was powered by a GeForceRTX 3090, an Intel Core i9-11900K, and 64 GB of RAM, resulting in an average frame rate of 85 frames per second.The tracking space measured 4x4 meters.We've made our study prototype available on GitHub 3 as an open-source project, facilitating future enhancements and developments of our approach.

Study Design
For the study design we chose a qualitative approach with a withinsubjects design and five conditions: Fade, Dissolve, Translate and 2 https://varjo.com/products/xr-3/ 3 https://github.com/fp-hive/Replica-Enhanced_Transitions.gitCombine, mentioned in section 4, and a Baseline condition, see section 5. We included two different target environments so participants could see and discuss the transitions and the replica in both a fictitious and a realistic virtual environment in the qualitative interview.We did not include this factor in the questionnaires as the main focus of our study were the different transition techniques and the nature of the study was qualitative.This is also true for the light coherence factor.We included this to see whether participants would notice and gain first insights into the relevance of coherent lighting, not to draw quantitative conclusions.This means that for each condition that included a replica, participants completed four trials, to cover a condition with coherent lighting and incoherent lighting for each target environment.In the Baseline condition, there was no coherent lighting, as this is only applied to the replica.However, for consistency, participants completed two trials per target environment.We distinguished between the presence or absence of sunlight.If sunlight was not present, it was either due to cloud obstructions or the current position of the sun.Depending on the actual weather conditions, the incoherent condition represented lighting conditions opposite to those of the real environment.For counterbalancing we used two balanced Latin squares.One for the combination of lighting coherence and target environments within each condition, and one for the transition techniques.However, the Combine condition was always applied as the last transition technique to ensure, that participants had already seen each of the visual effects it encompassed.While this choice does influence the results of the Combine technique, it eliminates the confounding familiarity factor for the other techniques, that would have occurred when participants saw the Dissolve, Fade and Translate effect before the respective condition.In each trial participants had to complete a simple search task were an object of the target environment was shown to the participants on a printout in the real environment and participants selected the respective item in the target environment.

Procedure
First the participants gave informed consent and filled out a demographic questionnaire.Then participants were briefed on the study procedure.They were also informed on the symptoms of motion sickness and the FMS.Therefore, participants were asked to give a verbal rating of their experienced sickness from 0 meaning no sickness to 20 representing substantial sickness.They were further instructed to focus on symptoms of nausea, general discomfort, and stomach problems when giving their ratings.They were told that there are two different target environments and a virtual replica of the room in four conditions.They were not informed about the light coherent condition to find out whether participants will notice the difference in lighting in the replica.Then participants started the first condition.At the beginning and the end of each condition, the FMS was applied.Following each condition, participants completed the UEQ-S and a short semi-structured interview.After completing all conditions, users ranked all conditions from most to least favourite.Then participants answered another semi-structured interview including questions on the usefulness of replicas and coherent lighting in the replica.In the end participants were thanked and offered candy as a small sign of appreciation.On average a session took 1 hour and 06 minutes.

Results
In the data analysis, we used a Friedman test for the FMS and UEQ-S scores.In case of statistical significance on the 0.05 level, we conducted a Dunn-Bonferroni post-hoc test for the pairwise comparison.The audio and video recordings from the semi-structured interviews were transcribed and grouped into thematic clusters.
6.6.1 Motion Sickness.We calculated the difference in FMS scores from the beginning to the end of each transition.There was no significant difference in FMS scores for either of the conditions ( 2 (4) = 4.64, = 0.326).The full FMS scores can be found in the supplemental material.6.6.2User Experience.In terms of user experience, all transition techniques received an overall score greater 0.80 which is considered a positive evaluation result for the UEQ-S, except for the Baseline condition which was evaluated neutral with an average score of 0.30.On the pragmatic subscale, Baseline received an average score of -0.02 and Translate received 0.75.All other transition techniques were again evaluated positively.Finally, on the hedonic subscale, Baseline was again evaluated neutral with an average score of 0.61 while all other techniques received a positive score as it is greater than 0.80, see Figure 8.We also found significant differences for the hedonic subscale ( 2 (4) = 16.01,= 0.003) with the post-hoc test showing that the Baseline condition received significantly lower ratings than Dissolve ( = −1.66,= −0.74,= 0.03), Translate ( = −1.68,= −0.75,= 0.025) and Combine ( = −1.72,= −0.76,= 0.021) .We also found a significant difference for the overall scale ( 2 (4) = 14.08, = 0.007).There, the post-hoc tests revealed that the Baseline condition was rated significantly lower than the Combine condition ( = −1.81,= −0.81,= 0.01).

Ranking of Transition
Techniques.The ranking from most favourite to least favourite transition reveals a general tendency but not a definitive answer to which technique was preferred, see Figure 9. Overall, the Combine technique was ranked the highest (Mdn = 2, IQR = 1-3), Dissolve came in second (Mdn = 2, IQR = 1.75-4), the third place is shared by Fade (Mdn = 3, IQR = 2-4) and Translate (Mdn = 3, IQR = 2-4) and Baseline came in last (Mdn = 4.5, IQR = 3-5).Figure 9 also shows that Combine was mostly ranked in places 1-3 and Baseline was mostly ranked in places 3-5.Furthermore, it shows that both Translate and Fade were received very mixed with almost equal amounts of votes in all positions.The rankings for Dissolve show that participants either really liked it and ranked it in first or second place or did not like it and ranked it in fourth or fifth place with no rankings in third place.
6.6.4Qualitative Results.The qualitative data from the interviews was transcribed and analysed using "closed" or "a priori" codes and themes which is common and accepted practice in Human-Computer Interaction [8].We report the participant codes of the respective subjects of our study to display the grounding of our findings in our data and the overlap of opinions in the qualitative interviews.
The codes we analysed in our qualitative data were based on our interview questions and are efficiency, usage scenarios, emotional response, and spatial orientation.For each of these results we distinguish between visual effects and replica.Additionally, we summarised the results for the visual coherence.
Fade: This transition is smooth (P9-10) and predictable (P13-15) without unnecessary effects (P14).It facilitates the transition by continuous transformation with enough time to get used to the new environment (P6-7, P9, P10).It also supports spatial orientation in the new environment (P9, P12, P14-15).However, it is also considered boring or visually unattractive (P2-5, P8-9, P13-14) and it may be unclear to users what objects they should focus on during the transition (P4).It is useful for frequent changes between environments (P5, P14), especially for beginners (P2) and conventional business applications (P13).Furthermore, it is useful for transitions to a target environment that is similar to the real environment (P1, P5, P7).
Coherent Lighting: The coherent lighting was only actively noticed and called out by P7.Another participant (P12) mentioned it when we specifically asked them about the change in light.P9 only mentioned they liked the incoherent lighting scenarios, as the replica was in their case sunnier than the real environment and they enjoyed transitioning through a friendlier version of reality into the target environment.P7 preferred the incoherent lighting in the replica as they felt that it emphasises that the replica is different from reality and that it is already a virtual environment.
Within the participants that did not notice the coherent and incoherent lighting, eight felt that direction and brightness should be generally similar, while five mentioned that they believe it makes no difference.Two also stated that they expect the virtual environment to have different lighting (P5 & P10).
General Remarks: Overall, participants felt that transitions were too slow: Fade (P2, P6, P15), Dissolve (P3-6, P10-14), Translate (P3-4, P13-14), Combine (P2, P8-9, P13) and Baseline (P3, P6, P11, P14).However, this is also due to the fact that the focus of this study were the transitions themselves and participants had to transition more frequently than in a real-world use case.This was also explicitly mentioned by P14, who stated that transitions can take more time when they are not used as frequently.

DISCUSSION
The user study shows, that participants appreciated the replica as an intermediate step during the transition (P2, P4-5, P7-12, P14, P16).This is also reflected in the rankings for user experience.Participants, who found the replica inefficient at first, even changed their mind during the course of the study, as they expressed during the interview.The use of the replica allows users to get familiar with the environment and makes it easier to follow the transition cognitively.Furthermore, the replica could aid with avoiding real obstacles in the virtual environment as one user stated "it conveys the feeling that you know where you can step without stumbling" (P11).This is also consistent with Valkov's observation of people walking more confidently after entering the environment using a smooth transition [59].Moreover, the replica allows users to experience the transition process first with the objects of the real environment and then continue to the different virtual environment.This leads the users' attention towards the position of the real objects where the replica might serve a similar purpose as visual landmarks that have been used in other studies to enhance spatial orientation [24,31].Overall, the replica provides the opportunity to apply visual effects to objects existing in reality.
In terms of visual effects, all effects except for Baseline achieved an acceptable user experience score in the UEQ-S.Dissolve and Translate were received very mixed with both positive and negative rankings, which is also reflected in Figure 9. Personal preference whether the visual effect was pleasant or unpleasant seems to be the defining aspect in this case.However, when looking at the negative comments for both techniques, Dissolve received comments like "visually unpleasant" and "too much going on" while Translate was described as "uncomfortable" or even "threatening".Therefore, Translate produced strong negative emotions when participants were not fond of the effect.Nonetheless, users felt that the gradually incoming objects facilitate spatial orientation.To improve Dissolve, participants suggested to mute the colours of the pattern to make it less obtrusive (P3, P7, P14-15).For Translate, participants suggested to use it only for objects that would naturally behave that way (P12) or objects that are small enough to be picked up (P10).
Fade and Baseline were mostly characterised as conventional and common.However, Fade outperformed Baseline in user experience scores as well as the ranking.Nevertheless, in both cases participants perceived the transitions as confusing as there are lots of elements visible simultaneously during the transition.
The Combine technique was the most popular in the ranking and the UEQ-S Overall score and second highest score on the Hedonic scale.It allows for mitigation of the negative effects of the other techniques.For instance, participants liked that Translate was only applied to objects that would naturally behave that way, such as the outside blinds.This validates our suggested design approach for translating objects and affirms the effectiveness of this technique for objects of this nature.Furthermore, users positively remarked that the Dissolve pattern was only used for objects and not walls and floors, making the effect less obtrusive and guiding the users' gaze during the transition.Therefore, participants did not have the same issue as they did with the Fade effect, where they did not know what they should look at during the transition.
The coherent lighting was only actively noticed by one participant (P7), who mentioned that the incoherent lighting was more useful as it emphasises the replica is already a completely virtual environment.Other participants felt that the lighting should be similar in terms of brightness and direction but that there is no need for identical lighting in the replica.Moreover, P4 suggested to slowly adjust the brightness in the replica from the realistic illumination to gradually match the light source of the target environment.Therefore, light coherence is not necessary within the transition process.In contrary, lighting incoherence can be used to emphasise the transition process.As one explained, the incoherent lighting "it makes it easier to recognise that it is a replica and not the real environment" (P7).
Some participants pointed out that the transition duration is too long.As depicted in other publications, the duration for transitions can range from a few seconds [21] to 15 seconds [20] or even 60 seconds [59].The preference for duration depends on the individual user and the specific application context.In contrast to traditional 2D visualisation recommendations [45], our transition process distributes visual load across the entire FoV of the HMD.In order to prevent users from feeling overwhelmed during this process, which could adversely affect the overall transition experience, we opted for a more cautious approach in determining the duration.This approach ensures a generous amount of time for a smooth transition.Especially in cases of infrequent transitions, a longer transition duration can be appropriate, as noted by one participant (P14), see Section 6.6.4-GeneralRemarks.However, we believe that the transition time did not significantly impact the general insights into the techniques, given that the duration was the same for all techniques.
In conclusion, our main findings are: • The digital replica is a useful intermediate step in the transition process, with users reporting that it facilitates cognitively following the changes in their environment, see Section 6.6.4-Replica and Figure 8. • Translate is visually exciting, but can also be perceived as threatening, see Section 6.6.4-Translate.• Dissolve is visually prominent and thus, participants mentioned that it can guide the users' attention.Yet, it was also reported as distracting, see Section 6.6.4-Dissolve.• Fade was perceived as simple and effective, but was deemed boring and was mentioned to be suitable for conventional applications, see Section 6.6.4-Fade.
• Combine could be a suitable compromise between visual load and visual guidance during the transition, see Section 6.6.4-Combine.While it can still be perceived as distracting, participants tended to rank it in one of the first three tiers, see Figure 9. • Baseline was mostly perceived as confusing, however, some users might be familiar due to its use in video games, as mentioned by one participant, see Section 6.6.4-Baseline and Figure 8. • Coherent lighting was only noticed by one participant and thus plays a minor part in our transition process.However, incoherent lighting or slowly transforming the lighting in the process could be implemented to emphasise the transition, see Section 6.6.4-CoherentLighting.

CONCLUSION
This work presented a generic concept for transitions to a target environment across the whole RVC as well as a digital replica of the real environment.Furthermore, four specific transitions were implemented and compared to a baseline condition in a qualitative user study.The transitions were Fade, Translate, Dissolve and Combine.The Baseline condition was Fade without the digital replica.Additionally, coherent lighting was implemented in the replica environment.
The study revealed that users felt that the digital replica facilitates cognitively following the transition to the target environment.Furthermore, the study indicates that it is not essential to implement identical lighting in the replica during the transition.It is more important that the general direction of the light matches the real environment.The ideal visual effect for the transition process depends on the individual user preference.Nevertheless, a combination of different effects can provide a useful compromise, where the users gaze can be steered towards specific objects without overwhelming them with excessive visual effects.
Future work should include different transition techniques, such as morphing and visual landmarks that provide spatial orientation.Furthermore, visual coherence should be explored in more detail, including a transition of the light source from real to target environment, along with exploring an automated approach for creating the replica environment.Future work should also include a quantitative study, especially considering the Combine technique, as it was not counterbalanced in this study for the sake of user understanding.Therefore, future work can investigate different variations of the Combine technique using a comparative quantitative approach including efficiency.Finally, transitions should be included and examined in real world use case studies.

Figure
Figure 3: Fade: (left) Prominent replicated objects are fully visible, while the floor and walls continue to undergo changes in transparency.(right) Intermediate state where the replicated objects are nearly transparent, and the target objects begin to emerge.
Figure 3: Fade: (left) Prominent replicated objects are fully visible, while the floor and walls continue to undergo changes in transparency.(right) Intermediate state where the replicated objects are nearly transparent, and the target objects begin to emerge.

Figure 4 :
Figure 4: Dissolve: (left) Replica desk emerges through a melting effect.(right) Replica curtain disintegrates while objects from the target environment materialises with the same visual effect.

Figure 5 :
Figure 5: Translate: (left) The replicated display has already replaced the real one, and the replica table is moving from bottom to top, aligning itself with the position of the real table to replace it.(right) The last missing objects of the target environment two are moving from the bottom towards their final position.

Figure 6 :
Figure 6: Comparison between the lighting conditions in the replica (left) and the real environment (right) at the same time.

Figure 7 :
Figure 7: (left) Target environment 1 represents a playful medieval scene.(right) Target environment 2 shows an office environment.

Figure 8 :
Figure 8: Results of the UEQ-S questionnaire for the overall scale and the pragmatic and hedonic subscales.

Figure 9 :
Figure 9: Results from participants ranking the techniques from most favourite to least favourite.