Aircraft Cockpit Interaction in Virtual Reality with Visual, Auditive, and Vibrotactile Feedback

Safety-critical interactive spaces for supervision and time-critical control tasks are usually characterized by many small displays and physical controls, typically found in control rooms or automotive, railway, and aviation cockpits. Using Virtual Reality (VR) simulations instead of a physical system can significantly reduce the training costs of these interactive spaces without risking real-world accidents or occupying expensive physical simulators. However, the user's physical interactions and feedback methods must be technologically mediated. Therefore, we conducted a within-subjects study with 24 participants and compared performance, task load, and simulator sickness during training of authentic aircraft cockpit manipulation tasks. The participants were asked to perform these tasks inside a VR flight simulator (VRFS) for three feedback methods (acoustic, haptic, and acoustic+haptic) and inside a physical flight simulator (PFS) of a commercial airplane cockpit. The study revealed a partial equivalence of VRFS and PFS, control-specific differences input elements, irrelevance of rudimentary vibrotactile feedback, slower movements in VR, as well as a preference for PFS.


INTRODUCTION
Interactive spaces are work environments that integrate multiple connected computing devices, e.g., di erent digital inputs, controllers, and information displays inside a physical space [20].Typically, they are intended to support creative or knowledge work inside meeting rooms, design studios, visualization labs, or libraries equipped with multiple mobile screens, interactive tabletops, or other large interactive surfaces, e.g.[20,51].In contrast, our more recent research is concerned with safety-critical interactive spaces for supervision and time-critical control tasks which are typically found in control rooms or automotive, railway, and aviation cockpits.They are usually characterized by a large number of small displays and physical controls such as switches, buttons, or dial knobs (see Fig. 1, 4).
This reliance on physical controls results in much greater e ort and costs for their technical implementation or physical prototyping.Using virtual reality (VR) simulations instead of physical systems could substantially reduce these costs, particularly when using inexpensive, consumergrade, and o -the-shelf VR hardware.For example, one potential use of VR is enabling rapid testing of new cockpit designs [25] and control layouts without building costly physical mockups [19,31,32].Another is using VR as a cost-e ective and portable training supplement for safety-critical procedures without risking real-world accidents or occupying a high-end physical ight simulators (PFS) [55] with acquisition costs of USD 1 Mio or above and operational costs of USD 400-500 per hour [5].
However, previous research has shown that currently available o -the-shelf VR products are generally not well-suited to simulate frequent interactions with many di erent physical cockpit elements [5].For example, for some aviation tasks, a virtual reality ight simulator (VRFS) based on an exact replica of an aircraft cockpit using simple, o -the-shelf consumer VR can create a training experience that is already equally successful as real-world training inside a fully-edged physical cockpit in a high-end, professional PFS.However, using VR comes at the cost of much slower task completion, increased perceived workload, and increased simulator sickness, primarily due to cumbersome interactions with simulated VR cockpit controls [5].Our work is therefore concerned with the following questions: How can we employ current commercial, o -the-shelf VR technologies for a cost-e ective simulation of interactive spaces that contain many physical controls?How can we do this while minimizing the negative e ects of the often cumbersome interactions with such simulated switches, buttons, or dials in VR by better interaction design?
In this article, we report our results from a user study in which we used a commercial, o -the-shelf VR head-mounted display (HMD) and data glove for nger and hand tracking for three di erent feedback methods (i.e., acoustic, haptic, and acoustic+haptic) to improve users' interactions with simulated cockpit controls in VR.Thereby, the data glove enabled a controller-free detection of natural hand and nger motions outside the users' eld of view (FOV) for mimicking real-world interactions with physical controls in VR.In addition, we also included a real-world, physical cockpit of a professional PFS as a baseline and "gold standard" in a fourth condition.
In a within-subjects study with 24 participants, we compared performance, task load, and simulator sickness during training of authentic cockpit manipulation tasks inside a VR replica of a Boeing 737-800NG cockpit for the three feedback methods.Additionally, we compared them to participants' interactions with the actual physical cockpit in a PFS.The participants were asked to train and execute typical aviation tasks by manipulating push buttons, rocker switches, and dial knobs, as these three control types represent about 90 percent of all cockpit elements of the Boeing 737-800NG.To achieve an in-depth analysis and understanding of the movements and interactions inside the VRFS and PFS, we recorded all hand and nger trajectories over time to di erentiate the users' movement time (i.e., time to reach the target switch) from the manipulation time (i.e., time needed to set a switch to its target state).
The study revealed several ndings about the advantages and disadvantages of using commercial, o -the-shelf VR technology to simulate safety-critical interactive spaces and the di erent methods for interacting with simulated controls in a VRFS: (1) Equivalence of VRFS and PFS: There were no signi cant di erences in error rates (and thus training success) between PFS and the three feedback methods in VRFS.There were also no signi cant di erences between VRFS and the physical cockpit in terms of the Raw TLX subscales mental demand, physical demand, temporal demand, performance, and e ort.(2) Control-speci c di erences: There are no signi cant di erences in manipulation time for push buttons and rocker switches in PFS and all VRFS conditions.It is, however, signi cantly higher for dial knobs.Problems with dial knobs in VR were also con rmed in semi-structured interviews and contribute to a signi cantly higher Raw TLX frustration subscale for VRFS than for PFS.(3) Irrelevance of vibrotactile feedback: Compared to acoustic feedback, the inclusion of the o -the-shelf data glove for rudimentary vibrotactile feedback did not signi cantly a ect the manipulation time.(4) Slower movements in VR: The average and median movement time in was signi cantly lower than in all VR conditions.Participants generally moved their hands more slowly in VR, contributing to slower task completion.
(5) Preference for PFS and simulator sickness: Despite comparable objective performances, most participants subjectively preferred PFS over VRFS.Simulator sickness contributed to this.After exposure, the mean score of the Simulator Sickness Questionnaire (SSQ) remained in the "minimal symptoms" category for PFS but moved into the "signi cant symptoms" category for VRFS.However, the increased SSQ results can be mostly attributed to just three of the 24 participants.They were the only ones reporting strong symptoms.These symptoms were in the oculomotor or disorientation category (but not in nausea), primarily stating blurred vision and eye strain as reasons.
We conclude that commercial, o -the-shelf VR technologies can be used for cost-e ective simulations of safety-critical interactive spaces, even when they contain many physical controls.However, although roughly equivalent performance and error rates can be expected, better interaction designs are necessary to improve the manipulation of more complex controls in VR, e.g., simulated dial knobs.Simply adding the rudimentary vibrotactile feedback of a commercial data glove will most likely not result in relevant improvements.Also, since hand movements are generally slower in VR, greater movement time and slower task completion in VR have to be expected-even when problems of dial knob manipulation will be reduced in the future.

RELATED WORK
This work is positioned in the context of three areas of related work: Virtual Reality Flight Simulation, Input Methods in Virtual Reality Flight Simulators, and Haptic Feedback in Virtual Reality.

Virtual Reality Flight Simulation
Flight simulators help to reduce the complexity of ying, as they allow training under safe conditions.The origins of physical ight simulators can be traced back to early non-digital examples using adapted parts from sewing machines [39].Since then, ight simulators have become an essential part of pilot training, as they enable realistic training of essential airmanship skills.Pilots train standard operation procedures (SOP), critical situations, and especially emergencies with great realism, but without putting the aircraft, the crew, or even passengers at risk.
Recent ight simulators often employ VR technology for more realistic training conditions and, thus, new ways of aviation pilot training [35,41].Such VRFS enable the training of pilots outside a physical cockpit but in a exible, cost-e ective, and sometimes photo-realistic interactive 3D space.Therefore, VR ight simulation is perhaps the most pervasive and successful part within VR simulation [40].VRFS are used as professional training devices [44,58], for testing exible cockpit layouts [4,59], or even for entertainment and gaming [50].As low-cost alternatives to "full" VR, previous work also focused on basic cockpit training by learning check procedures from viewing 360 • videos [36].
Airlines and ying schools are aware of the potential savings [12,52] by VRFS, as they seek a ordable and realistic substitutions for PFS for parts of the pilot training [26].Also, the use of VR in pilot training has already been approved for certain parts1 by the European Union Aviation Safety Agency (EASA) 2 .Previous work con rms the possible cost savings by using o -the-shelf hard-and software for cockpit familiarization training but also revealed problems with increased simulator sickness of VRFS compared to PFS [5].
VRFS have also been used as design tools.Previous work has compared the delity of a VRFS to a hardware cockpit mockup during ying tasks to evaluate the possible role of a VRFS in the early phases of the cockpit design process [32].While some previous research has used simpli ed cockpit mockups [7,24,30,45], our research uses a full-scale replica of an identical aircraft type for comparing PFS and VRFS, in order to provide a high level of internal validity.

Input Methods in Virtual Reality Flight Simulators
Current VR technologies require the user's physical interactions for manipulating simulated buttons, switches, displays, etc. to be technologically mediated.This can happen by using physical mockups [32], holding input devices (e.g.VR controllers [5]), using ultrasound [15], optical tracking [4,57], or touch screens [21].However, holding an input device limits the free movement of all ngers, and current input devices based on ultrasound, optical tracking, and touch screens provide limited exibility, as they are constrained to a prede ned position or a certain FOV of the provided sensor.
To avoid such trade-o s and also to provide haptic feedback, at least for selected controls, other researchers [27,58] and commercial products3 integrated physical joysticks and thrust levers into their VRFS.However, adding physical elements to a virtual cockpit makes the simulated cockpit less exible toward the representation of di erent cockpit layouts.
Using voice commands and speech recognition could overcome the mentioned trade-o s, but previous work of Rustamov et al. show an average of 89.6% correct recognition [42] in ight simulation, which is too low for safety-critical interactive spaces.Another approach for the interaction with virtual switches uses gaze-based interactions [49].However, voice or gaze-based interaction does not make use of the pilot's muscle memory -one of ve major attributes required for a safe ight [16].

Haptic Feedback in Virtual Reality
The lack of haptic or tactile feedback can have a negative e ect on the user performance when interacting with virtual elements.For example, Aslandere et al. [4] used an optical system for nger tracking to generate a virtual hand within a virtual cockpit without haptic feedback, with which users achieved an average hit rate of only 77% .Novel input devices with haptic feedback can be used for enhanced user input in VR cockpits, such as the Haptic Revolver [54] or robotic arm-based systems like Snake Charmer [3], or even force feedback [1], but with limited movement space.However, the mentioned technologies are prototypes and are not easily available on the market yet (e.g., Dexmo Glove [17]).
Haptic feedback can also be generated using ultrasound [9,13], actuators [14], or glove-based approaches with vibrators [48,56].Depending on their application area, vibrotactile devices are used on the wrist [53], arm [47], or on one [43] or even multiple ngertips [37,43,46].None of the mentioned technologies were used in pilot training within a virtual cockpit representation of a commercial aircraft.

METHOD AND STUDY DESIGN
For our work, we compared a VRFS using di erent feedback modalities to a full-scale physical cockpit replica of a Boeing 737-800NG with all original instruments, which ight simulator enthusiasts built as part of a commercial attraction.As this PFS focuses on entertainment, it was not certi ed by the EASA.However, adding fully functional circuit breakers it would ful ll the requirements for a Flight and Navigation Procedures Trainer (FNPT) Level II simulator 4 .Therefore, it supports the development of fundamental skills of pilot training and can be considered fully capable of basic cockpit manipulation tasks.The PFS was compared to a basic VRFS based on a HTC Vive Pro5 HMD.The VRFS used an identical virtual cockpit model6 of the Boeing 737-800NG.It was integrated, animated, and developed using Unity 7 as the main software component.All hand and nger movements in the PFS and VRFS were tracked using a Manus data glove 8 to ensure a valid comparison.
The goal of our study was a quantitative comparison of both simulator technologies in terms of basic cockpit interaction, self-reported task load, and self-reported simulator sickness during the manipulation of certain basic cockpit switches (dial knobs, rocker switches, and push buttons) that are used during the ight deck preparation and supplementary procedures according to the operations manual of the Boeing 737-800NG 9 .At the end of the task, we performed a semi-structured interview that helped us to explain our quantitative results.As the chosen buttons, switches, and knobs (e.g., landing light, weather radar) are not directly related to the aircraft's controls, we kept the simulated aircraft motionless in order to prevent any distraction caused by the movement of the outside scenery.
All test conditions were presented consecutively but randomized with Latin Square, inside the full-scale replica cockpit of the Boeing 737-800NG, on the left cockpit seat (captains' seat), and were performed with the right hand only, on two prede ned seat positions.These two seat positions allowed us to perform all tasks for one participant within a single session: • For PFS, the seat was moved into a prede ned forward position, from which all relevant physical switches were reachable.The participants wore tracked data gloves for tracking the right hand and ngers during the interaction with real-world physical switches.• For VR, the seat was moved to a prede ned backward position to increase the available physical space for freely moving the headworn HMD and tracked hands and to avoid unintentionally touching physical switches or consoles.For each task, the home position was a tangible marker (see Fig. 2A and B) on the right armrest that ensured an identical starting point of all trajectories in any condition, even when wearing the vision-blocking HMD.This home position allowed for a valid comparison between PFS and VRFS, as it is xed to the right armrest of the left pilot seat while the seat was moved into the forward and backward position.In order to achieve correct scales and distance measurements within the virtual cockpit, two VR base stations, and a VR tracker were set on prede ned positions within the cockpit (see Fig. 2D).

Participants
We invited 24 study participants.To avoid biases for or against new simulator technologies due to previous training or experiences, none of them had an active pilot license, more than 10 hours of experience in a ight simulator, or more than 10 hours of VR experience.
The participants (19-45 years, M=33.71,SD=8.29, 12 female, 12 male) were split into two groups.One half of the participants started with the PFS, and the other half started with the VRFS.We ensured that the used data glove tted tightly on the right hand so the ngertips were not covered by fabric.
This enabled unimpaired physical interactions with the physical switches in PFS, as the PFS was used as "golden standard" (see Fig. 2C).Three of the participants were left-handed (12.5%),21 were right-handed (87.5%).This corresponds to recent estimates of a 10.6% ratio of left-handed persons [34].The participants' vision was either normal or corrected to normal, using glasses or contact lenses that tted underneath the HMD.The interpupillary distance (IPD) was measured using the provided method of the HMD's manufacturer.

Hardware Configuration of PFS.
The PFS software was running on a PC with Windows 10 and an Intel i7 with 3.6 GHz, a Nvidia GeForce GTX 1080Ti GPU, and 32 GB RAM.The rendering of the external environment outside the cockpit (e.g., the runway) could be seen through the cockpit's windows and was projected with three HD projectors on a 180 • cylindrical screen in front of the cockpit.However, our study contained no tasks that put the aircraft in motion and required viewing the external environment.Our PFS did not include a full-motion platform.

Hardware Configuration of VRFS.
The VRFS used a laptop with Windows 10, an Intel i5 with 4.1 GHz CPU, a Nvidia GeForce RTX 2070 Super GPU, and 32 GB RAM.As HMD for the VRFS, we used a HTC Vive Pro with positional tracking (six degrees of freedom), providing a resolution of 1440x1600 pixel per eye, and integrated headphones.One advantage of the used HMD is the high availability on the market, which (as mentioned in section 1) makes it interesting for ying schools to use them in classroom situations or even at home.
For the hand-and nger-tracking in VR, we chose to use Manus Prime II Haptic gloves 10 .We decided on the Manus gloves as they provide basic vibrotactile feedback on each nger without covering the ngertips with fabric.Also, compared to other nger tracking systems based on front-facing cameras or sensors, the trajectories of hand and ngers can be tracked outside the FOV of the optical sensor of the HMD, which is a signi cant advantage given the importance of also enabling eyes-free, haptic-only interaction with switches in cockpits.
To mimic a more natural interaction with PFS switches, we included di erent feedback modalities in the VRFS.In the rst condition, participants were provided with visual and vibrotactile feedback during the interaction with the cockpit switches to provide stimuli on each nger while manipulating di erent cockpit elements.In a second condition, participants were provided visual and acoustic feedback whenever a virtual switch was manipulated.In a third condition, visual, vibrotactile, and acoustic feedback were provided simultaneously.Accordingly, visual feedback (i.e., visual highlighting of the touched control) was present in every condition (Fig. 2D and E).

Simulation
So ware Stack for PFS and VRFS.Our implementation of the VRFS aimed to create a high level of similarity between the VRFS and PFS to ensure high internal validity of the study.We provided the identical aircraft type, cockpit layou,t and switch positions in VRFS and PFS.Unity was the primary software component as it was used for the visual representation within the HMD, hand-and nger tracking, playing audio les, recording the switch positions, measurement of timing and distance, and logging.The user interface in the VRFS was based on the Unity plugin and code examples provided by Manus.The interface between Unity and the PFS was built on network sockets connected to the ProSim 11 software of the PFS.The PFS itself was running on the commercial simulator software Prepar3D by Lockheed Martin 12 (Fig. 3).The VRFS was running at a minimum of 90 frames per second (FPS) during the test conditions, which is a recommended minimum for VR applications [2].

Independent Variables
To compare the di erences between PFS and VRFS, we used the type of feedback methods for interacting with the cockpit's elements.

Feedback
Methods.The type of feedback during the interaction with cockpit elements was either PFS or a consumer-grade stereoscopic, cost-e cient VRFS with three di erent feedback modalities, resulting in four di erent feedback modalities that we presented counterbalanced:

Tasks
The participants performed basic cockpit manipulation tasks based on check procedures from the operations manual of the Boeing 737-800NG 13 .These tasks represent short but realistic actions regarding the pilots' checklist that can be performed with a single cockpit element (e.g., turning on the landing light).Participants were seated on the left seat of the cockpit.They executed actions with their right hand, which can be performed using either the index nger (push buttons and rocker switches) or thumb and index nger (dial knobs).
After an initial training phase, which was supported by the experimenter ( rst author of this paper and a former military jet-ghter pilot with 18 years of experience in aviation, who also designed, executed, and evaluated this study), the participants heard randomized, recorded audio les that contained a voice command to manipulate a speci c cockpit element.The cockpit elements (see Fig. 4) were divided into three di erent types (push buttons, rocker switches, and dial knobs) and three di erent areas (upper 14 , middle15 , lower 16 ).This resulted in nine tasks, represented in a 3x3 matrix within the aircraft cockpit (see Table 1).

Dependent Variables
For our study, we determined the following dependent variables.

Task Completion Time [sec]
. The task completion time was measured while performing each cockpit manipulation task (see Fig. 5).It was split into two components.Firstly, the movement time which started with the initial hand movement and ended with entering a range of less than 3 cm to the relevant cockpit element.We determined a distance of 3 cm to provide space for the interaction and compensate for a slight mismatch in the nger tracking (as shown in Fig. 2F and G).Secondly, the manipulation time started at the end of the movement time and ended with reaching the desired nal position of the cockpit element.

Error Rate [%].
The participants had to perform basic cockpit manipulation tasks.Two types of errors were recorded: rst, an incorrect switch error occurred whenever an incorrect cockpit element was manipulated; second, a switch position error was detected whenever a switch was left in an incorrect position at the end of each task.
3.5.3Perceived Workload.The participants' task load was measured using the NASA Task Load Index (NASA-TLX) [18] without paired comparisons of the subscales [8,29], also known as Raw TLX.After performing all tasks in the PFS and all tasks in VR, the participants rated their mental demand, physical demand, temporal demand, performance, e ort, and frustration on a scale ranging from very low (0) to very high (+10).
Fig. 6.Visualization of the within-subjects study procedure.In VR, the participants had to perform the cockpit manipulation tasks with all three feedback modalities ( , ℎ , ).The presentation of the cockpit manipulation tasks in VR was counterbalanced.
3.5.4Simulator Sickness.The Simulator Sickness Questionnaire [23] was applied before and after each PFS and VRFS session.This standardized, subjective questionnaire measures 16 symptoms on a Likert-scale ranging from not at all (0) to severe (3).These symptoms are general discomfort, fatigue, headache, eyestrain, di culty focusing, increased salivation, sweating, nausea, di culty concentrating, fullness of head, blurred vision, dizziness (eyes open), dizziness (eyes closed), vertigo, stomach awareness, and burping which are assigned to the categories nausea, oculomotor, and disorientation.As some symptoms are associated with multiple categories, the categories are not disjunctive.

Procedure
Participants gave informed consent, lled out a demographic questionnaire, and a pre-exposure SSQ.The study strictly complied with all relevant guidelines and legal regulations concerning COVID-19.
Before each session, the data glove was calibrated with a standalone application provided by the manufacturer.Furthermore, the positions of all cockpit elements were recorded by touching the relevant object in the physical cockpit as well in VR.During an initial training phase supported by the experimenter, all participants could decide for themselves when they wanted to complete the learning phase and felt able to perform each task under test conditions.After performing all tasks, the participants lled out the NASA-TLX.Each test was concluded by lling out a post-exposure SSQ and, at the end of both sessions, answering the questions of the semi-structured interview.The interview gave participants the opportunity to informally share their experiences and comments.The following initial set of questions was used as a conversation starter: • Which simulator did you prefer?How many points do you assign to PFS and VRFS when you have in total 10 points available?17• Can you tell us why have distributed the points in this way?
• Is there anything else you would like to share?On average, a complete test session took 10-15 minutes in the PFS and 15-20 minutes in the VRFS.A visual presentation of the study procedure is shown in Fig. 6.

RESULTS
This chapter contains detailed information about the results and the implications for future VRFS.

Movement Time
We expected similar movement times in the VRFS compared to the PFS, as the tasks, the distance between the starting point and target, and the scaling of the cockpit were identical.However, the average and median movement time in all VR conditions were slower (Fig. 7, Table 2) than the PFS.Looking at the data related to Attend, Master Caution, Landing Light, Flight Director, Cross Feed, and Course Selector, we found statistically signi cant di erences between and all three VR conditions , ℎ , and .These cockpit elements are situated in the upper or middle area of the aircraft cockpit and are, therefore, easy to reach.
Evaluating the remaining switches, we found statistically signi cant di erences for Cargo Fire Test in the comparison of with and , and for Overheat Test, between with ℎ , but none for Weather Radar.The last three mentioned switches (highlighted with yellow in Fig. 7) are positioned in the lower area.They are more challenging to reach, as the right armrest blocks them and are outside the pilots' FOV whenever looking straight ahead in the cockpit.These buttons are either of low priority during the ight (e.g.test buttons), or are hardly used during a regular ight (e.g., re extinguisher).As expected, we did not observe any signi cant di erence between the VR conditions because the trajectories towards the cockpit elements are not a ected by the interaction feedback, which happens later.
A possible explanation for the increased movement time in all VR conditions is depth underestimation [10], resulting in a reduced movement speed in the target's proximity.In order to get a better understanding of the underlying backgrounds, we performed a preliminary evaluation, which is described in detail in Chapter 6, as part of our outlook.Result 1 -Movement Time: The average and median movement time in is lower than in all VR conditions.This di erence is statistically signi cant for all switches in the upper and middle area that are easy to reach and within the pilots' FOV whenever sitting in the left seat and looking straight ahead.The di erence between PFS and VR is less signi cant for cockpit elements in the lower area, as they are blocked by the right armrest, resulting in a detoured trajectory both in the real world, and the virtual cockpit.

Manipulation Time
As our participants were rather inexperienced VR users, we expected increased manipulation times for the VR conditions.However, we found no statistically signi cant di erences (see Fig. 3 and Table 3) for the push buttons Attend, Master Caution, and Cargo Fire.We found a single statistically signi cant di erence for the rocker switches Landing Light between and ℎ , for Flight Director between and , and for Overheat Test between and .Notably, there were statistically signi cant di erences between and all VR conditions dial knobs, where the median and mean manipulation time were consistently higher in VR.Participants also mentioned dial knobs in the interviews and that they are more di cult to manipulate in VR (P1, P2, P8, P12), and that VR is not as precise as PFS (P4, P7).
Interestingly, we did not observe any di erence within the VR conditions , ℎ , and .In our study, the low-cost acoustic feedback was as e ective as rudimentary vibrotactile feedback when manipulating virtual cockpit elements.Furthermore, for Landing Light, Flight Director, and Overheat Test the average and median manipulation time in PFS are higher than in the VR conditions.A possible explanation is that the physical switches are spring-loaded and need considerable force to be moved.This was not simulated in the VRFS.Result 2 -Manipulation Time: The manipulation time of push buttons and rocker switches is quite similar in and in .However, the most signi cant di erence between the real world and the virtual cockpit is with dial knobs.Interestingly, we did not observe any di erence within the VR conditions , ℎ , and .In addition, the rudimentary vibrotactile feedback did not have a signi cant positive in uence on the manipulation time.

Error Rate
We did not observe any switch position error, neither in PFS, nor in the VR conditions.Furthermore, the number of incorrect switch errors was low among all conditions.The details about mean, median, standard deviation, and the results of the non-parametric Friedman Test for the not normally distributed data, can be found in Table 4. None of the comparisons were statistically signi cant.This result shows that the participants were able to select and interact correctly with the cockpit elements in all test conditions (PFS and VRFS).Result 3 -Error Rate: Participants successfully performed basic cockpit manipulation tasks in VR without any switch position error and signi cant di erences in incorrect switch error rates.Overall, there were no statistically signi cant di erences in error rates between , , ℎ , and .

Perceived Workload
Raw TLX was used for measuring task load during the user study (see Fig. 9).The data was not normally distributed and we applied a non-parametric test, accordingly.Looking at pairwise comparisons (Wilcoxon-Signed-Rank), we saw that only for the sub-scale frustration the di erences between and ( < 0.001) were statistically signi cant but not for the sub-scales mental demand ( = 0.055), physical demand ( = 0.063), temporal demand ( = 0.718), performance ( = 0.102), and e ort ( = 0.096).
We believe that the inaccuracy of the Manus data glove, in combination with the problematic manipulation of dial knobs led to an increased frustration among the participants, as e.g., P3 stated that he was "disappointed whenever I had to manipulate a dial knob in VR".However, the overall feedback towards the VRFS was quite positive, as di erent participants stated that they had fun in VR (P13, P14, P15, P20, P24) and that they see great potential in VR for the future pilot training (P3, P24).Result 4 -Workload: We did not observe statistically signi cant di erences within the Raw TLXvalues, except for the sub-scale frustration, which was mainly caused by problematic interaction with dial knobs.

SSQ and User Ranking
The SSQ questionnaire [22] was lled out before and after session and session, so four times for each participant.According to SSQ literature, a total SSQ score between 5-10 is associated with "minimal symptoms", 10-15 with "signi cant symptoms", 15-20 with "symptoms are a concern", and values above 20 with a "problem simulator" Looking at the descriptive statistics, the total SSQ score after exposure describes the as simulator with "minimal symptoms" ( = 3.74, = 6.39,= 7.02) and the as simulator with "signi cant symptoms", yet very close to the lower boundary ( = 4.71, = 10.21,= 11.58).In order to understand the in uence of and on the participants, we calculated the di erence between the pre-and post-exposure.By applying Wilcoxon-Signed-Rank test for the pairwise comparisons and as shown in Fig. 10, we did not nd any signi cance in the SSQ-Di erence in any of the four SSQ-Categories nausea ( = 0.632), oculomotor ( = 0.712), disorientation ( = 0.336), and total score ( = 0.962).This result shows a slightly higher SSQ-Total Score for the VRFS (mainly caused by P3, P6, and P18) but no signi cantly increased simulator sickness for VR.Looking at the evaluation of the distribution of the user-reported scoring, the majority of the participants preferred PFS over VRFS (see Fig. 11).This di erence was statistically signi cant ( < 0.001).Interestingly, three of the 24 participants favored virtual reality over the physical cockpit.In addition, the semi-structured interview revealed positive feedback towards the used VRFS.Some stated that the virtual cockpit is already close to the real cockpit (P18,P19,P22,P23), and that they can imagine using VRFS as a training device during pilot training (P3, P12,P21,P23).Result 5 -Simulator Sickness: The participants reported SSQ related "minimal symptoms" for PFS and "signi cant symptoms" VRFS.The increased SSQ results can be mostly attributed to three of the 24 participants who reported strong symptoms in the Oculomotor or Disorientation category but not for Nausea, primarily stating blurred vision and eye strain as reasons.

DISCUSSION
In our study, we compared di erent interaction types inside a commercial aircraft cockpit of a Boeing 737-800NG with push buttons, rocker switchers, and dial knobs in VR using di erent feedback methods such as visual, auditory and/or haptic feedback.With our work, we intend to nd out if we can employ current commercial, o -the-shelf VR technologies for a cost-e ective simulation of interactive spaces containing many physical controls while minimizing the previously reported adverse e ects.Our main results indicate the advantages but also shortcomings, which are discussed in this chapter.

Equivalence of VRFS and PFS
We did not observe signi cant di erences in error rates (no recorded switch position errors and low rate of incorrect switch errors) between the three VR feedback methods ( , ℎ , and ) and PFS (Table 4).This is a clear indicator for the training success of the VR cockpit representation, as the buildup of muscle memory -one of ve major attributes required for a safe ight [16] -is supported.Furthermore, there were no statistically signi cant di erences between VRFS and the physical cockpit concerning the Raw TLX sub-scales mental demand, physical demand, temporal demand, performance, and e ort (Fig. 10), that emphasizes the high level of equivalence between VRFS and PFS.

Control-Specific Di erences
Our results do not indicate signi cant di erences in manipulation time for push buttons and rocker switches in PFS and all VRFS conditions (Fig. 8, Table 3).However, the the Raw TLX sub-scale for frustration (Fig. 10) shows a statistically signi cant di erence between PFS and VRFS, mainly caused by the problematic interaction with dial knobs.These di erences are con rmed by the participants' comments in the semi-structured interview, declaring the interaction with dial knobs as main shortcoming of the presented VRFS.The used data glove did not cover the ngertips of the participants providing an unin uenced interaction with the cockpit elements, even in the physical environment, representing the "golden standard".

Irrelevance of Vibrotactile Feedback
The rudimentary vibrotactile feedback provided by the data glove did not signi cantly in uence the manipulation time compared to the visual and acoustic feedback (Fig. 8, Table 3).We are convinced that the cheaper acoustic feedback is su cient for the basic aircraft cockpit manipulation tasks and that the used vibrotactile does not provide an increased training result.More complex technologies based on, e.g., ultrasound [15] or force-feedback [17] might provide more suitable support in the future.

Slower Movements in VR
To our surprise, the participants generally moved their hands more slowly towards the VR cockpit elements, as average and median movement time in was signi cantly lower than in all VR conditions (Fig. 7, Table 2).A possible explanation for the slower movement time might be the low experience of the participants using data-gloves.In order to reveal the underlying reasons for the increased movement time, we performed a deeper analysis of the recorded data introduced in Chapter 6 (Limitations and Outlook).

Preference for PFS and Simulator Sickness
Most participants subjectively preferred PFS over VRFS (Fig. 11) as a result of the semi-structured interview.After exposure, the mean score of the SSQ remained in the "minimal symptoms" category for PFS but moved into the "signi cant symptoms" category for VRFS.Still, the increased SSQ results are mainly attributed to three of the 24 participants who were the only ones reporting strong symptoms.These were in the oculomotor or disorientation category (but not in nausea), primarily stating blurred vision and eye strain as reasons.

Implications for VRFS Research and Practice
We conclude that commercial, o -the-shelf VR technologies can be used for cost-e ective simulations of safety-critical interactive spaces, even when they contain many physical controls.Furthermore, the absence of physical switches did not impair participants' task completion and correctness.However, although roughly equivalent performance and error rates can be expected, better interaction designs are necessary to improve the manipulation of more complex controls in VR, e.g., simulated dial knobs.Simply adding the rudimentary vibrotactile feedback of a commercial data glove will most likely not result in relevant improvements.Also, since hand movements are generally slower in VR, increased movement time and slower task completion in VR have to be expected-even when problems of dial knob manipulation will be reduced in the future.
We are convinced that our results can be extended to other safety-critical interactive and virtual spaces, such as railway [33] and car simulators [11,28], as the investigated interaction types and feedback methods are not limited to a commercial airplane cockpit.We admit that future complementary VR-based training has to focus on solving the problematic interaction with dial knobs because the resulting frustration negatively in uences the learning outcome.Our study emphasizes the huge potential of VR as a complementary practice device within conventional training, even beyond aviation-related basic aircraft cockpit manipulation tasks.One indisputable advantage of using VR, within a training environment is the support of muscle memory that can be transferred directly to the real world environment [16].) and (0,0), that indicates a straight trajectory towards the target with a constant speed.The average trajectory in VR shows a decreased speed in the proximity of the target.As the cockpit elements in the lower Area are partly blocked by the right armrest, the participants were forced to move their hand radially, before they were able to proceed to the final point.

LIMITATIONS AND OUTLOOK
In our study, we had to deal with di erent limitations.First, as shown in Fig. 2F and G, the used data gloves provide a high but not fully precise representation of the physical hand in VR.Upcoming data gloves might provide higher precision and, therefore, more realistic nger tracking that can positively a ect the interaction, especially with dial knobs.
Second, in many cases the participants performed the tasks only once in every test condition.We did not evaluate the in uence of repetitive tasks and the resulting learning e ect on movement time and manipulation time, especially in the virtual environment.
The analysis of the movement time in Chapter 4.1 indicates a signi cant di erence between PFS and all VRFS conditions.In order to get a better understanding of the increased movement time in VR, we plotted a chart (see Fig. 12) of the average hand trajectory of all 24 participants.The x-axis of this chart describes the remaining time to the particular target within the normalized scale [1;0], where 1 represents the beginning, and 0 the end of observed movement time.The y-axis represents the remaining distance towards the target within the normalized scale of [1;0], where 1 marks the beginning, and 0 the end of observed movement time.In this chart, a ctitious straight line between the points (1,1) and (0,0) indicates a straight trajectory towards the target with a constant speed.
Interestingly, the average trajectories in PFS are closer to this ctitious straight line compared to the trajectories in VRFS.A trajectory below this ctitious straight line describes a hand trajectory with a reduced speed at the end of the movement, which is the case for the cockpit elements in the upper and middle cockpit area.Above it, it indicates a reduced speed at the beginning of the trajectory, which can be observed at the cockpit elements in the lower cockpit area.
In a preliminary analysis, which can be extended in the future, we found out that the movements in our conditions were performed quite straight towards the target with a constant speed for all cockpit elements.The trajectories in VR ( , ℎ , and ) show a reduced speed at the end of the movement time, resulting in statistically signi cant di erences for the tasks Attend, Landing Light, Cross Feed, Master Caution, Flight Director, and Course Selector, situated in the upper and Middle cockpit area.Our observed results of the trajectories coincide with previous research [6], reporting sigmoid-shaped trajectories during aimed mid-air movements in VR.A possible explanation for the increased movement time can be found in previous work, reporting depth underestimation in VR [38], which results in a decreased speed near the target.

CONCLUSION
We presented the results of a user study that compared a full-scale physical ight simulator of a Boeing 737-800NG with a cost-e cient virtual reality ight simulator for basic cockpit manipulation tasks with a data glove with vibrotactile and acoustic feedback.Based on our ndings, the similar manipulation times for push buttons and rocker switches, low error rates, moderate SSQ values, and similar NASA-TLX values show the potential for VR to be used as a safety-critical interactive space.However, the increased movement times and the signi cantly higher manipulation times of dial knobs led to a signi cantly increased frustration among the participants, indicating the potential for further development.

Fig. 2 .
Fig. 2. Details of our study: (A) tangible marker at starting position on right armrest (red), (B) participant puts right index finger at starting position, (C) no fingertips were covered by the gloves to enable unimpaired physical interaction with real cockpit switches in PFS, (D) and (E) visual feedback when touching a control in VR. (F) and (G) slight mismatch between physical hand and hand representation in VR when touching the finger tips of the index finger and the thumb.

Fig. 3 .
Fig. 3.The VRFS and PFS were based on Unity as the leading so ware component.Unity was used for visual representation in VR, recording, visualization of hand-and finger tracking, playing audio files, communication to the PFS via sockets, recording of switch positions, measurement of timing and distance, as well as logging.

•
: real world, physical ight simulator feedback, • : acoustic feedback during the manipulation in VR • ℎ : vibrotactile feedback for rudimentary haptic support during the manipulation of cockpit elements and • : combined acoustic and vibrotactile feedback.

Fig. 4 .
Fig.4.Details about the relevant elements within the aircra cockpit.Nine di erent cockpit elements categorized into three di erent types (push bu on, rocker switch, dial knob) and in three di erent areas (upper, middle, lower) were selected.Two VR base stations were used to triangulate and scale the virtual cockpit.

Fig. 5 .
Fig. 5.The movement time describes the required time to move the virtual hand toward the target.The manipulation time contains the time required to move a specific cockpit element to the correct final position.

Fig. 7 .
Fig. 7. Overview of movement time.We found statistically significant di erences (Bonferroni corrected) between PFS and VRFS for A end, Master Caution, Cargo Fire Test, Landing Light, Flight Director, Overheat Test, Cross Feed, and Course Selector.Cockpit elements in the lower area of the cockpit are highlighted with yellow, as the right armrest partly blocks the movement of these switches.

Fig. 8 .
Fig.8.Evaluation of manipulation time.We found statistically significant di erences for Landing Light between and ℎ , for Flight Director between and , and for Overheat Test between and .Furthermore, we observed statistically significant di erences for all dial knobs between and all VR conditions.

Fig. 9 .
Fig. 9.The analysis of the Raw TLX values indicates statistically significant di erences only for the subscale frustration.

Fig. 10 .
Fig. 10.Box plot of the perceived change in Simulator Sickness due to the exposure to either PFS or VRFS.The change was not statistically significant for all SSQ dimensions.

Fig. 11 .
Fig.11.Self reported distribution of max. 10 points, that were assigned either to PFS or VRFS, as response to Which simulator did you prefer?

Fig. 12 .
Fig. 12.Preliminary analysis of movement time that shows the average trajectory of the particpants during their movement towards the cockpit element.The x-axis of this chart describes the remaining time to the particular target within the normalized scale [1;0].The y-axis represents the remaining distance towards the target within a the normalized scale of [1;0].The average trajectory in PFS is a rather straight line between the points (1,1) and (0,0), that indicates a straight trajectory towards the target with a constant speed.The average trajectory in VR shows a decreased speed in the proximity of the target.As the cockpit elements in the lower Area are partly blocked by the right armrest, the participants were forced to move their hand radially, before they were able to proceed to the final point.

Table 2 .
This table shows the details regarding movement time, containing descriptive statistics, as well as information about the Friedman Test of the not normally distributed data, and the results of the Wilcoxon Signed Rank Tests with Bonferroni corrected p-values.

Table 3 .
This table shows the details regarding manipulation time, containing descriptive statistics as well as information about the Friedman Test of the not normally distributed data, and the results of the Wilcoxon Signed Rank Tests with Bonferroni corrected p-values.

Table 4 .
Evaluation of the low number of Incorrect Switch Errors including mean, median, and standard deviation.We did not perform post-hoc pairwise comparisons, as the Friedman Test did not indicate any significance.