Conflict Simulation for Shared Autonomy in Autonomous Driving

We present a tool for modeling conflict situations that enables simulation and testing of situation awareness in shared autonomy, in this case in an autonomous driving scenario. The flexibility of the tool allows definition of new conflict situations, integration with various control and conflict detection systems, as well as customization of Takeover Request (TOR) signals and different means of communication to the human operator. We start with one particular conflict situation - loss of lane markings, for which we demonstrate a simple conflict detection system. We conduct a preliminary user evaluation, which provides useful insights about the usability of the tool. The feedback from the participants indicates that TORs without indication feel uncomfortable and providing explicit information about the conflict situation is necessary when switching to manual control is required.


INTRODUCTION
In shared autonomy, automated systems and humans contribute to task execution, aiming for efective and seamless cooperation [15,38].Under specifc circumstances, the automated system may encounter the limits set by its design, encompassing factors like environment, location, time, and trafc conditions, commonly termed as Operational Design Domain (ODD) [5].In these cases, TOR is initiated and the system switches to manual control.It is important that the system is able to detect a confict situation [17] and to provide context awareness [14,23,27,32,34] that is sufcient for the human to understand the situation and react in a timely manner.In this report, we focus on Society of Automotive Engineers (SAE)-Level 3 (semi-autonomous driving), where the autonomous vehicle operates on its own for most of the journey and the driver can focus on other tasks [5].
In order to study the interaction between the autonomous system and the human operator, we present a tool1 based on Unity engine, that is built with the purpose of studying confict detection and situation awareness, and that enables defnitions of diferent confict types, as well as testing the human reaction to the situation.The simulation environment outputs diferent kinds of sensor information, such as camera image and Light Detection and Ranging (LiDAR) point cloud.It is integrated with Robot Operating System (ROS).Having a unifed environment for confict detection allows us to study diferent types of conficts, diferent sensors, setups, diferent presentation of the conficts, diferent ways to detect them, resolve them, pass control, and communicate them to the human driver to build modern Human Machine Interfaces (HMIs).We focus on the task of Autonomous driving and as a frst proof of concept, we choose one certain type of confict -missing/vanishing lane markings.
We performed a preliminary user experiment for interaction with the tool.We had a small amount of participants ( = 8) who had to observe the car drive for a while and take over control when requested.They were asked questions about the usability of the system and the signals for TOR.The feedback from the user experiment indicate that initiating a TOR without any explanation about the situation is uncomfortable for the user.
2 STATE-OF-THE-ART 2.1 Conficts in autonomous driving 2.1.1Conflict classification.The most general classifcation of conficts can be seen as a binary classifcation within the problem of reaching the limit of the ODD: frst, the system reaches its technical limits, so it does not have the capabilities to fnd a solution for a given problem or has to break (trafc) rules.Second, the system's technical capabilities are limited due to failure, e.g.sensor malfunctions.A lot of empirical studies have been conducted to analyze single types of conficts within this binary split.A main concern in the class of system limits is lane markings (blurred/missing lane markings, secondary lane markings because of work zones, road curvatures) [28,29,46].Other studies concentrated on the trafc dynamics and possible interactions with other participants, e.g.pedestrians, obstacles on the road, cut-in vehicles [9,19,30,35] or environmental factors [24,39].
A framework to build a common ground for more or less singular conficts in autonomous driving is proposed by Gold et al. [10], where conficts are classifed according to their urgency (time budget to solve the confict), predictability (dependencies to other factors), criticality (safety risks) and driver response (complexity of solution).In this framework, the research focus on HMIs mixed with maximizing the driver performance corresponds to scenarios 2, 9 and 10 in Table 1.In our study, we base our general confict choice on high values of urgency and especially driver response, because this indicates a possible application of advanced HMIs to simplify the confict representation.Therefore we added scenarios 5 and 7 to this set.The chosen confict situations are shown in Table 1 (without values of criticality and predictability).Table 1: Chosen conficts according to urgency and driver response from Gold et al. [10], from which confict 10 is the simulated confict in this paper.

Human-car interaction.
In SAE level-3, the automated driving system allows drivers to focus on so called NDRTs for the majority of the journey [3].However, when the system faces critical situations or potential conficts, the system needs to yield back control to the driver within a so called TOR [16,43,44].Several factors infuence the performance of the takeover as shown in [1,11,13,21,26].Next to the driving skills and the complexity of trafc situation, the type of NDRT has an impact on the overall performance [20,33].In [8,45] the appealing to diferent senses (auditory, tactile, or multimodal) in the way of how to deliver the TOR is analyzed.Especially for tactile/haptic interfaces some recent studies need to be mentioned [7,25,40], where additional forces or utilities support the awareness of the driver.The way auditory feedback can be delivered through a TOR difers between singular non-speech sounds, e.g.bells [2,18], and spoken instructions [22,31,37], also with possibilities to guide the driver's attention: in [3] an advanced TOR is proposed, so called Attention-guiding Takeover Requests (AGTOR), to improve takeover performance with directional sounds., so the road network is highly adaptable, features diferent road types, intersections, and bridges, and could be changed to any type of structure that might be relevant for the confict creation.In this paper we used a closed loop with limited routing possibilities (4 intersections) with focus on vanishing lane markings (implemented through diferent textures on the roads).A snippet of the 3D environment can be seen in Figure 3.

CONCEPTUAL WORK 3.1 Virtualization
Car physics is based on the open source package Realistic Car Kit4 (v2.3).We adjusted the car parameters to build a general car setup and linked a control interface to the package.The control mode can be switched between manual and automatic control intentionally or based on the confict detection in Section 3.2.
To draw the user's attention away from the streets (and to simulate some form of NDRT), we placed a display in the interior of the car looping a counting sheep stream (see Figure 1).For the initialization of the TOR we implemented both visual and auditory feedback.During the TOR a computer generated voice can describe the switch to manual control (implemented as AudioSource game object), while the auxiliary equipment can emit certain lights (steady red -automatic, blinking yellow -TOR, steady green -manual), as depicted in Figure 1.These features for TORs are optional and can be used individually or combined.

Sensors & control.
An image, captured by a camera attached to the car, is published to the network as CompressedImage.Msg5 (compressed as Portable Network Graphics (PNG)).To avoid delays, the update ratio is variable.For further analysis, we also implemented depth sensing in form of LiDAR.The LiDAR is attached to the car as well and simulated by an emitter and rotator.Rays are sent into the virtual scene from the LiDAR's origin (see Figure 3, where the rays are visualized as red debug lines).When an intersection point with the environment is calculated (all 3D game objects have mesh colliders), the point is added to a point cloud.A Gaussian distribution is optional to create entropy -we plan to add further error models like in [36] to include environmental infuences.The point cloud is published to the network as PointCloud2.Msg6 .The control inputs for the car are transferred as AckermannDrive.Msg7 with focus on steering and acceleration.

Confict detection
In this report, we focus on one of the confict situations described in Section 2.1.1 -Disappearing lane markings (No. 10 in Table 1).In our scenario, the car is driving and at some moment the lane markings disappear.We expect the system to detect the vanishing lane markings and to send a takeover request to the human operator, like in [4,42].Figure 5 shows the control loop of the simulation.Following the setup from Theers et al. [41], we use a simple lane detection model to extract the detected lanes from the camera image in Unity.We use color to mark the probability of each pixel belonging to a predicted lane, as shown on Figure 4.A trajectory that follows the lanes is given as an input to a Proportional-integral-derivative controller (PID), which outputs the steering angle, speed and brake.These values are then passed to the simulator as an AckermannDrive.Msg (see Section 3.1.3)to control the car.

USER EVALUATION 4.1 Setup
We performed an initial user evaluation of the simulation with a small group of 8 participants (Figure 6).Six participants used a steering wheel and the other two the keyboard for controlling the vehicle.At the beginning of the session, the participants had to drive the car in the simulated environment for a few minutes, in order to get familiar with the control of the car in manual mode.After that, the simulation was restarted in automatic mode, where the car was following a path.The participant was instructed to observe the car and take over control when requested.They had to observe the car driving while counting sheep and were asked to report the number of counted sheep, as a distraction from observing the road.After a randomly selected time between 30 and 120 seconds the TOR was initiated.It included a visual (blinking yellow light, see Section 3.1.2)and auditive transition signal ("Switching to manual control").The automated driving mode was restarted three times.After the experiment, the participants were asked to fll a feedback form with questions about the usability of the system, the clarity of the TOR and general comments about the simulation.We asked them also about their experience with computer games, Augmented Reality (AR)/Virtual Reality (VR) systems, car simulators, and autonomous (features of) cars.

Analysis of the results
The participants provided useful feedback about their experience with the simulation.Their responses were in several main directions: Control of the car.Most of the participants using the steering wheel reported that it was not easy to use -the wheel was too responsive and it was hard to follow the lane, and the gas pedal was reversing when released.Half of the participants reported they did not feel in control of the car when driving it manually.The participants who controlled the car with the keyboard also found the control too responsive.
Observing the car running in autonomous mode.Most participants found observing the car to drive boring, and for the cases where it was taking more time, they had a few seconds of reaction time to fgure out that they need to take back control.An interesting initial observation from the answers is that the people who reported previous experience with autonomous cars, found observing the autonomous mode uncomfortable, and the other way around.
NDRT Several participants commented that the provided distraction was too easy to trick -the counted sheep appear at regular intervals and they could trick it by counting a sheep every second without looking at it, so they could focus on the road and the autonomous car instead.
TOR The user evaluation did not include information about uncertainty in the lane detection and only focused on usability of the simulation and the efciency of the currently implemented TOR signals.
• Reason for takeover: all participants were confused about the reason for takeover.They did not understand why the system was requesting switching to manual control.• Visual signal: Most participants found the provided visual signal for TOR useful, but some did not notice it until the audio signal was introduced as well.• Audio signal: All participants evaluated the usefulness of the audio signal ("Switching to manual control") clear and useful.
General comments about the environment: Some participants commented on directions for improvement, such as the road being too narrow, the cars coming in the opposite direction were sometimes going through our car.
The main fnding of this initial evaluation is that it is necessary to provide explicit explanation about the confict when a TOR occurs.The audio signal was very clear, however, it came about a second after the visual signal, which was not perfectly clear for all participants, which can be critical in a real situation.The control of the car in the simulation needs to be signifcantly improved in order for it to be usable for testing confict detection and further ideas for providing explanations to support the user's situation awareness.Also, a more complex distraction needs to be introduced.Some improvements in the environment are also necessary to make the experience closer to a real-world one.

CONCLUSION & FUTURE WORK
In this report, we present a tool for simulation of confict situations in autonomous driving.The tool is based on the Unity engine and allows creation of custom roads, defnition of confict situations, TOR signals, and representation of the conficts.The integration with ROS allows plug-in of various methods fore controlling the car and confict detection, based on input from the diferent sensors available from the simulation.As a proof of concept, we focused on one scenario of vanishing lanes and performed an initial user experiment to collect feedback about the usability of the tool and its currently implemented functionalities related to TOR.
In the future, we plan to defne more confict situations and experiment with various ways of (re)presenting the confict situation to the human operator.We will integrate more advanced ways for lane detection and automated driving.The feedback from the test users will allow us to improve the simulation by making the control smoother and improving the environment.We will also include and experiment with various ways of notifcation about the TORs and explaining the situation.For example, we can project the lane markings on the screen and show the probability of the detected lanes.We believe that this tool would enable future research in the feld of confict detection and situational awareness.

Figure 1 :
Figure 1: Car interior with colored auxiliary equipment as visual TOR feedback and counting sheep stream as Non-Driving Related Task (NDRT).
3.1.1Unity & ROS.To create a simulator for studies on confict handling in autonomous driving[12] we use the Unity game engine (2021.3LTS) in combination with ROS Noetic 2 .We chose Unity over other simulators, e.g.[6], because of its enhanced capabilities in mixed reality (for later research steps) and customization features.The connection between Unity and ROS is established via Transmission Control Protocol (TCP) connection.This way Unity just works as data generator and can produce data and receive control inputs the same way a real autonomous car would do.The synergy between Unity and ROS is depicted in Figure2.

Figure 2 :
Figure 2: Link of the Unity application and ROS with ROS nodes (green), Unity simulation (purple), virtual assets and sensing (orange) and car parameters (blue).

3. 1 . 2
Virtual world & car.The environmental assets comprise traffc signs, buildings, and smaller objects like bus stations.The streets are spline-based and generated with the open-source package Road-Architect 3

Figure 3 :
Figure 3: Virtualization of LiDAR in the 3D environment.

Figure 4 :
Figure 4: Predicted lane markings projected on the camera image given as model input.

Figure 5 :
Figure 5: Control loop of our simulation.The control module takes as an input a camera image.Then it detects lanes and outputs control back to the simulation.Information about the predicted lanes and model confdence are displayed to a human operator who can take over.

Figure 6 :
Figure 6: Photo of a participant in the study driving in the simulation with a steering wheel.