PedSUMO: Simulacra of Automated Vehicle-Pedestrian Interaction Using SUMO To Study Large-Scale Effects

As automated vehicles become more widespread but lack a driver to communicate in uncertain situations, external communication, for example, via LEDs or displays, is evaluated. However, the concepts are mostly evaluated in simple scenarios, such as one person trying to cross in front of one automated vehicle. The traditional empirical approach fails to study the large-scale effects of these in this not-yet-real scenario. Therefore, we built PedSUMO, an enhancement to SUMO for the simulacra of automated vehicles' effects on public traffic, specifically how pedestrian attributes affect their respect for automated vehicle priority at unprioritized crossings. We explain the algorithms used and the derived parameters relevant to the crossing. We open-source our code under https://github.com/M-Colley/pedsumo and demonstrate an initial data collection and analysis of Ingolstadt, Germany.


BACKGROUND AND SUMMARY
Automated driving is a growing feld of research [25], with fully Automated Vehicles (AVs) being part of current discussions and research [44].AVs could provide numerous advantages, such as improving trafc fow [34].However, these advantages are currently only theoretical.The consequences of introducing AVs in greater numbers into public trafc can only be estimated as conducting large-scale studies in public is impossible when the safety of AVs is not clear yet [59,61].Also, fear of AVs is still signifcant in the population [27,48].Additionally, measuring the impact of many AVs on public trafc in many diferent locations might be unrealistic or expensive.Thus, creating virtual scenarios to simulate how AVs impact public trafc is more feasible.
This project examines the macroscopic efects of AVs in traffc and how the respect of pedestrians towards AVs' priority at crossings leads to diferent or fuctuating trafc fows.Currently, numerous research studies are concerned about whether AVs will have to be able to communicate with vulnerable road users such as pedestrians or cyclists [31].When AVs are regularly stopped due to pedestrian behavior, this can ripple through trafc, slowing down the overall fow.The efect is stronger with an increasing number of AVs with an external Human-Machine Interface (eHMI) as an eHMI serves as a communication between the human and the vehicles, contributing to a higher feeling of safety around AVs [4,54].The following provides background information about human behavior modeling, factors on crossing decisions, and eHMIs.

Attributes Infuencing Street Crossing
Several attributes contribute to pedestrian street-crossing decisions, including other pedestrians' behavior, group size, social status, and experience with AVs [15,54].Yagil [67] found that pedestrians are more likely to follow trafc laws when observing similar behavior from others.However, Lefkowitz et al. [43] demonstrated that this imitation is infuenced by the appearance of the other pedestrian.Contrarily, Dolphin et al. [24] argued that social status and gender do not signifcantly impact imitation, emphasizing the role of group size instead.In line with the importance of group size, Heimstra et al. [30] showed that children often cross streets in groups, which infuences their risk-taking behavior [29,58,60,65].Studying all these factors in an empirical study is nearly impossible, therefore, simulations are necessary.

Pedestrian Behavior Modeling
There exist several pedestrian simulation approaches.These can be distinguished into macroscopic or microscopic [52].Microscopic refers to simulations where each actor is simulated instead of, for example, fows.SUMO [22] represents a possibility to simulate mobility on the microscopic level.While "there are good models for optimal walking behavior, high-level psychological and social modeling of pedestrian behavior still remains an open research question that requires many conceptual issues to be clarifed" [3, p. 1].Camara et al. [3] showed that algorithms used age, gender, distraction, social group membership, cultural membership, and road safety adaptation to model pedestrian behavior.While most works use a deterministic approach, Völz et al. [64] showed a model that predicts the crossing decision at a crosswalk using support vector machines.Due to the unavailability of actual AVs on the streets equipped with eHMIs, such approaches are infeasible.
In partially related HCI domains, Savino et al. [57] evaluated bicyclist strategies to reach a given destination.It evaluates the effcacy of As-the-crow-fies (ATCF) navigation for cyclists, focusing on how diferent street network attributes impact the user experience.Using feature importance analysis across 1,633 cities, the paper identifes that an ideal environment for ATCF navigation has long streets, multiple turning options, few dead ends, and a grid-like structure.East Asian and North American cities are most suited for this navigation method, while Western Europe's street networks are least suited.For this, Savino et al. [57] simulated an agent using a modifed depth-frst search.Ikkala et al. [35] adopt a diferent method, biomechanically simulating a user's entire body.While this is a more accurate representation of a user in physical terms, the applicability to large-scale analyses is not yet possible.

PURPOSE
Using the microscopic trafc simulation tool SUMO [22], we vary pedestrian attributes that afect decision-making, making them more or less likely to respect AV priority at crossings.Microscopic trafc fow models focus on individual road user units, thus representing dynamic variables such as the position and velocity of each vehicle and pedestrian.PedSUMO seeks to measure macroscopic changes in trafc fow using diferent variables for pedestrian decision-making (e.g., gender of pedestrians, street width, vehicle size) with diferent percentages of AVs (with eHMI) in trafc.

CHARACTERISTICS
After repository cloning, install the requirements detailed in the requirements.txt.If Large Language Models (LLMs) are to be used, the requirements_llm.txtmust be installed.The requirements are minimal in addition to SUMO but require new versions for increased performance.If other cities than those provided are to be used, these must be downloaded and saved in the appropriate directory.We strongly encourage community input, either as comments, issues, or additional code in the GitHub repository.

CODE/SOFTWARE 4.1 Algorithms
The main idea of PedSUMO is to identify unprioritized crossings with pedestrians wanting to cross in each step of the simulation (see Figure 1).Additionally, the algorithm flters those for situations in which these pedestrians would not usually be able to cross due to an oncoming vehicle.If that oncoming vehicle is an AV, a chance for the waiting pedestrian to cross the road anyway and ignore the vehicle's right of way is calculated.
To increase performance during simulation time, a dictionary of all incoming lanes into each unprioritized crossing in the simulation is created when the scenario is selected.To achieve this, the successor of each lane in the network is evaluated.If the successor is an internal foe of an unprioritized crossing, the original lane is added to the set of lanes of the associated crossing.
After the incoming lanes dictionary is created, the main simulation loop starts.This simulation loop runs until the pre-confgured last simulation step (default = 3600 or 1h) is reached.At the start of each step, the terminated entities of the previous step are cleaned up, and newly added entities are adjusted.That includes assigning attributes such as age and gender to pedestrians and declaring vehicles as automated or manual.Afterward, every pedestrian's intent is evaluated.If a pedestrian intends to walk onto an unprioritized crossing as their next lane, this pedestrian is added to a list of waiting pedestrians for that crossing.
For each of these crossings, it is then determined whether the current situation is an av_crossing_scenario That is the case whenever a pedestrian would not usually be able to cross the road due to an oncoming vehicle, but that vehicle is marked as an AV.On the side, the closest vehicle and its time to collision and distance to the crossing are calculated for future use.
If the situation is an av_crossing_scenario, the crossing probability is calculated.To avoid redundancy, all defance factors specifc to the crossing, such as street_width_defance_factor or the vehi-cle_size_defance_factor , are calculated.Then, for each pedestrian wanting to cross the evaluated crossing, their individual defance factors, such as the waiting_time_defance_factor , are calculated.The supplementary material lists all factors and their calculation.
The total crossing probability is then calculated by multiplying each factor with the base_automated_vehicle_defance.The decision to cross is simulated by comparing this probability with a random number.If the pedestrian "decides" to cross, they are set to ignore all vehicles until they completely cross the crossing.Additionally, the danger of the situation is evaluated by calculating and then comparing the minimal stopping distance of the closest incoming vehicle in terms of time to collision with its distance to the crossing.If the stopping distance is larger than the vehicle's distance to the crossing, the situation is deemed dangerous.
Our implementation also allows the use of diferent LLMs provided by the HuggingFace transformers library [66] to identify potentially realistic behavior (see Park et al. [49]).Therefore, a prompt given the scenario values could start with: You are a pedestrian.You are standing at a street with some automated vehicles trying to decide whether you will cross it.You are distracted by your smartphone.There are no children in your vicinity.The approaching automated vehicle has an interface attached that communicates with you.You are not walking.The street is fve meters wide.The vehicle has a front area of three square meter.[...] After each crossing is evaluated, pedestrians who were altered in previous steps to ignore vehicles and successfully crossed their crossing get their alterations reset, and the next simulation step can begin.The usage of LLMs depends on the size of the Video Random Access Memory (VRAM) available and the chosen model.We suggest using 12GB VRAM or more.

Simulated Pedestrian Crossing Factors
Adjustable factors are diverse and have a diferent impact by default.The supplementary material shows a description of each factor with the corresponding source for reference: The relevant formulae determining the distribution of probabilities are described in the supplementary material.

Measurements/Logging
In addition to SUMO's standard output (see [23]), we log extra parameters in a CSV fle (see supplementary material).
Each crossing event has all factors listed that are explained in section 4.2, including defance values and their impact during the crossing event.Additionally, the static percentage of AVs (with eHMI) in all vehicles in trafc and the following data are logged in this fle for every crossing event.These can, as such, easily be used as independent variables.

USAGE NOTES
While SUMO generally allows the use of an OpenStreetMap (OSM) integration to simulate road networks, these often have to be fnetuned due to errors.Therefore, we provide already curated scenarios in Ingolstadt, Wildau, Monaco, and Bologna.Additionally available for simulation are Ulm and Manhattan, which were generated and adapted using SUMO's OSMWebWizard.
While the implementation is based on the scientifc literature, we highlight that the simulation cannot necessarily be seen as a true representation of the interaction between an AV and pedestrians.However, in line with Park et al. [49], the simulacra of human behavior with PedSUMO can generate insights that plausibly defne future behavior.This is currently the most appropriate avenue to study the large-scale efects of eHMI and AVs on trafc fow.
AVs represent a specifc manifestation of robots and are, therefore, directly relevant to the HRI community (e.g., see [2,40,41,51]).However, the current implementation can also serve as a basis for including simulated robots in communication with pedestrians.This is currently researched in the CHI and HRI community [50].

EVALUATION
As we were interested in the large-scale efects of AVs and eHMIs on trafc, we simulated Ulm, Ingolstadt, Monaco, and Bologna (e.g., see Figure 2).Due to time constraints, we chose a step size of 0.2 for the prevalence of AVs, eHMIs, and the base defance, resulting in 5 * 5 * 5 = 125 logs per city.A descriptive data report per city was generated via DataExplorer [18] and is attached in the GitHub repository under data.Due to the data size (between 275 MB and 4.2 GB), we will make the data available upon request.All relevant tables for the analyses are also available in the repository.We provide an initial overview of results for Ingolstadt, Germany, due to its realistically modeled trafc (taken from [63]).Because of the large number of data entries, using R or Python was too time-consuming.Therefore, we provide a Julia script which can be expanded.This reduced the runtime from hours to a few minutes.Due to our focus on providing the code, the analysis is not exhaustive.

Heatmap of Interactions
First, we provide a heatmap of all interactions over all parameter combinations in Figure 3.This heatmap shows that interactions occurred over the entire city.Attention: due to limits in Julia's visualization, the city had to be inverted vertically.We ftted a linear mixed model to predict crossing probability with regard to AV density, eHMI density, and base AV defance (see Figure 4).For a detailed description, see the repository.We ftted a linear model to fnd the correlation between AV density and collisions (see Figure 5).The linear model shows a downward trend of collisions with higher AV density.

DISCUSSION AND FUTURE WORK
In this work, we presented an implementation and preliminary data to study the efect of AVs and attached eHMIs in their interaction with pedestrians on a large scale.Our simulacra implementation relies on empirical data.However, scientifc data can be scarce regarding certain factors, showing a potential faw in how scientifc results are reported by solely reporting diferences but not quantifying them.Therefore, some numbers may be educated guesses rather than extracted from studies and statistics.Nonetheless, we argue it is the most appropriate way to study the large-scale efects.Additionally, we enable the usage of LLMs for deriving crossing decisions.Our frst evaluations reported in Section 6 show that we can simulate crossings in various areas of the cities and that, for example, the impact of AV density on collisions seems negatively correlated (i.e., more AVs lead to reduced collisions).
Very recently, Tian et al. [62] provided a novel model for the interaction of pedestrians and AVs.However, they do not provide an implementation, severely reducing applicability.In the future, we aim to re-implement this model to compare it against ours.Furthermore, we envision including additional mobility concepts, such as micromobility, in the interaction simulation and implementing interaction between manual drivers and other vulnerable road users.Besides, our approach can be extended to investigate the macroscopic efects of novel in-vehicle user interfaces (see [37,38]) on trafc.Also, the extensive resulting datasets suggest that spatiotemporal automotive user interface analysis [36] could facilitate future simulation analysis.

Figure 2 :
Figure 2: Overview of (parts of) diferent cities. Partially taken from previous work.

Figure 3 :
Figure 3: Heatmap of interactions between pedestrians and AVs in Ingolstadt, Germany over all parameter combinations.

Figure 5 :
Figure 5: Collisions with regard to AV density.