Zespol: A Lightweight Environment for Training Swarming Agents

Agent-based modeling (ABM) and simulation have emerged as important tools for studying emergent behaviors, especially in the context of swarming algorithms for robotic systems. Despite significant research in this area, there is a lack of standardized simulation environments, which hinders the development and deployment of real-world robotic swarms. To address this issue, we present Zespol, a modular, Python-based simulation environment that enables the development and testing of multi-agent control algorithms. Zespol provides a flexible and extensible sandbox for initial research, with the potential for scaling to real-world applications. We provide a topological overview of the system and detailed descriptions of its plug-and-play elements. We demonstrate the fidelity of Zespol in simulated and real-word robotics by replicating existing works highlighting the simulation to real gap with the milling behavior. We plan to leverage Zespol's plug-and-play feature for neuromorphic computing in swarming scenarios, which involves using the modules in Zespol to simulate the behavior of neurons and their connections as synapses. This will enable optimizing and studying the emergent behavior of swarm systems in complex environments. Our goal is to gain a better understanding of the interplay between environmental factors and neural-like computations in swarming systems.


Introduction
Despite having limited individual capabilities, species such as bees [16] and ants [14], combine their abilities collectively to achieve impressive feats, such as honeybee house hunting [14] and weaver ant nest construction [14].Researchers have drawn inspiration from this natural phenomenon and developed weak agents capable of working together to solve unique challenges, including clustering [9,34], classification [34], self-triggered coordination [26], and asynchronous cloud access [27].However, there are several issues hindering the advancement of emergent behaviors in low-powered, disposable robotic swarms.The first issue is the lack of standardization in simulation, development, and evaluation, resulting in challenging verification and extension of results [8,37].The second issue involves domain adaptation problems when trying to recreate simulated emergent behaviors physically, which leads to significant performance reductions [15,36,37].Finally, the increasing complexity of robotic control algorithms [19] present a significant obstacle to simulating multiple physical models required for swarming robotics.Although robotics research has often focused on developing individual capabilities [1], simulating swarming robotics requires the simulation of multiple physical models.However, current examples of distributed robotic simulation environments are often designed for larger-scale manufacturing and reinforcement learning problems [20], have complicated C and C++ interfaces [20], or are specialized for particular applications [32].
We developed Zespol, a Python-based simulation environment, to overcome the challenges of engineering swarms of robots with emergent behaviors.Zespol aims to facilitate research into the application of nature-inspired computing algorithms for swarming robotics.It provides a lightweight means of standardization for engineering swarms of robotics to emerge at collective behaviors before transitioning over to real-world experiments or higher fidelity simulations.Additionally, this framework establishes a direct connection to other simulation environments, minimizing domain adaptation penalties when switching from simulation to physical robotic systems.Zespol has native support for distributed parallelization and provides a modular, extensible, and well-documented Python-based interface compatible with neuromorphic computing platforms.

Background & Motivation
There have been a variety of previous works aimed at creating simulation environments for robotics and swarming applications.In [6], the Vectorized Multi-Agent Simulator (VMAS) is introduced as a framework designed for efficient multi-agent reinforcement learning applications.Their physics engine and the underlying control logic is written in PyTorch.VMAS has shown incredible performance in allowing parallel environments to run on GPUs.While this focus on CUDA accelerated hardware is beneficial for GPU-compatible workloads, it struggles when the simulation environment needs to expand to multiple heterogeneous compute nodes.A core assumption of the VMAS platform is holonomicity.We believe emergent behaviors can be decoupled from the sensing and control problem.Zespol does not force this trade-off of simulation fidelity for speed, as this would compromise the connection between simulation and a non-holonomic real robot.
Swarm-Sim as a 2D & 3D simulation core for swarm agents stands as a framework for the implementation and evaluation of swarming agents [11].The major limitations of this framework are the lack of direct support for cluster-level parallelization, and the discrete grid coordinate system, which precludes emergent behaviors that depend on agents having a continuous state, such as milling with ground robots.
Introduced in [33], SwarmLab is a MATLAB [24] based drone swarm simulator.The simulator is designed around a Drone class that supports quadcopter and fixed-wing aircraft dynamics.This makes their framework unsuitable for use with any non-drone agents without extensive modifications to the code.Interfacing MATLAB with modern learning and neuromorphic processing arXiv:2306.17744v1[cs.RO] 30 Jun 2023 frameworks presents additional difficulties.There are also notable performance issues associated with this framework that make running large-scale simulations a computationally expensive task.
MASON [22] is an agent-based simulation library designed from the ground-up to support custom Java-based [2] simulations.There are many similarities between MASON and Zespol such as the inherent separation between the environment and visualization systems along with the compartmentalized nature of individual simulations.Both MASON and Zespol allow agents to be given arbitrary dynamics.The major limitation of MASON is a consequence of using very advanced Java where the barrier to entry for new users can be high.This issue is only compounded when we consider the lack of a mature and low-barrier system for distributing these simulations among heterogeneous computing systems.Addressing these issues is one of the major goals of Zespol.
OpenAI Gym, introduced in 2016, was a pioneering platform in the field of single-agent reinforcement learning [7].Out of the box, they support a wide variety of applications for classic control problems such as Box2D [10], and Atari [4].Compared to Zespol, Gym has two major limitations in that it is primarily designed for reinforcement learning and the programmatic architecture around Gym is focused purely on single agent simulations which severely limits its applicability to multi-agent robotics [23,37].
NetLogo [38] is another multi-agent simulation environments.It is primarily designed to be used in educational environments, as evidenced by its integrated IDE with a drag-and-drop GUI.This makes programming behaviors easy, but the NetLogo language is limited.It is possible to run Python and R code from within NetLogo, as well as invoke a NetLogo simulation from a Java environment, but the interfaces are clunky and limited; thus NetLogo is largely incompatible with current means of distributing computation and simulation environments among heterogeneous computing systems and modern learning frameworks.NetLogo's simulation speed is, at best, equal to that of MASON [22], but struggles with anything higher than two-dimensional environments.
In [37], they conducted an interactive simulation in the design loop where simulated experiments where tightly coupled with realworld experiments.This study was broken up into four distinct portions to minimize the simulation to reality gap: 1) Characterizing the salient capabilities of the real robot, 2) Building a minimally viable simulation environment that characterizes the measured capabilities of physical robots, 3) Developing and exploring potential emergent behaviors in simulation, and 4) Deploying real robots based on simulation-driven-hypothesis and evaluating the performance penalties associated with the domain shift.They used a binary controller [5] for the salient capabilities of real robots and created stable milling behaviors in NetLogo that also performed the same behavior on physical robots.Despite their ability to minimize the simulation to reality gap, we are interested in deploying low-power and scalable neuromorphic computing platforms and explore novel methods of arriving at emergent behaviors.Zespol is designed as a simulation framework compatible with existing neuromorphic frameworks [29][30][31] and hardware [25].
Some of the key differences between prior works and Zespol are summarized in Table 1.Zespol is the only simulator that is written in user-friendly and well documented Python code, provides native capability for distributed (dist) simulation environments, and allows for arbitrary agent states and dynamics.Zespol objects are data structures responsible for containing all elements required for the object to function.For example, a robot object would contain the robot's current location, all of its sensor objects, controller objects, and control the interaction between these elements at every simulation time step.
The "Agent" object base class should be extended to support the specific requirements of a user's application.For example, The base class defines position and orientation vectors along three dimensions, a unique identifier, and the simulation time step fidelity.However, the tick method must be updated based on user requirements to control the interactions between sensors, controllers, and physical dynamics.
The "Swarm" class contains references to all agents within the swarm and controls the interactions between agents at every simulation time step.This is where the distributed nature of Zespol is highlighted because the memory and process spaces for all agents are separated, the processing of individual agent updates at every time step can be distributed across heterogeneous compute clusters with tools such as Dask [13].
Bringing everything together, we have the "World" class that contains every object and actionable element within the simulation environment.Therefore, this object maintains references to all swarms, visualization systems, and environmental objects such as world boundaries and obstacles.The last major responsibility of World objects is to manage the interactions between all swarms and environmental objects to manage the programmatic flow at every simulation time step.
For every agent, swarm, and world object there are associated states that contains a holistic view of the object with the central idea being the establishment of a shareable data structure that only contains fundamental information.This avoids repeatedly passing redundant information between objects.For example, AgentStates contain an Agent's location and orientation but shouldn't contain a copy of the Agent's sensor or controller.
"AgentsStates" are defined by a snapshot of the given Agent's current location and heading, the change in these values from the previous step, along with their unique identifier."SwarmStates" are represented by a collection of states from all member agents along with a variety of metrics such as angular momentum, center of mass, scatter, and radial variance.Lastly, the "WorldState" encompasses the states of all swarms along with all polygons that define the boundaries of the environment.
Besides the three predefined object-state pairs, there are three other notable objects within the system: Sensors, Controllers, and Visualizers.Each "Sensor" is representative of a real-world sensor such as an RGB camera or LIDAR scanner that uses information within the WorldState to recreate a synthetic version of the perspective an Agent would see from their location in the world."Controllers" accept input from Sensors and modify the location, orientation, and heading of an agent based on their physical dynamics.These dynamics are arbitrary so they can be modified to fit a user's specific application."Visualizers" in Zespol are separable, optional components of the simulator.They take a WorldState at every time step and generate visual output.We include a visualization system based around Matplotlib [17] to provide users with an example to follow when extending these utilities to support their specific application.
The overall algorithmic flow starts at (1) initializing all Swarms and Agents within the World.(2) The WorldState object is constructed by querying all Swarms and Agents for their SwarmStates and AgentStates, respectively.(3) For every Agent within every Swarm, an artificial sensory perception is calculated in the Agent's Sensor based on its location relative to all other elements in the environment.(4) This perception is then passed to the associated Controller where the AgentState is modified.(5) Once every Agent in every Swarm has calculated their new states, any visualizations and logs can be created.( 6) Lastly, the newly accumulated World-State is used to progress through the next simulation time-step.Figure 1 provides a visual representation for the algorithmic flow between the fundamental Zespol elements.

Initial Results & Discussion
Our initial use case for Zespol was recreating the circular milling behavior from [9,37] where agents move in a uniform circle.Using knowledge gained from [37] and Zespol's modular framework, we set up a simulation environment consisting of 9 Flockbots [21] with each being equipped with a front-facing infrared proximity sensor and a differential drive system.An image of a real-world Flockbot can be seen in Figure 2.There are numerous parameters for the Flockbot milling behavior that we selected based on the results of [37] where the World ticks at 30 ticks per second, the Swarm contains 9 agents, and Sensors have a view distance of 3 meters and the same asymmetric field-ofview found in [37] with a left bound of 11.5 degrees left of center and a right bound of 4 degrees left of center.
Zespol successfully models the complex coordination between multiple agents that results in a stable milling behavior.A visualization of the resulting formation is shown in Figure 4.This highlights the ability of Zespol to recreate emergent behaviors from other simulated environments and experimental results that have been validated on real-world robotic systems.In conclusion, the field of agent-based modeling and simulation for studying emergent behaviors has witnessed substantial growth in parallel with the demand for robotic systems that can perform collective tasks.However, the lack of standardization in simulation environments makes it challenging to compare and contrast newfound research ideas with existing methods.The Zespol environment is introduced to serve as a lightweight and modular, Python-based simulation environment for developing multi-agent control algorithms.It offers ample opportunities for adoption and expansion by the broader research community.Moreover, the fidelity of Zespol is evaluated against previously published results in simulated and real-world robotics, demonstrating its ability to replicate existing swarming algorithms with the comparison between Zespol, NetLogo, and real robots conducting the milling behavior with Flockbots.With Zespol, users can develop and standardize swarming algorithms before transitioning over to real-world experiments or higher fidelity simulations.Zespol also provides native support for distributed parallelization across compute clusters and is compatible with neuromorphic computing platforms.As a result, it is a promising solution to issues slowing the advancement of emergent behaviors in robotic swarms of low-powered and individually incapable robotic systems.
Although Zespol is already demonstrating promising results, there is still room for improvement to make it a solid foundation for research on the application of neuromorphic computing in swarming robotics.Our plans include developing formal interfaces for common neuromorphic computing frameworks such as Lava [18] and Nengo [3].We will also incorporate formal support for evolutionary algorithms [12] and Bayesian optimization learning schemes [28].To simplify the distributed nature of Zespol, we will create a user-friendly interface that minimizes the hassle of dealing with Dask [13] and multiprocessing [35].Additionally, we will incorporate a vectorized simulation module to run simulations on multiple GPUs across heterogeneous systems.Finally, we will leverage spiking controllers to discover novel swarming behaviors.

Zespol's underlying
architecture is designed with modularity in mind where each fundamental building block has a plug and play interface.This design philosophy allows users to develop their own blocks such as sensor modules, controllers, and physical dynamics.All simulations are designed to minimize inter-object dependencies to reduce the chance of segmentation faults and minimize communication latency by only passing critical information between blocks.Each interface is thoroughly documented with the provided examples showing how users can extend the framework to support their needs.More formally, each building block is represented by two data structures that form an object-state relationship.We have provided three fundamental object-state pairs, Agent-AgentState, Swarm-SwarmState, and World-WorldState.A more detailed description of these pairs is given in the following.

Figure 1 .
Figure 1.A flowchart presenting the critical programmatic flow within and between Zespol's main components.

Figure 3 .
Figure 3.A flowchart presenting a detailed view of the interprocess communication within our example Zespol application using a Flockbot robot with a binary sensor .

Figure 4 .
Figure 4.A swarm of Flockbot robots demonstrating a stable milling behavior in Zespol analogous to results found in NetLogo and real-world experiments

Table 1 .
Comparison of multi-agent simulators