CityLearn v2: An OpenAI Gym environment for demand response control benchmarking in grid-interactive communities

As more distributed energy resources become part of the demand-side infrastructure, it is important to quantify the energy flexibility they provide, as well as identify the best control strategies to accelerate their real-world adoption. CityLearn provides an environment for benchmarking of simple and advanced control algorithms in virtual grid-interactive communities. The updated CityLearn v2 environment introduced here extends the v1 environment to provide load shedding flexibility through heating ventilation and air conditioning power control coupled with a data-driven temperature dynamics model. The updated environment also includes the functionality to assess the resiliency of control algorithms during power outage events.


INTRODUCTION
The shift from fossil-fuel energy sources to renewable energy sources (RESs) and the electrification of end-uses are pathways towards climate change mitigation, but risk creating new challenges for the electricity grid if mismanaged.The intermittency of RESs could undermine grid resilience due to the mismatch between electricity generation and end-use demand [11].Furthermore, extremeweather events such as heat waves and winter storms can cause both increase in demand and decrease in supply due to power outages.
While distributed energy resources (DERs) are able to provide load shifting and shedding flexibility [7], it is challenging to control DERs to serve diverse occupant behaviors and coordinate multiple resources in many buildings.Advanced control algorithms such as model predictive control (MPC) [4] and reinforcement learning control (RLC) [6] can effectively manage DERs, by adapting to unique building characteristics while, cooperating towards gridlevel energy flexibility and resiliency objectives.As more DERs become part of the demand-side infrastructure, it is important to quantify the energy flexibility that they provide, as well as identify best control strategies to accelerate demand response (DR) program adoption.
CityLearn, is an open-source Gym environment for the easy implementation and benchmarking of simple, e.g., rule-based control (RBC) and advanced control algorithms for DR in grid-interactive communities [12].It has been applied in voltage regulation [9], transfer learning [8], and meta-reinforcement learning [13] problems.In contrast to its alternatives e.g., ACTB [5] and BOPTEST [3], it supports district-level and multi-agent control and does not require a compute-intensive co-simulation engine by using simplified first-order energy models compared to DOPTEST [2].Here, we introduce the CityLearn v2 environment that improves on the v1 environment to provide load shedding flexibility through heating ventilation and air conditioning power control coupled with a datadriven temperature dynamics model.Additionally, our updated environment includes the functionality to assess the resiliency of control algorithms during power outage events.

CITYLEARN
The CityLearn environment (Fig. 1) includes simplified energy models of buildings that contain heating ventilation and air conditioning (HVAC) systems (heat pumps and electric heaters) and energy storage systems (ESSs) storage.Each building's space cooling, space heating and domestic hot water (DHW) heating end-use loads are independently satisfied through air-to-water heat pumps.Alternatively, electric heaters in place of heat pumps can be used to satisfy space and DHW heating loads.ESSs are charged by the HVAC system that satisfies the end-use that the stored energy services.All HVAC systems as well as plug loads consume electricity from any of the available electricity sources including the grid, photovoltaic (PV) system, and battery.RBCs, RLC or MPC agent(s) manage load shifting in the buildings by determining how much energy to store or release at any given time.The control architecture is either one agent to many buildings (centralized) or one agent to one building (decentralized) with optional information sharing amongst agents.With CityLearn v21 , the agents can also, control the available power from the HVAC system to shed space thermal loads, preheat, or precool the building.The consequence of the controlled power on the indoor dry-bulb temperature, i.e., the building dynamics, is modeled using a long short-term memory (LSTM) surrogate model based on the work by Pinto et al. [10].
CityLearn v2 also provides the functionality to assess the resiliency of control algorithms during power outage events.CityLearn v2 provides a stochastic power outage model based on System Average Interruption Frequency Index (SAIFI) and Customer Average Interruption Duration Index (CAIDI) distribution system reliability