Making Sense of Constellations: Methodologies for Understanding Starlink's Scheduling Algorithms

Starlink constellations are currently the largest LEO WAN and have seen considerable interest from the research community. In this paper, we use high-frequency and high-fidelity measurements to uncover evidence of hierarchical traffic controllers in Starlink -- a global controller which allocates satellites to terminals and an on-satellite controller that schedules transmission of user flows. We then devise a novel approach for identifying how satellites are allocated to user terminals. Using data gathered with this approach, we measure the characteristics of the global controller and identify the factors that influence the allocation of satellites to terminals. Finally, we use this data to build a model which approximates Starlink's global scheduler. Our model is able to predict the characteristics of the satellite allocated to a terminal at a specific location and time with reasonably high accuracy and at a rate significantly higher than baseline.


INTRODUCTION
Low-earth orbit (LEO) satellite networks are expected to play an important role in achieving global broadband-like Internet connectivity since they enable low-latency, last-mile connectivity without heavy infrastructure costs (e.g., cell towers and fiber deployments), Unfortunately, prior work has shown lackluster network performance from Starlinkconnected end-hosts, both via measurements [20] as well as simulations [7,17].Despite these findings, researchers have been unable to propose and validate methods for improving the performance of the Starlink network.This is, in large part, due to the opacity of the network and its scheduling algorithms.Knowledge of algorithms responsible for determining which satellites route traffic from specific user terminal locations is key to engineering performance improvements for the network.In this paper, we address this gap in knowledge.In particular, we empirically uncover the scheduling algorithms used by the Starlink network by analyzing data from high-frequency measurements from four Starlink terminals (deployed both in the US and the EU) to servers co-located at their corresponding Point-of-Presence.
From our longitudinal and high-frequency measurements, we find that Starlink routes traffic from user terminals to their ground stations in a two-step process.First, a global network controller allocates a satellite to each user terminal based on a variety of factors including load, geospatial conditions, satellite charge, etc.Our experiments show that these allocations are made every 15-seconds, globally.Second, a local on-satellite controller schedules flows from the user terminals assigned to it.Taken together, these findings suggest that the hierarchical traffic engineering mechanisms, commonly deployed in terrestrial WANs [13], are also deployed by Starlink.We were able to confirm the validity of these major findings using recent FCC filings by SpaceX [5].
This work is the first to uncover hierarchical controllers and the characteristics of traffic engineering in Starlink, due to the following three key reasons.First, our high-frequency (millisecond granularity) measurements allow us to observe signatures of traffic engineering (e.g., abrupt latency changes) that cannot be observed with the coarse-grained measurements of prior work [20].Second, by co-locating our destination server at the Starlink PoP, we minimize the influence of terrestrial latency on our measurements.Finally, our novel methodology for identifying the satellite currently serving a given terminal allows us to obtain ground-truth regarding the traffic engineering decisions made by Starlink.
Contributions.This paper makes four main contributions.
• We use high-frequency measurements to show evidence of hierarchical traffic engineering on Starlink ( §3).• We develop a novel technique for identifying the satellite that serves a user terminal ( §4).• We uncover the characteristics and preferences of Starlink's global scheduler -i.e., the algorithm responsible for allocating satellites to specific user terminals ( §5).• We develop an approximation of the Starlink global scheduler that can predict the satellites allocated to user terminals with reasonably high accuracy ( §6).

BACKGROUND
Starlink is a low-earth orbit satellite constellation consisting of nearly 4000 satellites in the low Earth orbit (LEO).The Starlink ecosystem has four key components: (1) In-orbit satellites, (2) User terminals or dishes (3) Ground stations and (4) Points of Presence (PoPs).Figure 1 shows how these components interact with each other to provide Internet connectivity to Satrlink's end users.
User terminals.Terminals are deployed on user premises to connect to in-orbit satellites.Starlink user terminals are sophisticated phased-array antennas equipped with a motor that can physically reposition the angle of the dish to track fast moving satellites in the sky.User terminals can connect to any satellite at an angle of elevation higher than 25 • s (see Figure 1).While tens of satellites satisfy the angle of elevation constraints, a terminal can connect to only one satellite at a time.Terminals forward user traffic to the satellite assigned to them.Internals of the algorithm that maps user terminals to satellites are currently known only to the operators of the Starlink network.In this work, we empirically demonstrate characteristics of this algorithm and build its approximation.
Satellites.A Starlink satellite connects to multiple user terminals at a time.They allocate radio frames to user terminals mapped to them to exchange data.In our work, we find evidence that this allocation is determined by a local on-satellite controller.In fact, we also find the description of this controller, referred to as the medium access control scheduler [14], in recent FCC filings from SpaceX.This controller considers factors such as user priority, current load, and per-terminal flow characteristics when forwarding the traffic from user terminals to ground stations.
Ground stations and PoPs.Ground stations consist of a set of phased-array antennas that receive traffic from satellites and send it through wired links to Starlink's PoPs.Like user terminals, ground stations can communicate with satellites at an angle of elevation higher than 25 • above the horizon, relative to the ground station.A PoP is a terrestrial server with wired connectivity to a ground station.PoPs are connected to the Internet backbone.From the PoP, the traffic is routed to destination on the Internet.

EVIDENCE OF TRAFFIC ENGINEERING
Experiment setup: Vantage points.We perform our measurement using four Starlink terminals -one each in Western Europe, Northeast US, Midwest US, and Northwest US.
To improve the precision of our measurements, we configured the Starlink router to operate in bridge mode and connected them to a dedicated Raspberry Pi via Ethernet.These Pi's were the source of our measurements.This approach prevents the complexities which arise from using wireless routers in the measurement infrastructure.The destination of our measurements were servers co-located at the Starlink PoPs assigned to the regions of our user terminals.This choice of destination allows our measurements to be relatively unimpacted by the vagaries of terrestrial networking.Experiment setup: Measurements.We conduct high-frequency measurements of the round-trip times and packet loss rates between our sources and destination servers.Packets were sent using iRTT [3] at the rate of 1 packet/20 ms and iPerf3 at a bandwidth of 50% of the upstream connection.These parameters were chosen because they allowed stable and reliable measurements of the Starlink network.At higher frequencies and bandwidths, the packet loss rates and measured round-trip times were highly variable even within the same measurement period.From these measurements, we recorded high-resolution round-trip times and packet loss rates.To facilitate accurate measurements of the roundtrip times, the clocks of our vantage points and servers were routinely synchronized using NTP.Observation: Starlink relies on a global controller for satellite-to-terminal scheduling.Figure 2 shows the changes in measured latency during a brief measurement window of two minutes for our Midwest US terminal.It is immediately obvious that major changes in latency characteristics occur every 15 seconds -specifically, at the 12th, 27th, 42nd, and 57th second past every minute.Notably, these changes are observed from all our measured locations for all periods of time.In addition to visual tests, we are also able to confirm that the latency characteristics observed during these consecutive 15second windows are statistically different (Mann-Whitney U test;  < .05)from each other for all locations and over the entire period of our measurements.These drastic changes in latency are suggestive of global changes in the satellites allocated to user terminals for several reasons, including: (1) our measurements effectively nullify the impact of terrestrial networks; (2) these effects were observed, simultaneously, from all our vantage points; and (3) these effects were noticed even when our terminals were running well under capacity.Upon further investigation, we discovered an FCC filing from SpaceX which describes a global scheduler for periodically allocating terminals to satellites [5].We conclude that our measurements have uncovered evidence of this scheduler and the periodicity of reallocations.It is important to note that this finding renders impossible the hypothesis that performance characteristics are associated with the movement of the satellite assigned to serve a terminal (e.g., Figure 7 in [16]).This is because changes in satellite allocation occur every 15 seconds which is insufficient time to meaningfully cause impacts on performance due to change in satellite positions/distances.In the remainder of this paper, we will uncover the characteristics of this scheduler and develop an offline approximation for it.
Observation: Starlink uses an on-satellite controller for scheduling terminal flows.The second peculiar characteristic of the latency measurements from our user terminals is that within the fifteen-second time interval, latency measurements the user terminal frequently form parallel bands that are a few milliseconds apart.These bands reflect evidence that radio frames are allocated to user terminals by an on-satellite controller in a somewhat round-robin fashion.Further investigation is required to exactly identify the characteristics of this controller, which we believe to be the on-satellite Medium Access Control scheduler described in a SpaceX FCC filing [14].

OBTAINING SATELLITE ALLOCATIONS
Our results show that the Starlink network uses a global scheduler to assign satellites to user terminals every 15 seconds (Section 2).Unfortunately, the Starlink mobile app no longer identifies the satellite that a user terminal is connected to.Not having this knowledge limits our ability to reverseengineer the mechanics of Starlink's scheduling algorithms.In this section, we develop a novel technique that leverages Starlink's obstruction maps to identify the satellite allocated to a specific user terminal.At a high-level, our approach involves correlating the publicly known positions of the Starlink satellites with observations of connected satellites recorded by in the obstruction maps of each terminal.
Data: Obstruction maps.Obstruction maps are 123px x 123px, 2-dimensional images which mark the trajectory of satellites that recently served the user terminal.These images are used to create a 3-dimensional map made available to users via the Starlink mobile app.The 3-d map is meant to help users identify the quality of the location of their terminal, highlighting any physical obstructions between their terminal and any satellites meant to serve the terminal.Figure 3a shows an example of this map from the Starlink app.We used starlink-grpc-tools [4] to extract the 2d obstruction maps every 15 seconds from each terminal.The 3-d maps are not possible to obtain programmatically.Figures 3b and 3c shows examples of these maps from two consecutive 15-second time slots.
Data: Satellite positions.Positions of Starlink satellites are available publicly from a variety of sources in a twoline element(TLE) format.We use CelesTrak [2] to get the TLEs for Starlink satellites.Since these files only indicate satellite positions every six hours, we use the SGP4 satellite propagation algorithm [22] to calculate satellite positions, relative to a terminal location, for a specific point in time.
Method: Uncovering gRPC obstruction map parameters.As seen in Figures 3b and 3c, the 2-d obstruction maps are plain -only containing white pixels which indicate the trajectory of recently connected satellites without any further context (e.g., the angle of elevation and azimuth).Identifying these parameters from this map is crucial for identifying the connected satellite.To do this, we align the recorded 2-d maps with the 3-d maps observed on the Starlink app.From this alignment, we find that: (1) the 2-d obstruction map is a polar plot centered at 62x62; (2) the radius of the polar plot represents the angle of elevation and ranges from 25 to 90; and (3) the  of the polar plot represents the azimuth, where  = 0 represents the North.Finally, since the obstruction map is a square which contains a polar plot, we also need to get the boundaries of the polar plot within this square.We accomplish this by keeping the terminal online for two consecutive days.In fact, over a 2-day period the terminal will establish connections with satellites from practically all the regions of the sky that are within the field of view.In turn, this will result in essentially fully coloring the polar plot region in the gRPC map, since the gRPC map does not reset (unless the terminal goes offline).Using this fully colored map (an example is shown in Figure 3e), we find that the radius of the contained polar plot is 45 pixels.
Method: Isolating satellite trajectory.After recovering the parameters of the gRPC obstruction map, we use them to identify the trajectory of the satellite that is allocated to the terminal for a specific 15-second slot (denoted by ).To obtain this trajectory, we perform an XOR operation on the obstruction map from  and  − 1 (i.e., the prior 15-second slot).This will result in the erasure of all satellite trajectories which were common to the two figures -leaving visible only the trajectory associated with the satellite connected to the terminal during slot .We note that for this method to work,  we require that satellite trajectories are not overlapping with the trajectories of previously connected satellites (since an XOR would erase the overlaps).To ensure that this condition is met, we perform a terminal reset every 10 minutes (since resetting the terminal starts a fresh gRPC map).
Method: Identifying serving satellite.From the prior step, we have the angle of elevation and azimuth trajectories of the satellite that served the terminal for a specific 15second slot.To identify the specific ID of the serving satellite, we use the computed relative locations of the satellites in the area (as previously explained) to obtain the IDs of all satellites which are visible to our terminal during the specific time slot.Next, we compute the angle of elevations and azimuths for each of these satellites during the given slot.Finally, we convert the trajectories to cartesian coordinates and then use the Dynamic Time Warping (DTW) [21] distance measure to compute the similarity of trajectories.We select the satellite whose angle of elevation and azimuth trajectories are the most similar to those recorded from the gRPC maps.We validate our similarity matching via a manual (visual) pilot test study, in which the authors manually identified the best match between 500 sets of gRPC and TLE trajectories.The DTW similarity method and our manual tests overlapped on over 99% of all outcomes.

STARLINK'S GLOBAL SCHEDULER
The Starlink network uses a global scheduler to allocate user terminals to individual satellites every 15 seconds (Section 2).However, the scheduling algorithm is not publicly known.
In this section, we analyze characteristics of satellites that were allocated to our terminals with the goal of reverse engineering the algorithm of the global scheduler.On average, there are 35-44 satellites in the field of view of a user terminal in any 15 second slot.Having identified the satellite that a terminal is connected to during a slot (Section 4), we compare properties of the satellites that are selected by the scheduler with those that were available but not chosen.

Impact of satellite position
Angle of elevation.First, we compare the positions in the sky of satellites that were available but not selected to the positions of satellites that were selected by the global scheduler.The position of a satellite in the sky is defined by its AOE and azimuth with respect to the user terminal.Figure 4 shows that the median angle of elevation of selected satellites (solid lines) is 22.9 • s higher than that of the available but unselected satellites (dotted lines).Although only 30% of all available satellites had their AOEs in 45 • to 90 • range, the global scheduler picked 80% of satellites from the range (averaged over all locations).
Direction.Figure 5 shows the distribution of azimuths of the two sets of satellites.The plot is divided into 4 quadrants which represent the direction -stated at the top of each quadrant -of the satellites relative to the face of the user terminal.Although the azimuths of available satellites (dotted lines) are evenly distributed throughout the 4 quadrants, azimuths of selected satellites (solid lines) are skewed towards the north of the dish, except for the dish in Ithaca, NY.We investigated this difference and found that our user terminal in Ithaca was severely obstructed by trees towards the north west direction causing it to pick fewer satellites from the north west.The terminal in Ithaca was assigned only 9.7% of the satellites from the region compared to 55.4% on average by user terminals in other locations.In other locations, 58% of satellites on average were available towards the north of the user terminals, however, the user terminal was mapped to satellites from the north 82% of the times.Rationale: The International Telecommunication Union has imposed a mandatory geo-stationary orbit exclusion zone, which prohibits LEO satellites from transmitting to or receiving from a ground station while being in the protected part of the sky [1].This mandate forces terminals at latitudes more than 40 • N, the approximate latitude of our terminals, to point much higher than required by the minimum angle of elevation constraint.This is why the global scheduler assigns satellites higher up in the fields-of-view of our terminals.Moreover, satellites with a higher AOE can communicate  with terminals in a more energy efficient way.The distance between the user terminal and satellite increases inversely with AOE.As radio frequency (RF) power decreases inversely with distance, satellites farther away need to use significantly more power to communicate with user terminals.

Impact of satellite launch dates
The Starlink constellation consists of more than 4,000 satellites which were released in batches since 2018.We analyze whether the global scheduler prefers satellites from certain batches over others.To achieve this, we bin satellites by the year and month of their launch batch.We then compute %(No. of slots a satellite from a launch was picked/No. of slots a satellite from a launch was available) for all 15 second slots in our observation.Figure 6 shows the distributions of these probabilities with the launch dates of satellites binned by year and month.Averaged over all locations, the probability of picking a satellite increases with an increase of one month in the satellite's launch date.Rationale: As Starlink satellites are launched in batches, the difference in service time between the latest and oldest satellites can differ by years.As a Starlink satellite has a service lifetime of about 5 years, some parts of the constellation would be out of service years before others.This would require consistent replacement efforts to maintain the constellation's coverage.Hence, using satellites launched later 2 0 2 0 -0 1 2 0 2 0 -0 5 2 0 2 0 -0 9 2 0 2 1 -0 1 2 0 2 1 -0 5 2 0 2 1 -0 9 2 0 2 2 -0 1 2 0 2 2 -0 5 2 0 2 2 -0 9 2 0 2 3 -0 1 Launch date of satellites

Impact of being sunlit
Starlink satellites are equipped with solar panels to provide power for their operations.However, as satellites orbit the earth, they periodically go in and out of sunlight.To serve user terminals during times when a satellite is not under sunlight, it has to conserve energy to stay functional until it gets sunlight again.For this reason, we analyze whether Starlink's global scheduler has preference for sunlit satellites.
For every 15 second slot, we calculate positions of all available satellites relative to the sun using the SkyField library.We then extract 15 second slots during which at least one sunlit and at least one dark satellite was available.During such slots, the global scheduler opts for the sunlit satellites 72.3% of the time averaged over all locations.We also find that the global scheduler only picks dark satellites during 15 second slots where the %(dark/available) satellites is >= 35% (averaged over all locations).Next, we compare the positions of the dark satellites that were picked by the global scheduler with the positions of their sunlit counterparts.We find that the AOE of dark satellites picked by the scheduler was 25 degrees higher than their sunlit counterparts.
Rationale: As mentioned earlier, the RF power decreases inversely with distance causing satellites to expend more energy in communicating with the user terminal.As dark satellites have limited battery to stay functional, the global scheduler assigns satellites to user terminals that are higher up in the field-of-view to conserve energy.

MODELING THE GLOBAL SCHEDULER
Our analysis has shown that the Starlink global scheduler has specific preferences -i.e., it selects newer satellites that are sunlit, located towards the North West, and at a high angle of elevation.We now use these preferences and our data to build an offline model of the scheduler.The goal of this model is to predict the characteristics of the satellite allocated to serve a terminal in a given location and time.Feature selection.Given a specific location and time, we first identify the satellites that are available to serve the user terminal using our TLE dataset and the SGP4 algorithm.Next, we cluster the available satellites based on their azimuth ( ), angle of elevation (), age (), and sunlit status () as follows: Given a set of satellites () available at time  for location , the satellite  ∈  with parameters (  ,   ,   ,   ) is placed in the cluster: ( Here  () and  () denote the mean and standard deviation of the feature  (computed from ).Effectively, this approach clusters satellites by how many standard deviations away from the group mean each of their features are.For example, a satellite in the cluster (1, 0, 2, 1) is further than 1 standard deviation away from the mean azimuth and 2 standard deviations away from the mean age of all the satellites in .As features for our scheduler, we present the local time (  ) and the count of satellites available in each cluster at the start of the nearest 15-second interval.Training, testing, and validating our model.Our goal is to construct a model which takes the above features as input and returns the cluster to which the allocated satellite belongs.We train a random forest model because of its robustness to overfitting and the explainability of its predictions.We got the parameters of this model using grid-search and five-fold cross-validation.We use the data gathered from each location and measured satellite allocations (Section 4) to construct our model -80% of the data is used to create a training/testing dataset for a five-fold cross-validation evaluation and the remaining 20% is used to create a holdout dataset to validate our model's robustness to overfitting.Evaluating the model.To measure the accuracy of our model, we use a top-k-accuracy metric -i.e., we analyze the accuracy of the  most likely predictions made by the model.To put this accuracy in context, we compare it to the baseline model which simply returns the (top-) cluster(s) with the most number of available satellites as its prediction.Our results obtained over our holdout set are shown in Figure 7.The proposed model significantly outperforms the baseline model and predicts the correct allocated satellite characteristics 65% of the time (=5 guesses), in comparison to the baseline of 22%.Limitations.Our model is constructed using only publicly measurable data related to Starlink's satellites.However, based on disclosures in SpaceX's FCC filings [5,14], we expect that other publicly-unavailable features such as terminal density in a region and satellite load characteristics will also impact the global scheduler.Therefore, the performance of our model is constrained by the unavailability of data.Despite this, our model is demonstrably robust to over-fitting  and is able to make reasonably accurate and explainable predictions at a rate far higher than baseline.
Model release.In order to facilitate future simulations and evaluations of the Starlink network, our model will be publicly available on paper acceptance.

RELATED WORK
Previous work has studied Starlink's performance along the axes of latency and throughput with coarse-grained measurements [20,24].Researchers have found geographic variability [16] in Starlink performance.Others have developed simulations of LEO constellations [17,25] to suggest improvements to the network [8].Researchers have proposed to improve end host performance in Starlink using better routing protocols [11,12], better beamforming techniques [18], satellite network topology reconfiguration [6], clean-slate protocol design [10] and efficient satellite handoff techniques [15,19].Security researchers have also found potential attack surfaces in Starlink [9,23].

CONCLUSIONS
In this paper, we leveraged high-frequency and high-fidelity measurements from four globally distributed vantage points to uncover the presence of a hierarchical traffic controller -a global controller for allocating satellites to user terminals and an on-satellite local controller for scheduling user flows.Using a novel technique for identifying the satellites allocated to terminals, we identified the characteristics and preferences of the global scheduler and developed an offline approximation.Taken all together, our work is a first step towards understanding Starlink's traffic engineering decisions and supporting future efforts to model and evaluate scheduling algorithms for the Starlink network.

Figure 3 :
Figure 3: Obstruction maps (a) obtained from the Starlink app, (b, c) obtained from gRPC for two consecutive 15-second slots  − 1 and , (d) their XOR, and (e) the gRPC map after two days without a terminal reset.

Figure 5 :
Figure5: CDF of AZs of satellites available (dotted) vs. AZs of satellites chosen (solid).The x-axis is divided into 4 quadrants of 90 degrees each; the direction of the quadrant relative to the user terminal's face is listed on top of each quadrant.

Figure 7 :
Figure 7: Accuracy of our model compared to baseline using the top- accuracy metric.