Predicting Human Teammate's Workload

High pressure environments (e.g., disaster response) can result in variable workload that decreases human performance, and degrades the overall mission performance of human-robot teams. Preemptive human workload prediction enables the robot to adapt its behavior or the mission plan as a means of optimizing human performance. However, state-of-the-art workload prediction research only predicts cognitive workload for only a maximum of five seconds into the future. An approach for addressing this research gap is to employ multi-step prediction in additional workload components (e.g., cognitive, auditory, speech, visual, gross motor, fine motor, tactile).


INTRODUCTION
Uncertain, highly dynamic environments (e.g., disaster response) require robots to account for their human teammate's internal workload state.Knowledge of the human's cognitive workload alone is insufcient to determine the human's overall workload in peer-based teams [20,47]; multi-dimensional workload estimates of each component (i.e., cognitive, auditory, speech, visual, gross motor, fne motor, and tactile) [6,7,24,25] are required.
Workload is the ratio between the resources required to perform a task, and the human's resources available to dedicate to the task [51].Workload may be impacted by internal factors (e.g., fatigue, or experience [23]), is susceptible to individual diferences [36], and varies day-to-day [13] or from task-to-task [53].Workload can also signifcantly impact performance [50].An overload state may occur when the human has insufcient resources for the task demands [51], and underload may occur when the task demands are very low, both of which can negatively impact performance [50].
Disaster response domains often begin with a mission plan, and humans will train to perform anticipated tasks with their human or robot teammates, as shown in Figure 1.Unstructured environments prevent the use of environmentally embedded sensors (e.g., cameras), necessitating wearable sensors to capture physiological signals for workload estimation algorithms.Physiological signals (e.g., heart rate) from the training sessions may be used to estimate current workload.Proper adaptation by a robot partner requires predictions into the near future (e.g., minutes) to enable actions to redistribute team members' task assignments, or adjust the humanrobot interactions to mitigate a human's degraded workload state (e.g., overload or underload).
State-of-the-art workload prediction models are limited to structured environments with embedded sensors [52], largely focus on long term forecasts (24 hours or more) [49], and often only on cognitive workload [8,40].A research gap exists for predicting future overall workload and its components, especially for unstructured, dynamic and uncertain environments.
The manuscript provides: related research in Section 2, a description of predictive horizon limitations in Section 3, and forecasting methods in Section 4. Final remarks are presented in Section 5.

RELATED WORK
Preemptively predicting workload levels is critical for taking actions to avoid deteriorated human-robot team performance, and maintain optimal human teammate workload levels.Existing prediction horizons may be broadly separated into long-term predictions for pre-mission planning [49], and short-term predictions for midmission adaptation [52].Long-term predictions may use mission schedulers detailing expected workload due to task ordering and expected task duration, while short-term predictions use statistical or machine learning algorithms to predict moment-to-moment (e.g., regression values over the next several seconds) workload values.
Existing methods for long-term workload predictions utilize premission planning to create task orderings.Mission schedules that incorporate individual workload consider each task with an expected duration and average workload level [16].Schedulers may create mission schedules where the requisite tasks can be completed without overloading the human, while maintaining order dependencies between tasks [17,49] (e.g., a blocked road cannot be traversed without frst being cleared).Workload levels may be predicted by tracking the individual's progress towards achieving the mission plan, either by tracking the human based on the time elapsed, or actions completed [10].These methods do not account for the individual diferences between personnel [36], or an individual's day-to-day diferences in workload response [13].Mission schedules that fail to adapt may be unresponsive to unexpected events, place additional demands on the human, or disturb task execution timing, resulting in poor predictions [15].
Semi-static methods may be used in long-term predictions made the day before, which lack sufcient information to make fne-grain predictions on their own, through secondary rescheduling operations mid-mission [43].Mid-mission task allocation and rescheduling may improve system performance by accounting for fatigue [43], predicting the task completion time [3], or preemptively allocating tasks to reduce uncertainty in dynamic environments [42].
Predictions over an hour in advance are not practical for midmission/mid-task adaptation due to compounding error and uncertainty [33], and loss of fne-grain fdelity (e.g.task executed at moment in time vs. task density over a period of time) [41].Longer horizons are more likely to encounter unexpected events that impact predictions [39], resulting in decreased accuracy.Short-term prediction workload algorithms make mid-mission predictions within the next minute.Existing algorithms predict workload levels within a single task, and only for a single time step ahead for cognitive workload [8,40,44,52], as shown in Table 1.
Prior workload values [52], historical task loads [40,52], and performance metrics [8] have been used as predictors for short-term workload prediction algorithms.Linear regression classifed future cognitive workload as overload, normal load or underload within one second [52].A two-part model predicted whether a human will be cognitively overloaded in the next fve seconds, and the likelihood of the human being overload based on the previous three seconds of performance metrics [8].Evolving Graph Convolutional Networks predicted cognitive workload on a seven point scale using the previous fve seconds of domain specifc (i.e., air trafc controller) predictive features [40].Most existing algorithms employ prior workload estimates for predicting future workload values.Alternatively, the physiological data (e.g., heart rate) used as estimation features may also be forecasted.Predicted future physiological data may be used as inputs for workload estimation algorithms, resulting in predictions of future workload values.However, this approach has only predicted cognitive workload one second into the future using the prior fve seconds of EEG data [44], which is unreliable in unstructured environments due to the human's physical activity [21].

Research Gap
Current research uses scheduling planners to predict tasks' ordering and duration, while using each task's expected workload values, when making predictions greater than one minute.Sub-minute prediction horizons use machine learning algorithms to predict the next time step for the same task.Currently, these predictions are only for cognitive workload and only for fve seconds or less, which is an insufcient horizon length for robot adaptation.
A system is needed that predicts the remaining workload components (i.e., visual, speech, auditory, gross motor, fne motor, tactile), while also providing longer multiple-step predictions.Additionally, a workload prediction gap exists for time horizons between fve seconds and 15 minutes, which is needed to enable preemptive adaptation.Open research questions include analyzing workload predictions, such as how lag horizons impact prediction horizon accuracy for each workload component, and how model choice impacts the maximum prediction horizon.

WORKLOAD CONTENT HORIZONS
Content horizons describe the threshold at which predictions provide no further information relative to an unconditional mean estimate [18,19].Knowing these infection points is critical for designing robot adaptation algorithms; however, the level of prediction specifcity is a signifcant factor in determining predictions' duration reliability.Workload prediction must account for elements of the task performed at the predicted point, as well as the level of resources that will be available at that point in time.The level of detail related to the tasks the human will be performing (e.g. a task executed at a particular moment in time vs. task density over a period of time) places limits on the prediction horizon, as fner-grain predictions require greater accuracy.
Four time horizon categories of interest are: naive persistence, point prediction, average/peak workload prediction over a window , and unconditional mean prediction, as shown in Table 2.The time horizons , , represent infection points (e.g., represents when a prediction is equivalent to using the unconditional mean).
Naive persistence models are common baseline forecast models [38] that treat workload as a fxed value.The workload value at time t+1 is considered equal to the workload value at time t.This method establishes a minimum prediction horizon, where predictions become useful for robot adaptation, rather than just relying on current workload values.
Instantaneous, or point, workload predictions must predict active task workload components at prediction time +ℎ , as well as the available human resources.This level of specifcity may impact the prediction's accuracy, as well as the content horizon, due to the dependency on task prediction.
Prediction accuracy may be improved, and the content horizon may be increased by reducing the level of task specifcity.Rather than predicting the task being performed at each time point, the average workload may be predicted over a window of length .However, predicting averages may obscure workload peaks and valleys resulting in overload, or underload, respectively.Therefore, when predicting average workload over a window, the peak and valley workload during the window duration must also be predicted.
Sufciently large window sizes will approach the mean over the entire high-level task.This point represents predicting the unconditional mean, at which point predictions provide no further value, lacking the requisite information to perform diferently than predicting noise over the mean.

WORKLOAD TIME SERIES FORECASTING
Workload may be treated as a time series with the current level of workload dependent on prior workload conditions [26,31,45], with many workload estimation algorithms further inheriting physiological features' temporal dependency [46].The remainder of this manuscript will model the multi-dimensional workload estimates [6,7,24,25] as a series of discrete-time series, , or a set of sequential, uniformly spaced data points, where each data point is dependent on the preceding data point(s).
Time series forecasting may be univariate (i.e., only prior workload component values are used as predictors), or multivariate (i.e., additional time dependent variables are used as predictors) [11,37].Time series forecasting predicts the distribution of future observations given previous data points (i.e., p( +ℎ | 1: , 1: for multivariate time series, where , ℎ and refer to the workload time series, horizon length and input covariates respectively), by detecting trends and patterns in historical data in order to extrapolate future values.Common methods for time series forecasting are statistical methods (e.g., exponential smoothing [14,28], or autoregressive integrated moving average models [29]), or machine learning algorithms (e.g., recurrent neural networks [2]).
Forecasting methods may be divided into three categories to be investigated: one-step, iterative multi-step, and direct multi-step based on the number of horizons predicted [1,22].Two modeling approaches are also presented for performing workload prediction.

One-Step Forecast Methods
One-step forecasts are the simplest forecast horizon, predicting just one time step ahead.The time step size may be manipulated (e.g., one second vs. fve seconds), changing the prediction horizon.
One-step forecasts may be most useful in robot adaptation of interaction modalities (e.g., communicating visually rather than verbally, based on predicted high speech and auditory workload), as the prediction horizon may be too short for task redistribution.Additionally, multi-step forecasts (e.g., iterative methods) may require prior implementation of single-step forecasting methods.

Multi-Step Forecast Methods
Three strategies for multiple-step horizon forecasts are considered: iterative, direct and Multi-Input Multi-Output (MIMO).
Iterative, or recursive, forecasting uses a series of one-step predictions, where each one-step forecast output is used as input to predict the next time step.This method is relatively simple, with low computational cost, but the error present in each input value provided by the one-step prediction compounds over the multiple predictions [34].However, iterative methods provide detailed predictions for each time step, enabling fner-grain adaptation that may provide more optimal workload states.Compounding error from each step will limit the maximum usable prediction horizon.
Direct forecasting is comparable to one-step forecasting, but uses multiple-step ahead forecasts, not a single step.This more complex and computationally expensive method requires less training data [9].Iterative forecasting provides multiple horizons using a single model, while direct forecasting requires a separate model to be trained for each -step-ahead forecast [34].A weakness of this method is that each time horizon is treated independently, which excludes future data trend information.
A potentially signifcant downside to the direct method is that it does not explicitly account for each task performed over the prediction horizon, unlike the iterative method.However, direct forecasting methods generally have less error, and may predict longer horizons.Direct methods' longer duration prediction horizons provides the robot more time to plan preemptive intervention.The longer prediction horizon also enables a greater variety of adaptation actions (e.g., task redistribution).
Direct methods are further split into two categories based on what is predicted: instantaneous predictions, or average/peak window predictions.Instantaneous, or point, predictions predict exact workload values at time + ℎ, which allows for the specifc targeting of suboptimal workload conditions.Prediction time horizons and predicted information resolution are inversely correlated, with longer predictions containing greater uncertainty [33,41].Reducing the prediction specifcity, such as predicting average and peak workload over a time window at time +ℎ where ℎ represents the prediction horizon, may enable increased certainty and prediction horizon length.Average workload predictions provide knowledge of extended time periods in suboptimal conditions (i.e., overload, underload), while peak predictions provide the severity of suboptimal (e.g., overload) conditions.Increased prediction horizons provide the robot more time to take adaptive measures.
Direct forecasting methods can be expanded to use MIMO methods.Rather than generating a single output, MIMO methods forecast a series of outputs, thereby maintaining dependencies between future outputs.The output dependence maintenance helps avoid the direct forecasting conditional independence assumption, and lessens the accumulation of errors in the iterative method [9].

Design: Workload Prediction Method
Two designs are considered for the multi-dimensional workload prediction algorithm for each discussed step horizon (i.e., one-step, iterative, direct).Historical workload values and actions taken by the human will serve as predictors.Workload estimates for each workload component (i.e., cognitive, auditory, speech, visual, gross motor, fne motor, tactile), are provided via previously developed multi-dimensional workload estimation models [6,7,24,25].The action taken is provided by a task recognition model [4,5].

Neural Network Forecasting Models.
Artifcial neural networks can capture complex, nonlinear relationships between a forecast output and its predictors.Neural networks have forecast glucose levels [32], blood pressure [35], and heart rate [48].These networks can be trained to provide regression values for each workload component, using a "global" dataset of similar time series (i.e., other humans executing the mission plan) [37].Neural networks may also incorporate multiple variables changing with time in dynamic environments, enabling multivariate time series forecasting.
Feed forward neural networks consist of an input layer for predictors, at least one hidden layer, and an output layer that provides the predicted workload value(s).An autoregressive feed forward neural network using lagged workload component values as predictors may be used to predict workload.These models have less vulnerability to overtraining than more complicated models (e.g., recurrent neural networks), but often sufer from fxed lag horizons, and struggle to deal with missing values.Long Short Term Memory (LSTM) networks are a subset of recurrent neural networks, capable of learning long-term dependencies, and are able to learn when prior data (e.g., workload estimates) are no longer applicable and can be forgotten.LSTMs also intrinsically capture time-dependent properties of variables, and are more robust against missing intermittent data.LSTMs have successfully forecast physiological input signals used for workload estimation [44].However, LSTMs are vulnerable to overtraining, and may require large amounts of data, which can be difcult to acquire for human teammates.
A single predicted instance has limited use in preemptive adaption, as the prediction quality is unknown [30].The uncertainty in the data and prediction must be considered when taking preemptive actions.Prediction intervals represent the level of prediction uncertainty by providing a range of values the prediction is within (e.g., the predicted value has a 90% chance of being within the interval) [12].Prediction intervals may be calculated for the neural forecasting methods by using the standard deviation of the residuals scaled by the square root of the number of horizon steps, and a value associated with the confdence coverage [27].The inclusion of workload prediction intervals enables intervention planning.

Structural Time Series Models.
Structural times series consist of four components: level (a baseline level), trend (often linear data behavior over time), seasonality (patterns or cycles), and noise (variability in the data).Time series models may be represented as a sum of component functions (e.g., * ) and noise (e.g., ) [37], as shown in Equation 1.
Each component (e.g., local level, linear tread, seasonality cycle) may be modeled using linear-Gaussian State-Space Models (SSMs).SSMs are partially observed Markov models containing evolving latent hidden states, which evolve over time, generating observations using prior values of , and observed inputs (e.g., human actions).SSMs can forecast time series with hidden variables (e.g., workload), while diferentiating between the hidden state, and observations of the hidden state.Disaster response environments may produce dynamic, missing or irregularly spaced data, which SSMs may take into consideration.SSMs may also incorporate known (e.g., mission plan), or predicted, future human actions, to improve forecasts at time horizon + ℎ.Probabilistic SSM forecasts also provide confdence intervals for use in adaptation planning.The trade-of for improved capabilities is increased implementation/model complexity, and computational expense, compared to the neural forecasting models.Further issues may also arise if sampling rate diferences between the hidden state model and the observation model exist.

CONCLUSION
High risk, uncertain and dynamic environments (e.g., disaster response) may result in variable human workload across the workload components that can decrease human-robot team performance and potentially result in injury, loss of life or property damage.Predicting a human teammate's overall and individual workload components can enable a robot to take preemptive actions to adapt to mitigate the human's adverse workload conditions.
Current workload prediction approaches focus solely on cognitive workload [8,40], and only predict a single time step [44,52].These approaches fail to predict the other workload components (i.e., auditory, speech, visual, gross motor, fne motor, tactile) and the overall workload necessary for preemptive adaptation.Accounting only for cognitive workload will not provide sufcient information to reliably balance the human's workload in an uncertain and dynamic environment.These approaches also fail to investigate multiple-timestep prediction problems that provide predictions into the future (i.e., 1-5 seconds is insufcient).
This manuscript proposes several methods for obtaining multistep predictions using iterative and direct time series forecasting.These methods will provide predictions for the missing workload components, with sufcient prediction horizons for preemptive adaptation.The neural network and Structural Time Series models can be used to explore the content horizon prediction limitations to determine adaptable time horizons, that can mitigate adverse workload conditions and improve human-robot team performance.

ACKNOWLEDGMENTS
The presented work was partially supported by ONR grants N00024-20-F-8705 and N00014-21-1-2052.The views, opinions, and fndings expressed are those of the authors and are not to be interpreted as representing the ofcial views or policies of the Department of Defense or the U.S. Government.

Table 1 :
Short term workload prediction algorithms.