Short-term Energy Forecasting using the Regression Tsetlin Machine

Due to the unpredictable nature of the energy generated by distributed energy systems such as rooftop solar panels, energy forecasting plays a vital role in energy-related applications. To achieve this, machine learning models have been widely adopted. However, in order to minimize the loss, it is better to have accurate forecasting. Tsetlin machine (TM) is an emerging machine learning model with fast learning, low memory consumption, interpretability, and low energy consumption. In this paper, we propose an efficient short-term energy prediction model using the Regression Tsetlin Machine (RTM). We have evaluated short-term energy forecasting using Ausgrid consumer data and the experimental results show that the RTM is capable of achieving better results compared to simple multi-layered artificial neural networks.


INTRODUCTION
In recent years, there has been a significant increase in renewable energy integration among energy consumers.As a result of this, the electricity grid becomes a more intelligent system with the integration of smart metering, optimization techniques, communication techniques, automation, and distributed energy systems (DESs) such as solar energy [15,18].However, the generated electricity using DESs is unpredictable and intermittent causing voltage and power variations in the grid.The consumers connected to the electricity grid can be divided into two categories namely, prosumers and consumers.The prosumers are the consumers who can perform both electricity production and energy consumption.
Energy forecasting is a mechanism of using previous energy consumption or production data to determine future energy requirements.The energy forecasting methods can be divided into four groups namely, very short-term load forecasting (seconds to hours), short-term load forecasting (hours to weeks), medium-term load forecasting (days to months), and long-term load forecasting (months to years).Short-term load forecasting (STLF) aims to forecast energy demand for hours, days, or weeks.STLF is important for efficient and accurate power system operations [14,26].
There are various factors that affect the energy demand.Some of these factors are weather changes that affect the load variation, lighting, and heating, seasonal changes, the day of the week (weekday or weekend), and the time of the day.The days with extreme weather or anomalous days make energy forecasting extremely difficult.With the adaptation of artificial intelligence (AI) into energy forecasting, the forecasting results can be improved.
Over the past few years, many energy forecasting systems have been proposed.Some of them are regression, statistical, and AI models.Among the different AI models, Artificial Neural Networks (ANNs) have received huge attention due to its properties of clear model, easy implementation, and good performance [8,22].Furthermore, ANN can learn complex and non-linear relationships.However, the memorizing patterns of ANN due to overtraining may lead to pattern recognition rather than discovering relationships in the data.To avoid this, adjustment of the number of training data, random restart, and pruning nodes can be adopted.However, on the contrary, ANNs suffer from interpretability which is essential in understanding machine learning models and comprehending why certain decisions or predictions have been made.
The energy forecasting models help to predict the future energy demand while assuring the consumers get access to enough energy in the near future.However, the predicted values for energy demand or production using machine learning models may suffer from wrong or inaccurate values.This may lead to inaccurate production/consumption of energy.
A machine learning model that provides reasons for their output while increasing transparency is called an interpretable machine learning algorithm.Decision trees, decision rules, linear regression, and logistic regressions are some of the traditional interpretable machine learning models.However, the degree of interpretability may vary in the above-stated algorithms [20].When considering the accuracy of the complex problems, the deep learning models perform better compared to the above-stated algorithms.However, the deep learning models may not be able to be used for interpreting the results easily [19].Thus, in this paper, we propose an energy forecasting method using the TM [11] which is not only efficient but also leads to result in interpretability.
TM is an interpretable machine learning model that uses a propositional logic-based approach and produces decision rules (e.g., if X satisfies condition A and not condition B then Y = 1) [5].However, the accuracy of the prediction of these interpretable rules is better compared to other machine learning algorithms, and an accuracy comparison in different datasets is given in the original study of TM [11].Moreover, the TM has been expanded in different directions for many applications such as Convolutional Tsetlin Machine [13], Multi-Layered Tsetlin Machine [12], Tsetlin Machine with multi-granular clauses [10] and Regression Tsetlin Machine (RTM) [1,2].
The rest of the paper is organized as follows.Section 2 presents the related work while in Section 3 we introduce the TM and its operating principles.In Section 4 we present how the TM can be used for energy forecasting and in Section 5 we provide the performance evaluation.The paper concludes in Section 6.

RELATED WORK
There have been several studies found in the literature related to energy forecasting using machine learning models.In this section, we have summarised the existing works related to short-term energy forecasting (STEF).
Short-term energy forecasting (STEF) is an important research area with the application of time-series forecasting.One of the main focuses of the STEF in power systems is to analyze and predict the behavior of the system based on historical observations.Normally, a forecasting system needs higher complexity since the behavior patterns of the system depend on external data such as weather.Moreover, to have better prediction, data collection, accuracy, computational power, and implementation need to be improved.
ANN is inspired by the human biological system.This is related to how the human brain processes information and communicates it through the neuro system.ANN consists of three layers namely the input layer, the hidden layer, and the output layer.These layers consist of a highly interconnected set of nodes referred to as neurons.These neurons transform a set of inputs into a set of desired outputs.The transformation is dependent upon the characteristics of the elements and the weights associated with the connections.Given that weights influence the output, it is necessary to adjust the weights and the thresholds accordingly, which is referred to as the learning process of an ANN [9,23].The electricity consumption prediction for the heating, ventilating, and air conditioning (HVAC) system in a hotel is evaluated in [3] using ANN and random forests.The results show that the ANN performs slightly better than the random forests.Furthermore, several building energy forecasting methods using ANN are proposed in [17,21,28].
To increase the accuracy of energy forecasting models, the decision tree (DT) algorithm is used in [4,6,25,29].The DT consists of a set of decision nodes and a set of leaf nodes.A decision node represents an input parameter.Depending on the attributes in the parameter, a node can have two or more branches.A leaf node in DT represents a decision.DT models may be highly unstable with noisy input data [4].Even though DT has been identified as an interpretable, efficient method for classification, it is not the most promising solution for regression.
Deep convolutional neural networks, deep recurrent neural networks, and long short-term memory (LSTM) networks are some of the other machine learning models that have been used for energy forecasting and research work in [16,24,27] shows the involvement in those algorithms.It is evident that there are many algorithms as stated above have shown great success in forecasting.However, the above-mentioned algorithms fall behind in achieving the balance between accuracy and interoperability.Therefore, as a solution for the above-stated issue, we propose an interpretable efficient TM algorithm for energy forecasting.

THE TSETLIN MACHINE (TM)
The TM is an evolving classification mechanism introduced by Granmo [11].The TM is a team of Tsetling Automatas (TAs) which uses to manipulate expressions using propositional logic.A TA is a fixed structure deterministic automation.A TA can learn the optimal action from a set of actions in an environment.Fig. 1 shows an instance of a simple TA with 2N states with two actions.The TM is built as a collection of TAs and the operation process is as follows.To represent the patterns, propositional formulas in disjunctive normal form are used.Then, the propositional formulas are organized in the form of a game.These formulas are learned using the labeled data with the help of a set of TAs.To get the Nash Equilibria to correspond to the optimal configuration of the TM, the payoff matrix of the game has been designed.This helps to have a simple TM architecture which helps in achieving interpretability and transparency in both the learning and classification process.The TM is designed for a bit-wise operation and it takes bits as inputs and uses bit manipulation operators for learning.When using TM for continuous problems, the RTM can be used.
In the following, we briefly present the details of RTM.First, we start by defining the architecture of the RTM and then a brief description of the learning mechanism for RTM.

The Regression Tsetling Machine (RTM)
The RTM uses clauses as additive building blocks to calculate the continuous output.In other words, the polarities in the clauses are removed, and then the total sum of clause output is mapped into a continuous value.This value is between 0 and the maximum training output value ŷ .The sum of the continuous output  is calculated as, To determine the feedback for the RTM, the output  is compared with the target output ŷ.Depending on the value, Type I or Type II feedback is used to update TAs in clauses as stated below.
Here, the Type I feedback reinforces the true positive clause output while Type II output turns false positive clauses to 0. However, the clauses must be updated in a responsible manner.When the error is high in RTM, a large number of clauses should be updated to close the error gap.On the other hand, if the error is close to 0, a few clause updates might be enough to close the error gap.

EMPIRICAL EVALUATION
In this section, we present how the RTM is used to predict the energy.First, we introduce the dataset, and then we present how the dataset is preprocessed before applying it to the TM.Finally, the experimental results are presented.

The Dataset
In order to assess the feasibility of TM for energy forecasting, we used the Ausgrid dataset 1 .The Ausgrid dataset has the electricity data of 300 randomly selected solar customers registered in the 1 The dataset is available to download at https://www.ausgrid.com.au/Industry/Our-Research/Data-to-share

Data Pre-processing
For this study, we focus on energy generation and electricity consumption data along with related time and date.For our study, we considered general energy consumption data for one consumer in the system.Moreover, we considered 12 feature attributes for the prediction.These features are year, week, day of the week, holiday (weekday or weekend), and energy consumption data for the past week at the considered time.

The RTM on Ausgrid Dataset
We use several RTMs by varying the number of epochs to analyze the forecasting with the help of the pre-processed dataset 2 .Then, we have analyzed the performance by varying the number of clauses (m = 10, 50, 100, 200, 500, 1000, 2000).The hyper-parameters,  , and  are selected by using the binary search for distinct RTM setups.

PERFORMANCE EVALUATION
In this section, we present the performance of the RTM-based energy forecasting algorithm with the help of the Ausgrid dataset.
For the experiment, we have used energy consumption data until week 24 as training data which is 351 data samples for the RTM while the rest of the data is used for the testing.The performance was measured in terms of mean absolute error (MAE) while varying the number of clauses and the number of epochs in the RTM.The variation of MAE with the number of epochs is tabulated in TABLE 1.According to Table 1, we get the lowest MAE when the number of epochs is 200.Thereafter, we have set the number of epochs to 200 and varied the number of clauses and observed its effect on the MAE.The variation of MAE against the number of clauses is tabulated in TABLE 2. Furthermore, with the help of Figure 2 and Figure 3, we have illustrated the aforementioned variations graphically.
To compare the performance of the proposed method with a deep learning model, we implemented the deep learning model presented in [7] (Listing 4.6) and ran under similar conditions.We used three ANN architectures with the following parameter settings.
• ANN-1: 1 hidden layer with 20 neurons • ANN-2: 3 hidden layers with 20, 150, and 100 neurons • ANN-3: 5 hidden layers with 20, 200, 150, 100, and 50 neurons In TABLE 3, we have tabulated the MAE values for the abovestated 3 ANN architectures along with the MAE for the RTM with 200 epochs and 500 clauses given that this setting leads to the minimum error.From this tabulation, it is evident that the RTMbased approach can achieve better MAE when compared to simple multi-layered ANNs.

CONCLUSION
In this paper, we propose an efficient energy forecasting mechanism using the Regression Tsetlin Machine (RTM).This mechanism will address the problem of achieving accuracy which has a significant importance for forecasting.With the help of the Ausgrid dataset,

Figure 1 :
Figure 1: Transition graph of a two-action TA.

Figure 2 :Figure 3 :
Figure 2: Variation of MAE with the number of epochs

Table 1 :
Performance variation with number of epochs

Table 2 :
Performance variation with number of clauses

Table 3 :
Performance Comparison between RTM and ANN Machine Learning Algorithm ANN-1 ANN-2 ANN-3 RTM MAE 0.2706 0.2594 0.2599 0.1840 we have evaluated the forecasting performance of the proposed RTM-based mechanism and compared it with simple multi-layered artificial neural network (ANN) architectures.The results show that the RTM-based approach can achieve lower mean absolute error (MAE) when compared to the considered ANN architectures which provide evidence of the superiority of the RTM-based approach.