Chaotic Time Series Forecasting Using Higher Order Neural Networks

—This study presents a novel application and comparison of higher order neural networks (HONNs) to forecast benchmark chaotic time series. Two models of HONNs were implemented, namely functional link neural network (FLNN) and pi-sigma neural network (PSNN). These models were tested on two benchmark time series; the monthly smoothed sunspot numbers and the Mackey-Glass time-delay differential equation time series. The forecasting performance of the HONNs is compared against the performance of different models previously used in the literature such as fuzzy and neural networks models. Simulation results showed that FLNN and PSNN offer good performance compared to many previously used hybrid models .


I. INTRODUCTION
Time series forecasting is very important in many applications such as financial forecasting, weather forecasting, traffic forecasting, etc.Time series forecasting aims to build a model that take advantage of past observations to forecast the future.Time series in nature is usually non-linear or chaotic [1].According to [2], chaotic system has four fundamental characteristics: aperiodic, bounded, sensitivity to initial conditions and deterministic.Aperiodic means that the same state will not be repeated, bounded indicates that neighbour states keep within a finite range and does not approach infinity, sensitivity to initial conditions meaning that small changes in initial conditions will cause divergence between two close points as the state of system progress, and deterministic means there is a rule with no random term to forecast the future state of the system.Chaotic time series forecasting has been observed in many areas such as power load [3], marketing system [4], exchange rate [5], etc.
A number of methods have been used to forecast chaotic time series in the literature such as support vector echo-state machines [1], self-organizing map [6], and fuzzy and neurofuzzy [7]- [10].
Artificial neural networks (ANN) have been also used to forecast chaotic time series.ANN is an intelligent-based approach which is inspired by biological nervous systems; it can learn from historical data and adjust its weight matrices to build model that can predict the future.

Different types of ANN have been utilized for chaotic time series with varying degrees of success including
Elman-Nonlinear Autoregressive with eXogenous input neural networks [2], beta basis function neural networks [11], orthogonal function neural network [12], radial basis function [13], multilayer perceptron network [14].
Multi-layered ANN structure needs a large number of units to deal with complex mapping problems, which results in low learning rate and poor generalization [15].To overcome the drawbacks of multilayered networks, different higher order neural networks (HONNs) with a single layer were introduced.
HONNs utilize higher order terms (i.e., product units) which allow them to transform the input space into a higher dimensional space in which linear separability is possible, thus reducing the complexity of the network [16].Unlike multi-layered ANN structure, HONNs have only one single layer of hidden nodes which helps to accelerate the training.
Different types of HONNs have been used for time series forecasting [16]- [20] but no much attention has been paid to apply HONNs to forecast benchmark chaotic time series and compare their performance with other existing models.
In summary, the contribution of this work as follows: • Application of two HONNs namely, Functional Link Neural Network (FLNN) and Pi-Sigma Neural Network to forecast two benchmark chaotic time series: the monthly sunspot number and the Mackey-Glass time-delay differential equation time series.
• Comparison of the forecasting performance of these models with other existing models.The remainder of this paper is organized as follows.Section 2 describes a brief about HONNs, FLNN and PSSN.The experimental design used in this work is also discussed in this section.In section 3 we present the results and discussion.Finally, the conclusion is given in section 4.

II. PROPOSED MODELS AND METHOD
This section gives a brief about HONNs, FLNN and PSNN.Furthermore, it shows experimental design steps that we used in this works including the used time series, data preprocessing, network topology and training, and the evaluation metrics.

A. Higher Order Neural Networks (HONNs):
HONNs are feedforward neural network with a combination of summing and product units.They can expand the non-linear input space into higher dimensional space where linear separability is possible [16].
Using product units can increase the information capacity of the network thus helping to deal with complex problems with smaller network structure.As a results of the simple architectures of HONNs, they reduce the number of free parameters thus they can learn faster [16].However, some HONNs suffer from the combinatorial explosion of the higher order terms and demonstrate slow learning [16].
This paper uses two HONNs models namely, the Functional Link Neural Network and Pi-Sigma Neural Network.With different strength and capabilities, the characteristic and structure of these networks are presented below.

B. Functional Link Neural Networks (FLNN):
FLNN was introduced by Giles and Maxwell [21].FLNN extends the structure of feedforward network by introducing supplementary inputs to the network.Therefore, the hyperplane generated by the FLNN provides greater discrimination capability in the input pattern space [22].FLNN has been used for different problems such as classification [23], system identification [24] and time series forecasting [18].
There are two common models of the FLNN: tensor product and functional expansion [22].In the former model as shown in Fig. 1, each component of the input vector is multiplied by other components of this input vector.In other words, instead of describing input patterns in terms of a set of components {x i }, it is described as {x i , x ij }, where j ≥i, or as {x i , x ij , x ijk }, where k≥ j ≥i, and so on.Therefore, no new information has been added, but joint activations have been made available to the network.The latter model, which is shown in Fig. 2, expands the dimension of the inputs by choosing an appropriate set of functions to deal with the problem at hand.The problem with this model is that choosing a good set of functions to expand input dimensions is difficult [25].Therefore, in this work we only consider the FLNN with tensor product.
The FLNN in Fig. 1 is an example of a third order FLNN.It consists of three external inputs and four high order inputs.The learning algorithm for FLNN using the incremental backpropagation algorithm is as follows:

For a given input,
• Calculate the output as follows: ) where σ is an activation function, W 0 are the biases, W i , W ij and W ijk are weights that link input nodes with the output node, x is a component of input vector X.
• Compute the weight changes: where η is the learning rate and d is the desired output.
• Update the weights:

C. Pi-Sigma Neural Networks (PSNN):
PSNN is a feedforward neural network with one layer of trainable weights.PSNN calculates its output as product of sum of the input components [26].The motivation to develop PSNN was to develop a model which maintains fast learning property and powerful mapping capability whilst avoiding the combinatorial explosion in the number of free parameters that occurs in FLNN.
As shown in Fig. 3, PSNN consists of two layers; the product layer and the summing layer.The trainable weights are found only between the inputs and the summing units.The structure of PSNN is highly regular due the fact that the summing units can be added incrementally until a specified goal is achieved.Despite the fact that PSNN is not a universal approximator [27], it demonstrated competent ability to deal with many problems such as classification [28], time series forecasting [19], image coding [29] and visual cryptography [30].
The learning algorithm for PSNN using the incremental backpropagation algorithm is as follows: For a given input, • Calculate the output as follows: ) where σ is an activation function, W 0j are the biases, W ij are weights that link input nodes with the summing nodes, x is a component of input vector X.
• Compute the weight changes: where η is the learning rate and d is the desired output.
• Update the weights: • Continue until termination condition is satisfied.

D. Experimental Design:
1) Time Series benchmark data: we used two benchmark chaotic time series, namely the monthly smoothed sunspot numbers and the Mackey-Glass time-delay differential equation time series.
Sunspot time series is a good indication of solar activity for solar cycles [31].It is very important to forecast Sunspot time series due to the observed impact of solar activity on earth, climate, weather, satellites and space missions [31].
In this paper, we downloaded the monthly smoothed sunspot time series from [32].To compare the performance of FLNN and PSNN with other models in [31], two thousands points from November 1834 to June 2001 were selected.
Mackey-Glass time series is a benchmark problem that has been used by many researchers [11]- [14].This time series is given by the following delay differential equation: where =0.2, =−0.1, (0)=1.2, and =17.With this setting the series produce chaotic behaviour and we can compare the forecasting performance of FLNN and PSNN with other models in the literature.This time series can be found in mgdata.dat in MATLAB [33].
The input-output data pairs and the number of training and testing samples that we used in this paper for these two time series are shown in Table I.Fig. 4 and Fig. 5 show the used interval for training and testing samples for both time series.
2) Data Preprocessing: We scaled the points to the range [0.2, 0.8] because we used the sigmoid activation function.We used the minimum and maximum normalization method which is given by: where ̂ is the normalized value of x, min 1 and max 1 are the minimum and maximum values of all observations, and min 2 and max 2 refer to the minimum and maximum values of the new range.

3) Network Topology and Training:
The topology of the FLNN and PSNN that we used is shown in Table 2. Most of the settings are selected empirically.) ( 1 (11) where N, y and ^y represent the number of out-of-sample data, actual output and network output, respectively.

III. RESULTS AND DISCUSSION
The forecasting models for FLNN and PSNN of the two time series are built via the experimental design settings.In order to obtain fair comparison between FLNN and PSNN and avoid weight initialization influence, the average performance of 30 simulations are reported as shown in Table III and Table IV.Note that, the results that are shown in these two tables are the de-normalized results.That means, we de-normalized the forecasted value and compared it with the original desired value before calculating the used metrics.Best average results are in boldface.
As can be seen from Table III and Table IV, FLNN outperform PSNN on Sunspot time series, while PSNN is better than FLNN on Mackey-Glass.Therefore, each one has its ability based on the time series properties.During the simulations, we noticed that increasing network order of PSNN results in decreasing forecasting performance on Sunspot time series but it helps PSNN on Mackey-Glass time series.Learning curves for the best simulations are shown in Fig. 6 and Fig. 7.It can be seen that there is no much improvement in learning after 500 epochs.Note that, we do not de-normalize the RMSE in Fig. 6 and Fig. 7.
The best performance with FLNN and PSNN using the out-of-sample data for Sunspot and Mackey-Glass time series are shown in Fig. 8 to Fig. 11.As it can be noticed from these figures that FLNN and PSNN to some extent can follow the dynamics behaviour of the time series.Finally, a comparison among different models in the literature with FLNN and PSNN is shown in Table V and Table VI.It should be noted that based on our search we could not find studies that used the normalization range that we used in this work.For that, we used the de-normalized results for best FLNN and PSNN simulations and compared them with the de-normalized results in the literature or with studies that did not use any normalization method.The results show that FLNN and PSNN offer good performance compared to other hybrid models in the literature.Therefore, hybridizing other models with FLNN or PSNN could enhance the forecasting performance.

IV. CONCLUSIONS AND FUTURE WORKS
This paper presents the application of two higher order neural networks, namely functional link neural network (FLNN) and pi-sigma neural network (PSNN) to forecast two benchmarks chaotic time series; the monthly smoothed sunspot numbers and the Mackey-Glass time series.Results showed that FLNN outperforms PSNN on Sunspot time series while PSNN is better than FLNN on Mackey-Glass time series.Furthermore, a comparison with different hybrid models in the literature showed that FLNN and PSNN offer good performance compared to these hybrid models.Future works could be hybridizing swarm intelligence techniques with FLNN or PSNN and applying them to forecast chaotic time series.

Fig. 1
Fig. 1 Functional link neural network of type tensor product model.

Fig. 2
Fig. 2 Functional link neural network of type functional expansion model.

Fig. 4
Fig. 4 Sunspot time series.Blue points for training while red points for testing.

Fig. 5
Fig. 5 Mackey-Glass time series.Blue points for training while red points for testing.

4 )
Performance Metrics: Due we aim to compare the forecasting performance of FLNN and PSNN with other models in the literature, we used the Normalized Mean Squared Error (NMSE) and the Root Mean Squared Error (RMSE) metrics.NMSE and RMSE are given by:

Fig. 6
Fig. 6 Learning curve for best FLNN simulation on Sunspot time series.

Fig. 7
Fig. 7 Learning curve for best PSNN simulation on Mackey-Glass time series.

Fig. 9
Fig. 9 Out-of-sample forecasting for best PSNN simulation on Sunspot time series.

Fig. 10
Fig. 10 Out-of-sample forecasting for best FLNN simulation on Mackey-Glass time series.

Fig. 11
Fig. 11 Out-of-sample forecasting for best PSNN simulation on Mackey-Glass time series.

TABLE V COMPARISON
OF THE PERFORMANCE OF VARIOUS EXISTING MODELS ON SUNSPOT TIME SERIES

TABLE VI COMPARISON
OF THE PERFORMANCE OF VARIOUS EXISTING MODELS ON MACKEY-GLASS TIME SERIES