Comparison between Cascade Forward and Multi-Layer Perceptron Neural Networks for NARX Functional Electrical Stimulation (FES)-Based Muscle Model

— This paper presents the development and comparison of muscle models based on Functional Electrical Stimulation (FES) stimulation parameters using the Nonlinear Auto-Regressive model with Exogenous Inputs (NARX) using Multi-Layer Perceptron and Cascade Forward Neural Network (CFNN). FES stimulations with varying frequency, pulse width and pulse duration were used to estimate the muscle torque. About 722 data points were used to create muscle model. One Step Ahead (OSA) prediction, correlation tests and residual histogram analysis were performed to validate the model. The optimal Multi-Layer Perceptron (MLP) results were obtained from input lag space of 1, output lag space of 43 and hidden units 30. The MLP selected a total of three terms were selected to construct the final model, which producing a final Mean Square Error (MSE) of 1.1299. The optimal CFNN results were obtained from input lag space of 1, output lag space of 5 and hidden units 20 with similar terms selected. The final MSE produced was 1.0320. The proposed approach managed to approximate the behavior of the system well with unbiased residuals, which CFNN showing 8.66% MSE improvement over MLP with 33.33% less hidden units


I. INTRODUCTION
The spinal cord is a collection of nerves that sends commands to muscles to induce movement. Damage to the spinal cord causes paraplegia that results in loss of sensation and voluntary movement. The level of spinal cord injury depends on the extent of the trauma. A complete spinal cord injury causes paralysis below the lesion, while incomplete injuries might retain some function below the injury level. This is because spinal cord injuries interrupt a neural pathway that makes it impossible for the physiological stimulus to reach the muscle innervated below the level of the lesion.
Paraplegic patients (who have suffered the loss of leg functionality) need rehabilitation to assist them in regaining lost functions and maximize their potential to reduce reliance on assistive devices such as wheelchairs [1].
Functional Electric Stimulates (FES) has been discovered as an excellent rehabilitation tool to restore the patient's walking ability [2], [3]. FES uses electrical pulses to induce skeletal muscle contraction and limb movement. The electrical stimulation is delivered to the group of muscles to induce movement and allow the functionality of the patient's legs [4].
Modelling the muscles to characterize its behavior is an important task before FES can be applied. Many researches have been done to improve the muscle model. A muscle model was constructed by combining its activation and mechanical properties [5]. The activation model response was based on stimulation intensity, pulse width and frequency [6] that deal with mechanical behaviour [7]. However, currently, there is little research on the construction of a muscle model that takes into account the effects of FES to the paraplegic muscle.
In this paper, we propose a muscle model using two neural network approaches to constructing the Nonlinear Auto-Regressive model with Exogenous Inputs (NARX) approach to model muscle behavior (torque) based on stimulation frequency, pulse width, pulse and duration of muscle excitation. The models would benefit FES practitioners in terms of creating rehabilitative devices for paraplegic patients as it eliminates the possibility of injuries from try-and-error experimentation during the development of the said device.
Multi-Layer Perceptron (MLP) neural network consists of an input (sensory) layer, one or more hidden layers and an output layer. The input layer consists of several units that receive inputs from the real world, while output layer returns the results back to the real world. The rest of the units are arranged in one or more hidden layers, which are responsible for extracting underlying patterns from the inputs [8], [9].
Cascade Forward Neural Network (CFNN) are similar in structure to MLP except that CFNN has a direct weighted connection from its input to output layer which enables it to learn highly complex patterns [10]. This allows the inputs to directly influence the output nodes by embedding additional information and features to it.

A. The Human Leg Muscoskeletal System
Muscles are soft tissues of the body and function to produce force and cause of motion. It reacts as a motor that drives the human kinematic system [11].
Muscles are generally divided into three types: skeletal, cardiac and smooth muscle. Cardiac muscles are found in the heart that forming contractile walls of the organ [12], while smooth muscles are found in the hollow parts of the body such as stomach, intestines, blood vessels, and bladder. Finally, skeletal muscles are attached bones and can stretch or contract to produce movement.
An example of skeleton muscle is present in the thigh. This group of muscles is used for balance, posture and supporting body motion. The group of muscles at the behind the thigh forms the hamstring muscle group. Hamstring muscles cross two joints (hip and knee) and act as extensors of the thigh and flexor of the leg [12].
Quadriceps muscles extend (straightens) the leg, which similar to a motion of rising from a chair from a sitting position. Damage to the spinal cord would result in loss of sensation and control of voluntary movement the legs as mention previously.

B. Functional Electrical Stimulation (FES)
FES is an excellent rehabilitation tool for physiotherapy in order to produce force or movement of the body due to conditions such as spinal cord injury, cerebral palsy and stroke [13], [14]. FES works by introducing current in specific motor neurons to generate contraction. The electrodes are placed on the skin for the neurons to receive a series of electrical pulse [15].
The intensity and frequency of the electrical current are the main parameters to produce the required tension in the electrically stimulated muscle. Stimulation intensity is defined as a function of the total charge transferred to the muscle, and it depends on pulse duration, pulse amplitude, and frequency. The resulting torque produced depends on the tension in the flexor and extensor that can be controlled by varying the pulse amplitude, pulse duration, and frequency of the simulation [15].
The FES system for leg movement is illustrated in Fig. 1. The knee will move in the extension of flexion movement when stimulation excitation from FES given to the muscle [11]. However, the task of finding suitable electrical stimulation is a difficult one since experiments are burdensome and time-consuming for the subjects. A better approach is to measure and calibrate the performance of the system, including using computer simulation prior to actual experimentation to avoid or at least reduce the effect these issues. Fig. 1 Overview of the FES system [13] II. MATERIAL AND METHOD

A. Data Collection
The data used for modelling was from a previous experiment [16]. Electrical stimulation is delivered via two gel surface electrodes, with the cathode placed on the upper thigh and the anode placed on the lower thigh. The electrodes were placed to maximize muscle contraction. Stimulation pulses were generated by FES through MATLAB software. More than 600 simulation pulses were generated with the following specifications: • Simulation frequency: 10 to 50 Hz. • Pulse width: 200 to 400 microseconds. • Pulse duration: 1 to 5 seconds.
A dataset containing 731 readings were obtained from the experiment, in which 722 data points were used for the final experiment.

B. Create Regressor Matrix
This process constructs the regressor matrix, P, with maximum lag space of 50 for both input and output. After the regressors matrices have been constructed, the Error Reduction Ratio (ERR) algorithm [17] was applied to select the terms to be used in the final model structure. The structure was determined based on selecting the top regressor terms that account for 95% of the variability of the prediction data (highest ERR values).

C. Parameter Estimation
After the optimal model structure had been determined, training on both neural network models was performed. Both MLP and CFNN hidden units were varied from 5 to 30. The hidden and output activation functions are tangent-sigmoid and linear respectively as the muscle modelling problem presented here is a function approximation problem. Prior to testing the various neural network structures, the initial weights generated by the Mersenne-Twister algorithm were reset to a predefined seed to remove the influence of initial weights from the evaluation.
The optimal model was selected based on the Mean Square Error (MSE) criteria. As MSE values are calculated from the magnitude of the residuals, low values indicate a good model fit. The ideal case for MSE is zero (when the model outputs are exactly the same as the actual). However, this rarely happens in actual modelling scenarios, and a sufficiently small value is acceptable.

D. Model Validation and Analysis
After the optimal model has been found, it needs to be validated and analyzed to ensure that the model is valid and acceptable. One Step Ahead (OSA) and residual tests were performed to select the best model that fulfils the validation criteria. Several tests namely the OSA prediction, correlation tests, and residual histogram analysis were performed to validate the model.
OSA is a test that measures the ability of a model to predict future values based on its previous data. It is used to measure the predictive ability of the model in comparison to the actual data.
In SI, the prediction model can only be accepted when the residuals are randomly distributed (appears as white noise). This type of residuals indicates that the dynamics of a system has been fully captured by the SI model, which leaving only un-modelled white noise as the residuals. Correlation tests and histogram tests are used to evaluate the randomness of model residuals. Correlation tests measure the correlation between two time-series sequences at different points in time. They are useful indicators of dependencies and correlatedness between two sequences. Correlation tests are done by shifting the signals at different lags and measuring the correlation coefficients (degree of correlation) between them [18].
A histogram is a graphical method to present a distribution summary of a univariate data set. It is drawn by segmenting the data into equal-sized bins (classes), then plotting the frequencies of data appearing in each bin. The horizontal axis of the histogram plot shows the bins, while the vertical axis depicts the data point frequencies. The purpose of histogram analysis is used to view the distribution of the residuals. The histogram exhibits white noise as a Gaussian distribution, with symmetric bellshaped distribution with most of the frequency counts grouped in the middle and tapering off at both tails.

III. RESULTS AND DISCUSSION
The optimal results of both neural networks are summarized in Table I. In the OSA test ( Fig. 4 and Fig. 5), the dotted line indicates the predicted output of the model, while the solid line indicates the actual system output. Both models exhibited good predictive ability based on the OSA tests. This was because both neural networks showed close fitting relative to the actual data.
Another important criterion of system identification is the whiteness of the residuals. This is because non-random residual indicates model bias as not all dynamics in the original system is sufficiently captured by the model. The correlation test results and residual histogram for both neural networks are shown in Fig. 6 to Fig. 9. Both correlation tests exhibited correlation coefficients between the 95% confidence limits (except for lag 0 in the autocorrelation test, which is expected to be 1). Additionally, the histogram tests showed a bell-shaped (Gaussian) curve. All these observations indicate that the residuals are sufficiently random. Because of this, both neural network models were considered valid and acceptable.  Based on the results of the OSA, correlation and histogram tests, it was observed that both models were able to produce accurate and valid models with uncorrelated residuals.
However, both models exhibited high MSE values. We believe this is due to the magnitude of the data used for the modelling. As can be seen from Fig. 4 and Fig. 5, the magnitude of the output ranges from approximately 5 to 30. This translates into higher residual magnitudes when producing the final model.
Based on the lower CFNN MSE relative to MLP, it appears that the introduction of additional direct input connections to the output layer produces a positive impact on the size of the neural network. The additional connection carries additional system dynamics undiscovered by the hidden layer in the MLP. Overall, the CFNN proved to be a better choice for the construction of the model as the network structure is more compact and the residuals are lower.

IV. CONCLUSION
NARX MLP and CFNN models were constructed to model quadriceps muscle torque based on FES stimulations with varying frequency, pulse width and pulse duration. The proposed approach managed to approximate the behavior of the system well with unbiased residuals, with CFNN showing the better performance compared to MLP [19].