An Application of Artificial Neural Networks and Fuzzy Logic on the Stock Price Prediction Problem

The financial industry has been becoming more and more dependent on advanced computing technologies in order to maintain competitiveness in a global economy. Hence, the stock price prediction problem using data mining techniques is one of the most important issues in finance. This field has attracted great scientific interest and has become a crucial research area to provide a more precise prediction process. Fuzzy logic (FL) and Artificial Neural Network (ANN) present an exciting and promising technique with a wide scope for the applications of prediction. There is a growing interest in both fields of fuzzy logic computing and the financial world in the use of fuzzy logic to predict future changes in prices of stocks, exchange rates, commodities, and other financial time series. Fuzzy logic provides a way to draw definite conclusions from vague, ambiguous or imprecise information. Artificial Neural Network is one of data mining techniques being widely accepted in the business area due to its ability to learn and detect relationships among nonlinear variables. The ANN outperforms statistical regression models and also allows deeper analysis of large data sets, especially those that have the tendency to fluctuate within a short of time period. In this paper, we investigate the ability of Fuzzy logic and multilayer perceptron (MLP), which is a kind of the ANN, to tackle the financial time series stock forecasting problem. The proposed approaches were tested on the historical price data collected from Yahoo Finance with different companies. Furthermore, the comparison between those techniques is performed to examine their effectiveness. Keywords— Fuzzy Logic, Fireworks algorithm, Back-propagation algorithm, stock price forecasting, Multilayer Perceptron Neural Network, Wavelet transform.


I. INTRODUCTION
The problem of stock price prediction is one of essential topics in the fields of finance and business economics.A smart system would be able to predict the stock price and give a guide to investors to buy a stock before the price rises, or sell it before its value declines.Although it is very difficult to substitute the role of experts, an accurate prediction algorithm can directly result in high profits for investment firms.However, the stock market tendencies are non-linear, uncertain, non-stationary, and they seem to contain more significant risks than ever before in forecasting the stock price [1,2].
The purpose of investors is to attain the high and stable profit.Therefore, a number of methods have been proposed to generate more accurate estimation results in which multilayer perceptron (MLP) Neural Network [3] is one of techniques which are widely used for computing the stock price.Several studies indicated that MLP is more efficient than statistical regression models and also allows deeper analysis of large data sets, especially those that have the tendency to fluctuate within a short of period of time [4][5][6].To enhance the effectiveness of prediction models, the pre-processing methods and optimization algorithms are often combined with ANN [6] and Fuzzy Logic [7].In this paper, the close price will be predicted by using two approaches including MLP with input data being historical close prices and the type-2 fuzzy time series model with input data being historical close, high, low, and open prices.In both methods, Haar Wavelet Transform is employed for the pre-processing and the Fireworks algorithm (FA) [8] is utilized to optimize the weights and biases of MLP before training MLP using the back propagation algorithm.The FA is also used for optimizing the lengths of intervals in the type-2 fuzzy time series model.The Wavelet transform (WT) is to decompose the stock price time series and to eliminate noise, because the representation of a wavelet can tackle the non-stationary involved in the economic and financial time series [9].The Fireworks algorithm is a novel swarm intelligence proposed by Ying Tan [10] with promising results in the accuracy of optimization and the speed of convergence.This algorithm can seek optimal solutions and create the balance between exploration and exploitation as well as giving accurate outcomes of estimated stock prices.
The goal of this study is to assess the efficiency of MLP and the type-2 fuzzy time series model improved by applying the pre-processing and the FA in solving regression problems for a specific field such as the stock market.The rest of this paper is organized as follows: section 2 presents about the proposed methodology.The experiments and results are shown in the section 3. Section 4 is conclusion and future work.

II. MATERIAL AND METHOD
In the stock price prediction, there are two main analysis techniques: technical and fundamental analyses.Technical analysis uses historical time-series to give the outcomes, while fundamental analysis focuses on the forces of supply, the past performance of the company, and the earnings forecast.
In this paper, we use the technical analysis in combination with MLP, Haar Wavelet Transform, the FA, and the type-2 fuzzy time series model.Fig. 1 shows the overview of the proposed method.This paper uses the close price as a target of prediction.As for MLP, the input data include historical close prices, while with regard to the type-2 fuzzy time series model the historical close, open, high, and low prices are used for input data.

B. Noise filtering using Haar wavelet transforms
The first step of the data pre-processing is the use of wavelet transform to decompose the financial time series and eliminate noise as the representation of a wavelet can resolve the non-stationary involved in the economic and financial time series [9].
Wavelets are mathematical tools that can break data into various frequency components, and then each element is considered with a resolution matched to its scale.The wavelet transform might overcome the limitations of Fourier transform when coping with unexpected and unforeseen disruptions [11].
There are a wide range of popular wavelet algorithms including Daubechies wavelets, Mexican Hat wavelets and Morlet wavelets.These wavelet algorithms have the advantage of better resolution for smoothly changing time series.Nonetheless, their computational expense is much higher than the Haar wavelets.
The Haar wavelet algorithm works on time series whose size is a power of two (e.g., 32, 64, 128...).Each step of the wavelet transform creates two sets of values: a set of averages and a set of differences known as wavelet coefficients.Each set is a half of the size of the input data.For instance, if the time series has 128 elements, the first step will generate 64 averages and 64 coefficients.The set of averages then becomes the input for the next step (e.g., 64 averages creating a new set of 32 averages and 32 coefficients).This process is iterated until one average and one coefficient are obtained.
The strength of two coefficient spectrums generated by a wavelet calculation reflects the change in time series at different resolutions.The first coefficient band describes the highest frequency changes.This is noisiest part of the time series.This noise can be removed by employing threshold techniques.Each later band reflects changes at lower and lower frequencies.The Fireworks algorithm is a novel swarm intelligence algorithm inspired by observing the fireworks explosion and is proposed for global optimization of complex functions [8].

C. Fireworks Algorithm 1) The introduction of Fireworks algorithm
When a firework explodes, a shower of sparks will be created around the firework.The explosion process of fireworks can be considered as a local search around a specific point.To seek a point xi such that f(xi) = y, 'fireworks' are continually set off in potential space until reaching one 'spark' target or one target being fairly close to the point xi.Imitating the explosion process of fireworks, a rough framework of the FA is described in Fig. 2.

2) Types of Fireworks Explosion
In the process of observation of fireworks, the fireworks explosion is divided into two specific types.A good firework explosion generates numerous sparks which centralize the explosion center.In contrast, a bad firework explosion generates quite a few sparks which scatter around the explosion center as shown in Fig. 3.The number of sparks generated by each firework   is defined as in Eq. (2).
where  is a parameter controlling the total number of sparks generated by n fireworks,   = ((  )) ( = 1, 2, … , ) is the maximum or minimum value of the objective function among n fireworks, and ξ denotes the smallest constant being used to avoid zero-division-error.
In order to avoid overwhelming effects of gorgeous fireworks, bounds are defined for   as specified in Eq. (3).
where  and  are constant parameters.

Amplitude of Explosion:
In contrast to the design of sparks number, the amplitude of a good firework explosion is smaller than that of a bad one.Amplitude of the explosion for each firework is defined as in Eq. (4).
where  ̂ is the maximum explosion amplitude, and   = ((  )) ( = 1, 2, … , ) denotes the minimum or best value of the objective function among n fireworks.
Generating Sparks: In the process of explosion, sparks might be affected by the explosion from random  dimensions.In the FA, the dimensionality  is shown as in Eq. ( 5). = (×(0, 1)) (5) where  is the number of dimensions of vector  , and (0, 1) is a random number in the range of [0, 1].
The location of a spark of the firework   is obtained by using Algorithm 1. Imitating the explosion process, a spark's location  ̃ is initially produced.Then, if the obtained location is out of the potential space, it is mapped to the potential space.

3) Selection of Locations
At the beginning of each explosion generation,  locations will be selected for the fireworks explosion.In the FA, the current best position  * is always kept for the next explosion generation.After that,  − 1 other locations are chosen based on their distance to other locations in order to keep the diversity of sparks.The general distance between a location   and other locations is defined as in Eq. (6).
The selection probability of a location   is then specified as in Eq. (7).
When assessing the distance, any distance measure can be used including Manhattan distance, Euclidean distance, Angle-based distance, etc.In this paper, Euclidean distance is used for computing the distance.

4) Summary contents of the algorithm
Algorithm 3 shows the framework of the FA.In each generation, two types of sparks are created respectively as shown in Algorithm 1 and Algorithm 2. In the first type, explosion amplitude and the number of sparks rely on the quality of the corresponding firework.In contrast, the second kind is created using a Gaussian explosion process, which carries out seeking in a local Gaussian space around a firework.

Algorithm 3. Fireworks algorithm
Randomly choose n locations for fireworks; while stopping criteria is not met do Set off n fireworks respectively at n locations: for each firework   do Compute the number of sparks that the firework produces: ̂, using Eq. ( 2) and Eq. ( 3); Find locations of ̂ sparks of the firework   using Algorithm 1 end for for  = 1 →  ̂ do Randomly choose a firework   ; Create a specific spark for the firework using Algorithm 2; end for Choose the best location and keep it for next explosion generation; Randomly select n -1 locations from the two types of sparks and the current fireworks according to the probability Error!Reference source not found.end while

D. Multilayer Perceptron for the stock price prediction problem
ANN is one of techniques that are widely used in trade and finance because of its capability of learning and identifying the relationships among non-linear variables.Some studies proved that ANN is more efficient than statistical regression models and allows deeper analysis of large data sets, especially those that tend to oscillate within a short of period of time [4][5][6].However, in the problem of finance prediction with huge time series data, specific preprocessing techniques and optimization algorithms have to be used to enhance the accuracy of predicted results.
In this study, a multilayer perceptron, which is a kind of ANNs, is combined with the Haar wavelet transform and the FA [8] to construct a stock price prediction system.The Haar wavelet mentioned before will be employed to analyse stock price data and to filter noise.The wavelet transform can handle unstable signals in the fields of economics and finance [9].The FA is used for optimizing weights and biases of MLP in order to improve the accuracy and learning ability of MLP.

1) Neural Network Setting
In general, a multilayer perceptron might have many hidden layers with unlimited number of neurons of each layer.However, theoretical works have shown that an MLP with one hidden layer is good enough to approximate any complex non-linear functions [12].In addition, many studies and experimental results also indicate that one hidden layer is sufficient for most forecasting problems [4,13,14].Therefore, this work uses the architecture of MLP neural network with one hidden layer.
Other difficult tasks when selecting good parameters for MLP are the number of hidden neurons and the activation function.Setting an appropriate architecture of MLP for a particular problem is an important task because the network topology directly affects to its computational complexity and generalization ability.If the training data set is divided into groups with similar features, the number of these groups can be used for the number of hidden neurons.In the case that training data distribute scattered and do not contain the same features, the number of connections may be quite equal to the number of training samples to maintain the convergence ability of MLP.
However, too much hidden layers or hidden neurons will drive MLP to the over-fitting which means that MLP performs well on training data but poorly on data it has not seen.This leads to the inability of generalization of MLP.Based on conducted experiments and other studies as in [7,15], the MLP with 8 neurons for the hidden layer and a bipolar Sigmoid function (Fig. 4) as an activation function for both hidden and output layers is suitable for forecasting the stock price.2) Optimizing Weights and Biases of the MLP using the FA In this paper, the FA is applied to optimize the weights and biases of the MLP before the training process.A firework individual is shown in Fig. 6.
The FA was presented in the section 2.3 in which  being the dimensionality of vector  is computed as in Eq. ( 8) and its description is shown in Table I.The bias of the output neuron

3) Training the MLP by Back-propagation Algorithm
After optimizing the MLP by using the FA, the training process is continued with back -propagation algorithm in about 1000 cycles more.

E. Type-2 Fuzzy time series Model for the problem of stock price prediction 1) Some definitions of fuzzy sets and fuzzy time series Definition 1: Fuzzy set
Fuzzy set  of the universe of discourse U is represented by all pairs of elements (,   ()) as follows:  = {(,   ())| ∈ } where: : is the universe of discourse of the fuzzy set A, which is discrete and finite   : is the membership function of the fuzzy set    (): is the level of the dependence of x on the fuzzy set  Definition 2: Fuzzy time series Let () ( = ⋯ , 0, 1, 2, … ) be a subset of R and () be the universe of discourse on which fuzzy sets   () are defined.If () is a collection of  1 (),  2 (), …, then () is called a fuzzy time series defined on ().(,  − 1) : is the fuzzy logical relationship between ( − 1) and () *:represents an operator in the fuzzy set Let ( − 1) =   and () =   .The relationship between ( − 1) and () (referred to as the FLR [16]) can be denoted by:   →   where: : is called the left-hand side (LHS)   : is called the right-hand side (RHS) Definition 4: Fuzzy logical relationship group (FLRG) Suppose there are the following FLRs with the same LHS:   →  1   →  2 …   →   Following Chen's model [17], these FLRs can be grouped into an FLRG as: →  1 ,  2 , … ,

2) A type-2 fuzzy time series for the stock price prediction problem
The type-2 fuzzy time series model was proposed by Huarng and Yu [18].This model is the expansion of the type-1 model for using more observations.It has two important improvements which are the definition of operations for utilizing extra observations and the method to compute the forecasts.
a. Some definitions for the type-2 model A number of type-1 fuzzy time series models in the past used only one variable for forecasting, and only some of the observations related to that variable were applied.We refer to these observations as type-1 observations, such as the closing of the stock index.The type-2 model uses extra observations such as open, high, and low prices.
Definition 5: Type-2 fuzzy time series model Type-2 fuzzy time series model can be considered as an expansion of a type-1 fuzzy time series model.The type-2 fuzzy time series model utilizes the FLRs established by a type-1 model relied on type-1 observations.Fuzzy operators such as union and intersection are used to establish the new FLRs obtained from type-1 and type-2 observations.Type-2 forecasts are obtained from these FLRs [18].Step 2: Pick variables and type-1 observations.This work uses the close price for the problem, so the close price is selected as type-1 observation.
Step 3: Apply the type-1 model to type-1 observations and obtain FLRGs Following Chen's model, the process of forecasting is carried out as follows: + There are many previous works on introducing the approach to compute the lengths of intervals [19][20][21].In this paper, the number of intervals n will be calculated by using the distribution-based technique [19], then the length of each interval will be optimized by the FA.
+ Step 3-3: Define fuzzy sets for observations Fuzzy sets   are defined through the membership functions, and we use a triangular fuzzy set (0, 0.5, and 1) as a degree of membership.
Step 5: Map out-of-sample observations to FLRGs for type-1 and type-2 observations and obtain forecasts.

Fuzzy logical relationship group
Step 6: Apply operators in definition 6 to the FLRGs for all the observations.For example: At 7/3/2016: Close price 2 →  3 Applying the union operator, we have: Applying the intersection operator, we obtain: Step 7: Defuzzify the forecasts Supposing the forecast when using the operator  (union or intersection) is  1 ,  2 , … ,   , the arithmetic average of intervals  1 ,  2 , … ,   are  1 ,  2 , … ,   respectively [17] the defuzzified forecast when using the operator  is computed as follows: Step 8: Calculate forecasts for the type-2 model 3) Improved the type-2 fuzzy time series model using the FA With regard to the time series model, the selection of interval lengths is extremely essential.The intervals with suitable lengths can increase the forecasting accuracy of the model.The lengths of intervals should not be too large or small.When an effective length of intervals is too large, there will be no fluctuations in the fuzzy time series.By contrast, when the length is too small, the meaning of fuzzy time series will be diminished [19].In this paper, the FA is used to optimize the lengths of intervals without modifying the number of intervals in order to improve the forecasting accuracy of the proposed type-2 model.
The FA used to optimize the lengths of intervals is similar to the optimization of weights and biases of the ANN.
Let  =  − 1 be the dimensionality of the vector  , a firework individual are shown in Fig. 7.

B. Evaluation criteria
The proposed approaches were evaluated according to the root mean squared error (RMSE), the mean absolute error (MAE) and the mean absolute percentage error (MAPE) criteria.These criteria are defined as Eqs.( 9)-( 11 (11) where N is the size of testing sets.
These criteria measure how the predicted value O is close to the real value Y.The lower these measures, the better result is.

C. Evaluation the efficiency of proposed methods 1) The effect of data pre-processing
As mentioned above, the Haar wavelet transform is capable of removing noise.Therefore, this transform suits to oscillating and aperiodic time series in the finance field.
a. On the MLP neural network Using data being removed noise to train the network gives results with fewer rates of errors and faster speed of convergence when compared to the original data containing a lot of jags.
Table V shows that when using Wavelet transform the values of evaluation criteria reduce about 10% on three data sets.Fig. 8 illustrates that MLP converges faster with the use of Wavelet as well as errors.6 shows that evaluation criteria are lower about 4% on three data sets when using the Wavelet transform.
2) The effect of the FA on obtained results a.An experimental result for the MLP The FA is used for optimizing weights and biases of MLP.Table VII indicates that the use of the FA contributes to the decrease of the rate of error 4% for three data sets.Fig. 9 shows that the speed of convergence of MLP using FA is faster than the figure without using FA.

b. Experimental results for the type-2 fuzzy time series model
The FA is employed to optimize the lengths of intervals in the type-2 fuzzy time series model.Table VIII shows that all evaluation criteria decrease approximately 14% on three data sets.

3) The comparison of effectiveness of the MLP and the type-2 fuzzy time series model
Some predicted and actual stock prices are presented in Table IX.It can be seen that the results of both methods are quite close to the actual values.Table X compares both approaches on evaluation criteria.On the GOOG and YHOO data sets, the type-2 fuzzy time series model outperforms the MLP, but with regard to the APPL data set the results of the MLP is better.These outcomes point out that both methods operate positively on the stock price prediction problem      A mathematical modeling of a stock price prediction problem is a process of determining the variation pattern of variables of the problem from the analysis of figures and historical data.Due to the complexity of the problem with many practical factors, the common mathematical modeling techniques expose a large number of limitations.Therefore, this paper explores the soft computing methods such as fuzzy logic and artificial neural network to deal with the stock price prediction problem, resulting in the suggested and reference values for traders.
As for MLP, though the obtained results and estimation effectiveness are quite positive, the stock price prediction problem usually requires more reliable approaches to enhance the efficiency of the training process.Simultaneously, we need to use the open, high, and low prices in comparison with the close price to improve the quality of estimation results.However, the current MLP can use only the close price and this is a main drawback of the MLP.
Fuzzy logic might overcome a number of disadvantages of conventional techniques and MLP.The generations of a fuzzy prediction system can rely on both the expert knowledge and time series data to produce the results.Fuzzy logic provides a flexible approach with fewer assumptions for time series data in the field of finance.In addition, fuzzy logic is proved as a great substitute for tools that need real time data in stock prices.Moreover, the type-2 fuzzy time series model in this work used four factors including the open, high, low, and close prices to give the estimation results, so the outcomes are better when compared with the MLP using only the close price.

Fig. 1
Fig.1The overview of proposed approachA.Choosing data formatting and prediction targetsIn the problem of predicting stock prices, there are some factors being used to analyse such as moving average (MA), relative strength index (RSI), Boilinger bands, close/open prices, and volume oscillator.The selection of appropriate indexes and factors depends on experiences of traders and kinds of shares.This paper uses the close price as a target of prediction.As for MLP, the input data include historical close prices, while with regard to the type-2 fuzzy time series model the historical close, open, high, and low prices are used for input data.

Fig. 2 A
Fig. 2 A framework of the fireworks algorithm

Fig. 3
Fig. 3 Two kinds of fireworks explosion Number of Sparks: Suppose that the FA is designed for the general optimization problem:  () ∈ ,   ≤  ≤   (1) where  = [ 1 ,  2 , … ,   ] is a position in the potential space of solutions, () is an objective function,   and   are lower and upper bounds of the potential space,  is the dimensionality of vector .The number of sparks generated by each firework   is defined as in Eq. (2).

Algorithm 1 .Algorithm 2 .
Obtain the location of a spark Initialize the location of the spark:  ̃ =   ;  = (×(0, 1)); Randomly choose z dimensions of  ̃; Compute  =   ×(−1, 1); for each  ̃  ∈ { pre-selected z dimensions of  ̃} do  ̃  =  ̃  + ; if  ̃  <    or  ̃  >    then Mapping  ̃  into the potential space:  ̃  =    + | ̃  | % (   −    ); end if end for To maintain the diversity of sparks, there is another way of generating sparks called Gaussian explosion, which is shown in Algorithm 2. Function (1, 1) which is a Gaussian distribution with mean 1 and standard deviation 1 is used to define the coefficient of the explosion. ̂ sparks of this kind are generated in each explosion generation.Find the position of a specific spark Initialize the location of the spark:  ̂ =   ;  = (×(0, 1)); Randomly choose z dimensions of  ̂; Compute the coefficient of Gaussian explosion:  = (1, 1

Fig. 4
Fig. 4 Bi-polar sigmoid function Fig. 5 describes the architecture of the MLP used in our work.The input layer contains 30 neurons corresponding to 30 close prices of 30 latest days.The output layer including one neuron is the close price of the next day.

Fig. 5
Fig. 5 An architecture of the MLP for the stock prediction system -windowSize-8-1

Fig. 7
Fig. 7 The graphical representation of a firework individual III.RESULTS AND DISCUSSION A. Test suites Test suites contain historical data of Google Inc. (GOOG), Apple Inc. (AAPL) and Yahoo! Inc. (YHOO) in the period of 2011-2016, which is taken from Yahoo Finance [22].The data from 1/2011 to 5/2015 are used for training and the data from 6/2015 to 12/2015 are employed for testing.The close price is chosen for the type-1 observation and open, high, and low prices are selected for type-2 observations.

Fig. 8
Fig. 8 RMSE of the training stage using and non-using Wavelet (GOOG) b.On the type-2 fuzzy time series model Using data being removed noise for the type-2 fuzzy time series model generates less rate of errors compared with utilizing original data.Table6shows that evaluation criteria are lower about 4% on three data sets when using the Wavelet transform.

TABLE V EXPERIMENTAL
RESULTS OF USING WAVELET AND NON-USING WAVELET FOR MLP

TABLE VI EXPERIMENTAL
RESULTS OF USING WAVELET AND NON-USING WAVELET FOR THE TYPE-2 FUZZY TIME SERIES

TABLE VII EXPERIMENTAL
RESULTS OF USING AND NON-USING FA FOR MLP

TABLE VIII EXPERIMENTAL
RESULTS OF THE TYPE 2 FUZZY TIME SERIES MODEL USING FA AND WITHOUT USING FA

TABLE IX THE
ESTIMATION PRICES AND ACTUAL PRICES USING TWO METHODS (GOOG)

TABLE X A
COMPARISON OF EXPERIMENTAL RESULTS USING THE ANN AND THE TYPE 2 FUZZY TIME SERIES MODEL