# Stock Market Prediction Using Multi-Layer Perceptrons With TensorFlow

This post is part of a series on artificial neural networks (ANN) in TensorFlow and Python.

In this post a multi-layer perceptron (MLP) class based on the TensorFlow library is discussed. The class is then applied to the problem of performing stock prediction given historical data. Note: This post is not meant to characterize how stock prediction is actually done; it is intended to demonstrate the TensorFlow library and MLPs.

### Data Setup

The data used in this post was collected from finance.yahoo.com. The data consists of historical stock data from Yahoo Inc. over the period of the 12th of April 1996 to the 19th of April 2016. The data can be downloaded as a CSV file from the provided link. Note: YHOO is no longer traded as the company’s name has changed; feel free to use the provided link or historical data from another company.

To pre-process the data for the neural network, first transform the dates into integer values using LibreOffice’s DATEVALUE function. A screen-shot of the transformed data can be seen as follows:

Figure 1: Pre-Processing Data Using LibreOffice

For simplicity sake, the “High” value will be computed based on the “Date Value.” Thus, the goal is to create an MLP that takes as input a date in the form of an integer and returns a predicted high value of the Yahoo Inc. stock price for that day.

With the date values saved the spreadsheet, next the data is loaded into python. To improve the performance of the MLP, the data is first scaled so that both the input and output data have mean 0 and variance 1. This can be accomplished as follows (take note that “Date Value” is in column index 1 and “High” is in column index 4):

import numpy as np
from TFANN import MLPR
import matplotlib.pyplot as mpl
from sklearn.preprocessing import scale

pth = filePath + 'yahoostock.csv'
A = np.loadtxt(pth, delimiter=",", skiprows=1, usecols=(1, 4))
A = scale(A)
#y is the dependent variable
y = A[:, 1].reshape(-1, 1)
#A contains the independent variable
A = A[:, 0].reshape(-1, 1)
#Plot the high value of the stock price
mpl.plot(A[:, 0], y[:, 0])
mpl.show()

The produced plot is as follows:

Figure 2: Scaled Yahoo Stock Data

Next, an MLP is constructed and trained on the scaled data.

### Creating the MLP

The MLP class that will be used follows a simple interface similar to that of the python scikit-learn library. The source code is available here. The interface is as follows:

#Fit the MLP to the data
#param A: numpy matrix where each row is a sample
#param y: numpy matrix of target values
def fit(self, A, y):

#Predict the output given the input (only run after calling fit)
#param A: The input values for which to predict outputs
#return: The predicted output values (one row per input sample)
def predict(self, A):

#Predicts the ouputs for input A and then computes the RMSE between
#The predicted values and the actualy values
#param A: The input values for which to predict outputs
#param y: The actual target values
#return: The RMSE
def score(self, A, y):

The first step is to create an MLPR object. This can be done as follows:

#Number of neurons in the input layer
i = 1
#Number of neurons in the output layer
o = 1
#Number of neurons in the hidden layers
h = 32
#The list of layer sizes
layers = [i, h, h, h, h, h, h, h, h, h, o]
mlpr = MLPR(layers, maxItr = 1000, tol = 0.40, reg = 0.001, verbose = True)

With this code, an MLPR object will be initialized with the given layer sizes, a training iteration limit of 1000, an error tolerance of 0.40 (for the RMSE), regularization weight of 0.001, and verbose output enabled. The source code for the MLPR class shows how this is accomplished.

#Create the MLP variables for TF graph
#_X: The input matrix
#_W: The weight matrices
#_B: The bias vectors
#_AF: The activation function
def _CreateMLP(_X, _W, _B, _AF):
n = len(_W)
for i in range(n - 1):
_X = _AF(tf.matmul(_X, _W[i]) + _B[i])
return tf.matmul(_X, _W[n - 1]) + _B[n - 1]

#Add L2 regularizers for the weight and bias matrices
#_W: The weight matrices
#_B: The bias matrices
#return: tensorflow variable representing l2 regularization cost
def _CreateL2Reg(_W, _B):
n = len(_W)
regularizers = tf.nn.l2_loss(_W[0]) + tf.nn.l2_loss(_B[0])
for i in range(1, n):
regularizers += tf.nn.l2_loss(_W[i]) + tf.nn.l2_loss(_B[i])
return regularizers

#Create weight and bias vectors for an MLP
#layers: The number of neurons in each layer (including input and output)
#return: A tuple of lists of the weight and bias matrices respectively
def _CreateVars(layers):
weight = []
bias = []
n = len(layers)
for i in range(n - 1):
#Fan-in for layer; used as standard dev
lyrstd = np.sqrt(1.0 / layers[i])
curW = tf.Variable(tf.random_normal([layers[i], layers[i + 1]], stddev = lyrstd))
weight.append(curW)
curB = tf.Variable(tf.random_normal([layers[i + 1]], stddev = lyrstd))
bias.append(curB)
return (weight, bias)

...

#The constructor
#param layers: A list of layer sizes
#param actvFn: The activation function to use: 'tanh', 'sig', or 'relu'
#param learnRate: The learning rate parameter
#param decay: The decay parameter
#param maxItr: Maximum number of training iterations
#param tol: Maximum error tolerated
#param batchSize: Size of training batches to use (use all if None)
#param verbose: Print training information
#param reg: Regularization weight
def __init__(self, layers, actvFn = 'tanh', learnRate = 0.001, decay = 0.9, maxItr = 2000,
tol = 1e-2, batchSize = None, verbose = False, reg = 0.001):
#Parameters
self.tol = tol
self.mItr = maxItr
self.vrbse = verbose
self.batSz = batchSize
#Input size
self.x = tf.placeholder("float", [None, layers[0]])
#Output size
self.y = tf.placeholder("float", [None, layers[-1]])
#Setup the weight and bias variables
weight, bias = _CreateVars(layers)
#Create the tensorflow MLP model
self.pred = _CreateMLP(self.x, weight, bias, _GetActvFn(actvFn))
#Use L2 as the cost function
self.loss = tf.reduce_sum(tf.nn.l2_loss(self.pred - self.y))
#Use regularization to prevent over-fitting
if(reg is not None):
self.loss += _CreateL2Reg(weight, bias) * reg
#Use ADAM method to minimize the loss function

As seen above, tensorflow placeholder variables are created for the input (x) and the output (y). Next, tensorflow variables for the weight matrices and bias vectors are created using the _CreateVars() function. The weights are initialized as random normal numbers distributed as $\mathcal{N}(0, 1/\sqrt{f})$, where $f$ is the fan-in to the layer.

Next, the MLP model is constructed using its definition as discussed in an earlier post. After that, the loss and regularization functions are defined as the L2 loss. Regularization penalizes larger values in the weight matrices and bias vectors to help prevent over-fitting. Lastly, tensorflow’s AdamOptimizer is employed as the training optimizer with the goal of minimizing the loss function. Note that at this stage the learning has not yet been done, only the tensorflow graph has been initialized with the necessary components of the MLP.

Next, the MLP is trained with the Yahoo stock data. A hold-out period is used to assess how well the MLP is performing. This can be accomplished as follows:

#Length of the hold-out period
nDays = 5
n = len(A)
#Learn the data
mlpr.fit(A[0:(n-nDays)], y[0:(n-nDays)])

When the fit function is called, the actual training process begins. First, a tensorflow session must be created and all variables defined in the constructor must be initialized. Then, training iterations are performed up to the iteration limit provided, the weights are updated, and the error is recorded. The feed_dict parameter specifies the values of our inputs (x) and outputs (y). If the error falls below the tolerance level, training is completed, otherwise the maximum number of iterations is exhausted.

#Fit the MLP to the data
#param A: numpy matrix where each row is a sample
#param y: numpy matrix of target values
def fit(self, A, y):
m = len(A)
#Start the tensorflow session and initializer
#all variables
self.sess = tf.Session()
init = tf.initialize_all_variables()
self.sess.run(init)
#Begin training
for i in range(self.mItr):
#Batch mode or all at once
if(self.batSz is None):
self.sess.run(self.optmzr, feed_dict={self.x:A, self.y:y})
else:
for j in range(0, m, self.batSz):
batA, batY = _NextBatch(A, y, j, self.batSz)
self.sess.run(self.optmzr, feed_dict={self.x:batA, self.y:batY})
err = np.sqrt(self.sess.run(self.loss, feed_dict={self.x:A, self.y:y}) * 2.0 / m)
if(self.vrbse):
print("Iter " + str(i + 1) + ": " + str(err))
if(err < self.tol):
break

With the MLP network trained, prediction can be performed and the results plotted using matplotlib.

#Begin prediction
yHat = mlpr.predict(A)
#Plot the results
mpl.plot(A, y, c='#b0403f')
mpl.plot(A, yHat, c='#5aa9ab')
mpl.show()

Figure 3: Actual vs Predicted Stock Data

As can be seen, the MLP smooths the original stock data. The amount of smoothing is dependent upon the MLP parameters including the number layers, the size of the layers, the error tolerance, and the amount of regularization. In practice it requires a lot of parameter tuning in order to get decent results from a neural network.

## 38 thoughts on “Stock Market Prediction Using Multi-Layer Perceptrons With TensorFlow”

1. Paul says:

hi, how to predict the stock price out of the training data and plot the curve like one or two month after.

Like

1. Nicholas T Smith says:

Hi Paul,

The naive model in this post simply takes a date as argument and then outputs a value which is the predicted stock price for that day. To make a prediction further into the future, simply input the dates desired.

N

Like

1. Paul says:

Hi Nicholas, would you please give an example of how to input the dates as the parameters in codes? It is interesting to see the prediction curve followed on the training dates.

Cheers

Liked by 1 person

2. Nicholas T Smith says:

Hi Paul,

Because the original data used date strings, it requires a bit of code to allow dates to be directly input. I will make a follow up post to this one with an improve model and code.

In the meantime, you can try something like this to extend the plot:

B = np.concatenate([np.arange(A[0], 2.0, d).reshape(-1, 1), A], axis = 0)
mpl.plot(B, mlpr.predict(B), c=’#5aa9ab’)

N

Like

2. Nicholas T Smith says:

Hello Paul,

I have provided more code in my latest post. I hope you find it useful!

N

Like

1. Paul says:

Thanks. Brilliant work!

Like

2. curious says:

Like

1. Nicholas T Smith says:

For this example, a holdout period was used which could be considered as a testing set; the hold-out period comes from the original data set but it is not used during training. This is common for time-series data.

Like

1. Nicholas T Smith says:

Cool! Did you make that site?

Like

3. JUAN says:

Hi, im having a problem with (from FTMLP import MLPR), can i get any help please.

Like

1. Nicholas T Smith says:

Hi Juan, it should be: from TFMLP import MLPR.

Like

4. Judy says:

Hey, thanks for your well-written post and codes! In this post you predict “High” value based on the “Date” value for simplicity. But I want to predict a value based on “multiple” values. How can I do that?
I tried to change y = A[:, 1].reshape(-1, 1) and A = A[:, 0].reshape(-1, 1) to y = A[:, 385] and A = A[:, 1:385] since my csv file for training has this format each line: “id,value0,value1,…,value383,reference” where value0,value1,…,value383 are the features. “reference” row indicates target value which should be computed by value0-383.
But I got error message like “ValueError: Cannot feed value of shape (99, 384) for Tensor u’Placeholder:0′, which has shape ‘(?, 99)'”. I tried to reshape the matrix but didn’t find the right way.
Thanks!

Like

1. Judy says:

Btw, “99” is the number of rows in my file.

Like

2. Nicholas T Smith says:

Hi Judy,

The first parameter to the MLP constructor is a list containing the layer sizes. The first value in that list is the number of input neurons, which must match the number of features per sample. The relevant line in the above code is:

#The list of layer sizes
layers = [i, h, h, h, h, h, h, h, h, h, o]

If your samples have 384 features (your data matrix has 99 rows and 384 columns) then i should be 384 in that case.

Hope this helps,

N

Like

1. Judy says:

Hi Nicholas,
Thanks a lot, that helps! But could you please provide the code of predict function in this post? The code of your GitHub link is different from the code in this post which caused error if I directly use it. Thanks!

Like

2. Nicholas T Smith says:

Hi Judy,

You should be able to use the predict function for the MLPR class in TFANN.py on my GitHub. I updated the name of the file after I added CNNs. I updated the code on this page to reflect that.

N

Like

5. Valerie says:

Hi! I’m currently having a problem with _NextBatch method. Where can I find it?

Like

1. Nicholas T Smith says:

Hi Valerie,

The latest version of the code no longer uses the _NextBatch function. You can see the latest code for the MLPR class in the TFANN.py file on my github page.

N

Like

6. Hi, could you point me to the link you used to get the historical stock data? Thanks in advance!

Like

7. Nicholas T Smith says:

You can use the historical data tab in Yahoo finance. Google finance also makes historical stock data available for download. For instance: https://finance.yahoo.com/quote/%5EGSPC/history?p=%5EGSPC

I think the link to Yahoo doesn’t work anymore because YHOO is no longer being traded. I’ll probably update the post later with a direct link to the data.

N

Like