Recurrent Neural Networks are special kinds of deep learning networks that were meant specially for dealing with sequence modeling problems. Whenever we have data in which the sequence of data matters like language translation, speech to text conversion, DNA sequence analysis, etc., we use RNN. The working logic of RNN states that the output of the current state not only depends on the current input but also on the previous state saved inside the cell, about the past inputs.
In this tutorial, we are going to cover the following topics:
A regular deep learning network consists of the input layer, hidden layers, and an output layer. RNNs are also pretty much the same with slight modifications. Consider this diagram, showing the input, hidden layer, and output with a temporal loop.
Suppose we are solving a language translation problem and we want to know the translation of a sentence.
Note: Machine translation is performed with the help of an encoder-decoder, we just used this example to get an idea of how RNN actually works.
We will be predicting the closing price of Netflix stocks with the help of RNN. For this, we will be using data from the year 2010 to 2022.
Using padas-datareader helps fetch data of various stocks from various sources. If you are using Google Colab try using :
pip install --upgrade pandas-datareader
pip install pandas-datareader
import numpy as np
import pandas as pd
import pandas_datareader as pdr
import matplotlib.pyplot as plt
from datetime import datetime
Pandas datareader provides a function get_data_yahoo to directly fetch the stock data of the specified stock from Yahoo Finance for the given duration. WE are extracting the Netflix stock price between the duration from 2010-04-01 to 2022-04-25.
df_netflix = pdr.get_data_yahoo('NFLX', start = '2010-04-01',end = '2022-04-25')
df_netflix
Let us plot the Closing price of the stock for the past 12 years.
df = df_netflix['Close']
plt.figure(figsize = (15,4))
plt.plot(df)
plt.title("Closing Price")
plt.show()
Since this is time-series data, we cannot use the scikit-learn train_test_split() function because we need to preserve the order of prices. So, we will use, starting 80% of the values as a training dataset and the rest for testing. We also reshaped the datasets to make them compatible for the next step which is normalization.
last_index = int(len(df) * 0.8)
train = df[:last_index].values.reshape(-1,1)
test = df[last_index:].values.reshape(-1,1)
train.shape, test.shape
((2430, 1), (608, 1))
Normalization converts all the values in the range of 0 to 1. It improves convergence and hence reduces training time.
from sklearn.preprocessing import MinMaxScaler
scaler = MinMaxScaler()
train = scaler.fit_transform(train)
test = scaler.transform(test)
train[:10]
This function specifies how many previous values must be considered to find out the pattern, in order to predict a price. Accordingly, both, the training and testing datasets are split into X and Y.
def create_dataset(dataset, time_step=1):
data_X, data_Y = [], []
for i in range(len(dataset)-time_step-1):
a = dataset[i:(i + time_step), 0]
data_X.append(a)
data_Y.append(dataset[i + time_step, 0])
return np.array(data_X), np.array(data_Y)
time_step = 100
X_train, y_train = create_dataset(train, 100)
X_test, y_test = create_dataset(test, 100)
To make the shape of the input compatible with the model we need to reshape it.
X_train = X_train.reshape(X_train.shape[0],X_train.shape[1], 1)
X_test = X_test.reshape(X_test.shape[0],X_test.shape[1], 1)
For building the RNN model we stacked 3 recurrent layers using Keras SimpleRNN() function. Writing return_sequences = True ensures that the output is a 3D array containing outputs for all the time steps ready to be fed into the next recurrent layer (Not needed for the last layer because we only want the last output). For compiling the model adam optimizer is used and mean_squared_error as the loss function. At last, the summary of the model is shown.
from keras.models import Sequential
from keras.layers import SimpleRNN, Dense
model = Sequential([SimpleRNN(20, return_sequences = True, input_shape = [None,1]),
SimpleRNN(20, return_sequences = True),
SimpleRNN(20),
Dense(1)])
model.compile(loss = 'mean_squared_error', optimizer = 'adam')
model.summary()
Output:
Model: "sequential"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
simple_rnn (SimpleRNN) (None, None, 20) 440
simple_rnn_1 (SimpleRNN) (None, None, 20) 820
simple_rnn_2 (SimpleRNN) (None, 20) 820
dense (Dense) (None, 1) 21
=================================================================
Total params: 2,101
Trainable params: 2,101
Non-trainable params: 0
For training purposes, we need to give training datasets and for testing, testing datasets along with respective labels and specify epochs (no. of iterations).
model.fit(X_train, y_train, validation_data = (X_test, y_test), epochs = 80)
Output:
Epoch 1/80
73/73 [==============================] - 4s 58ms/step - loss: 1.6642e-04 - val_loss: 0.0030
Epoch 2/80
73/73 [==============================] - 4s 57ms/step - loss: 1.6279e-04 - val_loss: 0.0020
Epoch 3/80
73/73 [==============================] - 4s 59ms/step - loss: 1.8406e-04 - val_loss: 0.0022
Epoch 4/80
73/73 [==============================] - 4s 59ms/step - loss: 1.6228e-04 - val_loss: 0.0020
Epoch 5/80
73/73 [==============================] - 4s 59ms/step - loss: 1.5072e-04 - val_loss: 0.0025
.
.
.
Epoch 75/80
73/73 [==============================] - 4s 58ms/step - loss: 1.5850e-04 - val_loss: 0.0018
Epoch 76/80
73/73 [==============================] - 4s 59ms/step - loss: 1.9614e-04 - val_loss: 0.0023
Epoch 77/80
73/73 [==============================] - 4s 59ms/step - loss: 1.5352e-04 - val_loss: 0.0017
Epoch 78/80
73/73 [==============================] - 4s 60ms/step - loss: 1.5618e-04 - val_loss: 0.0016
Epoch 79/80
73/73 [==============================] - 4s 58ms/step - loss: 1.8878e-04 - val_loss: 0.0017
Epoch 80/80
73/73 [==============================] - 4s 57ms/step - loss: 1.7536e-04 - val_loss: 0.0019
<keras.callbacks.History at 0x7fc3132469d0>
After training the model is ready to make the predictions on the testing dataset.
test_predict = model.predict(X_test)
test_predict[:10]
array([[1.0299232 ],
[1.0298218 ],
[1.0012373 ],
[1.0180799 ],
[1.0012494 ],
[1.0095133 ],
[0.97529024],
[0.98867863],
[1.0026649 ],
[0.99097246]], dtype=float32)
Since we normalized the data earlier and converted all the values in the range 0 to 1, now we should inverse the transformation to get actual values.
test_predict=scaler.inverse_transform(test_predict)
Here we have plotted actual and predicted stock prices for the test dataset. We can see that the model is performing fine.
plt.figure(figsize = (15,4))
plt.plot(scaler.inverse_transform(y_test.reshape(-1,1)), color = 'r', label = 'actual')
plt.plot(test_predict, color = 'b',label = 'predicted')
plt.legend()
plt.show()
For evaluation, we will be using root mean square values.
from sklearn.metrics import mean_squared_error
mean_squared_error(y_test,test_predict, squared = False)
Output:
507.0118272626747
In this tutorial, we have understood how a Recurrent Neural Network works. The feedback mechanism allows it to hold memory and this recurrent nature makes it ideal for sequence modeling problems. We have also implemented the RNN using python to predict the stock prices. In the upcoming session, we will try to focus on LSTM, GRU, and other deep learning topics. If you want to explore the Convolutional Neural Network (CNN) follow this article.
In this tutorial, we will focus on MapReduce Algorithm, its working, example, Word Count Problem,…
Learn how to use Pyomo Packare to solve linear programming problems. In recent years, with…
In today's rapidly evolving technological landscape, machine learning has emerged as a transformative discipline, revolutionizing…
Analyze employee churn, Why employees are leaving the company, and How to predict, who will…
Airflow operators are core components of any workflow defined in airflow. The operator represents a…
Machine Learning Operations (MLOps) is a multi-disciplinary field that combines machine learning and software development…