Many-to-One Sequence Problem

Input of LSTM is a 3D array [batch, timesteps, feature]\ batch = Number of samples\ feature = Are you predicting with OHLC or closing price etc \ timestep = Loopback (e.g. 60 means take past 60 days to predict) \

Output of LSTM could either be 2D or 3D, depending upon the return_sequences argument\ \ If return_sequences = False, output is 2D array.\ Else, output is 3D array (to pass to the next LSTM layer).

Import and Settings

In [1]:
import warnings
warnings.filterwarnings('ignore')

import numpy as np
import pandas as pd
from pathlib import Path

from sklearn.preprocessing import MinMaxScaler
from sklearn.metrics import mean_squared_error, r2_score

from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dropout, Dense, LSTM
from tensorflow.keras.optimizers import Adam, RMSprop
from tensorflow.keras.callbacks import ModelCheckpoint, EarlyStopping

import matplotlib.pyplot as plt
plt.style.use('seaborn')

Toy Example

X is a toy dataset of 20 samples with 3 timesteps, where each timestep consists of a single feature.\ The output of each sample will be the sum of these 3 inputs.\ For example, if our sample consists of a sequence 1,2,3, the output would be sum of these inputs, 1+2+3 = 6.\ \ This is a Many (input)-to-one (output) sequence problem.

X

In [2]:
# create toy dataset
X = np.array([i+1 for i in range(60)])

# reshape into 3D
X = np.array(X).reshape(20,3,1)

print(f'X: {X[0]}')
print(f'\nX: {X[1]}')
X: [[1]
 [2]
 [3]]

X: [[4]
 [5]
 [6]]
In [3]:
# check the shape
X.shape
Out[3]:
(20, 3, 1)

y

In [4]:
# Create y
# y is the sum of the values in the timesteps

y = []
for each in X:
    y.append(each.sum())
    
# convert to array
y = np.array(y)

# check the output
print(f'y: {y[0]}')
print(f'y: {y[1]}')
y: 6
y: 15
In [5]:
# check the shape
y.shape
Out[5]:
(20,)

Building the Model

In [6]:
# complie model five
model = Sequential()
model.add(LSTM(50, activation='relu', input_shape=(3, 1)))
model.add(Dense(1))
model.compile(optimizer='adam', loss='mse')
print(model.summary())
Model: "sequential"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
=================================================================
 lstm (LSTM)                 (None, 50)                10400     
                                                                 
 dense (Dense)               (None, 1)                 51        
                                                                 
=================================================================
Total params: 10,451
Trainable params: 10,451
Non-trainable params: 0
_________________________________________________________________
None

Training the Model

In [7]:
# fit model
model.fit(X, y, batch_size=5, epochs=2000, validation_split=0.2, verbose=0)

# predict the outcome
test_input = np.array([70,71,72])
test_input = test_input.reshape((1, 3, 1))
test_output = model.predict(test_input, verbose=0)
print(test_output)
[[213.61253]]

Applying Neural Networks to Financial Time Series

Data Preprocessing

In [8]:
data = pd.read_csv('data/spy.csv', index_col=0, parse_dates=True)[['Adj Close']]['2000':'2019']
In [9]:
plt.figure(figsize=(14,6))
plt.title('SPY Price')
plt.plot(data);
In [10]:
scaler = MinMaxScaler()
data_scaled = pd.Series(scaler.fit_transform(data).squeeze(), index=data.index)
In [11]:
def generate_data(data, window_size):
    n = len(data)
    y = data[window_size:]
    data = data.values.reshape(-1, 1) # make 2D
    X = np.hstack(tuple([data[i: n-j, :] for i, j in enumerate(range(window_size, 0, -1))]))
    return pd.DataFrame(X, index=y.index), y
In [12]:
window_size = 60
In [13]:
X, y = generate_data(data_scaled, window_size=window_size)
In [14]:
X.head(2)
Out[14]:
0 1 2 3 4 5 6 7 8 9 ... 50 51 52 53 54 55 56 57 58 59
Date
2000-03-29 0.169410 0.154845 0.155485 0.149723 0.170210 0.171491 0.167009 0.163328 0.168290 0.173331 ... 0.155005 0.171731 0.174208 0.172282 0.179984 0.182311 0.188890 0.191217 0.187045 0.184798
2000-03-30 0.154845 0.155485 0.149723 0.170210 0.171491 0.167009 0.163328 0.168290 0.173331 0.170370 ... 0.171731 0.174208 0.172282 0.179984 0.182311 0.188890 0.191217 0.187045 0.184798 0.185199

2 rows × 60 columns

In [15]:
y.head(2)
Out[15]:
Date
2000-03-29    0.185199
2000-03-30    0.178701
dtype: float64
In [16]:
X.shape, y.shape
Out[16]:
((4971, 60), (4971,))

Train Test Split

In [17]:
# split the data before and after 2015
X_train = X[:'2015'].values.reshape(-1, window_size, 1)
y_train = y[:'2015']

# keep the last five year for testing
X_test = X['2015':].values.reshape(-1, window_size, 1)
y_test = y['2015':]
In [18]:
X_train.shape
Out[18]:
(3965, 60, 1)

Training a LSTM model

In [19]:
# assign variables for model
batch, timesteps, features = X_train.shape
batch, timesteps, features
Out[19]:
(3965, 60, 1)
In [20]:
# model architecture
model = Sequential()
model.add(LSTM(units=10, input_shape=(timesteps, features), activation = 'relu', name='LSTM'))
model.add(Dense(units=1, name='Output'))
In [21]:
# summary
model.summary()
Model: "sequential_1"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
=================================================================
 LSTM (LSTM)                 (None, 10)                480       
                                                                 
 Output (Dense)              (None, 1)                 11        
                                                                 
=================================================================
Total params: 491
Trainable params: 491
Non-trainable params: 0
_________________________________________________________________
In [22]:
# specify optimizer separately (preferred method)
opt = RMSprop(lr=0.001, rho=0.9, epsilon=1e-08, decay=0.0)
In [23]:
# compile the model
model.compile(optimizer=opt, loss='mean_squared_error')
In [24]:
results_path = Path('results', 'lstm_time_series')
if not results_path.exists():
    results_path.mkdir(parents=True)

# save the best weights at checkpoints
model_path = (results_path / 'model.h5').as_posix()
checkpointer = ModelCheckpoint(filepath=model_path,
                               verbose=1,
                               monitor='val_loss',
                               save_best_only=True)
In [25]:
# define early stopping (stop if there's no improvement in val_loss for 20 consecutive epochs)
early_stopping = EarlyStopping(monitor='val_loss', 
                              patience=20,
                              restore_best_weights=True)
In [26]:
#train the model
lstm_training = model.fit(X_train, 
                    y_train, 
                    batch_size=64, 
                    epochs=500, 
                    verbose=1, 
                    callbacks=[early_stopping, checkpointer],
                    validation_data=(X_test, y_test), shuffle=False)
Epoch 1/500
60/62 [============================>.] - ETA: 0s - loss: 0.0047
Epoch 00001: val_loss improved from inf to 11.31635, saving model to results/lstm_time_series\model.h5
62/62 [==============================] - 2s 16ms/step - loss: 0.0046 - val_loss: 11.3163
Epoch 2/500
60/62 [============================>.] - ETA: 0s - loss: 0.0021
Epoch 00002: val_loss did not improve from 11.31635
62/62 [==============================] - 1s 14ms/step - loss: 0.0020 - val_loss: 2142.5442
Epoch 3/500
60/62 [============================>.] - ETA: 0s - loss: 7.2860e-04
Epoch 00003: val_loss did not improve from 11.31635
62/62 [==============================] - 1s 14ms/step - loss: 7.2365e-04 - val_loss: 34122.3242
Epoch 4/500
62/62 [==============================] - ETA: 0s - loss: 3.4285e-04
Epoch 00004: val_loss did not improve from 11.31635
62/62 [==============================] - 1s 13ms/step - loss: 3.4285e-04 - val_loss: 46763.2891
Epoch 5/500
61/62 [============================>.] - ETA: 0s - loss: 2.3995e-04
Epoch 00005: val_loss did not improve from 11.31635
62/62 [==============================] - 1s 14ms/step - loss: 2.4178e-04 - val_loss: 34585.9805
Epoch 6/500
59/62 [===========================>..] - ETA: 0s - loss: 1.8447e-04
Epoch 00006: val_loss did not improve from 11.31635
62/62 [==============================] - 1s 13ms/step - loss: 2.0608e-04 - val_loss: 18491.0879
Epoch 7/500
61/62 [============================>.] - ETA: 0s - loss: 1.7586e-04
Epoch 00007: val_loss did not improve from 11.31635
62/62 [==============================] - 1s 13ms/step - loss: 1.7812e-04 - val_loss: 9403.9014
Epoch 8/500
60/62 [============================>.] - ETA: 0s - loss: 1.5469e-04
Epoch 00008: val_loss did not improve from 11.31635
62/62 [==============================] - 1s 14ms/step - loss: 1.6148e-04 - val_loss: 4209.4946
Epoch 9/500
62/62 [==============================] - ETA: 0s - loss: 1.5077e-04
Epoch 00009: val_loss did not improve from 11.31635
62/62 [==============================] - 1s 13ms/step - loss: 1.5077e-04 - val_loss: 1768.2617
Epoch 10/500
60/62 [============================>.] - ETA: 0s - loss: 1.3571e-04
Epoch 00010: val_loss did not improve from 11.31635
62/62 [==============================] - 1s 13ms/step - loss: 1.4160e-04 - val_loss: 718.3263
Epoch 11/500
60/62 [============================>.] - ETA: 0s - loss: 1.3146e-04
Epoch 00011: val_loss did not improve from 11.31635
62/62 [==============================] - 1s 13ms/step - loss: 1.3763e-04 - val_loss: 280.8399
Epoch 12/500
61/62 [============================>.] - ETA: 0s - loss: 1.3286e-04
Epoch 00012: val_loss did not improve from 11.31635
62/62 [==============================] - 1s 14ms/step - loss: 1.3469e-04 - val_loss: 106.8439
Epoch 13/500
60/62 [============================>.] - ETA: 0s - loss: 1.2483e-04
Epoch 00013: val_loss did not improve from 11.31635
62/62 [==============================] - 1s 13ms/step - loss: 1.3115e-04 - val_loss: 41.1862
Epoch 14/500
61/62 [============================>.] - ETA: 0s - loss: 1.2544e-04
Epoch 00014: val_loss did not improve from 11.31635
62/62 [==============================] - 1s 13ms/step - loss: 1.2705e-04 - val_loss: 16.2547
Epoch 15/500
60/62 [============================>.] - ETA: 0s - loss: 1.1860e-04
Epoch 00015: val_loss improved from 11.31635 to 5.93819, saving model to results/lstm_time_series\model.h5
62/62 [==============================] - 1s 14ms/step - loss: 1.2356e-04 - val_loss: 5.9382
Epoch 16/500
62/62 [==============================] - ETA: 0s - loss: 1.2095e-04
Epoch 00016: val_loss improved from 5.93819 to 2.08882, saving model to results/lstm_time_series\model.h5
62/62 [==============================] - 1s 15ms/step - loss: 1.2095e-04 - val_loss: 2.0888
Epoch 17/500
60/62 [============================>.] - ETA: 0s - loss: 1.1373e-04
Epoch 00017: val_loss improved from 2.08882 to 0.84995, saving model to results/lstm_time_series\model.h5
62/62 [==============================] - 1s 13ms/step - loss: 1.1839e-04 - val_loss: 0.8499
Epoch 18/500
62/62 [==============================] - ETA: 0s - loss: 1.1595e-04
Epoch 00018: val_loss improved from 0.84995 to 0.42249, saving model to results/lstm_time_series\model.h5
62/62 [==============================] - 1s 14ms/step - loss: 1.1595e-04 - val_loss: 0.4225
Epoch 19/500
59/62 [===========================>..] - ETA: 0s - loss: 1.0435e-04
Epoch 00019: val_loss improved from 0.42249 to 0.24415, saving model to results/lstm_time_series\model.h5
62/62 [==============================] - 1s 14ms/step - loss: 1.1353e-04 - val_loss: 0.2441
Epoch 20/500
62/62 [==============================] - ETA: 0s - loss: 1.1134e-04
Epoch 00020: val_loss improved from 0.24415 to 0.15573, saving model to results/lstm_time_series\model.h5
62/62 [==============================] - 1s 13ms/step - loss: 1.1134e-04 - val_loss: 0.1557
Epoch 21/500
62/62 [==============================] - ETA: 0s - loss: 1.0932e-04
Epoch 00021: val_loss improved from 0.15573 to 0.10599, saving model to results/lstm_time_series\model.h5
62/62 [==============================] - 1s 14ms/step - loss: 1.0932e-04 - val_loss: 0.1060
Epoch 22/500
58/62 [===========================>..] - ETA: 0s - loss: 9.1835e-05
Epoch 00022: val_loss improved from 0.10599 to 0.07558, saving model to results/lstm_time_series\model.h5
62/62 [==============================] - 1s 14ms/step - loss: 1.0732e-04 - val_loss: 0.0756
Epoch 23/500
59/62 [===========================>..] - ETA: 0s - loss: 9.6447e-05
Epoch 00023: val_loss improved from 0.07558 to 0.05615, saving model to results/lstm_time_series\model.h5
62/62 [==============================] - 1s 14ms/step - loss: 1.0506e-04 - val_loss: 0.0561
Epoch 24/500
59/62 [===========================>..] - ETA: 0s - loss: 9.5938e-05
Epoch 00024: val_loss improved from 0.05615 to 0.04292, saving model to results/lstm_time_series\model.h5
62/62 [==============================] - 1s 15ms/step - loss: 1.0284e-04 - val_loss: 0.0429
Epoch 25/500
62/62 [==============================] - ETA: 0s - loss: 1.0102e-04
Epoch 00025: val_loss improved from 0.04292 to 0.03319, saving model to results/lstm_time_series\model.h5
62/62 [==============================] - 1s 15ms/step - loss: 1.0102e-04 - val_loss: 0.0332
Epoch 26/500
58/62 [===========================>..] - ETA: 0s - loss: 9.0627e-05
Epoch 00026: val_loss improved from 0.03319 to 0.02585, saving model to results/lstm_time_series\model.h5
62/62 [==============================] - 1s 15ms/step - loss: 9.9215e-05 - val_loss: 0.0259
Epoch 27/500
59/62 [===========================>..] - ETA: 0s - loss: 9.2489e-05
Epoch 00027: val_loss improved from 0.02585 to 0.01996, saving model to results/lstm_time_series\model.h5
62/62 [==============================] - 1s 14ms/step - loss: 9.7284e-05 - val_loss: 0.0200
Epoch 28/500
62/62 [==============================] - ETA: 0s - loss: 9.4947e-05
Epoch 00028: val_loss improved from 0.01996 to 0.01470, saving model to results/lstm_time_series\model.h5
62/62 [==============================] - 1s 15ms/step - loss: 9.4947e-05 - val_loss: 0.0147
Epoch 29/500
61/62 [============================>.] - ETA: 0s - loss: 9.2986e-05
Epoch 00029: val_loss improved from 0.01470 to 0.01099, saving model to results/lstm_time_series\model.h5
62/62 [==============================] - 1s 15ms/step - loss: 9.3323e-05 - val_loss: 0.0110
Epoch 30/500
60/62 [============================>.] - ETA: 0s - loss: 8.9369e-05
Epoch 00030: val_loss improved from 0.01099 to 0.00865, saving model to results/lstm_time_series\model.h5
62/62 [==============================] - 1s 15ms/step - loss: 9.2342e-05 - val_loss: 0.0087
Epoch 31/500
61/62 [============================>.] - ETA: 0s - loss: 9.0679e-05
Epoch 00031: val_loss improved from 0.00865 to 0.00684, saving model to results/lstm_time_series\model.h5
62/62 [==============================] - 1s 14ms/step - loss: 9.1137e-05 - val_loss: 0.0068
Epoch 32/500
60/62 [============================>.] - ETA: 0s - loss: 8.7221e-05
Epoch 00032: val_loss improved from 0.00684 to 0.00547, saving model to results/lstm_time_series\model.h5
62/62 [==============================] - 1s 14ms/step - loss: 9.0074e-05 - val_loss: 0.0055
Epoch 33/500
58/62 [===========================>..] - ETA: 0s - loss: 8.2130e-05
Epoch 00033: val_loss improved from 0.00547 to 0.00434, saving model to results/lstm_time_series\model.h5
62/62 [==============================] - 1s 14ms/step - loss: 8.8899e-05 - val_loss: 0.0043
Epoch 34/500
59/62 [===========================>..] - ETA: 0s - loss: 8.3812e-05
Epoch 00034: val_loss improved from 0.00434 to 0.00340, saving model to results/lstm_time_series\model.h5
62/62 [==============================] - 1s 14ms/step - loss: 8.7691e-05 - val_loss: 0.0034
Epoch 35/500
62/62 [==============================] - ETA: 0s - loss: 8.6454e-05
Epoch 00035: val_loss improved from 0.00340 to 0.00263, saving model to results/lstm_time_series\model.h5
62/62 [==============================] - 1s 15ms/step - loss: 8.6454e-05 - val_loss: 0.0026
Epoch 36/500
61/62 [============================>.] - ETA: 0s - loss: 8.4776e-05
Epoch 00036: val_loss improved from 0.00263 to 0.00203, saving model to results/lstm_time_series\model.h5
62/62 [==============================] - 1s 14ms/step - loss: 8.5305e-05 - val_loss: 0.0020
Epoch 37/500
59/62 [===========================>..] - ETA: 0s - loss: 8.0858e-05
Epoch 00037: val_loss improved from 0.00203 to 0.00158, saving model to results/lstm_time_series\model.h5
62/62 [==============================] - 1s 15ms/step - loss: 8.4216e-05 - val_loss: 0.0016
Epoch 38/500
61/62 [============================>.] - ETA: 0s - loss: 8.2763e-05
Epoch 00038: val_loss improved from 0.00158 to 0.00122, saving model to results/lstm_time_series\model.h5
62/62 [==============================] - 1s 15ms/step - loss: 8.3280e-05 - val_loss: 0.0012
Epoch 39/500
58/62 [===========================>..] - ETA: 0s - loss: 7.6794e-05
Epoch 00039: val_loss improved from 0.00122 to 0.00095, saving model to results/lstm_time_series\model.h5
62/62 [==============================] - 1s 15ms/step - loss: 8.2410e-05 - val_loss: 9.4724e-04
Epoch 40/500
62/62 [==============================] - ETA: 0s - loss: 8.1430e-05
Epoch 00040: val_loss improved from 0.00095 to 0.00074, saving model to results/lstm_time_series\model.h5
62/62 [==============================] - 1s 14ms/step - loss: 8.1430e-05 - val_loss: 7.3893e-04
Epoch 41/500
59/62 [===========================>..] - ETA: 0s - loss: 7.7394e-05
Epoch 00041: val_loss improved from 0.00074 to 0.00058, saving model to results/lstm_time_series\model.h5
62/62 [==============================] - 1s 15ms/step - loss: 8.0587e-05 - val_loss: 5.7929e-04
Epoch 42/500
62/62 [==============================] - ETA: 0s - loss: 7.9712e-05
Epoch 00042: val_loss improved from 0.00058 to 0.00046, saving model to results/lstm_time_series\model.h5
62/62 [==============================] - 1s 15ms/step - loss: 7.9712e-05 - val_loss: 4.6032e-04
Epoch 43/500
59/62 [===========================>..] - ETA: 0s - loss: 7.5591e-05
Epoch 00043: val_loss improved from 0.00046 to 0.00037, saving model to results/lstm_time_series\model.h5
62/62 [==============================] - 1s 14ms/step - loss: 7.8751e-05 - val_loss: 3.7490e-04
Epoch 44/500
60/62 [============================>.] - ETA: 0s - loss: 7.5013e-05
Epoch 00044: val_loss improved from 0.00037 to 0.00032, saving model to results/lstm_time_series\model.h5
62/62 [==============================] - 1s 15ms/step - loss: 7.7560e-05 - val_loss: 3.2193e-04
Epoch 45/500
61/62 [============================>.] - ETA: 0s - loss: 7.6046e-05
Epoch 00045: val_loss improved from 0.00032 to 0.00030, saving model to results/lstm_time_series\model.h5
62/62 [==============================] - 1s 15ms/step - loss: 7.6444e-05 - val_loss: 2.9560e-04
Epoch 46/500
59/62 [===========================>..] - ETA: 0s - loss: 7.2380e-05
Epoch 00046: val_loss improved from 0.00030 to 0.00029, saving model to results/lstm_time_series\model.h5
62/62 [==============================] - 1s 15ms/step - loss: 7.5566e-05 - val_loss: 2.8942e-04
Epoch 47/500
59/62 [===========================>..] - ETA: 0s - loss: 7.1473e-05
Epoch 00047: val_loss did not improve from 0.00029
62/62 [==============================] - 1s 14ms/step - loss: 7.4669e-05 - val_loss: 3.0016e-04
Epoch 48/500
59/62 [===========================>..] - ETA: 0s - loss: 7.0085e-05
Epoch 00048: val_loss did not improve from 0.00029
62/62 [==============================] - 1s 15ms/step - loss: 7.3368e-05 - val_loss: 3.2926e-04
Epoch 49/500
60/62 [============================>.] - ETA: 0s - loss: 6.9456e-05
Epoch 00049: val_loss did not improve from 0.00029
62/62 [==============================] - 1s 14ms/step - loss: 7.2018e-05 - val_loss: 3.7472e-04
Epoch 50/500
58/62 [===========================>..] - ETA: 0s - loss: 6.4962e-05
Epoch 00050: val_loss did not improve from 0.00029
62/62 [==============================] - 1s 17ms/step - loss: 7.0030e-05 - val_loss: 4.2411e-04
Epoch 51/500
61/62 [============================>.] - ETA: 0s - loss: 6.8500e-05
Epoch 00051: val_loss did not improve from 0.00029
62/62 [==============================] - 1s 14ms/step - loss: 6.8705e-05 - val_loss: 4.9172e-04
Epoch 52/500
62/62 [==============================] - ETA: 0s - loss: 6.7332e-05
Epoch 00052: val_loss did not improve from 0.00029
62/62 [==============================] - 1s 14ms/step - loss: 6.7332e-05 - val_loss: 5.8319e-04
Epoch 53/500
62/62 [==============================] - ETA: 0s - loss: 6.5827e-05
Epoch 00053: val_loss did not improve from 0.00029
62/62 [==============================] - 1s 14ms/step - loss: 6.5827e-05 - val_loss: 6.8722e-04
Epoch 54/500
59/62 [===========================>..] - ETA: 0s - loss: 6.1113e-05
Epoch 00054: val_loss did not improve from 0.00029
62/62 [==============================] - 1s 14ms/step - loss: 6.4242e-05 - val_loss: 8.3399e-04
Epoch 55/500
58/62 [===========================>..] - ETA: 0s - loss: 5.9391e-05
Epoch 00055: val_loss did not improve from 0.00029
62/62 [==============================] - 1s 16ms/step - loss: 6.2609e-05 - val_loss: 9.6483e-04
Epoch 56/500
61/62 [============================>.] - ETA: 0s - loss: 6.1010e-05
Epoch 00056: val_loss did not improve from 0.00029
62/62 [==============================] - 1s 14ms/step - loss: 6.1192e-05 - val_loss: 7.5847e-04
Epoch 57/500
59/62 [===========================>..] - ETA: 0s - loss: 5.8728e-05
Epoch 00057: val_loss did not improve from 0.00029
62/62 [==============================] - 1s 16ms/step - loss: 6.1406e-05 - val_loss: 3.4834e-04
Epoch 58/500
61/62 [============================>.] - ETA: 0s - loss: 6.0776e-05
Epoch 00058: val_loss did not improve from 0.00029
62/62 [==============================] - 1s 16ms/step - loss: 6.1062e-05 - val_loss: 7.6530e-04
Epoch 59/500
61/62 [============================>.] - ETA: 0s - loss: 5.8647e-05
Epoch 00059: val_loss did not improve from 0.00029
62/62 [==============================] - 1s 13ms/step - loss: 5.9140e-05 - val_loss: 5.1445e-04
Epoch 60/500
59/62 [===========================>..] - ETA: 0s - loss: 5.6945e-05
Epoch 00060: val_loss did not improve from 0.00029
62/62 [==============================] - 1s 16ms/step - loss: 5.9088e-05 - val_loss: 6.8917e-04
Epoch 61/500
59/62 [===========================>..] - ETA: 0s - loss: 5.5854e-05
Epoch 00061: val_loss did not improve from 0.00029
62/62 [==============================] - 1s 15ms/step - loss: 5.8068e-05 - val_loss: 8.5753e-04
Epoch 62/500
62/62 [==============================] - ETA: 0s - loss: 5.7439e-05
Epoch 00062: val_loss did not improve from 0.00029
62/62 [==============================] - 1s 16ms/step - loss: 5.7439e-05 - val_loss: 0.0010
Epoch 63/500
62/62 [==============================] - ETA: 0s - loss: 5.7072e-05
Epoch 00063: val_loss did not improve from 0.00029
62/62 [==============================] - 1s 15ms/step - loss: 5.7072e-05 - val_loss: 0.0012
Epoch 64/500
61/62 [============================>.] - ETA: 0s - loss: 5.6002e-05
Epoch 00064: val_loss did not improve from 0.00029
62/62 [==============================] - 1s 14ms/step - loss: 5.6208e-05 - val_loss: 0.0012
Epoch 65/500
61/62 [============================>.] - ETA: 0s - loss: 5.5490e-05
Epoch 00065: val_loss did not improve from 0.00029
62/62 [==============================] - 1s 14ms/step - loss: 5.5713e-05 - val_loss: 0.0012
Epoch 66/500
59/62 [===========================>..] - ETA: 0s - loss: 5.3238e-05
Epoch 00066: val_loss did not improve from 0.00029
62/62 [==============================] - 1s 14ms/step - loss: 5.5451e-05 - val_loss: 9.7760e-04

Evaluating the Model

In [27]:
fig, ax = plt.subplots(figsize=(12, 4))

# derive rmse for the training/validation loss
loss_history = pd.DataFrame(lstm_training.history).pow(.5)
loss_history.index += 1

# get the best rmse and iteration of the same
best_rmse = loss_history.val_loss.min()
best_epoch = loss_history.val_loss.idxmin()

# plot rolling 5-iteration rmse
loss_history.columns=['Training RMSE', 'Validation RMSE']
title = f'5-Epoch Rolling RMSE (Best Validation RMSE: {best_rmse:.4%})'
loss_history.rolling(5).mean().plot(logy=True, lw=2, title=title, ax=ax)

# location of best iteration
ax.axvline(best_epoch, ls='--', lw=1, c='k')

# save figure
fig.tight_layout()
fig.savefig(results_path / 'lstm_error', dpi=300);
In [28]:
train_rmse_scaled = np.sqrt(model.evaluate(X_train, y_train, verbose=0))
test_rmse_scaled = np.sqrt(model.evaluate(X_test, y_test, verbose=0))

print(f'Train RMSE: {train_rmse_scaled:.4f} | Test RMSE: {test_rmse_scaled:.4f}')
Train RMSE: 0.0147 | Test RMSE: 0.0170
In [29]:
# predictions
y_pred = model.predict(X_test)
y_pred = pd.Series(scaler.inverse_transform(y_pred).squeeze(), index=y_test.index)
In [30]:
# rescale y
y_test_rescaled = pd.Series(scaler.inverse_transform(y_test.to_frame()).squeeze(), index = y_test.index)
In [31]:
print(f'R-square: {r2_score(y_test_rescaled, y_pred):0.4}')
R-square: 0.9872

Plot Results

In [32]:
fig, ax = plt.subplots(nrows=1, ncols=2, figsize=(20,6))

ax[0].plot(y_test_rescaled, color='red', label='actual')
ax[0].plot(y_pred, color='blue', label='predicted')
ax[1].hist(y_pred - y_test_rescaled, bins=50, density=True, label='spread')

ax[0].legend()
ax[1].legend()

plt.suptitle('SPY LSTM Prediction');
In [ ]: