Introducing neural networks to predict stock prices

Vivek Palaniappan

Last update: Jan 4, 2023

Related tags

Deep Learning python finance data-science machine-learning tutorial neural-network trading guide prediction stock-price-prediction trading-strategies quantitative-finance stock-prices algorithmic-trading regression-models yahoo-finance lstm-neural-networks keras-tensorflow mlp-networks prediction-mod

Overview

IntroNeuralNetworks in Python: A Template Project

IntroNeuralNetworks is a project that introduces neural networks and illustrates an example of how one can use neural networks to predict stock prices. It is built with the goal of allowing beginners to understand the fundamentals of how neural network models are built and go through the entire workflow of machine learning. This model is in no way sophisticated, so do improve upon this base project in any way.

The core steps involved is: download stock price data from Yahoo Finance, preprocess the dataframes according to specifications for neural network libraries and finally train the neural network model and backtest over historical data.

This model is not meant to be used to live trade stocks with. However, with further extensions, this model can definitely be used to support your trading strategies.

I hope you find this project useful in your journey as a trader or a machine learning engineer. Personally, this is my first major machine learning and python project, so I'll appreciate if you leave a star.

As a disclaimer, this is a purely educational project. Any backtested results do not guarantee performance in live trading. Do live trading at your own risk. This guide and further analysis has been cross-posted in my blog, Engineer Quant

Contents
Overview
Getting Started
Requirements
Stock Price Data
Preprocessing
- Preparing Train Dataset
- Preparing Test Dataset
Neural Network Models
- Multilayer Perceptron Model
- LSTM Model
Backtesting
Stock Predictions
Extensions
Contributing

Overview

The overall workflow for this project is as such:

Acquire the stock price data - this will give us our features for the model.
Preprocess the data - make the train and test datasets.
Use the neural network to learn from the training data.
Backtest the model across a date range.
Make useful stock price predictions
Supplement your trading strategies with the predictions

Although this is very general, it is essentially what you need to build your own machine learning or neural network model.

Getting Started

For those of you that do not want to learn about the construction of the model (although I highly suggest you to), clone and download the project, unzip it to your preferred folder and run the following code in your computer.

pip install -r requirements.txt
python LSTM_model.py

It's as simple as that!

Requirements

For those who want a more details manual, this program is built in Python 3.6. If you are using an earlier version of Python, like Python 3.x, you will run into problems with syntax when it comes to f strings. I do suggest that you update to Python 3.6.

pip install -r requirements.txt

Stock Price Data

Now we come to the most dreaded part of any machine learning project: data acquisiton and data preprocessing. As tedious and hard as it might be, it is vital to have high quality data to feed into your model. As the saying goes "Garbage in. Garbage out." This is most applicable to machine learning models, as your model is only as good as the data it is fed. Processing the data comes in two parts: downloading the data, and forming our datasets for the model. Thanks to Yahoo Finance API, downloading the stock price data is relatively simple (sadly I doubt not for long).

To download the stock price data, we use pandas_datareader which after a while did not work. So we use this fix and use fix_yahoo_finance. If this fails (maybe in the near future), you can just download the stock data directly from Yahoo for free and save it as stock_price.csv.

Preprocessing

Once we have the stock price data for the stocks we are going to predict, we now need to create the training and testing datasets.

Preparing Train Dataset

The goal for our training dataset is to have rows of a given length (the number of prices used to predict) along with the correct prediction to evaluate our model against. I have given the user the option of choosing how much of the stock price data you want to use for your training data when calling the Preprocessing class. Generating the training data is done quite simply using numpy.arrays and a for loop. You can perform this by running:

Preprocessing.get_train(seq_len)

Preparing Test Dataset

The test dataset is prepared in precisely the same way as the training dataset, just that the length of the data is different. This is done with the following code:

Preprocessing.get_test(seq_len)

Neural Network Models

Since the main goal of this project is to get acquainted with machine learning and neural networks, I will explain what models I have used and why they may be efficient in predicting stock prices. If you want a more detailed explanation of neural networks, check out my blog.

Multilayer Perceptron Model

A multilayer perceptron is the most basic of neural networks that uses backpropagation to learn from the training dataset. If you want more details about how the multilayer perceptron works, do read this article.

LSTM Model

The benefit of using a Long Short Term Memory neural network is that there is an extra element of long term memory, where the neural network has data about the data in prior layers as a 'memory' which allows the model to find the relationships between the data itself and between the data and output. Again for more details, please read this article

Backtesting

My backtest system is simple in the sense that it only evaluates how well the model predicts the stock price. It does not actually consider how to trade based on these predictions (that is the topic of developing trading strategies using this model). To run just the backtesting, you will need to run

back_test(strategy, seq_len, ticker, start_date, end_date, dim)

The dim variable is the dimensions of the data set you want and it is necessary to successfully train the models.

Stock Predictions

Now that your model has been trained and backtested, we can use it to make stock price predictions. In order to make stock price predictions, you need to download the current data and use the predict method of keras module. Run the following code after training and backtesting the model:

data = pdr.get_data_yahoo("AAPL", "2017-12-19", "2018-01-03")
stock = data["Adj Close"]
X_predict = np.array(stock).reshape((1, 10)) / 200
print(model.predict(X_predict)*200)

Extensions

As mentioned before, this projected is highly extendable, and here some ideas for improving the project.

Getting Data

Getting data is pretty standard using Yahoo Finance. However, you may want to look into clustering data in terms of trends of stocks (maybe by sector, or if you want to be really precise, use k-means clustering?).

Neural Network Model

This neural network can be improved in many ways:

Tuning hyperparameters: find the optimal hyperparameters that gives the best prediction
Backtesting: Make the backtesting system more robust (I have left certain important aspects out for you to figure). Maybe include buying and shorting?
Try different Neural Networks: There are plenty of options and see which works best for your stocks.

Supporting Trade

As I said earlier, this model can be used to support trading by using this prediction in your trading strategy. Examples include:

Simple long short strategy: you buy if the prediction is higher, and vice versa
Intraday Trading: if you can get your hands on minute data or even tick data, you can use this predictor to trade.
Statistical Arbitrage: use can also use the predictions of various stock prices to find the correlation between stocks.

Contributing

Feel free to fork this and submit PRs. I am open and grateful for any suggestions or bug fixes. Hope you enjoy this project!

For more content like this, check out my academic blog at https://medium.com/engineer-quant

Comments

Fix output file name

Great code! But I get an error when running:

get_prices.py attempts to create an output file with: stock_data.to_csv(f"{ticker}_prices.csv") which doesn't work, and the LSTM_model.py looks for a file called "stock_prices.csv". This word correct the conflict by saving all stock data as "stock_prices.csv", although one might alternatively change the LSTM_model.py file to look for the correct file.

opened by dkozuch 0
New complementary tool

My name is Luis, I'm a big-data machine-learning developer, I'm a fan of your work, and I usually check your updates.

I was afraid that my savings would be eaten by inflation. I have created a powerful tool that based on past technical patterns (volatility, moving averages, statistics, trends, candlesticks, support and resistance, stock index indicators). All the ones you know (RSI, MACD, STOCH, Bolinger Bands, SMA, DEMARK, Japanese candlesticks, ichimoku, fibonacci, williansR, balance of power, murrey math, etc) and more than 200 others.

The tool creates prediction models of correct trading points (buy signal and sell signal, every stock is good traded in time and direction). For this I have used big data tools like pandas python, stock market libraries like: tablib, TAcharts ,pandas_ta... For data collection and calculation. And powerful machine-learning libraries such as: Sklearn.RandomForest , Sklearn.GradientBoosting, XGBoost, Google TensorFlow and Google TensorFlow LSTM.

With the models trained with the selection of the best technical indicators, the tool is able to predict trading points (where to buy, where to sell) and send real-time alerts to Telegram or Mail. The points are calculated based on the learning of the correct trading points of the last 2 years (including the change to bear market after the rate hike).

I think it could be useful to you, to improve, I would like to give it to you, and if you are interested in improving and collaborating I am also willing, and if not I would like to file it in the drawer.

opened by Leci37 0
Accuracy is zero

I'm running the code as is, and although it's running fine, the evaluation returns a loss and when I ask it for an accuracy it returns 0. Also, the prediction is at about 40. Any idea why this happens?

opened by dinos66 1
fix dependencies

Changed your code to import yfinance instead of fix_yahoo_finance

fix_yahoo_finance library was renamed to yfinance. For reasons of backward-competability https://pypi.org/project/fix-yahoo-finance/#description

opened by leMedi 0

TypeError: download() missing 1 required positional argument: 'tickers'

Good day. When i run python get_prices.py such error appears...

Traceback (most recent call last):
  File "get_prices.py", line 4, in <module>
    fix.pdr_override()
TypeError: download() missing 1 required positional argument: 'tickers'

opened by wa1era 4

LSTM_model.py and MLP_model.py doesnt print

Hello! I am trying to figure out why neither models print any output. I ran get_prices.py and it created the correct csv file with data. I then ran preprocessing.py and again with no problems. However when I run LSTM_model.py or MLP_model-py it epochs the data and so on and finishes with no error messages, however it doesnt print anything either.

opened by sword134 1
Getting this error while running as it is.

[*****************100%*******************] 1 of 1 downloaded Traceback (most recent call last): File "LSTM_model.py", line 37, in X_predict = np.array(stock).reshape((1, 10, 1)) / 200 ValueError: cannot reshape array of size 9 into shape (1,10,1)

opened by sampatel012 1

Introducing neural networks to predict stock prices

Related tags

Overview

IntroNeuralNetworks in Python: A Template Project

Contents

Overview

Getting Started

Requirements

Stock Price Data

Preprocessing

Preparing Train Dataset

Preparing Test Dataset

Neural Network Models

Multilayer Perceptron Model

LSTM Model

Backtesting

Stock Predictions

Extensions

Getting Data

Neural Network Model

Supporting Trade

Contributing

Comments

Fix output file name

New complementary tool

Accuracy is zero

fix dependencies

TypeError: download() missing 1 required positional argument: 'tickers'

LSTM_model.py and MLP_model.py doesnt print

Getting this error while running as it is.

Owner

Vivek Palaniappan

Forecasting directional movements of stock prices for intraday trading using LSTM and random forest

DeepProbLog is an extension of ProbLog that integrates Probabilistic Logic Programming with deep learning by introducing the neural predicate.

Using a Seq2Seq RNN architecture via TensorFlow to predict future Bitcoin prices

House_prices_kaggle - Predict sales prices and practice feature engineering, RFs, and gradient boosting

Stock-history-display - something like a easy yearly review for your stock performance

PyTorch Implementation of CvT: Introducing Convolutions to Vision Transformers

This is an official implementation of CvT: Introducing Convolutions to Vision Transformers.

This is an official implementation of CvT: Introducing Convolutions to Vision Transformers.

Code and data form the paper BERT Got a Date: Introducing Transformers to Temporal Tagging

Use deep learning, genetic programming and other methods to predict stock and market movements

Predict stock movement with Machine Learning and Deep Learning algorithms

A PyTorch implementation of "Predict then Propagate: Graph Neural Networks meet Personalized PageRank" (ICLR 2019).

HeatNet is a python package that provides tools to build, train and evaluate neural networks designed to predict extreme heat wave events globally on daily to subseasonal timescales.

Complex-Valued Neural Networks (CVNN)Complex-Valued Neural Networks (CVNN)

Neural network for stock price prediction

Accommodating supervised learning algorithms for the historical prices of the world's favorite cryptocurrency and boosting it through LightGBM.

NFT-Price-Prediction-CNN - Using visual feature extraction, prices of NFTs are predicted via CNN (Alexnet and Resnet) architectures.

This repository contains notebook implementations of the following Neural Process variants: Conditional Neural Processes (CNPs), Neural Processes (NPs), Attentive Neural Processes (ANPs).

A complete end-to-end demonstration in which we collect training data in Unity and use that data to train a deep neural network to predict the pose of a cube. This model is then deployed in a simulated robotic pick-and-place task.