Neural Networks and the Stock Market Pt. 1

So a hobby of mine, or interest at very least, is stock trading and investing. I’m an engineer, and we’re good with numbers, so how hard could stock trading be?

Relevant XKCD: https://xkcd.com/1570/

Joking aside, quantitative and algorithmic trading, using an automated program to trade stock, is an interesting topic. It seems the biggest advantage with an trading algorithm is that it can remove emotions like fear and greed from the decision making process.

The subject I will be covering with this series of posts is to figure out how successful you could be with stock forecasting using a neural network. Artificial neural networks can be used for a wide variety of applications, such as natural language processing, cluster analysis, and pattern recognition (or any other classification problem). I think stock forecasting could be formulated as a classification problem, namely whether a stock’s value is going to go up or down.

Getting Started

The way artificial neural networks operate, from a high level, is they ‘train’ on data they’ve encountered, and based on that data make some modification (i.e. learn), usually by adjusting the amount of weights given to input variables. Since our network will need some data about a stock, like price or volume, the first step should be setting up a function to download stock data.

There is a lot of stock market data available, some free and others for a fee. Since I’m cheap, I think a good place to start would be the historical price information available through Yahoo! Finance.

Imports

What’s nice about using Yahoo! Finance is there’s already an existing python library for getting the data we want. In addition to that library, I’ll also be using Pandas and NumPy.

import pandas as pd
import numpy as np
import yahoo_finance as yf

import os

Getting the data

The basic calls needed to get the data from Yahoo are shown here. The three things you need are: ** the ticker symbol, a start data, and an end date**.

For this example, I’ve chosen an exchange traded fund (ETF) that mirrors the performance of the S&P 500 (ticker is SPY).

symbol   = yf.Share("SPY")
spy_data = symbol.get_historical("2011-01-01", "2016-06-30")
spy_df   = pd.DataFrame(spy_data)

spy_df.info()
spy_df.head()

RangeIndex: 1383 entries, 0 to 1382
Data columns (total 8 columns):
Adj_Close 1383 non-null object
Close 1383 non-null object
Date 1383 non-null object
High 1383 non-null object
Low 1383 non-null object
Open 1383 non-null object
Symbol 1383 non-null object
Volume 1383 non-null object
dtypes: object(8)
memory usage: 86.5+ KB

Adj_Close Close Date High Low Open Symbol Volume
0 207.205862 209.479996 2016-06-30 209.539993 206.559998 207.210007 SPY 165021900
1 204.416484 206.660004 2016-06-29 206.929993 204.720001 204.839996 SPY 137328600
2 200.994039 203.199997 2016-06-28 203.229996 201.119995 201.479996 SPY 159382400
3 197.43313 199.600006 2016-06-27 201.600006 198.649994 201.589996 SPY 230775800
4 201.033613 203.240005 2016-06-24 210.850006 202.720001 203.630005 SPY 333444400

This is pretty straightforward. For each day in the period specified between the start and end dates, you are provided the symbol, opening price, daily low, daily high, trading volume, the closing price, and the ajusted closing price. I think wrapping this into a function like the one I made below is a good idea.

def getHistoricalData(symbol_name, start_date, end_date, save_data=False, pathname=""):
    symbol = yf.Share(symbol_name)
    price_data = symbol.get_historical(start_date, end_date)
    price_df = pd.DataFrame(price_data)

    if save_data:
        if len(pathname) > 0:
            if not os.path.exists(pathname):
                os.makedirs(pathname)
            filename = pathname + "\\" + symbol_name + "_" + start_date + "_" end_date + ".csv"
            print "Ticker data for",symbol_name,"saved to:", pathname
        else:
            filename = symbol_name + "_" + start_date + "_" + end_date + ".csv"
            print "Ticker data for",symbol_name,"saved to local directory"
            price_df.to_csv(filename)

    return price_df

#Testing out the function
start = "2016-01-01"
end = "2016-12-31"
path = "data"

getHistoricalData("SPY",start,end)
getHistoricalData("SPY",start,end,True,"data")
getHistoricalData("AAPL",start,end,True)

Ticker data for SPY saved to: data
Ticker data for AAPL saved to local directory

Adj_Close Close Date High Low Open Symbol Volume
0 115.82 115.82 2016-12-30 117.199997 115.43 116.650002 AAPL 30253100
1 116.730003 116.730003 2016-12-29 117.110001 116.400002 116.449997 AAPL 14963300
2 116.760002 116.760002 2016-12-28 118.019997 116.199997 117.519997 AAPL 20582000
3 117.260002 117.260002 2016-12-27 117.800003 116.489998 116.519997 AAPL 18071900
4 116.519997 116.519997 2016-12-23 116.519997 115.589996 115.589996 AAPL 14181200
5 116.290001 116.290001 2016-12-22 116.510002 115.639999 116.349998 AAPL 25789800
6 117.059998 117.059998 2016-12-21 117.400002 116.779999 116.800003 AAPL 24216900
7 116.949997 116.949997 2016-12-20 117.50 116.68 116.739998 AAPL 20905800
8 116.639999 116.639999 2016-12-19 117.379997 115.75 115.800003 AAPL 27675400
9 115.970001 115.970001 2016-12-16 116.50 115.650002 116.470001 AAPL 44055400
10 115.82 115.82 2016-12-15 116.730003 115.230003 115.379997 AAPL 46232200
11 115.190002 115.190002 2016-12-14 116.199997 114.980003 115.040001 AAPL 33433200
12 115.190002 115.190002 2016-12-13 115.919998 113.75 113.839996 AAPL 43167500
13 113.300003 113.300003 2016-12-12 115.00 112.489998 113.290001 AAPL 26149100
14 113.949997 113.949997 2016-12-09 114.699997 112.309998 112.309998 AAPL 34274100
15 112.120003 112.120003 2016-12-08 112.43 110.599998 110.860001 AAPL 26818500
16 111.029999 111.029999 2016-12-07 111.190002 109.160004 109.260002 AAPL 29853000
17 109.949997 109.949997 2016-12-06 110.360001 109.190002 109.50 AAPL 26075900
18 109.110001 109.110001 2016-12-05 110.029999 108.25 110.00 AAPL 34037300
19 109.900002 109.900002 2016-12-02 110.089996 108.849998 109.169998 AAPL 26409800
20 109.489998 109.489998 2016-12-01 110.940002 109.029999 110.370003 AAPL 36825800
21 110.519997 110.519997 2016-11-30 112.199997 110.269997 111.599998 AAPL 35765000
22 111.459999 111.459999 2016-11-29 112.029999 110.07 110.779999 AAPL 28459300
23 111.57 111.57 2016-11-28 112.470001 111.389999 111.43 AAPL 27026600
24 111.790001 111.790001 2016-11-25 111.870003 110.949997 111.129997 AAPL 11424400
25 111.230003 111.230003 2016-11-23 111.510002 110.330002 111.360001 AAPL 27387900
26 111.800003 111.800003 2016-11-22 112.419998 111.400002 111.949997 AAPL 25922600
27 111.730003 111.730003 2016-11-21 111.989998 110.010002 110.120003 AAPL 29119100
28 110.059998 110.059998 2016-11-18 110.540001 109.660004 109.720001 AAPL 27404300
29 109.949997 109.949997 2016-11-17 110.349998 108.830002 109.809998 AAPL 26964600
222 95.049622 96.639999 2016-02-16 96.849998 94.610001 95.019997 AAPL 49057900
223 92.44323 93.989998 2016-02-12 94.50 93.010002 94.190002 AAPL 40351400
224 92.158002 93.699997 2016-02-11 94.720001 92.589996 93.790001 AAPL 50074700
225 92.718621 94.269997 2016-02-10 96.349998 94.099998 95.919998 AAPL 42343600
226 93.426774 94.989998 2016-02-09 95.940002 93.93 94.290001 AAPL 44331200
227 93.446449 95.010002 2016-02-08 95.699997 93.040001 93.129997 AAPL 54021400
228 92.472736 94.019997 2016-02-05 96.919998 93.690002 96.519997 AAPL 46418100
229 95.010279 96.599998 2016-02-04 97.330002 95.190002 95.860001 AAPL 46471700
230 94.252947 96.349998 2016-02-03 96.839996 94.080002 95.00 AAPL 45964300
231 92.423652 94.480003 2016-02-02 96.040001 94.279999 95.419998 AAPL 37357200
232 94.331208 96.43 2016-02-01 96.709999 95.400002 96.470001 AAPL 40943500
233 95.221398 97.339996 2016-01-29 97.339996 94.349998 94.790001 AAPL 64416500
234 92.042134 94.089996 2016-01-28 94.519997 92.389999 93.790001 AAPL 55678800
235 91.386718 93.419998 2016-01-27 96.629997 93.339996 96.040001 AAPL 133369700
236 97.813722 99.989998 2016-01-26 100.879997 98.07 99.93 AAPL 75077000
237 97.275697 99.440002 2016-01-25 101.529999 99.209999 101.519997 AAPL 51794500
238 99.212599 101.419998 2016-01-22 101.459999 98.370003 98.629997 AAPL 65800500
239 94.20404 96.300003 2016-01-21 97.879997 94.940002 97.059998 AAPL 52161500
240 94.683373 96.790001 2016-01-20 98.190002 93.419998 95.099998 AAPL 72334400
241 94.556205 96.660004 2016-01-19 98.650002 95.50 98.410004 AAPL 53087700
242 95.015969 97.129997 2016-01-15 97.709999 95.360001 96.199997 AAPL 79010000
243 97.35395 99.519997 2016-01-14 100.480003 95.739998 97.959999 AAPL 63170100
244 95.270312 97.389999 2016-01-13 101.190002 97.300003 100.32 AAPL 62439600
245 97.784376 99.959999 2016-01-12 100.690002 98.839996 100.550003 AAPL 49154200
246 96.3855 98.529999 2016-01-11 99.059998 97.339996 98.970001 AAPL 49739400
247 94.849671 96.959999 2016-01-08 99.110001 96.760002 98.550003 AAPL 70798000
248 94.350769 96.449997 2016-01-07 100.129997 96.43 98.68 AAPL 81094400
249 98.508268 100.699997 2016-01-06 102.370003 99.870003 100.559998 AAPL 68457400
250 100.474523 102.709999 2016-01-05 105.849998 102.410004 105.75 AAPL 55791000
251 103.057063 105.349998 2016-01-04 105.370003 102.00 102.610001 AAPL 67649400

252 rows × 8 columns

So now we can grab data, and if desired save that data locally to a directory of our choosing, which seems good enough for now. The obvious issue with this function is that I do no error checking of the ticker symbol or the dates. Since this code is most likely just for my consumption, I’ll let those pass. I can see going back to this function later, after the network has been implemented and tested with the variables that the above code fetches. Depending on how accurate the network is at forecasting, adding other factors based on our data, such as simple moving averages.

This entry feels pretty heavy on words and light on code, which seems bad for a coding blog. That should change, hopefully, with the next entry. For part 2, I’ll be writing the code that implements a basic neural network.

Thanks for reading! Please feel free to leave any comments or point out any bugs in the code above!

See Part 2 of the series here.

Advertisements

6 thoughts on “Neural Networks and the Stock Market Pt. 1

  1. Hi,

    Very interesting stuff. I’m working on training a neural network on stock market data too. I like your method. It seems to be the best way. However, I wonder if

    Like

  2. … I wonder if you can somehow save from Yahoo the historical data for moving averages, EPS, PE Ratio, thinks like that. The predictive ability of your neural network is only as good as your features (inputs), from my understanding. Yahoo has a list of commands to request these numbers individually, such as:

    get_percent_change_from_year_high()
    get_percent_change_from_year_low()
    get_change_from_year_low()
    get_change_from_year_high()
    get_percent_change_from_200_day_moving_average()
    get_change_from_200_day_moving_average()
    get_percent_change_from_50_day_moving_average()
    get_change_from_50_day_moving_average()
    get_EPS_estimate_next_quarter()
    get_EPS_estimate_next_year()
    get_ex_dividend_date()
    get_EPS_estimate_current_year()
    get_price_EPS_estimate_next_year()
    get_price_EPS_estimate_current_year()
    get_one_yr_target_price()
    get_change_percent_change()

    …But I wonder how to compile all of these numbers for each trading day during your specified time period. I’m very new to both python and programming, and predictive analytics as well, but I have some knowledge of trading in the stock market. Please, I’d like to e-mail with you to collaborate on this.

    Like

  3. One last comment.

    I’m following along, using the code from your examples (Very nice, easy to read), and I noticed an error from line 10:

    filename = pathname + “\\” + symbol_name + “_” + start_date + “_” end_date + “.csv”

    should actually be:

    filename = pathname + “\\” + symbol_name + “_” + start_date + “_” + end_date + “.csv”

    It works with the + sign before end_date.

    Ty,

    Matt

    Like

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s