如何使用TensorFlow 2和Keras在Python中预测股票价格-BFW博客

预测股价一直是吸引投资者和研究人员的话题。投资者总是猜测股票的价格是否会上涨，因为有许多复杂的财务指标，只有投资者和具有良好财务知识的人才能理解，所以股市的走势对普通百姓来说非常难以琢磨。

对于非专家而言，机器学习是一个很好的机会，它可以准确地预测并获得稳定的财富，并且可以帮助专家获得最有用的指标并做出更好的预测。

本教程的目的是在TensorFlow 2和Keras中构建一个预测股市价格的神经网络。更具体地说，我们将使用LSTM单元构建循环神经网络，因为这是时间序列预测的最新技术。

一、安装环境

好吧，让我们开始吧。首先，您需要安装Tensorflow 2和其他库：

pip3 install tensorflow pandas numpy matplotlib yahoo_fin sklearn

完成所有设置后，打开一个新的Python文件（或bfwstuio）并导入以下库：

import tensorflow as tf
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import LSTM, Dense, Dropout, Bidirectional
from tensorflow.keras.callbacks import ModelCheckpoint, TensorBoard
from sklearn import preprocessing
from sklearn.model_selection import train_test_split
from sklearn.metrics import accuracy_score
from yahoo_fin import stock_info as si
from collections import deque

import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import time
import os
import random

我们正在使用yahoo_fin模块，它实际上是一个Python抓取工具，可从Yahoo Finance平台提取财务数据，因此它不是可靠的API，请随时使用Alpha Vantage等其他数据源。

另外，我们需要确保在进行培训/测试后我们能获得稳定的结果，设置种子可以帮助：

# set seed, so we can get the same results after rerunning several times
np.random.seed(314)
tf.random.set_seed(314)
random.seed(314)

二、准备数据集

第一步，我们需要编写一个函数，该函数从Internet下载数据集并对其进行预处理：

def load_data(ticker, n_steps=50, scale=True, shuffle=True, lookup_step=1, 
test_size=0.2, feature_columns=['adjclose', 'volume', 'open', 'high', 'low']):
"""
Loads data from Yahoo Finance source, as well as scaling, shuffling, normalizing and splitting.
Params:
ticker (str/pd.DataFrame): the ticker you want to load, examples include AAPL, TESL, etc.
n_steps (int): the historical sequence length (i.e window size) used to predict, default is 50
scale (bool): whether to scale prices from 0 to 1, default is True
shuffle (bool): whether to shuffle the data, default is True
lookup_step (int): the future lookup step to predict, default is 1 (e.g next day)
test_size (float): ratio for test data, default is 0.2 (20% testing data)
feature_columns (list): the list of features to use to feed into the model, default is everything grabbed from yahoo_fin
"""
# see if ticker is already a loaded stock from yahoo finance
if isinstance(ticker, str):
# load it from yahoo_fin library
df = si.get_data(ticker)
elif isinstance(ticker, pd.DataFrame):
# already loaded, use it directly
df = ticker
else:
raise TypeError("ticker can be either a str or a `pd.DataFrame` instances")
# this will contain all the elements we want to return from this function
result = {}
# we will also return the original dataframe itself
result['df'] = df.copy()
# make sure that the passed feature_columns exist in the dataframe
for col in feature_columns:
assert col in df.columns, f"'{col}' does not exist in the dataframe."
if scale:
column_scaler = {}
# scale the data (prices) from 0 to 1
for column in feature_columns:
scaler = preprocessing.MinMaxScaler()
df[column] = scaler.fit_transform(np.expand_dims(df[column].values, axis=1))
column_scaler[column] = scaler
# add the MinMaxScaler instances to the result returned
result["column_scaler"] = column_scaler
# add the target column (label) by shifting by `lookup_step`
df['future'] = df['adjclose'].shift(-lookup_step)
# last `lookup_step` columns contains NaN in future column
# get them before droping NaNs
last_sequence = np.array(df[feature_columns].tail(lookup_step))
# drop NaNs
df.dropna(inplace=True)
sequence_data = []
sequences = deque(maxlen=n_steps)
for entry, target in zip(df[feature_columns].values, df['future'].values):
sequences.append(entry)
if len(sequences) == n_steps:
sequence_data.append([np.array(sequences), target])
# get the last sequence by appending the last `n_step` sequence with `lookup_step` sequence
# for instance, if n_steps=50 and lookup_step=10, last_sequence should be of 60 (that is 50+10) length
# this last_sequence will be used to predict future stock prices not available in the dataset
last_sequence = list(sequences) + list(last_sequence)
last_sequence = np.array(last_sequence)
# add to result
result['last_sequence'] = last_sequence
# construct the X's and y's
X, y = [], []
for seq, target in sequence_data:
X.append(seq)
y.append(target)
# convert to numpy arrays
X = np.array(X)
y = np.array(y)
# reshape X to fit the neural network
X = X.reshape((X.shape[0], X.shape[2], X.shape[1]))
# split the dataset
result["X_train"], result["X_test"], result["y_train"], result["y_test"] = train_test_split(X, y, 
test_size=test_size, shuffle=shuffle)
# return the result
return result

此函数很长但很方便，它接受几个参数以使其尽可能灵活。

该股票的说法是，我们要加载的股票，例如，你可以使用TSLA特斯拉股市，AAPL苹果等。

n_steps整数表示我们要使用的历史序列长度，有人称它为窗口大小，回想一下我们将要使用递归神经网络，我们需要将序列数据输入网络，选择50表示我们将使用50天的股价来预测第二天。

scale是一个布尔变量，指示是否将价格从0缩放到1，我们将其设置为True，因为将高值从0缩放到1将帮助神经网络更快，更有效地学习。

lookup_step是要预测的将来的查找步骤，默认设置为1（例如，第二天）。

我们将使用此数据集中所有可用的功能，即开盘价，最高价，最低价，成交量和调整后的收盘价。请查看本教程，以了解更多这些指标。

上面的函数执行以下操作：

首先，它使用yahoo_fin模块中的stock_info.get_data（）函数加载数据集。
如果将scale参数作为True传递，它将使用sklearn的MinMaxScaler类将所有价格从0缩放到1（包括volume）。请注意，每列都有自己的缩放器。
然后，通过将调整后的关闭列移动lookup_step，添加指示目标值（用于预测的标签或y的目标）的Future列。
之后，它会重新整理和拆分数据并返回结果。
为了更好地理解代码，我强烈建议您手动打印输出变量（result），并查看功能和标签的制作方式。

三、模型制作

现在我们有了一个适当的函数来加载和准备数据集，我们需要另一个核心函数来构...

点击查看剩余70%

打赏博主×

如何使用TensorFlow 2和Keras在Python中预测股票价格

网友评论0

在Python中使用TensorFlow识别声音性别

10分钟使用cloudflare免费搭建支持nodejs、键值对及静态文件存储的免备案网站

Flutter 1.7正式发布,支持AndriodX

企业大数据中心搭建方案

mysql 根据关键词权重搜索

今天分享的BAT面试java的几点心得

用ChatGPT+LangChain做一个专属私有知识库问答系统

5分钟创建一个Ember.js网站

php官方发布重大漏洞

无需编写代码，就能教电脑识别图形声音姿势

{{item.title}}

何为BFWSOA框架

BFWSOA框架特性

BFWSOA框架程序流程图

MVCVPSCW七层架构

BFWSOA框架创建一个小应用

BFWSOA框架路由模式与Apache、Nginx配置

BFWSOA框架表单验证与提交

BFWSOA框架数据库操作

BFWSOA 缓存设置

BFWSOA模型简介

python如何将word文档中的标注文本分离成试卷和答案？

在哪可以调用豆包保持一致性的连环画ap？

go或python有没有可视化配置nginx的ui代码？

go如何编写linux的强大的可视化waf防火墙？

python或go如何防止linux上指定文件夹下所有文件被非法篡改？

ai各种字体大全提示词谁有？

gemini免费api支持哪些模型调用？

mockjs如何在js原生fetch调用的时候不起作用？

有没有开源免费的影视电影电视剧vip破解资源爬虫代码？

WelsonJS 与 Electron构建windows应用的区别？