Deep Learning for Event-Driven Stock Prediction
Reading time:2019/3/30-2019/4/12 Theme:Deep learning; CNN; NLP
Abstract:
We propose a deep learning method for event driven stock market prediction. First, events are extracted from news text, and represented as dense vectors, trained using a novel neural tensor net work. Second, a deep convolutional neural network is used to model both short-term and long-term influences of events on stock price movements. Experimental results show that our model can achieve nearly 6% improvements on S&P 500 index prediction and individual stock prediction, respectively, compared to state-of-the-art baseline methods. In addition, market simulation results show that our system is more capable of making profits than previously reported systems trained on S&P 500 stock historical data.
Main content:
First part:Introduction
1.As web information grows, recent work has applied Natural Language Processing (NLP) techniques to explore financial news for predicting market volatility.
2.Pioneering work mainly uses simple features from news documents, such as bags-of-words, noun phrases, and named entities.
3.One disadvantage of structured representations of events is that they lead to increased sparsity, which potentially limits the predictive power.
4.Ding et al. [2014] show that the performance of daily prediction is better than weekly and monthly prediction.
5.Despite the relatively weaker effects of long-term events, the volatility of stock markets is still affected by them. However, little previous work quantitively models combined short-term and long-term effects of events. To fill in this gap, we treat history news as daily event sequences, using a convolutional neural network (CNN) to perform semantic composition over the input event sequence, and a pooling layer to extract the most representative global features. Then a feedforward neural network is used to associate the global features with stock trends through a shared hidden layer and a output layer.
6.Experiments on large-scale financial news datasets from Reuters and Bloomberg show that event embeddings can effectively address the problem of event sparsity.
Second part: Neural Tensor Network for Learning Event Embeddings
1.Our goal is to automatically learn embeddings for structured event tuples E = (O1, P, O2), which draw more fundamental relations between events, even if they do not share the same action, actor or object
2.Our task is related to previous work on learning distributed representations of multi-relational data from knowledge bases [Bordes et al., 2011; Socher et al., 2013], which learns the embedding of (e1, R, e2), where e1 and e2 are named entities and R is the relation type. However, learning structured event embedding has two significant differences.
First, the number of relation types in knowledge bases is limited.
Second, the goal of relational database embedding is to be able to state whether two entities (e1, e2) are in a certain relation R. When R is symmetric, e1 and e2 have interchangeable roles.
Neural Tensor Network
3.As most event arguments consist of several words, we represent the actor, action and object as the average of its word embeddings, respectively, allowing the sharing of statistical strength between the words describing each component.
4.We extract more than 10 million events from Reuters financial news and Bloomberg financial news as the training data for event embeddings. The training algorithm repeats for N iterations over the training examples, which is a set of event tuples E = (O1, P, O2), extracted from the training corpus using the method in Section 2.1. In each iteration, the train ing procedure is shown in Algorithm 1.
Deep Prediction Model
1.We model long-term events as events over the past month, mid-term events as events over the past week, and short-term events as events on the past day of the stock price change. As shown in Figure 3, the prediction model learns the effect of these three different time spans on stock prices based on the framework of a CNN.
2.For our task, it is necessary to utilize all local features and predict stock price movements globally. Hence, we use a max pooling layer on top of the convolutional layer, which forces the network to retain only the most useful local features produced by the convolutional layer. Note that the convolution operation is only applied to the long-term and mid-term event embeddings, because the unit of timing is one day.
3.To correlate the feature vector VC and stock prices, we use a feedforward neural network with one hidden layer and one output layer.
4.We use financial news from Reuters and Bloomberg over the period from October 2006 to November 2013, released by Ding et al. [2014] 1.Randinsky et al. [2012] and Ding et al. [2014] show that news titles are more useful for prediction compared to news contents. This paper extracts events only from news titles. We conduct our experiments on predicting the Standard & Poor’s 500 stock (S&P 500) index and its individual stocks, obtaining indices and prices from Yahoo Finance. Detail statistics of training, development (tuning) and test sets are shown in Table 1.
5.In contrast to the baselines, we use a neural tensor network to learn event embeddings for representing news documents, and build a prediction model based on a deep CNN.
6.Event embedding is useful for the task of stock market prediction. Given the same prediction model (CNN or NN), the event embeddings based methods (EB-NN and EB-CNN) achieve consistently better performance than the events-based methods (E-NN and E-CNN). This is likely due to the following reasons.
First, low-dimensional dense vector can effectively alleviate the problem of feature sparsity.
Second, we can learn deeper semantic relations between event embeddings, by modeling the semantic compositionality over word embeddings.
Market Simulation(a strategy)
7.We simulate real stock trading by following the strategy pro posed by Lavrenko et al. [2000], which mimics the behavio of a daily trader who uses our model in a simple way. If the model indicates that an individual stock price will increase the next day, the fictitious trader will invest in $10,000 worth of that stock at the opening price. After a purchase, the trader will hold the stock for one day. During the holding time, if the stock can make a profit of 2% or more, the trader sells immediately. Otherwise, at the end of the day, the trader sells the stock at the closing price. The same strategy is used for shorting, if the model indicates that an individual stock price will decrease. If the trader can buy the stock at a price 1% lower than shorted, he/she buys the stock to cover. Otherwise, the trader buys the stock at the closing price.
8.we notice that if there is no news reported for an individual stock on the previous day, their models cannot predict the trend of the stock price movements on a day, because they do not leverage long-term and mid-term news. This does not hurt the evaluation results of accuracy and MCC, but can hurt the real profit.
9.Despite 50 years of studies from the fields of finance,computer science and other research communities, the debate continues over what kinds of information can be useful for stock market prediction. In Artificial Intelligence (AI), three sources of information has been the most exploited for algorithmic stock market prediction.
First, some prediction techniques leverage historical and time-series data.
Second,with advances of NLP techniques, various studies have found that financial news can dramatically affect the share price of a security.
Apart from events, sentiment is another perspective of deep semantic analysis of news documents.
Conclusion
We demonstrated that deep learning is useful for event-driven stock price movement prediction by proposing a novel neural tensor network for learning event embeddings, and using a deep convolutional neural network to model the combined influence of long-term events and short-term events on stock price movements. Experimental results showed that event embeddings-based document representations are better than discrete events-based methods, and deep convolutional neural network can capture longer-term influence of news event than standard feedforward neural network. In market simulation, a simple greedy strategy allowed our model to yield more profit compared with previous work.
关于本篇文章的一些体会和总结:
1.这篇文章是哈尔滨工业大学本部刘挺团队(主攻NLP)于2015年发表在顶级会议IJCAI(国际人工智能联合会议)上关于“用深度学习在事件驱动股票预测中的运用”。其主要创新点在于将事件从新闻文本中提取出来,并用一个新的神经张量网络训练的密集向量表示。用到的数据为:美国500家上市公司的标准普尔500指数,利用自然语言处理NLP的事件抽取技术,将新闻转化为结构信息(结构化表示),深度学习CNN进行预测,结果显示相对最先进的预测方法,性能提高了6%。
2.using open information extraction(actor+action+object)
3.structured representation => increase sparsity => limited predictive power => event embeddings => dense vectors.
4.A novel neural tensor network:它可以学习事件参数的语义组成,方法是将他们相乘组合,而不是像标准的神经网络那样仅仅隐式的组合。
5.尽管长期事件影响相对减弱,但是股票市场的波动性仍然受其影响,而以往的定量模型很少将短期和长期事件的影响结合起来。
6.(news to sequences,CNN)在输入事件序列上用CNN表征其语义组成,池化层来提取最有代表意义的全局特征,前馈神经网络关联通过共享隐层和输出层的伴随股票趋势的全局特征。
7.CNN+long term history data:目标是自动学习结构化事件元组(在时间中联系基础,尽管没有共同的action,actor和object)的嵌入。
8.Neural tensor network:input(word embedding),output(event embedding)
9.我们的任务与以前从知识库中学习多关系数据的分布式表示有关。问题:关联类型有限,确定两个事件实体是否关联。
10.由于大多数事件参数由几个词组成,因此我们分别将参与者,动作和对象表示为其词嵌入的平均值,从而允许描述每个组件的词之间的共享统计强度。
11.news titles are more useful for prediction compared to news contents.
12.experiment:S&P500index yahoo finance, ACC and MCC(马修斯相关系数),using bag of words, svm.
13.in the baseline:a standard feedforward neural network.in new model:a neural tensor network to learn event embedding and a deep CNN.
Contrast:WB-NN,WB-CNN,E-NN,EB-NN,EB-CNN.
14.实验验证部分用了市场模拟交易,模仿一个以简单的方式使用文章中的模型的日常交易行为,如果模型表明某个股票价格将在第二天上涨,那么虚构的交易者将以开盘价投资价值10000美元的该股票。(设定策略,设置阈值控制交易程度)单独利用了长期和中期的新闻来预测某天的股价。
15.结论:文章提出了一种新的学习事件嵌入的神经张量网络,并利用深卷积神经网络对长期事件和短期事件对股票价格运动的联合影响进行建模,证明了深度学习对事件驱动的股票价格运动预测是有用的。实验结果表明,基于事件嵌入的文档表示方法优于基于离散事件的方法,深度卷积神经网络比标准前馈神经网络能够捕获新闻事件的长期影响。在市场模拟中,一个简单的贪婪策略允许我们的模型比以前的工作产生更多的利润。
总结:这篇文章主要利用了NLP和CNN相关技术,做了长期中期股票新闻对美股指数和个股的影响,发表于2015年,当时神经网络深度学习正处于迅猛发展期,所以比较新颖。文末作者也指出了,目前对股票价格预测的方法主要有:
(1)利用时间序列中一段股票历史数据;
(2)利用自然语言处理技术,不同的研究发现金融新闻能够戏剧性地影响股票价格。
(3)对于新闻做深度情感处理也是另外一个方向。
由于作者并未提供源代码和数据集,实验没法浮现结果,但是作者的思路十分清晰。接下来我也会在新闻上做进一步的研究。
来源:oschina
链接:https://my.oschina.net/u/4339501/blog/3576165