Classifying News Stories to Estimate the Direction of a Stock Market Index (Brett Drury, Luis
Torgo and J. J. Almeida)[1]
Hao Fu, Jiatong Ruan
Classifying News Stories to Estimate the Direction of a Stock Market - - PowerPoint PPT Presentation
Classifying News Stories to Estimate the Direction of a Stock Market Index (Brett Drury, Luis Torgo and J. J. Almeida)[1] Hao Fu, Jiatong Ruan Introduction Background Timely information from news -> Prediction of the prospects of economic
Torgo and J. J. Almeida)[1]
Hao Fu, Jiatong Ruan
○ manually created rules ○ models learnt from manully selected data and manually constrcted dictionaries
Manually organize news stories
to automatically classify
Alignment of news sotries to market movement [6]
where the company names are on headlines.[6]
Manual constrcuted rules with automatically constructed dictionaries Alignment of stories with sharp market movement Self-training to construct a model to classify news stories
Fig 1: Proposed Classification
Amount: News stories (>300,000) News Source: Really SImple Syndication (RSS) feeds Time Period: Oct. 2008 - Jun. 2010, crawler ran at the same time each day Database: RDBMS: headline, description, published data and story text Stock Data: Yahoo Finance
Data pre-process:
indexes and company employees.
event or sentiment phrases Economic Actor(company,
Verb/Adj. Object(profits, unemployment, etc) Classified as positive or negative Unclassified Fig 2: Rule Classifyer Model
news stories.
movement.
when the market lose more than 2.11%.
strategy.
story rule classifier Alignment equal labels! training set Fig 3: Hybrid Strategy for equal labels
training set.
news story rule classifier Alignment contradictory labels! training set Fig 4: Hybrid Strategy for contradictory labels
Fig 5: Flow Diagram for Proposed Algorithm
information.
Strategy Headline Text Description Rules 0.77 0.60 0.65 Alignment 0.57 0.57 0.57 Hybrid 0.66 0.57 0.58 Proposed 0.84 0.71 0.77 Fig 6: Estimated F-Measure for competing strategies
categories.
events which may influence the market is increased. The proposed method adds further documents with a self-training method.
data.
training more robust model.
to evaluate, is not clearly presented in paper.
not a good idea to ignore data that contrary to market trend.
[1] Drury, Brett, Luis Torgo, and J. J. Almeida. "Classifying news stories to estimate the direction of a stock market index." Information Systems and Technologies (CISTI), 2011 6th Iberian Conference on. IEEE, 2011. [2] Taleb, Nassim Nicholas and Lane, Allen., The Black Swan (The impact of the highly improbable). Random House, 2008. [3] Thomas, James D. News and Trading Rules. s.1. : CiteSeer, 2003 [4] Mittermayer, M A and Knolmaye, G F. Text Mining Systems for Market Response to News: A Survey. University of Bern, 2006
[5] Wuthrich, B, et al., Daily prediction of major stock indices from textual www data. International conference on Knowledge Discovery and Data Mining, 1998 [6] Lavrenko, Victor, et al., Language Models for Financial News Recommendation. ACM Press, 2000 [7] Drury, Brett and Almeida, J J., Identification of Fine Grained Feature Based Event and Sentiment Phrases from Business News Stories. ACM, 2011 [8] H. Cunningham, D. Maynard, K. Bontcheva, and V. Tablan. Gate: A framwork and graphical development environment for robust nlp tools and applications. In Proceeding of the 40th Anniversary Meeting of the Association for Computational LInguistics, 2002