mining f reqeun t episo des for relating financial ev en
play

Mining F reqeun t Episo des for relating Financial Ev en - PDF document

Mining F reqeun t Episo des for relating Financial Ev en ts and Sto c k T rends Ann y Ng and Ada W aic hee F u Departmen t of Computer Science and Engineering The Chinese Univ ersit y of Hong Kong


  1. Mining F reqeun t Episo des for relating Financial Ev en ts and Sto c k T rends Ann y Ng and Ada W ai�c hee F u Departmen t of Computer Science and Engineering The Chinese Univ ersit y of Hong Kong� Shatin� Hong Kong Email� ang�adafu�cse�cuhk�edu�hk Abstract� It is exp ected that sto c k prices can b e a�ected b y the lo cal and o v erseas p olitical and economic ev en ts� W e extract ev en ts from the �nancial news of Chinese lo cal newspap ers whic h are a v ailable on the w eb� the news are matc hed against sto c k prices databases and a new metho d is prop osed for the mining of frequen t temp oral patterns� � In tro duction In sto c k mark et� the share prices can b e in�uenced b y man y factors� ranging from news releases of companies and lo cal p olitics to news of sup erp o w er econom y � W e call these incidences ev en ts � W e assume that eac h ev en t is of a certain ev en t t yp e and eac h ev en t has a time of o ccurrence� t ypically giv en b y the date that the ev en t o ccurs or it is rep orted� Eac h �ev en t� therefore corresp onds to a time p oin t� W e exp ect that ev en ts lik e �the Hong Kong go v ernmen t announcing de�cit� and �W ashington deciding to increase the in terest rate�� ma y lead to �uctuations in the Hong Kong sto c k prices within a short p erio d of time� When a n um b er of ev en ts o ccur within a short p erio d of time� w e assume that they p ossibly ha v e some relationship� Suc h a p erio d of time can b e determined b y the application exp erts and it is called a windo w � usually limited to a few da ys� Roughly sp eaking� a set of ev en ts that o ccur within a windo w is called an episo de instance � The set of ev en t t yp es in the instance is called an episo de � F or example� w e ma y ha v e the follo wing statemen t in a �nancial rep ort� �T elecomm unications sto c ks pushed the Hang Seng Index �� higher follo wing the Star TV�HK T elecom and Orange�Mannesmann deals�� This can b e an ex� ample for an episo de� in whic h all the four ev en ts� �telecomm unicatio n sto c ks rise�� �Hang Seng Index surges� and the t w o deals of �Star TV�HK T elecom� and �Orange�Mannesmann�� all happ ened within a p erio d of � da ys� If there are man y instances of the same episo de it is called a fr e quent episo de � W e are in terested to �nd frequen t episo des related to sto c k mo v emen ts� The sto c k mo v e� men t need not b e the last ev en t o ccurring in the episo de instance� b ecause the mo v emen t of sto c ks ma y b e caused b y the in v estors� exp ectation that something w ould happ en on the follo wing da ys� F or example� w e can ha v e a news rep ort sa ying �Hong Kong shares slid y esterda y in a mark et burdened b y the fear of p os� sible United States in terest rates rises tomorro w�� Therefore w e do not assume an ordering of the ev en ts in an episo de� �

  2. F rom the frequen t episo de� w e ma y disco v er the factors for the �uctuation of sto c k prices� W e are in terested in a sp ecial t yp e of episo des that w e call sto c k�episo des � it can b e written as � h e � � ��� � t da ys� i �� where the � e e e � � n � � ��� are ev en t t yp es and at least one of the ev en ts should b e the ev en t of e e � n sto c k �uctuation� An instance for this sto c k�episo de is an instance where the ev en ts of the ev en t t yp es � ��� app ear in a windo w of t da ys� Since w e are e e � n only concerned with sto c k�episo des� w e shall simply refer to sto c k�episo des as episo des� ��� De�nition s Let E � f E � E � ���� E g b e a set of ev en t t yp es � Assume that w e ha v e a � � m database that records ev en ts for da ys � to n � W e call this a ev en t database � w e can represen t this as D B � � D � D � ���� D � � where D is for da y i � and � � n i D � f e � e � ���� e g � where e � E � j � �� � k ��� This means that the ev en ts i i � i � ik ij that happ en on da y ha v e ev en t t yp es � Eac h is called a da y� i e � e � ���� e D i � i � ik i record � The da y records in the database are consecutiv e and arranged in D i c hronological order� where is one da y b efore for all � � � � �� D D n i i i �� � f e g � where � � i � �� � b ��� is an episo de if P has at P � e � ���� e e E p � p � pb pi least t w o elemen ts and at least one is a sto c k ev en t t yp e� W e assume that e pj a windo w size is giv en whic h is da ys� this is used to indicate a consecutiv e x sequence of da ys� W e are in terested in ev en ts that o ccur within a short p erio d x as de�ned b y a windo w� If the database consists of da ys and the windo w size m is da ys� there are � m � windo ws in the database� The �rst windo w con tains x exactly da ys The i �th windo w con tains ��� � with up to D � D � ���� D D � D � x � � x i i �� da ys� The second last windo w con tains � and the last windo w con tains D � D m � � m only D � m In some previous w ork suc h as ���� the frequency of an episo de is de�ned as the n um b er of windo ws whic h con tain ev en ts in the episo de� F or our application� w e notice some problem with this de�nition� supp ose w e ha v e a windo w size of x � if an episo de o ccurs in a single da y i � then for windo ws that start from da y i � x � � to windo ws starting from i � they all con tain the episo de� so the frequency of the episo de will b e x � Ho w ev er� the episo de actually has o ccurred only once� Therefore w e prop ose a di�eren t de�nition for the frequency of an episo de� De�niti on �� Given a window size of x days for DB� and an episo de P � an episo de instance of P is an o c curr enc e of al l the event typ es in P within a window W and wher e the r e c or d of the �rst day of the window W c ontains at le ast one of the event typ es in P � Each window c an b e c ounte d at most onc e as an episo de instanc e for a given episo de� The frequency of an ev en t is the numb er of o c curr enc es of the event in the datab ase� The supp ort or the frequency of an episo de is the numb er of in� stanc es for the episo de� Ther efor e� the fr e quency of an episo de P is the numb er of windows W � such that W c ontains al l the event typ es in P and the �rst day of W c ontains at le ast one of the event typ es in P � A n episo de is a frequen t episo de if its fr e quency is � � a given minim um supp ort threshold � � �

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend