Semi-Supervised Time Series communities: Classification Medicine - PowerPoint PPT Presentation

• Time series are of interest to many Semi-Supervised Time Series communities: Classification – Medicine Introduction – Aerospace Mojdeh Jalali Heravi – Finance – Business – Meteology – Entertainment – … . • Current methods for time series On the other hand … • classification: Copious amounts of Unlabeled data are Large amount of labeled training data available Introduction Introduction – Difficult or expensive to collect For example: PhysioBank archive – • Time More than 40 GBs of ECG • • Expertise Freely available • In hospitals there are even more! • Semi-Supervised classification � takes advantage of large collections of Unlabeled data

• Applications The paper … • Value of unlabeled data Li Wei, Eamonn Keogh, Semi-Supervised time Outline • Semi-supervise learning series classification , In Proc. of ACM SIGKDD • Time series classification International Conference on Knowledge • Semi-supervised time series Discovery and Data Mining, 2006 classification • Empirical Evaluation • Indexing of handwritten documents • a classifier for George Washington will not and are – generalize to Isaac interested in making large archives of Applications Applications Newton handwritten text searchable. • Obtaining labeled data for each word is – For indexing first the words should be expensive classified. • Having few training examples and using – Treating the words as time series is an semi-supervised competitive approach. approach would be great! A sample of text written by George Washington

• Heartbeat Classification – PhysioBank • Applications • More than 40 GBs of freely available medical Applications • Value of unlabeled data data Outline • A potential goldmine for a researcher • Semi-supervise learning • Time series classification • Again, Having few training examples and using semi-supervised approach would be • Semi-supervised time series great! classification • Empirical Evaluation Value of unlabeled data Value of unlabeled data

• Classification � supervised learning Semi-supervised Learning • Clustering � unsupervised learning • Applications • Value of unlabeled data Learning from both labeled and Outline • Semi-supervise learning unlabeled data is called • Time series classification semi-supervised learning • Semi-supervised time series classification Less human effort • Empirical Evaluation Higher accuracy Five classes of SSL: Five classes of SSL: – – Semi-supervised Learning Semi-supervised Learning 2. Low density separation approaches 1. Generative models the oldest methods • “ The decision boundary should lie in a low • Assumption: the data are drawn from a • density region ” � pushes the decision mixture distribution that can be identified by boundary away from the unlabeled data large amount of unlabeled data. To achieve this goal � maximization • Knowledge of the structure of the data can be algorithms ( e.g. TSVM) naturally incorporate into the model “ (abnormal time series) do not necessarily There has been no discussion of the mixture live in sparse areas of n-dimensional distribution assumption for time series data so far space ” and “ repeated patterns do not necessarily live in dense parts ” . Keogh et. al . [1]

– Five classes of SSL: – Five classes of SSL: Semi-supervised Learning Semi-supervised Learning 3. Graph-based semi-supervised learning 4. Co-training • “ the (high-dimensional) data lie (roughly) on a • Features � 2 disjoint sets low-dimensional manifold ” – assumption: features are independent • Data � nodes – each set is sufficient to train a good classifier distance between the nodes � edges • Two classifiers � on each feature subset – The predictions of one classifier are used to • Graph mincut [2], Tikhonov Regularization [3], enlarge the training set of the other. Manifold Regularization [4] shape The graph encodes prior knowledge � color its construction needs to be hand crafted for each domain. But we are looking for a general semi-supervised classification framework Time series have very high feature correlation – Five classes of SSL: Semi-supervised Learning 5. Self-training • Applications • Train � small amount of labeled data • Value of unlabeled data • Classify � unlabeled data Outline – Adds the most confidently classified examples + • Semi-supervise learning their labels into the training set • Time series classification – This procedure repeats � classifier refines gradually • Semi-supervised time series classification The classifier is using its own predictions to teach • Empirical Evaluation itself � it ’ s general with few assumptions

• Definition 1 . Time Series : • Positive class Time Series Classification – Some structure A time series T = t 1 , … ,t m is an – positive labeled examples are rare, but ordered set of m real-valued variables. unlabeled data is abundant. – Long time series Time Series – Small number of ways to be in class – Short time series � subsequences of long time series • Negative class – Little or no common structure Definition 2 . Euclidean Distance : – essentially infinite number of ways to be in this class We focus on binary time series classifiers • 1 nearest neighbor with Euclidian Semi-supervised Time Series distance • Applications • Value of unlabeled data Classification Outline • Semi-supervise learning • Time series classification • Semi-supervised time series classification • Empirical Evaluation On Control-Chart Dataset

• Training the classifier (example) • Training the classifier (algorithm) Semi-supervised Time Series Semi-supervised Time Series P � positively labeled examples U � unlabeled examples Classification Classification • Stopping criterion (example) • Stopping criterion Semi-supervised Time Series Semi-supervised Time Series Classification Classification

• Using the classifier • Using the classifier Semi-supervised Time Series Semi-supervised Time Series – a modification on the classification – For each instance to be classified, check scheme of the 1NN classifier whether its nearest neighbor in the Classification Classification training set is labeled or not using only the labeled positive examples in the training set – the training set is huge – To classify: Comparing each instance in the testing • within r distance to any of the labeled positive examples � positive set to each example in the training set is • otherwise � negative. untenable in practice. – r � the average distance from a positive example to its nearest neighbor • Semi-supervised approach Empirical Evaluation • Applications Compared to: • Value of unlabeled data • Na ï ve KNN approach Outline – K nearest neighbor of positive example � • Semi-supervise learning positive • Time series classification – Others � negative • Semi-supervised time series – Find the best k classification • Empirical Evaluation

• Stopping heuristic • Performance – class distribution is skewed � accuracy is Empirical Evaluation Empirical Evaluation not good – Different from what was described before 96% negative Keep training until it achieves the highest 4% positive precision-recall + few more iterations if simply classify everything as negative accuracy = 96% – Precision-recall breakeven point • Test and training sets • Precision = recall – For more experiments � distinct – For small datasets � same • still non-trivial � most data in training dataset are unlabeled • ECG dataset form • Handwritten MIT-BIH arrhythmia documents Word Spotting Dataset Database ECG Dataset – # of initial positive examples = 10 – # of initial positive examples = 10 – Run 200 times – Run 25 times • Blue line � • Blue line � average average • Gray lines � • Gray lines � 1 SD intervals 1 SD intervals approach P-R approach P-R Semi-supervised 94.97% Semi-supervised 86.2% KNN (k = 312) 81.29% KNN (k = 109) 79.52%

• 2D time series extracted from video Word Spotting Dataset Class A: Actor 1 with gun • • Class B: Actor 1 without gun Gun Dataset Class C: Actor 2 with gun • Class D: Actor 2 without gun • – # of initial positive examples = 1 – Run 27 times distance from positive class � rank approach P-R probability to be in positive class � Semi-supervised 65.19% KNN (k = 27) 55.93% a collection of time series containing a sequence of • measurements recorded by one vacuum-chamber sensor during the etch process of silicon wafers for semiconductor fabrication Wafer Dataset Yoga Dataset – # of initial positive examples = 1 # of initial positive examples = 1 approach P-R approach P-R Semi-supervised 73.17% Semi-supervised 89.04% KNN (k = 381) 46.87% KNN (k = 156) 82.95%

Semi-Supervised Time Series communities: Classification Medicine - PowerPoint PPT Presentation

Time series are of interest to many Semi-Supervised Time Series communities: Classification Medicine Introduction Aerospace Mojdeh Jalali Heravi Finance Business Meteology Entertainment . Current methods

Margin-based Semi-supervised Learning Using Apollonius circle MONA EMADI AND JAFAR TANHA T TC S

Semi-Supervised Kernel Mean Shift Clustering A Semi-Supervised Clustering Approach Motivation:

Semi-Supervised Local Fisher Semi-Supervised Local Fisher Discriminant Analysis Discriminant

Support Vector Machines (SVMs). Semi-Supervised Learning. Semi-Supervised SVMs.

Semi-Supervised Learning Maria-Florina Balcan 03/30/2015 Readings: Semi-Supervised Learning.

CS330 Paper Presentation: October 16th, 2019 Supervised Classification Semi-Supervised

Iterative Hybrid Algorithm for Semi-supervised Classification Martin SAVESKI Supervised by

Unsupervised and Semi-supervised Learning of Structure Graham Neubig Site

Unsupervised and Semi-supervised Learning of Structure Graham Neubig Site

Lead Screw Motors LSM08 Series LSM11 Series LSM14 Series LSM17 Series

Time Series Analysis and Mining with R Time Series Decomposi- tion Time Series Forecasting

Outline Time series and forecasting Time series objects 1 in R Basic time series functionality

Semi-Crystalline Polymer Morphologies and their Hierarchical Morphologies 1 Semi-Crystalline

A semi-supervised approach to extracting multiword entity names from user reviews Olga

Semi-supervised Semantic Role Labeling Hagen Frstenau Department of Computational Linguistics

Semi-supervised Image Classification in Likelihood Space Rong Duan, Wei Jiang, Hong Man Stevens

Everything is under control Optimal control and applications to aerospace problems E. Tr elat

Fr From m Ba Basti tion to Bl Blairgowrie: : Ho How th the e Sc Scottish sh Trauma

TELE-SURGERY A New Virtual T ool for Medical Education Thais RUSSOMANO a , Ricardo B. CARDOSO a

Space and Naval Warfare Systems Center Atlantic Navy and Federal Support Portfolio Overview

USAF BRAC Construction Col Sal Nodjomian, USAF Programs Division Chief DCS/Logistics,

Headquarters Air Mobility Command MAF Rated Officer Development AMC A1KO/KB Team Lt Col

Better Informing the Public of Laser Exposure Injury Potential Patrick Murphy and Greg Makhov

GEM Discoverer Pre-Departure Briefing 21 April 2015 LT 1 Say Hi! Programme (Session 1) 9.30am