IRDM ‘15/16
Jilles Vreeken
Chapter 7-2: Discrete S ete Sequentia ential D Data ta
26 Nov 2015
Chapter 7-2: Discrete S ete Sequentia ential D Data ta Jilles - - PowerPoint PPT Presentation
Chapter 7-2: Discrete S ete Sequentia ential D Data ta Jilles Vreeken IRDM 15/16 26 Nov 2015 IRDM Chapter 7, overview Time Series Basic Ideas 1. Prediction 2. Motif Discovery 3. Discrete Sequences Basic Ideas 4. Pattern
IRDM ‘15/16
26 Nov 2015
IRDM ‘15/16
Time Series
1.
Basic Ideas
2.
Prediction
3.
Motif Discovery
Discrete Sequences
4.
Basic Ideas
5.
Pattern Discovery
6.
Hidden Markov Models
You’ll find this covered in Aggarwal Ch. 3.4, 14, 15
VII-1: 2
IRDM ‘15/16
Time Series
1.
Basic Ideas
2.
Prediction
3.
Motif Discovery
Discrete Sequences
4.
Basic Ideas
5.
Pattern Discovery
6.
Hidden Markov Models
You’ll find this covered in Aggarwal Ch. 3.4, 14, 15
VII-1: 3
IRDM ‘15/16
Aggarwal Ch. 14.4, 3.4
VII-1: 4
IRDM ‘15/16
DTW stretches the time axis of one series to enable better matches
(Aggarwal Ch. 3.4) VII-1: 5
IRDM ‘15/16
Let 𝐸𝐸𝐸(𝑗, 𝑘) be the optimal distance between the first 𝑗 and first 𝑘 elements of time series 𝑌 of length 𝑜 and 𝑍 of length 𝑛 𝐸𝐸𝐸 𝑗, 𝑘 = 𝑒𝑗𝑒𝑒𝑒𝑜𝑒𝑒 𝑌𝑗, 𝑍
𝑘 + min
𝐸𝐸𝐸(𝑗, 𝑘 − 1) 𝐸𝐸𝐸(𝑗 − 1, 𝑘) 𝐸𝐸𝐸(𝑗 − 1, 𝑘 − 1) repeat 𝑦𝑗 repeat 𝑧𝑘 repeat neither We initialise as follows
𝐸𝐸𝐸 0,0 = 0 𝐸𝐸𝐸 0, 𝑘 = ∞ for all 𝑘 ∈ {1, … , 𝑜} 𝐸𝐸𝐸 𝑗, 0 = ∞ for all 𝑗 ∈ {1, … , 𝑛}
We can then simply iterate by increasing 𝑗 and 𝑘
(Aggarwal Ch. 3.4) VII-1: 6
IRDM ‘15/16
Let 𝐸𝐸𝐸(𝑗, 𝑘) be the optimal distance between the first 𝑗 and first 𝑘 elements of time series 𝑌 of length 𝑜 and 𝑍 of length 𝑛 𝐸𝐸𝐸 𝑗, 𝑘 = 𝑒𝑗𝑒𝑒𝑒𝑜𝑒𝑒 𝑌𝑗, 𝑍
𝑘 + min
𝐸𝐸𝐸(𝑗, 𝑘 − 1) 𝐸𝐸𝐸(𝑗 − 1, 𝑘) 𝐸𝐸𝐸(𝑗 − 1, 𝑘 − 1) repeat 𝑦𝑗 repeat 𝑧𝑘 repeat neither From the initialised values, can simply iterate by increasing 𝑗 and 𝑘:
for for 𝑗 = 1 to 𝑛 for for 𝑘 = 1 to 𝑜 compute 𝐸𝐸𝐸(𝑗, 𝑘)
We can also compute it recursively, by dynamic programming. Both naïve strategies cost 𝑃 𝑜𝑛 , however.
(Aggarwal Ch. 3.4) VII-1: 7
IRDM ‘15/16
Let 𝐸𝐸𝐸(𝑗, 𝑘) be the optimal distance between the first 𝑗 elements of time series 𝑌 of length 𝑜 and the first 𝑘 elements of time series 𝑍 of length 𝑛
𝐸𝐸𝐸 𝑗, 𝑘 = 𝑒𝑗𝑒𝑒𝑒𝑜𝑒𝑒 𝑌𝑗, 𝑍
𝑘 + min
𝐸𝐸𝐸(𝑗, 𝑘 − 1) 𝐸𝐸𝐸(𝑗 − 1, 𝑘) 𝐸𝐸𝐸(𝑗 − 1, 𝑘 − 1) repeat 𝑦𝑗 repeat 𝑧𝑘 repeat neither
We can speed up computation by imposing constraints.
e.g. a window constraint to compute 𝐸𝐸𝐸(𝑗, 𝑘) only when 𝑗 − 𝑘 ≤ 𝑥 we then only need max 0, i − w − min
{𝑜, 𝑗 + 𝑥} inner loops
(Aggarwal Ch. 3.4) VII-1: 8
IRDM ‘15/16
Even smarter is to speed up DTW using a lower bound.
𝑀𝑀_𝐿𝑒𝐿𝐿𝐿(𝑌, 𝑍) = 𝑍
𝑗 − 𝑉𝑗 2
𝑍
𝑗 − 𝑀𝑗 2
if 𝑌𝑗 > 𝑉𝑗 if 𝑌𝑗 < 𝑀𝑗
𝑜 𝑗=1
𝑉𝑗 = max {𝑌𝑗−𝑠: 𝑌𝑗+𝑠} 𝑀𝑗 = min {𝑌𝑗−𝑠: 𝑌𝑗+𝑠} where 𝑠 is the reach, the allowed range
VII-1: 9
X Y
L U
IRDM ‘15/16
VII- 2: 10
IRDM ‘15/16
Aggarwal Ch. 14.1-14.2
VII-2: 11
IRDM ‘15/16
Continuous real-valued time series have their downsides
mining results rely on either a dis
istanc nce function n or assu sump mption
indexing, pattern mining, summarisation, clustering, classification,
and outlier detection results hence rely on arbitrar ary choices
Discrete sequences are often easier to deal with
mining results rely mostly on count
nting ing
How to transform a time series into an event sequence?
discretisation
VII-2: 12
IRDM ‘15/16
(Lin et al. 2002, 2007) VII-2: 13
IRDM ‘15/16
Symbolic Aggregate Approximation (SAX)
most well-known approach to discretise a time series type of piece-wise aggregated approximation (PAA)
How to do SAX
divide the data into 𝑥 fr
frames es
compute the mean per frame perform equal-height binning
alphabet of 𝑒 characters
(Lin et al. 2002, 2007) VII-2: 14
IRDM ‘15/16
A discrete seque uenc nce 𝑌1 … 𝑌𝑜 of length 𝑜 and dimensionality 𝑒, contains 𝑒 discrete feature values at each of 𝑜 different timestamps 𝑒1 … 𝑒𝑜. Each of the 𝑜 comp
behavioral attributes (𝑦𝑗
1 … 𝑦𝑗 𝑒) collected at the 𝑗th
timestamp. The actual time stamps are usually ignored – they only induce an order on the components, or eve vents ts.
VII-2: 15
IRDM ‘15/16
In many applications, the dimensionality is 1
e.g. strings, such as text or genomes. for AATCGTAC over an alphabet Σ = {A, C, G, T}, each 𝑌𝑗 ∈ Σ
In some applications, each 𝑌𝑗 is not a vector, but a se set
e.g. a supermarket transaction, 𝑌𝑗 ⊆ Σ there is no order within 𝑌𝑗
We will consider the set-setting, as it is most general
VII-2: 16
IRDM ‘15/16
Aggarwal Ch. 15.2
VII-2: 17
IRDM ‘15/16
A se sequ quential p patt attern is a sequence.
to occur in the data, it has to be a subsequence of the data.
Defini inition: n: Given two sequences 𝒴 = 𝑌1 … 𝑌𝑜 and 𝒶 = 𝑎1 … 𝑎𝑙 where all elements 𝑌𝑗 and 𝑎𝑗 in the sequences are sets. Then, the sequence 𝒶 is a subsequ equen ence of 𝒴, if 𝑙 elements 𝑌𝑗1 … 𝑌𝑗𝑙 can be found in 𝒴, such that 𝑗1 < 𝑗2 < ⋯ < 𝑗𝑙 and 𝑎
𝑘 ⊆ 𝑌𝑗𝑘 for each 𝑘 ∈ {1 … 𝑙}
VII-2: 18
a a a b d c d b a b c a a b a b a b a b
𝒶 = 𝒴 =
IRDM ‘15/16
Depending on whether we have a datab atabas ase 𝑬 of sequences,
gle l long s g sequ equence, we have to define the suppo support of a sequential pattern differently. Standard, or ‘per sequence’ support counting
given a database 𝑬 = {𝒴1, … , 𝒴𝑂}, the support of a subsequence
𝒶 is the number of sequences in 𝑬 that contain 𝒶.
Window-based support counting
given a single sequence 𝒴, the support of a subsequence 𝒶 is the
number of windo dows over 𝒴 that contain 𝒶.
(we can define frequency analogue as relative support) VII-2: 19
IRDM ‘15/16
A wind ndow
𝒴[𝑒; 𝑒] = 𝑌𝑗 ∈ 𝒴 ∣ 𝑒 ≤ 𝑗 ≤ s Window-based support counting
we can choose a window length 𝑥, and sweep over the data
VII-2: 20
a a a b d c d b a b c a b d a a b c a b
𝒶 = 𝒴 =
IRDM ‘15/16
VII-2: 21
a a a b d c d b a b c a b d a a b c a b : 1
A window 𝒴[𝑒; 𝑒] is a strict subsequence of sequence 𝒴. 𝒴[𝑒; 𝑒] = 𝑌𝑗 ∈ 𝒴 ∣ 𝑒 ≤ 𝑗 ≤ s Window-based support counting
we can choose a window length 𝑥, and sweep over the data
𝒶 = 𝒴 =
IRDM ‘15/16
VII-2: 22
a a a b d c d b a b c a b d a a b c a b : 1
A window 𝒴[𝑒; 𝑒] is a strict subsequence of sequence 𝒴. 𝒴[𝑒; 𝑒] = 𝑌𝑗 ∈ 𝒴 ∣ 𝑒 ≤ 𝑗 ≤ s Window-based support counting
we can choose a window length 𝑥, and sweep over the data
𝒶 = 𝒴 =
IRDM ‘15/16
VII-2: 23
a a a b d c d b a b c a b d a a b c a b : 1 : 2
A window 𝒴[𝑒; 𝑒] is a strict subsequence of sequence 𝒴. 𝒴[𝑒; 𝑒] = 𝑌𝑗 ∈ 𝒴 ∣ 𝑒 ≤ 𝑗 ≤ s Window-based support counting
we can choose a window length 𝑥, and sweep over the data
𝒶 = 𝒴 =
IRDM ‘15/16
A window 𝒴[𝑒; 𝑒] is a strict subsequence of sequence 𝒴. 𝒴[𝑒; 𝑒] = 𝑌𝑗 ∈ 𝒴 ∣ 𝑒 ≤ 𝑗 ≤ s Window-based support counting
we can choose a window length 𝑥, and sweep over the data
VII-2: 24
a a a b d c d b a b c a b d a a b c a b : 2 : 3
𝒶 = 𝒴 =
IRDM ‘15/16
VII-2: 25
a a a b d c d b a b c a b d a a b c a b : 3 : 4
A window 𝒴[𝑒; 𝑒] is a strict subsequence of sequence 𝒴. 𝒴[𝑒; 𝑒] = 𝑌𝑗 ∈ 𝒴 ∣ 𝑒 ≤ 𝑗 ≤ s Window-based support counting
we can choose a window length 𝑥, and sweep over the data support is now dependent on 𝑥, what happens with longer 𝑥?
𝒶 = 𝒴 =
IRDM ‘15/16
Fixed window lengths lead to double counting
if 𝒴[𝑒; 𝑒] supports sequence 𝒶 so do 𝒴[𝑒; 𝑒 + 𝑙] and 𝒴[𝑒 − 𝑙; 𝑒]
We can avoid this by counting only min inim imal w l win indows
𝑥 = 𝒴[𝑒; 𝑒] is a minimal window of pattern 𝒶 if 𝑥 contains 𝒶 but
no other proper sub-windows of w contain 𝒶.
for efficiency or fun, we may want to set a maximal window size
VII-2: 26
a a a b d c d b a b c a b d a a b c a b
𝒶 = 𝒴 =
IRDM ‘15/16
Fixed window lengths lead to double counting
if 𝒴[𝑒; 𝑒] supports sequence 𝒶 so do 𝒴[𝑒; 𝑒 + 𝑙] and 𝒴[𝑒 − 𝑙; 𝑒]
We can avoid this by counting only min inim imal w l win indows
𝑥 = 𝒴[𝑒; 𝑒] is a minimal window of pattern 𝒶 if 𝑥 contains 𝒶 but
no other proper sub-windows of w contain 𝒶.
for efficiency or fun, we may want to set a maximal window size
VII-2: 27
a a a b d c d b a b c a b d a a b c a b
𝒶 = 𝒴 =
IRDM ‘15/16
Fixed window lengths lead to double counting
if 𝒴[𝑒; 𝑒] supports sequence 𝒶 so do 𝒴[𝑒; 𝑒 + 𝑙] and 𝒴[𝑒 − 𝑙; 𝑒]
We can avoid this by counting only min inim imal w l win indows
𝑥 = 𝒴[𝑒; 𝑒] is a minimal window of pattern 𝒶 if 𝑥 contains 𝒶 but
no other proper sub-windows of w contain 𝒶.
for efficiency or fun, we may want to set a maximal window size
VII-2: 28
a a a b d c d b a b c a b d a a b c a b
𝒶 = 𝒴 =
: 5
IRDM ‘15/16
Like for itemsets, the per-sequence and per-window definitions of support are also monotone
we can employ level-wise search!
We can modify
APRIORI to get to GSP (Agrawal & Srikant, 1995; Mannila, Toivonen, Verkamo, 1995) ECLAT to get SPADE (Zaki, 2000) FP-GROWTH to get PREFIXSPAN (Pei et al., 2001)
VII-2: 29
IRDM ‘15/16
Alg lgorit ithm GSP(sequence database 𝑬, minimal support 𝜏) begin in 𝑙 ← 1; ℱ𝑙 ← {all frequent 1 − item elements} wh while ile ℱ𝑙 is not empty do do Generate 𝒟𝑙+1 by joining pairs of sequences in ℱ𝑙, such that removing an item from the first element of one sequence matches the sequence obtained by removing an item from the last element of the other Prune sequences from 𝒟𝑙+1 that violate downward closure Determine ℱ𝑙+1 by support counting on (𝒟𝑙+1, 𝑬) and retaining sequences from 𝒟𝑙+1 with support at least 𝜏 𝑙 ← 𝑙 + 1 end nd return ⋃ ℱ𝑗
𝑙 𝑗=1
end nd
(Agrawal & Srikant, 1995; Mannila, Toivonen & Verkamo, 1995) VII-2: 30
a b c b a c b
+
∈ ℱ𝑙 ∈ ℱ𝑙 ∈ 𝒟𝑙+1
IRDM ‘15/16
There are many types of sequential patterns The most well-known are
𝑜-grams, 𝑙-mers, or strict subsequ
quen ences es, where we do not allo llow gaps
serial episodes, or subsequences,
where we do allow gaps
VII-2: 31
a c b a c b a a a b c c d b d b e c b d a a b c
𝒴 =
IRDM ‘15/16
There are many types of sequential patterns The most well-known are
𝑜-grams, 𝑙-mers, or strict subsequ
quen ences es, where we do not allo llow gaps
serial episodes, or subsequences,
where we do allow gaps
Each element can contain one or more items
VII-2: 32
a b a c b d c d a a c d d c d b e f e d b d a a b c
𝒴 =
a b c a d b c a a
IRDM ‘15/16
Serial episodes are still restrictive
not everything always happens exactly in sequential order
Paralle llel e l epis isodes acknowledge this
a parallel episode defines a partial order, for a match it requires all
parallel events to happen, but does not specify their exact order.
e.g. first ,, then in any order and , and then
We can also combine the two into genera eralised ed e episodes es
VII-2: 33
b d c a c b d a a
and
c b d a b d c
IRDM ‘15/16
Aggarwal Ch. 15.5
VII-2: 34
IRDM ‘15/16
Hidden Mar Markov Mo Models are probabilistic, generative models for discrete sequences. It is a graphical model in which nodes correspond to sys system st states, and edges to state changes ges. In a HMM the states of the system are hidd dden en; not directly visible to the user. We only observe a sequence over symbols Σ that the system generates when it switches between states.
VII-2: 35
IRDM ‘15/16
This HMM can generate sequences such as
VVVVVVVMVV
Veggie (common)
MVMVVMMVM
Omni (common)
MMVMVVVVVV
Omni-turned-Veggie (not very common)
MMMMMMMM
Carnivore (rare)
VII-2: 36 Vegetarian Omnivore
0.99 0.90 0.01 0.10
Meal d distrib ibutio ion V = 99% M = 1% Meal d distrib ibutio ion 𝑊 = 50% 𝑁 = 50%
IRDM ‘15/16
VII-2: 37 Flexitarian Omnivore
0.60 0.90 0.20 0.08
Meal d distrib ibutio ion V = 80% M = 20% Meal d distrib ibutio ion 𝑊 = 50% 𝑁 = 50%
Vegetarian Carnivore
0.99
Meal d distrib ibutio ion V = 99% M = 1% Meal d distrib ibutio ion V = 1% M = 99%
0.60 0.01 0.20 0.40 0.02
IRDM ‘15/16
A Hid idden Ma Markov
Model l over alphabet Σ = {𝜏1, … , 𝜏 Σ } is a directed graph 𝐻(𝑇, 𝐸) consisting of 𝑜 states 𝑇 = {𝑒1, … , 𝑒𝑜}. The initial state probabilities are 𝜌𝑗, … , 𝜌𝑜. The (directed) edges correspond to state transitions. The probability of a transition from state 𝑒𝑗 to state 𝑒
𝑘 is denoted by 𝑞𝑗𝑘.
For every visit to a state, a symbol from Σ is generated with probability 𝑄(𝜏𝑗 ∣ 𝑒𝑘).
VII-2: 38
IRDM ‘15/16
There are three main things to do with an HMM
1. 1.
Traini ning ng.
Given topology 𝐻 and database 𝑬, learn the in init itia ial l state probabili litie ies, transit itio ion probabili lities, and the sy symb mbol emi missi ssion probabil ilit itie ies. .
2. 2.
Explan anati ation.
Given an HMM, determine the mos
t likel ely sta tate e seq equen ence that generated test sequence 𝒶.
3. 3.
Evaluati ation.
Given an HMM, determine the probability ty o
est t seq equen ence 𝒶.
VII-2: 39
IRDM ‘15/16
We want to know the fit it p probabilit ility that sequence 𝒴 = 𝑌1 … 𝑌𝑛 was generated by the given HMM. Naïve approach
compute all 𝑜𝑛 possible paths over 𝐻 for each, determine probability of generating 𝒴 sum these probabilities, this is the fit probability of 𝒴
VII-2: 40
IRDM ‘15/16
The fit probability of the first 𝑠 symbols1 can be co compu mputed recur ecursi sively from the fit probability of the first (𝑠 − 1) symbols2 Let 𝛽𝑠 𝒴, 𝑒
𝑘 be the probability that the first 𝑠 symbols of 𝒴
are generated by the model, and the last state is 𝑒𝑘. 𝛽𝑠 𝒴, 𝑒
𝑘 = 𝛽𝑠−1 𝒴, 𝑒𝑗 ⋅ 𝑞𝑗𝑘 ⋅ 𝑄 𝑌𝑠
𝑒𝑘
𝑜 𝑗=1
That is, we sum over all paths up to different final nodes.
1 and a fixed value of the 𝑠𝑢𝑢 state 2 and a fixed value of the (𝑠 − 1)𝑡𝑢 state
VII-2: 41
IRDM ‘15/16
We initialise with with 𝛽1 𝒴, 𝑒
𝑘
= 𝜌𝑘 ⋅ 𝑄 𝑌1 𝑒
𝑘 and
then iteratively compute for each 𝑠 = 1 … 𝑛. The fit probability of 𝒴 is the sum over all end-states, 𝐺(𝒴) = 𝛽𝑛 𝒴, 𝑒
𝑘 𝑜 𝑘=1
The complexity of the Forwar ard A d Algorith thm is 𝑃(𝑜2𝑛)
VII-2: 42
IRDM ‘15/16
Good question to ask: why compute the fit probability?
classification clustering anomaly detection
For the first two, we can now create group-specific HMMs, and assi assign th the most l st likely y se sequ quences to those. For the the third, we have an HMM for our training data, and can now report poorly y fitti tting s g sequences.
VII-2: 43
IRDM ‘15/16
We want to know why a y a se sequ quence 𝓨 fits o s our data
most likely state sequence gives an intuitive explanation. Naïve approach
compute all 𝑜𝑛 possible paths over the HMM for each, determine probability of generating 𝒴 report the path with maxim
imum probabilit ity
Instead of naively, can re-use the recursive approach?
VII-2: 44
IRDM ‘15/16
Any subpath of an optimal state path must also be optimal for generating the corresponding subsequence. Let 𝜀𝑠(𝒴, 𝑒
𝑘) be the probability of the best state sequence
generating the first 𝑠 symbols of 𝒴 ending at state 𝑒
𝑘, with
𝜀𝑠 𝒴, 𝑒
𝑘 = max 𝑗∈[1,𝑜] 𝜀𝑠−1 𝒴, 𝑒𝑗 ⋅ 𝑞𝑗𝑘 ⋅ 𝑄(𝑌𝑠 ∣ 𝑒 𝑘)
That is, we recursively compute the maximum-probability path over all 𝑜 different paths for different final nodes. Overall, we initialise recursion with 𝜀1 𝒴, 𝑒
𝑘 = 𝜌𝑘𝑄(𝑌1 ∣ 𝑒 𝑘),
and then iteratively compute for 𝑠 = 1 … 𝑛.
VII-2: 45
IRDM ‘15/16
So far, we assumed the given HMM was tr trai ained. How do we train a HMM in practice? Learning the parameters of an HMM is difficult
no known algorithm is guaranteed to give the global optimum
There do exist methods for reasonably effective solutions
e.g. the Forward-Backward (Baum-Welch) algorithm
VII-2: 46
IRDM ‘15/16
We already know how to calculate the for forwa ward p prob
ility 𝛽𝑠(𝒴, 𝑒𝑘) for the first 𝑠 symbols of a sequence 𝒴, ending at 𝑒𝑘. Now, let 𝛾𝑠(𝒴, 𝑒
𝑘) be the backwar
ard p prob
after and nd no not i incl ncluding ng the 𝑠𝑢𝑢 symbol, conditioned ned that the 𝑠𝑢𝑢 state is 𝑒𝑘. We initialise 𝛾 𝒴 𝒴, 𝑒𝑘 = 1, and compute 𝛾𝑠(𝒴, 𝑒𝑘) just as 𝛽𝑠(𝒴, 𝑒
𝑘) but from back to front.
For the Baum-Welch algorithm, we’ll also need
𝛿𝑠 𝒴, 𝑒𝑗 for the probability that the 𝑠𝑢𝑢 state corresponds to 𝑒𝑗, and 𝜔𝑠(𝒴, 𝑒𝑗, 𝑒
𝑘) for the probability of the 𝑠𝑢𝑢 state 𝑒𝑗, and the 𝑠 + 1 𝑢𝑢 state 𝑒 𝑘
VII-2: 47
IRDM ‘15/16
We initialize the model parameters randoml
We will then iteratively (E-step) Estimate 𝛽 ⋅ , 𝛾 ⋅ , 𝜔 ⋅ , and 𝛿(⋅) from the current model parameters (M-step) Estimate model parameters 𝜌 ⋅ , 𝑄 ⋅ ⋅ , 𝑞⋅⋅ from the current 𝛽 ⋅ , 𝛾 ⋅ , 𝜔 ⋅ , and 𝛿(⋅) until the parameters converge. This is simply the EM strategy!
VII-2: 48
IRDM ‘15/16
𝛽 ⋅
Forward algorithm. 𝛾 ⋅
Backward algorithm.
VII-2: 49
IRDM ‘15/16
𝜔(⋅) We can split this value into first till 𝑠𝑢𝑢, 𝑠𝑢𝑢, and 𝑠 + 1 𝑢𝑢 till end 𝜔𝑠 𝒴, 𝑒𝑗, 𝑒
𝑘
= 𝛽𝑠 𝒴, 𝑒𝑗 ⋅ 𝑞𝑗𝑘 ⋅ 𝑄 𝑌𝑠+1 𝑒
𝑘
⋅ 𝛾𝑠+1(𝒴, 𝑒
𝑘)
and normalize to probabilities over all pairs 𝑗, 𝑘. So, easy, after all. 𝛿 ⋅
𝑘) varying 𝑒 𝑘
VII-2: 50
IRDM ‘15/16
VII-2: 51
IRDM ‘15/16
Discrete sequences are a fun aspect of time series
many interesting problems
Mining sequential patterns
more expressive than itemsets, more difficult to define support
Hidden Markov Models
can be used to predict, explain, evaluate discrete sequences
VII-2: 52
IRDM ‘15/16
Discrete sequences are a fun aspect of time series
many interesting problems
Mining sequential patterns
more expressive than itemsets, more difficult to define support
Hidden Markov Models
can be used to predict, explain, evaluate discrete sequences
VII-2: 53