time series
play

Time Series Bhiksha Raj Class 22. 14 Nov 2013 14 Nov 2013 - PowerPoint PPT Presentation

Machine Learning for Signal Processing Predicting and Estimation from Time Series Bhiksha Raj Class 22. 14 Nov 2013 14 Nov 2013 11-755/18797 1 Administrivia No class on Tuesday.. Project Demos: 5 th December (Thursday). Before


  1. Machine Learning for Signal Processing Predicting and Estimation from Time Series Bhiksha Raj Class 22. 14 Nov 2013 14 Nov 2013 11-755/18797 1

  2. Administrivia • No class on Tuesday.. • Project Demos: 5 th December (Thursday). – Before exams week 14 Nov 2013 11-755/18797 2

  3. An automotive example • Determine automatically, by only listening to a running automobile, if it is: – Idling; or – Travelling at constant velocity; or – Accelerating; or – Decelerating • Assume (for illustration) that we only record energy level (SPL) in the sound – The SPL is measured once per second 14 Nov 2013 11-755/18797 3

  4. What we know • An automobile that is at rest can accelerate, or continue to stay at rest • An accelerating automobile can hit a steady- state velocity, continue to accelerate, or decelerate • A decelerating automobile can continue to decelerate, come to rest, cruise, or accelerate • A automobile at a steady-state velocity can stay in steady state, accelerate or decelerate 14 Nov 2013 11-755/18797 4

  5. What else we know P(x|idle) P(x|decel) P(x|cruise) P(x|accel) 45 60 65 70 • The probability distribution of the SPL of the sound is different in the various conditions – As shown in figure • In reality, depends on the car • The distributions for the different conditions overlap – Simply knowing the current sound level is not enough to know the state of the car 14 Nov 2013 11-755/18797 5

  6. The Model! P(x|accel) 0.33 70 Accelerating state P(x|idle) 0.5 0.33 0.33 0.5 0.33 0.25 0.33 Idling state Cruising state 65 0.25 45 0.25 0.25 I A C D 0.33 I 0.5 0.5 0 0 A 0 1/3 1/3 1/3 Decelerating state C 0 1/3 1/3 1/3 60 D 0.25 0.25 0.25 0.25 • The state-space model – Assuming all transitions from a state are equally probable 14 Nov 2013 11-755/18797 6

  7. Estimating the state at T = 0- 0.25 0.25 0.25 0.25 Idling Accelerating Cruising Decelerating • A T=0, before the first observation, we know nothing of the state – Assume all states are equally likely 14 Nov 2013 11-755/18797 7

  8. The first observation P(x|idle) P(x|decel) P(x|cruise) P(x|accel) 45 60 65 70 • At T=0 we observe the sound level x 0 = 67dB SPL – The observation modifies our belief in the state of the system • P(x 0 |idle) = 0 • P(x 0 |deceleration) = 0.0001 • P(x 0 |acceleration) = 0.7 • P(x 0 |cruising) = 0.5 – Note, these don’t have to sum to 1 – In fact, since these are densities, any of them can be > 1 14 Nov 2013 11-755/18797 8

  9. Estimating state after at observing x 0 • P(state | x 0 ) = C P(state)P(x 0 |state) – P(idle | x 0 ) = 0 – P(deceleration | x 0 ) = C 0.000025 – P(cruising | x 0 ) = C 0.125 – P(acceleration | x 0 ) = C 0.175 • Normalizing – P(idle | x 0 ) = 0 – P(deceleration | x 0 ) = 0.000083 – P(cruising | x 0 ) = 0.42 – P(acceleration | x 0 ) = 0.57 14 Nov 2013 11-755/18797 9

  10. Estimating the state at T = 0+ 0.57 0.42 8.3 x 10 -5 0.0 Idling Accelerating Cruising Decelerating • At T=0, after the first observation, we must update our belief about the states – The first observation provided some evidence about the state of the system – It modifies our belief in the state of the system 14 Nov 2013 11-755/18797 10

  11. Predicting the state at T=1 I A C D A 0.57 I 0.5 0.5 0 0 0.42 I C A 0 1/3 1/3 1/3 C 0 1/3 1/3 1/3 8.3 x 10 -5 0.0 D D 0.25 0.25 0.25 0.25 Idling Accelerating Cruising Decelerating • Predicting the probability of idling at T=1 – P(idling|idling) = 0.5; – P(idling | deceleration) = 0.25 – P(idling at T=1| x 0 ) = P(I T=0 |x 0 ) P(I|I) + P(D T=0 |x 0 ) P(I|D) = 2.1 x 10 -5 • In general, for any state S – P(S T=1 | x 0 ) = S ST=0 P(S T=0 | x 0 ) P(S T=1 |S T=0 ) 14 Nov 2013 11-755/18797 11

  12. Predicting the state at T = 1 0.57 0.42 8.3 x 10 -5 0.0 Idling Accelerating Cruising Decelerating P(S T=1 | x 0 ) = S ST=0 P(S T=0 | x 0 ) P(S T=1 |S T=0 ) 0.33 0.33 0.33 2.1x10 -5 14 Nov 2013 11-755/18797 12

  13. Updating after the observation at T=1 P(x|idle) P(x|decel) P(x|cruise) P(x|accel) 45 60 65 70 • At T=1 we observe x 1 = 63dB SPL • P(x 1 |idle) = 0 • P(x 1 |deceleration) = 0.2 • P(x 1 |acceleration) = 0.001 • P(x 1 |cruising) = 0.5 14 Nov 2013 11-755/18797 13

  14. Update after observing x 1 • P(state | x 0:1 ) = C P(state| x 0 )P(x 1 |state) – P(idle | x 0:1 ) = 0 – P(deceleration | x 0,1 ) = C 0.066 – P(cruising | x 0:1 ) = C 0.165 – P(acceleration | x 0:1 ) = C 0.00033 • Normalizing – P(idle | x 0:1 ) = 0 – P(deceleration | x 0:1 ) = 0.285 – P(cruising | x 0:1 ) = 0.713 – P(acceleration | x 0:1 ) = 0. 0014 14 Nov 2013 11-755/18797 14

  15. Estimating the state at T = 1+ 0.713 0.285 0.0 0.0014 Idling Accelerating Cruising Decelerating • The updated probability at T=1 incorporates information from both x 0 and x 1 – It is NOT a local decision based on x 1 alone – Because of the Markov nature of the process, the state at T=0 affects the state at T=1 • x 0 provides evidence for the state at T=1 14 Nov 2013 11-755/18797 15

  16. Estimating a Unique state • What we have estimated is a distribution over the states • If we had to guess a state, we would pick the most likely state from the distributions 0.57 0.42 • State(T=0) = Accelerating 8.3 x 10 -5 0.0 Idling Accelerating Cruising Decelerating 0.713 • State(T=1) = Cruising 0.285 0.0 0.0014 Idling Accelerating Cruising Decelerating 14 Nov 2013 11-755/18797 16

  17. Overall procedure T=T+1 P(S T | x 0:T-1 ) = S ST-1 P(S T-1 | x 0:T-1 ) P(S T |S T-1 ) P(S T | x 0:T ) = C. P(S T | x 0:T-1 ) P(x T |S T ) Update the Predict the distribution of the distribution of the state at T state at T after observing x T PREDICT UPDATE • At T=0 the predicted state distribution is the initial state probability • At each time T, the current estimate of the distribution over states considers all observations x 0 ... x T – A natural outcome of the Markov nature of the model • The prediction+update is identical to the forward computation for HMMs to within a normalizing constant 14 Nov 2013 11-755/18797 17

  18. Comparison to Forward Algorithm T=T+1 P(S T | x 0:T-1 ) = S ST-1 P(S T-1 | x 0:T-1 ) P(S T |S T-1 ) P(S T | x 0:T ) = C. P(S T | x 0:T-1 ) P(x T |S T ) Update the Predict the distribution of the distribution of the state at T state at T after observing x T PREDICT UPDATE • Forward Algorithm: – P(x 0:T ,S T ) = P ( x T | S T ) S ST-1 P ( x 0:T-1 , S T-1 ) P(S T |S T-1 ) PREDICT UPDATE • Normalized: – P(S T |x 0:T ) = ( S S’ T P(x 0:T ,S’ T ) ) -1 P(x 0:T ,S T ) = C P(x 0:T ,S T ) 14 Nov 2013 11-755/18797 18

  19. Decomposing the forward algorithm  P(x 0:T ,S T ) = P ( x T | S T ) S ST-1 P ( x 0:T-1 , S T-1 ) P(S T |S T-1 ) • Predict:  P(x 0:T-1 ,S T ) = S ST-1 P ( x 0:T-1 , S T-1 ) P(S T |S T-1 ) • Update:  P(x 0:T ,S T ) = P ( x T | S T ) P(x 0:T-1 ,S T ) 14 Nov 2013 11-755/18797 19

  20. Estimating the state T=T+1 P(S T | x 0:T-1 ) = S ST-1 P(S T-1 | x 0:T-1 ) P(S T |S T-1 ) P(S T | x 0:T ) = C. P(S T | x 0:T-1 ) P(x T |S T ) Update the Predict the distribution of the distribution of the state at T state at T after observing x T Estimate(S T ) = argmax ST P(S T | x 0:T ) Estimate(S T ) • The state is estimated from the updated distribution – The updated distribution is propagated into time, not the state 14 Nov 2013 11-755/18797 20

  21. Predicting the next observation T=T+1 P(S T | x 0:T-1 ) = S ST-1 P(S T-1 | x 0:T-1 ) P(S T |S T-1 ) P(S T | x 0:T ) = C. P(S T | x 0:T-1 ) P(x T |S T ) Update the Predict the distribution of the distribution of the state at T state at T after observing x T Predict P(x T |x 0:T-1 ) Predict x T • The probability distribution for the observations at the next time is a mixture: – P(x T |x 0:T-1 ) = S ST P(x T |S T ) P(S T |x 0:T-1 ) • The actual observation can be predicted from P(x T |x 0:T-1 ) 14 Nov 2013 11-755/18797 21

  22. Predicting the next observation • MAP estimate: – argmax xT P(x T |x 0:T-1 ) • MMSE estimate: – Expectation(x T |x 0:T-1 ) 14 Nov 2013 11-755/18797 22

  23. Difference from Viterbi decoding • Estimating only the current state at any time – Not the state sequence – Although we are considering all past observations • The most likely state at T and T+1 may be such that there is no valid transition between S T and S T+1 14 Nov 2013 11-755/18797 23

  24. A known state model • HMM assumes a very coarsely quantized state space – Idling / accelerating / cruising / decelerating • Actual state can be finer – Idling, accelerating at various rates, decelerating at various rates, cruising at various speeds • Solution: Many more states (one for each acceleration /deceleration rate, crusing speed)? • Solution: A continuous valued state 14 Nov 2013 11-755/18797 24

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend