Advanced Probabilistic Models for Generalized SSMs Speech and - - PowerPoint PPT Presentation

advanced probabilistic models for
SMART_READER_LITE
LIVE PREVIEW

Advanced Probabilistic Models for Generalized SSMs Speech and - - PowerPoint PPT Presentation

Probabilistic Models State-Space Models Advanced Probabilistic Models for Generalized SSMs Speech and Language Nonlinear SSMs Bayesian Learning Mark Andrews Inferring Trajectories Gatsby Computational Neuroscience Unit Learning Example


slide-1
SLIDE 1

Probabilistic Models State-Space Models Generalized SSMs Nonlinear SSMs Bayesian Learning Inferring Trajectories Learning Example Conclusion Title Page ◭ ◮ Page 1 of 10 Go Back Full Screen Quit

Advanced Probabilistic Models for Speech and Language

Mark Andrews

Gatsby Computational Neuroscience Unit www.gatsby.ucl.ac.uk/~mark Monday 28 June, 2003

Abstract This project proposal applies to the statistical and machine learning models used in speech and language research. It aims to develop a more general theoretical framework for the use of state-space models in speech and language, and to expand the set of models currently used in these fields.

slide-2
SLIDE 2

Probabilistic Models State-Space Models Generalized SSMs Nonlinear SSMs Bayesian Learning Inferring Trajectories Learning Example Conclusion Title Page ◭ ◮ Page 2 of 10 Go Back Full Screen Quit

1. Probabilistic Models Modeling speech and language demands the development and use of appropriate probabilistic models. These models are necessary for:

  • 1. Machine Learning: Probabilistic models provide a gen-

eral means by which to derive and analyse algorithms.

  • 2. Neural-Network Modeling: Probabilistic models pro-

vide statistical interpretations of neural network models

  • f language processing.
  • 3. Data-analysis: Probabilistic models facilitate data-analysis

in the experimental analysis of human language abilities. Suitable models should describe, for example, the sequential and recursive structures found in the these domains.

slide-3
SLIDE 3

Probabilistic Models State-Space Models Generalized SSMs Nonlinear SSMs Bayesian Learning Inferring Trajectories Learning Example Conclusion Title Page ◭ ◮ Page 3 of 10 Go Back Full Screen Quit

2. State-Space Models In state-space models, the observed data are a function of a state-space that is evolving through time.

Observed−Variables Latent Variables

xt yt xt−1 yt−1 xt+1 yt+1

Examples of state-space models include Hidden Markov mod- els (HMMs), Kalman filter models (KFMs), their hybrids and their variants. Despite their proven usefulness, many of these familiar models have limitations.

slide-4
SLIDE 4

Probabilistic Models State-Space Models Generalized SSMs Nonlinear SSMs Bayesian Learning Inferring Trajectories Learning Example Conclusion Title Page ◭ ◮ Page 4 of 10 Go Back Full Screen Quit

3. Generalized State-Space Models To overcome limitations, powerful yet flexible generalizations

  • f current models are needed.

❼ Powerful enough to model, for example, non-regular re- cursive structures, arbitrarily nonlinear functions, contin- uous state topologies, and continuous time dynamics. ❼ Flexible enough to yield tractable learning algorithms, and allow for tractable inference. Possible candidates include nonlinear state-space models (NSSMs). Nonlinear state-space models can generalize both HMMs, KFMs and other state-space models.

slide-5
SLIDE 5

Probabilistic Models State-Space Models Generalized SSMs Nonlinear SSMs Bayesian Learning Inferring Trajectories Learning Example Conclusion Title Page ◭ ◮ Page 5 of 10 Go Back Full Screen Quit

4. Nonlinear State-Space Models

−2 −1 1 2 −2 2 0.2 0.4 0.6 0.8 X0 X1 P(y=2) −2 −1 1 2 −2 2 0.2 0.4 0.6 0.8 X0 X1 P(y=1) −2 −1 1 2 −2 2 0.2 0.4 0.6 0.8 X0 X1 P(y=0) −2 −1 1 −1.5 −1 −0.5 0.5 1 1.5 X0 X1 −4 −2 2 4 −2 2 5 10 15 20 X0 X1 y0

Figure 1: A nonlinear state-space model. The upper left figure shows the state-space

  • f a 2-dimensional dynamical system, where arrows represent the rate and direction
  • f the dynamics. The three figures on the right represent a multinomial output for

the dynamical system. The three surfaces represent the probabilities for each of three discrete values. These surfaces sum to one at each point in the state-space. The lower left figure represents a nonlinear scalar valued output-function.

slide-6
SLIDE 6

Probabilistic Models State-Space Models Generalized SSMs Nonlinear SSMs Bayesian Learning Inferring Trajectories Learning Example Conclusion Title Page ◭ ◮ Page 6 of 10 Go Back Full Screen Quit

5. Bayesian Learning in Probabilistic Models Posterior: P(θ|D) = P(θ) P(D)

  • X

P(D, X|θ). Learning Methods: argmax

θ

P(θ|D) MAP, argmin

Q

  • θ

Q(θ) log P(θ|D) Q(θ) Variational Bayes,

N

  • i=1

δ(θ − ˜ θi) ≈ P(θ|D) MCMC.

slide-7
SLIDE 7

Probabilistic Models State-Space Models Generalized SSMs Nonlinear SSMs Bayesian Learning Inferring Trajectories Learning Example Conclusion Title Page ◭ ◮ Page 7 of 10 Go Back Full Screen Quit

6. Inferring State-Space Trajectories Accurate inference of state-space trajectories is essential for learning.

2 4 6 8 10 −0.5 0.5 1 1.5 2 0.005 0.01 0.015 time state−space posterior−probability

Figure 2: Weighted particles for a one-dimensional state-space. The particles shown here represent P(xt|y0:T , θ) for 0 ≤ t ≤ 10 and from random parameter values for a NSSM. The multi-modal and non-Gaussian distribution of the posterior is clearly evident.

slide-8
SLIDE 8

Probabilistic Models State-Space Models Generalized SSMs Nonlinear SSMs Bayesian Learning Inferring Trajectories Learning Example Conclusion Title Page ◭ ◮ Page 8 of 10 Go Back Full Screen Quit

7. Example of MAP Learning in a NSSM

−1 1 2 −1 −0.5 0.5 1 1.5 2 X0 X1 2 4 6 8 10 2 3 4 5 6 7 8 X1 X0 −1 1 2 0.2 0.4 0.6 0.8 1 X0 P Symbol 1 Symbol 2 Symbol 3 2 4 6 8 10 0.2 0.4 0.6 0.8 1 X0 P Symbol 1 Symbol 2 Symbol 3

Figure 3: Example of learning a nonlinear state-space model for discrete output. The system that generated the data is shown on the left, while the learned system is on the

  • right. Notice how although the form of the dynamical mapping that is learned is almost

identical to the data generator, the state-space itself has been rescaled considerably. The multinomial distributions also exhibit this rescaling.

slide-9
SLIDE 9

Probabilistic Models State-Space Models Generalized SSMs Nonlinear SSMs Bayesian Learning Inferring Trajectories Learning Example Conclusion Title Page ◭ ◮ Page 9 of 10 Go Back Full Screen Quit

8. Conclusion General and flexible models are necessary for all stages of lan- guage modelling. Nonlinear state-space models are a promising class of models.

  • 1. Tractable learning and inference algorithms are feasible

for NSSMs.

  • 2. Recurrent neural network models can be given a proba-

bilistic interpretation in terms of NSSMs.

  • 3. NSSMs provide methods for time-series analysis and psycho-

linguistic data-analysis. NSSMs have powerful modelling capabilities, yet have a prob- abilistic structure identical to other, more familiar, state-space models.

slide-10
SLIDE 10

Probabilistic Models State-Space Models Generalized SSMs Nonlinear SSMs Bayesian Learning Inferring Trajectories Learning Example Conclusion Title Page ◭ ◮ Page 10 of 10 Go Back Full Screen Quit

References [1] Lawrence R. Rabiner. A tutorial on Hidden Markov Models and selected applications in speech recognition. In Pro- ceedings of the IEEE, volume 77, pages 256–286, 1989. [2] Michael Isard and Andrew Blake. A smoothing filter for

  • Condensation. In Proceedings of the 5th European Con-

ference on Computer Vision, pages 767–781, 1998. [3] Zoubin Ghahramani and Sam T. Roweis. Learning nonlin- ear dynamical systems using the EM algorithm. In Neural Information Processing Systems, vol. 11. 1998. [4] Harri Valpola and Juha Karhunen. An unsupervised en- semble learning method for nonlinear dynamic state-space

  • models. Neural Computation, 14(11), 2002.