Review of Probability 1 Probability Theory: Many techniques in - - PowerPoint PPT Presentation

review of probability
SMART_READER_LITE
LIVE PREVIEW

Review of Probability 1 Probability Theory: Many techniques in - - PowerPoint PPT Presentation

Review of Probability 1 Probability Theory: Many techniques in speech processing require the manipulation of probabilities and statistics. The two principal application areas we will encounter are: Statistical pattern recognition.


slide-1
SLIDE 1

1

Review of Probability

slide-2
SLIDE 2

2

Probability Theory:

 Many techniques in speech processing

require the manipulation of probabilities and statistics.

 The two principal application areas we will

encounter are:

Statistical pattern recognition. Modeling of linear systems.

slide-3
SLIDE 3

3

Events:

 It is customary to refer to the probability of

an event.

 An event is a certain set of possible

  • utcomes of an experiment or trial.

 Outcomes are assumed to be mutually

exclusive and, taken together, to cover all possibilities.

slide-4
SLIDE 4

4

Axioms of Probability:

 To any event A we can assign a number,

P(A), which satisfies the following axioms:

P(A)≥0. P(S)=1. If A and B are mutually exclusive, then

P(A+B)=P(A)+P(B).

 The number P(A) is called the probability

  • f A.
slide-5
SLIDE 5

5

Axioms of Probability (some consequence):

 Some immediate consequence:

If is the complement of A, then

 

P(0) ,the probability of the impossible event, is 0. P(A) ≤ 1.

 If two event A and B are not mutually

exclusive, we can show that

P(A+B)=P(A)+P(B)-P(AB).

A

S A A   ) ( ) ( 1 ) ( A P A P  

slide-6
SLIDE 6

6

Conditional Probability:

 The conditional probability of an event A,

given that event B has occurred, is defined as:

 We can infer P(B|A) by means of Bayes’

theorem:

) ( ) ( ) | ( B P AB P B A P  ) ( ) ( ) | ( ) | ( A P B P B A P A B P 

slide-7
SLIDE 7

7

Independence:

 Events A and B may have nothing to do with

each other and they are said to be independent.

 Two events are independent if

P(AB)=P(A)P(B).

 From the definition of conditional probability:

) ( ) | ( A P B A P  ) ( ) | ( B P A B P  ) ( ) ( ) ( ) ( ) ( B P A P B P A P B A P    

slide-8
SLIDE 8

8

Independence:

 Three events A,B and C are independent

  • nly if:

           ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( C P B P A P ABC P C P B P BC P C P A P AC P B P A P AB P

slide-9
SLIDE 9

9

Random Variables:

 A random variable is a number chosen at

random as the outcome of an experiment.

 Random variable may be real or complex

and may be discrete or continuous.

In S.P. ,the random variable encounter are

most often real and discrete.

 We can characterize a random variable by

its probability distribution or by its probability density function (pdf).

slide-10
SLIDE 10

10

Random Variables (distribution function):

 The distribution function for a random

variable y is the probability that y does not exceed some value u,

 and

) ( ) ( u y P u Fy   ) ( ) ( ) ( u F v F v y u P

y y

   

slide-11
SLIDE 11

11

Random Variables (probability density function):

 The probability density function is the

derivative of the distribution:

 and,  

) ( ) ( u F du d u f

y y

  

v u y

dy y f v y u P ) ( ) (

1 ) (  

y

F

1 ) ( 

  

dy y f y

slide-12
SLIDE 12

12

Random Variables (expected value):

 We can also characterize a random

variable by its statistics.

 The expected value of g(x) is written

E{g(x)} or <g(x)> and defined as

 Continuous random variable:  Discrete random variable:

  

  dx x f x g x g ) ( ) ( ) (

 

x

x p x g x g ) ( ) ( ) (

slide-13
SLIDE 13

13

Random Variables (moments):

 The statistics of greatest interest are the

moment of X.

 The kth moment of X is the expected value

  • f .

 For a discrete random variable:

k

X

 

x k k k

x p x X m ) (

slide-14
SLIDE 14

14

Random Variables (mean & variance):

 The first moment, ,is the mean of x.

Continuous: Discrete:

 The second central moment, also known

as the variance of p(x), is given by

1

m

m = X =< X >= xp(x)

x

å

  

 dx x xf X ) (

2 2 2 2

) ( ) ( X m x p x x

x

     

slide-15
SLIDE 15

15

Random Variables …:

 To estimate the statistics of a random

variable, we repeat the experiment which generates the variable a large number of times.

If the experiment is run N times, then each

value x will occur Np(x) times, thus

N i i x

x N

1

1 ˆ 

N i k i k

x N m

1

1 ˆ

slide-16
SLIDE 16

16

Random Variables (Uniform density):

 A random variable has a uniform density

  • n the interval (a, b) if :

  

      

  • therwise

, ), /( 1 ) ( b x a a b x f X

            b x b x a a b a x a x x FX , 1 ), /( ) ( , ) (

2 2

) ( 12 1 a b   

slide-17
SLIDE 17

17

Random Variables (Gaussian density):

 The Gaussian, or normal density function

is given by:

2 2 2

/ ) (

2 1 ) , ; (

 

   

 

x

e x n

slide-18
SLIDE 18

18

Random Variables (…Gaussian density):

 The distribution function of a normal

variable is:

 If we define error function as  Thus,

du u n x N

x

 

 ) , ; ( ) , ; (    

du e x erf

x u

 

 

2 /

2

2 1 ) ( 

) ( 1 ) , ; (        x erf x N

slide-19
SLIDE 19

19

Two Random Variables:

 If two random variables x and y are to be

considered together, they can be described in terms of their joint probability density f(x, y) or, for discrete variables, p(x, y).

 Two random variable are independent if

) ( ) ( ) , ( y p x p y x p 

slide-20
SLIDE 20

20

Two Random Variables(…Continue):

 Given a function g(x, y), its expected

value is defined as:

 Continuous:  Discrete:  And joint moment for two discrete random variable is:

 

     

  dxdy y x f y x g y x g ) , ( ) , ( ) , (

 

y x

y x p y x g y x g

,

) , ( ) , ( ) , (

y x j i ij

y x p y x m

,

) , (

slide-21
SLIDE 21

21

Two Random Variables(…Continue):

 Moments are estimated in practice by averaging

repeated measurements:

 A measure of the dependence of two random

variables is their correlation and the correlation

  • f two variables is their joint second moment:

 

y x

y x xyp xy m

, 11

) , (

j N i ij

y x N m

  

1

1 ˆ

slide-22
SLIDE 22

22

Two Random Variables(…Continue):

 The joint second central moment of x , y is

their covariance:

 If x and y are independent then their covariance is zero.

 The correlation coefficient of x and y is

their covariance normalized to their standard deviations:

y x xy xy

r     y x m y y x x

xy

    

11

) )( ( 

slide-23
SLIDE 23

23

Two Random Variables(…Gaussian Random Variable):

 Two random variables x and y are jointly

Gaussian if their density function is :

Where

y x xy xy

r    

                     

2 2 2 2 2 2

2 ) 1 ( 2 1 exp 1 2 1 ) , (

y y x x y x

y rxy x r r y x n      

slide-24
SLIDE 24

24

Two Random Variables(…Sum of Random Variables):

 The expected value of the sum of two

random variables is :

 This is true whether x and y are independent or not

And also we have :

 

  

i i i i

x x

       y x y x     x c cx

slide-25
SLIDE 25

25

Two Random Variables(…Sum of Random Variable):

 The variance of the sum of the two independent

random variable is :

 If two random variable are independent, the

probability density of their sum is the convolution

  • f the densities of the individual variables :

 Continuous:  Discrete: 2 2 2 y x y x

    

   

  du u z f u f z f

y x y x

) ( ) ( ) (

   

 

u y x y x

u z p u p z p ) ( ) ( ) (

slide-26
SLIDE 26

26

Central Limit Theorem

 Central Limit Theorem (informal

paraphrase): If many independent random variables are summed, the probability density function (pdf) of the sum tends toward the Gaussian density, no matter what their individual densities are.

slide-27
SLIDE 27

27

Multivariate Normal Density

 The normal density function can be generalized

to any number of random variables.

 Let X be the random vector,  Where  The matrix R is the covariance matrix of X

(R is Positive-Definite)

        

 

) ( 2 1 exp | | ) 2 ( ) (

2 / 1 2 /

x x Q R x N

n

) ( ) ( ) (

1

x x R x x x x Q

T

   

   

T

x x x x R ) )( (

] ,..., , [

2 1 n

X X X Col

slide-28
SLIDE 28

28

Random Functions :

 A random function is one arising as the

  • utcome of an experiment.

 Random functions do not need to be

functions of time, but in all cases of interest to us they will be.

 A discrete stochastic process is

characterized by many probability density functions of the form,

) ,..., , , , ,..., , , (

3 2 1 3 2 1 n n

t t t t x x x x p

slide-29
SLIDE 29

29

Random Functions :

 If the individual values of the random

signal are independent, then

 If these individual probability densities are

all the same, then we have a sequence of independent, identically distributed samples (i.i.d.). ) , ( )... , ( ) , ( ) ,..., , , ,..., , (

2 2 1 1 2 1 2 1 n n n n

t x p t x p t x p t t t x x x p 

slide-30
SLIDE 30

30

mean & autocorrelation

 The mean is the expected value of x(t) :  The autocorrelation function is the

expected value of the product :

 

x

t x xp t x t x ) , ( ) ( ) (

) , , , ( ) ( ) ( ) , (

2 1 , 2 1 2 1 2 1 2 1

2 1

t t x x p x x t x t x t t r

x x

 

) ( ) (

2 1

t x t x

slide-31
SLIDE 31

31

ensemble & time average

 Mean and autocorrelation can be determined in

two ways:

The experiment can be repeated many times

and the average taken over all these

  • functions. Such an average is called

ensemble average.

Take any one of these function as being

representative of the ensemble and find the average from a number of samples of this one

  • function. This is called a time average.
slide-32
SLIDE 32

32

ergodicity & stationarity

 If the time average and ensemble average

  • f a random function are the same, it is

said to be ergodic.

 A random function is said to be stationary

if its statistics do not change as a function

  • f time.

This is also called strict sense stationarity (vs.

wide sense stationarity).

 Any ergodic function is also stationary.

slide-33
SLIDE 33

33

ergodicity & stationarity

 For a stationary signal we have:  Stationarity is defined as:

 Where

 And the autocorrelation function is :

x t x  ) (

) , , ( ) , , , (

2 1 2 1 2 1

 x x p t t x x p 

1 2

t t   

2 1,

2 1 2 1

) , , ( ) (

x x

x x p x x r  

slide-34
SLIDE 34

34

ergodicity & stationarity

 When x(t) is ergodic, its mean and

autocorrelation are :

   

N N t N

t x N x ) ( 2 1 lim

) ( ) ( 2 1 lim ) ( ) ( ) (   

   

   

N N t N

t x t x N t x t x r

slide-35
SLIDE 35

35

cross-correlation

 The cross-correlation of two ergodic

random functions is :

 The subscript xy indicates a cross-correlation.

   

   

N N t N xy

t y t x N t y t x r ) ( ) ( 1 lim ) ( ) ( ) (   

slide-36
SLIDE 36

36

Random Functions (power & cross spectral density):

 The Fourier transform of

(the autocorrelation function of an ergodic random function) is called the power spectral density of x(t) :

 The cross-spectral density of two ergodic

random functions is :

   

 

 

j

e r S ) ( ) (

   

 

 

j xy xy

e r S ) ( ) ( ) ( r

slide-37
SLIDE 37

37

Random Functions (…power density):

 For an ergodic signal x(t),

can be written as:

 Then from elementary Fourier transform properties,

2

| ) ( | ) ( ) ( ) ( ) ( ) (       X X X X X S    

) ( r ) ( ) ( ) (       x x r

Assuming x(t) is real

slide-38
SLIDE 38

38

Random Functions (White Noise):

 If all values of a random signal are

uncorrelated,

 Then this random function is called white noise

 The power spectrum of white noise is constant,  White noise is a mixture of all frequencies.

) ( ) (

2

     r

2

) (    S

slide-39
SLIDE 39

39

Random Signal in Linear Systems :

 Let T[ ] represent the linear operation; then  Given a system with impulse response h(n),  A stationary signal applied to a linear system

yields a stationary output,

] ) ( [ )] ( [     t x T t x T

) ( ) ( ) ( ) ( ) ( n h n x n h n x n y       ) ( ) ( ) ( ) (         h h r r

xx yy 2

| ) ( | ) ( ) (    H S S

xx yy