Probability Density (1) Let f ( x 1 , x 2 . . . x n ) be a - - PDF document

probability density 1
SMART_READER_LITE
LIVE PREVIEW

Probability Density (1) Let f ( x 1 , x 2 . . . x n ) be a - - PDF document

Probability Density (1) Let f ( x 1 , x 2 . . . x n ) be a probability density for the variables { x 1 , x 2 . . . x n } . These variables can always be viewed as coordinates over an abstract space (a manifold). The probability of a domain A


slide-1
SLIDE 1

Probability Density (1)

Let f(x1, x2 . . . xn) be a probability density for the variables {x1, x2 . . . xn} . These variables can always be viewed as coordinates

  • ver an abstract space (a ‘manifold’).

The probability of a domain A is computed via P(A) =

  • A

dx1 dx2 . . . dxn f(x1, x2 . . . xn) . Even when there is a volume element dV (x1, x2 . . . xn)

  • ver the space, one should never integrate a probability

density using P(A) =

  • A

dV (x1, x2 . . . xn) f(x1, x2 . . . xn) . 1

slide-2
SLIDE 2

Probability Density (2)

When changing from the variables {x1, x2 . . . xn} to some other variables {y1, y2 . . . yn} , the probability dis- tribution that was represented by the probability den- sity f(x1, x2 . . . xn) is now represented by a probability density g(y1, y2 . . . yn) , and one has (Jacobian rule) g(y1, y2 . . . yn) = f(x1, x2 . . . xn) J , where J is the absolute value of the determinant of the matrix of partial derivatives   

∂x1 ∂y1

· · ·

∂x1 ∂yn

. . . ... . . .

∂xn ∂y1

· · ·

∂xn ∂yn

   . 2

slide-3
SLIDE 3

Marginal Probability Density

When the whole set of variables {x1, x2 . . . xn} naturally separates into two groups of variables {u1, u2 . . . up} and {v1, v2 . . . vq} (with p+q = n ), all the information con- cerning the variables {u1, u2 . . . up} alone is contained in the marginal probability density fu(u1, u2 . . . up) = =

  • all range

dv1 . . . dvq f(u1, u2 . . . up, v1, v2 . . . vq) . Similarly, all the information concerning the variables {v1, v2 . . . vq} alone is contained in the marginal proba- bility density fv(v1, v2 . . . vq) = =

  • all range

du1 . . . dup f(u1, u2 . . . up, v1, v2 . . . vq) . 3

slide-4
SLIDE 4

Warning

This definition of marginal probability density is, ex- cepted for some minor interpretation details, safe. The same is not true for the definition of conditional probability density. The simple definition one finds in most texts is usually overinterpreted, and leads to para- doxes, the most famous of all being the ‘Borel paradox’. 4

slide-5
SLIDE 5

Reproduced from Kolmogorov’s Foundations of the The-

  • ry of Probability (1950, pp. 50–51).

§ 2. Explanation of a Borel Paradox Let us choose for our basic set E the set of all points

  • n a spherical surface.

Our F will be the aggregate

  • f all Borel sets of the spherical surface. And finally,
  • ur P(A) is to be proportional to the measure set of A.

Let us now choose two diametrically opposite points for

  • ur poles, so that each meridian circle will be uniquely

defined by the longitude ψ , 0 ≤ ψ < π . Since ψ varies from 0 only to π, — in other words, we are considering complete meridian circles (and not merely semicircles) — the latitude θ must vary from −π to +π (and not from −π

2 to +π 2 ). Borel set the following problem: Required

to determine “the conditional probability distribution”

  • f latitude θ , −π ≤ θ < +π , for a given longitude ψ .

It is easy to calculate that Pψ(θ1 ≤ θ < θ2) = 1 4 θ2

θ1

| cos θ| dθ . The probability distribution of θ for a given ψ is not uniform. If we assume the the conditional probability distri- bution of θ “with the hypothesis that ξ lies on the given meridian circle” must be uniform, then we have arrived at a contradiction. This shows that the concept of a conditional proba- bility with regard to an isolated given hypothesis whose probability equals 0 is inadmissible. For we van obtain a probability distribution for θ on the meridian circle only if we regard this circle as an element of the decomposi- tion of the entire spherical surface into meridian circles with the given poles. 5

slide-6
SLIDE 6

Conditional Probability

(Not yet conditional probability density)

P(A|B) = P(A ∩ B) P(B) 6

slide-7
SLIDE 7

f(u|v0) = f(u, v0)

  • du f(u, v0)

7

slide-8
SLIDE 8

f( u | v = v(u) ) = f( u , v(u) )

  • du f( u , v(u) )

8

slide-9
SLIDE 9

The Borel Paradox

an example of the danger of overinter- preting the usual definition of conditional probability density

Arbitrary probability density over the sphere, using spher- ical coordinates: f(θ, ϕ) . P(A) =

  • A

dθ dϕ f(θ, ϕ) The homogeneous probability density: f(θ, ϕ) = sin θ 4 π ; π dθ 2π dϕ f(θ, ϕ) = 1 9

slide-10
SLIDE 10

Marginal probability density for θ : fθ(θ) = 2π dϕ f(θ, ϕ) = sin θ 2 Marginal probability density for ϕ : fϕ(ϕ) = π dθ f(θ, ϕ) = 1 2 π → interpretation O.K. 10

slide-11
SLIDE 11

A point P has materialized on the surface of the sphere, with homogeneous probability density, and we are told that it has materialized in the meridian defined by ϕ = ϕ0 . Which is the probability density for the colatitude θ ? Conditional probability density for θ given ϕ = ϕ0 : fθ|ϕ(θ|ϕ = ϕ0) = f(θ, ϕ0) π

0 dθ f(θ, ϕ0) = sin θ

2 11

slide-12
SLIDE 12

Rather than developing here the theory that is totally free from those inconsistencies (and to propose more general formulas for the conditional probability density), I choose to take the formulas above as they are, and give (later on) the precise conditions for their validity. (conditions that are not fulfilled in the Borel problem. . . ) 12

slide-13
SLIDE 13

Bayes’ Theorem

Some variables {u, v} = {u1, u2, . . . , v1, v2, . . . } ‘Joint’ probability density: f(u, v) Marginal probability density: fv(v) =

  • du f(u, v)

Conditional probability density: f(u|v0) =

f(u,v0) R du f(u,v0)

f(u|v) =

f(u,v) R du f(u,v)

Using the definition of marginal probability density, the conditional probability density can be written f(u|v) = f(u, v) fv(v) Therefore, f(u, v) = f(u|v) fv(v) 13