[PPT] - An introduction to the theory of coherent lower previsions 2nd PowerPoint Presentation

SLIDE 1

An introduction to the theory of coherent lower previsions

2nd SIPTA School on Imprecise Probabilities, Madrid, July 24-28, 2006

Enrique Miranda Rey Juan Carlos University

Dep. of Statistics and O.R.

2nd SIPTA school on imprecise probabilities – p. 1/8

SLIDE 2

Overview, Part I

Some considerations about probability. Coherent previsions and probabilities. Coherent lower and upper previsions. Sets of desirable gambles and linear previsions. Natural extension.

2nd SIPTA school on imprecise probabilities – p. 2/8

SLIDE 3

Some considerations about probability

History. Interpretations. Subjective probability.

2nd SIPTA school on imprecise probabilities – p. 3/8

SLIDE 4

Which is the goal of probability?

Probability seeks to determine the plausibility of the different outcomes of an experiment when these cannot be predicted beforehand. What is the probability of guessing the 6 winning numbers in the lottery? What is the probability of arriving in 30’ from the airport to Manuel Becerra by car? What is the probability of having a sunny day tomorrow?

2nd SIPTA school on imprecise probabilities – p. 4/8

SLIDE 5

A bit of history

Three main steps: The first works in probability were made in the XVII century, in some letters between Pascal and Fermat. Both them and de Moivre and Bernoulli later on used probability to model games of chance. In 1812, Laplace made a mathematical formalisation of the previous works and used probability in connection with other types of problems. In 1933, Kolmogorov established an axiomatic definition.

2nd SIPTA school on imprecise probabilities – p. 5/8

SLIDE 6

σ-fields of events

Probability is usually defined on σ-fields of events. Given a space X, an event is any subset A ⊆ X. A class A of events is called a σ-field when it satisfies:

∅, X ∈ A. Ac ∈ A for any A ∈ A.

Given (An)n ∈ A, their union ∪nAn ∈ A.

2nd SIPTA school on imprecise probabilities – p. 6/8

SLIDE 7

Kolmogorov’s definition of probability

A probability P on X is a mapping defined on a σ-field A of subsets of X satisfying:

1. P(A) ≥ 0 for any A in A.
2. P(X) = 1.
3. Given (An)n pairwise disjoint, P(∪nAn) =

n P(An).

2nd SIPTA school on imprecise probabilities – p. 7/8

SLIDE 8

Finite and σ-additive probabilities

Some people only require (3) to hold for finite unions, and this produces the so-called finitely additive probabilities. Advantages: They can be defined on all events, and are easier to justify. Disadvantages: σ-additive probabilities satisfy continuity properties and limit results.

2nd SIPTA school on imprecise probabilities – p. 8/8

SLIDE 9

Interpretations of probability

Classical. Frequentist. Subjective.

. . .

2nd SIPTA school on imprecise probabilities – p. 9/8

SLIDE 10

Aleatory vs. epistemic

In some cases, the probability of an event A is a property of the event, meaning that it does not depend on the subject making the assessment. We talk then of aleatory probabilities. However, and specially in the framework of decision making, we may need to assess probabilities that represent

ur beliefs. Hence, these may vary depending on the

subject or on the amount of information he possesses at the time. We talk then of subjective probabilities.

2nd SIPTA school on imprecise probabilities – p. 10/8

SLIDE 11

The behavioural interpretation

One of the possible interpretations of subjective probability is the behavioural interpretation. We interpret the probability of an event A in terms of our betting behaviour: we are disposed to bet at most P(A) euros on the event A. If we consider the gamble IA where we win 1 euro if A happens and 0 if it doesn’t happen, then we accept the transaction IA − P(A), because the expected gain is

(1 − P(A)) ∗ P(A) + (0 − P(A))(1 − P(A)) = 0.

2nd SIPTA school on imprecise probabilities – p. 11/8

SLIDE 12

Gambles

More generally, we can consider our betting behaviour on gambles. A gamble is a bounded real-valued variable on X,

f : X → R.

It represents a reward that depends on the outcome of the experiment modelled by X. We shall denote the set of all gambles by L(X).

2nd SIPTA school on imprecise probabilities – p. 12/8

SLIDE 13

Example

Who shall win the next World Cup? Consider the outcomes a=an American team, b=an European team, c=an African/Asian team.

X = {a, b, c}.

Consider the gamble f(a) = 3, f(b) = −2, f(c) = 10. Depending on how likely I consider each of the outcomes I will accept the gamble or not.

2nd SIPTA school on imprecise probabilities – p. 13/8

SLIDE 14

Betting on gambles

Consider now a gamble f on X. We may consider the supremum value µ such that we are disposed to pay µ for f, i.e., such that the reward f − µ is desirable: it will be the expectation E(f). For any µ < E(f), we expect to have a gain. For any µ > E(f), we expect to have a loss.

2nd SIPTA school on imprecise probabilities – p. 14/8

SLIDE 15

Buying and selling prices

I may also give money in order to get the reward: if I am disposed to pay x for the gamble f, then the gamble f − x is desirable to me. I may also sell a gamble f, meaning that if I am disposed to sell it at a price x then the gamble x − f is desirable to me. In the case of probabilities, the supremum buying price for a gamble f coincides with the infimum selling price.

2nd SIPTA school on imprecise probabilities – p. 15/8

SLIDE 16

Events or gambles?

In the case of probabilities, we are indifferent between betting on event or on gambles: our betting rates on events (a probability) determine our betting rates on gambles (its expectation). Since we want to be able to express our beliefs for any event, we consider here finitely additive probabilities. What happens if we are not certain of which probability to use?

2nd SIPTA school on imprecise probabilities – p. 16/8

SLIDE 17

Existence of indecision

When we don’t have much information, it may be difficult (and unreasonable) to give a fair price P(f): there may be some prices µ for which we would not be disposed to buy or sell the gamble f. In terms of desirable gambles, this means that we would be indifferent between two gambles. It is sometimes considered preferable to give different values P(f) < P(f) than to give a precise (and possibly wrong) value.

2nd SIPTA school on imprecise probabilities – p. 17/8

SLIDE 18

Lower and upper previsions

The lower prevision for a gamble f, P(f), is my supremum acceptable buying price for f, meaning that I am disposed to buy it for P(f) − ǫ (or to accept the reward f − (P(f) − ǫ)) for any ǫ > 0. The upper prevision for a gamble f, P(f), is my infimum acceptable selling price for f, meaning that I am disposed to sell f for P(f) + ǫ (or to accept the reward P(f) + ǫ − f) for any ǫ > 0.

2nd SIPTA school on imprecise probabilities – p. 18/8

SLIDE 19

Conjugacy of P, P

Under this interpretation,

P(−f) = sup{x : −f − x acceptable } = − inf{x : f + x acceptable } = −P(f)

Hence, it suffices to work with one of these two functions.

2nd SIPTA school on imprecise probabilities – p. 19/8

SLIDE 20

Important remark

The domain K of P: need not have any predefined structure. may contain indicators of events.

2nd SIPTA school on imprecise probabilities – p. 20/8

SLIDE 21

Lower probabilities of events

The lower probability of A, P(A)

= lower prevision P(IA) of the indicator of A. = supremum betting rate on A. = measure of the evidence supporting A. = measure of the strength of our belief in A.

2nd SIPTA school on imprecise probabilities – p. 21/8

SLIDE 22

Upper probabilities of events

The upper probability of A, P(A)

= upper prevision P(IA) of the indicator of A. = measure of the lack of evidence against A. = measure of the plausibility of A.

We have then a behavioural interpretation of upper and lower probabilities: evidence in favour of A ↑ ⇒ P(A) ↑ evidence against A ↑ ⇒ P(A) ↓

2nd SIPTA school on imprecise probabilities – p. 22/8

SLIDE 23

2nd SIPTA school on imprecise probabilities – p. 23/8

SLIDE 24

Events or gambles (II)

In the imprecise case, the lower and upper previsions for events do not determine the lower and upper previsions for gambles uniquely. Hence, lower and upper previsions are more informative than lower and upper probabilities.

2nd SIPTA school on imprecise probabilities – p. 24/8

SLIDE 25

Precise previsions

If we knew the probability of each of the outcomes, then our prevision for a gamble f would be the expected value E(f): for any smaller value x the gamble f − x would produce a gain in the long run (so we would be disposed to buy f for it), and for any greater value the value f − x would result in a loss in the long run (so we would accept x − f, that is, we would sell f for the price x). In general we are not able to determine exactly the probability of each outcome, and we end up with lower and upper previsions.

2nd SIPTA school on imprecise probabilities – p. 25/8

SLIDE 26

Consistency requirements

The assessments made by a lower prevision on a set of gambles should satisfy a number of consistency requirements: A combination of the assessments should not produce a net loss, no matter the outcome: avoiding sure loss. Our supremum buying price for a gamble f should not depend on our assessments for other gambles: coherence.

2nd SIPTA school on imprecise probabilities – p. 26/8

SLIDE 27

Avoiding sure loss

Concerning the match between Brazil and the USSR in the 1982 World Cup, a student represented his beliefs saying that

P(W) = 0.65, P(D) = 0.25, P(L) = 0.4 P(W) = 0.6, P(D) = 0.2, P(L) = 0.35,

being {W, D, L} ={Win, Draw, Loss} (of Brazil). This means that the gambles IW − 0.59, ID − 0.19 y IL − 0.34 are desirable for him. But if he accepts all of them he gets the sum is

[IW + ID + IL] − 1.12 = −0.12

which produces a net loss of 0.12, no matter the outcome

f the match.

2nd SIPTA school on imprecise probabilities – p. 27/8

SLIDE 28

Avoiding sure loss: general definition

Let P be a lower prevision defined on a set of gambles K. It avoids sure loss iff

sup

ω∈X n

i=1

fk(ω) − P(fk) ≥ 0

for any f1, . . . , fn ∈ K. Otherwise, there is some ǫ such that

n

i=1

fk − (P(fk) − ǫ) < −ǫ

no matter the outcome.

2nd SIPTA school on imprecise probabilities – p. 28/8

SLIDE 29

Coherence

The student thinks about it again and comes up with the assessments:

P(W) = 0.52, P(D) = 0.61, P(L) = 0.31 P(W) = 0.27, P(D) = 0.27, P(L) = 0.21

These assessments avoid sure loss. However, they imply that the transaction

IW − 0.26 + IL − 0.2 = 0.54 − ID

is acceptable for him, which means that he is disposed to bet against draw at a rate 0.54, smaller that P(D), which indicates that P(D) is too large.

2nd SIPTA school on imprecise probabilities – p. 29/8

SLIDE 30

Coherence: general definition

A lower prevision P is called coherent when given gambles

f0, f1, . . . , fn in its domain and m ∈ N,

n

i=1

[fi(ω) − P(fi)] ≥ m[f0(ω) − P(f0)]

for some ω ∈ X. Otherwise, there is some ǫ such that

n

i=1

fi − (P(fi) − ǫ) < f0 − P(f0) − ǫ,

and P(f0) + ǫ would be an acceptable buying price for f0.

2nd SIPTA school on imprecise probabilities – p. 30/8

SLIDE 31

Coherence on linear spaces

Suppose the domain K is a linear space of gambles: If f, g ∈ K, then f + g ∈ K. If f ∈ K, λ ∈ R, then λf ∈ K. Then, P is coherent if and only if for any f, g ∈ K, λ ≥ 0,

P(f) ≥ inf f. P(λf) = λP(f). P(f + g) ≥ P(f) + P(g).

2nd SIPTA school on imprecise probabilities – p. 31/8

SLIDE 32

Consequences of coherence

Whenever the gambles belong to the domain of P, P:

1. P(∅) = P(∅) = 0, P(X) = P(X) = 1.
2. A ⊆ B ⇒ P(A) ≤ P(B), P(A) ≤ P(B).
3. P(f) + P(g) ≤ P(f + g) ≤ P(f) + P(g) ≤ P(f + g) ≤

P(f) + P(g).

4. P(λf) = λP(f), P(λf) = λP(f) for λ ≥ 0.
5. If fn → f uniformly, then P(fn) → P(f) and

P(fn) → P(f).

2nd SIPTA school on imprecise probabilities – p. 32/8

SLIDE 33

Consequences of coherence (II)

λP(f) + (1 − λ)P(g) ≤ P(λf + (1 − λ)g) ∀λ ∈ [0, 1]. P(f + µ) = P(f) + µ ∀µ ∈ R.

The lower envelope of a set of coherent lower previsions is coherent. A convex combination of coherent lower previsions (with the same domain) is coherent. The point-wise limit (inferior) of coherent lower previsions is coherent.

2nd SIPTA school on imprecise probabilities – p. 33/8

SLIDE 34

Linear previsions

When P(f) = P(f) for all f ∈ K and they are coherent, we call them linear previsions. A functional P on K is a linear prevision if and only if for any f1, . . . , fn, g1, . . . , gm ∈ K

sup

ω∈X n

j=1

(fj(ω) − P(fj)) −

m

i=1

(gi(ω) − P(gi)) ≥ 0.

These are the previsions considered by de Finetti. We shall denote by P(X) the set of all linear previsions on X.

2nd SIPTA school on imprecise probabilities – p. 34/8

SLIDE 35

Linear previsions: equivalent conditions

If K = −K, a lower prevision P on K is a linear prevision if and only if it avoids sure loss and is self-conjugate (P(f) = −P(−f) ∀f). If K is a linear space of gambles, then P is a linear prevision if and only if for all f, g ∈ K,

1. P(f) ≥ inf f.
2. P(f + g) = P(f) + P(g).

2nd SIPTA school on imprecise probabilities – p. 35/8

SLIDE 36

Coherence and precise previsions

Given a lower prevision P on K, we can consider

M(P) := {P ∈ P(X) : P(f) ≥ P(f) ∀f ∈ Kk}. P avoids sure loss ⇐ ⇒ M(P) = ∅. P coherent ⇐ ⇒ P(f) = minP∈M(P) P(f).

There is a 1-to-1 correspondence between coherent lower previsions and (closed and convex) sets of linear previsions. This correspondence establishes a sensitivity analysis interpretation to coherent lower previsions.

2nd SIPTA school on imprecise probabilities – p. 36/8

SLIDE 37

Sets of desirable gambles

Given a lower prevision P, we can consider the set of gambles

D := {f ∈ K : P(f) ≥ 0},

the set of associated desirable gambles. Conversely, given a set of gambles D we can define

P(f) := sup{µ : f − µ ∈ D}

2nd SIPTA school on imprecise probabilities – p. 37/8

SLIDE 38

Coherence of sets of desirable gambles

A set of desirable gambles is coherent if and only if

1. If sup f < 0, then f /

∈ D.

2. If f ≥ 0, then f ∈ D.
3. If f, g ∈ D, then f + g ∈ D.
4. If f ∈ D, λ ≥ 0, then λf ∈ D.
5. If f + ǫ ∈ D for all ǫ > 0, then f ∈ D.

If D is a coherent set of gambles, then the lower prevision it induces is coherent. Conversely, a coherent lower prevision P determines a coherent set of desirable gambles through the previous formula.

2nd SIPTA school on imprecise probabilities – p. 38/8

SLIDE 39

Is coherence too strong?

Some critics to the property of coherence are: Descriptive decision theory shows that sometimes beliefs violate the notion of coherence. They may be difficult to assign for people not familiar with the behavioural theory of imprecise probabilities. Other rationality criteria may be also interesting.

2nd SIPTA school on imprecise probabilities – p. 39/8

SLIDE 40

Inference: natural extension

Consider a coherent lower prevision P with domain K. Which are the consequences of the assessments in K on gambles outside K? The natural extension of P to all gambles is given by

E(f) := sup{µ :∃fk ∈ K, λk ≥ 0, k = 1, . . . , n : f − µ ≥

n

i=1

fk(ω) − P(fk)} E(f) is the supremum acceptable buying price for f that

can derived from the assessments on the gambles in the domain.

2nd SIPTA school on imprecise probabilities – p. 40/8

SLIDE 41

Natural extension: properties

If P does not avoid sure loss, then E(f) = +∞ for any gamble f. If P avoids sure loss, then E is the smallest coherent lower prevision on L(X) that dominates P on K.

P is coherent if and only if E coincides with P on K. E is then the least-committal extension of P: if there are

ther extensions, they reflect stronger assessments

than those in P. The natural extension of a convex combination of coherent lower previsions (with the same domain) is the convex combination of the natural extensions.

2nd SIPTA school on imprecise probabilities – p. 41/8

SLIDE 42

In terms of sets of linear previsions

Given a lower prevision P and its set of dominating linear prevision M(P), the natural extension E of P is the lower envelope of P. Hence, P is coherent if and only if M(E) = M(P). Moreover, we can calculate E as the lower envelope of the extreme points of the set M(P).

2nd SIPTA school on imprecise probabilities – p. 42/8

SLIDE 43

In terms of sets of gambles

Consider a coherent set of desirable gambles D. Its natural extension E is the set of gambles

E :={g ∈ L(X) : (∀δ > 0)(∃n ≥ 0, λk ∈ R+, fk ∈ D) g ≥

n

k=1

λkfk − δ}.

It is the smallest coherent set of desirable gambles that contains D. It is the smallest closed convex cone including D and all non-negative gambles.

2nd SIPTA school on imprecise probabilities – p. 43/8

SLIDE 44

All these procedures of natural extension agree with one another: if we consider a coherent lower prevision P, its set

f desirable gambles DP, the natural extension of this set

EDP and then the coherent lower prevision associated to

this set, we obtain the natural extension of P. Hence, we have three equivalent ways of representing our behavioural dispositions: Coherent lower previsions. Sets of linear previsions. Sets of desirable gambles.

2nd SIPTA school on imprecise probabilities – p. 44/8

SLIDE 45

Some references

T. Fine, Theories of probability, Academic Press, 1973.
B. de Finetti, Theory of Probability, John Wiley and

Sons, 1974.

H. Kyburg and J. Smoker (eds.), Studies in subjective

probability, Wiley, New York, 1980. P . Williams, Notes on conditional previsions, Technical Report, Univ. of Sussex, 1975. P . Walley, Statistical reasoning with imprecise probabilities, Chapman and Hall, 1991.

2nd SIPTA school on imprecise probabilities – p. 45/8

SLIDE 46

Overview, Part II

Conditional lower previsions. Definition of coherence. Products. Other types of non-additive measures: 2- and n-monotone capacities. Belief functions. Possibility and maxitive measures.

2nd SIPTA school on imprecise probabilities – p. 46/8

SLIDE 47

Joint lower previsions

Consider now another experiment Y taking values on a set

Y. Then, we may wonder about the joint experiment X × Y

that takes values on a set X × Y. A model about the value that X × Y takes may be given by a joint lower prevision P on L(X × Y).

2nd SIPTA school on imprecise probabilities – p. 47/8

SLIDE 48

Marginal lower previsions

Assume we have a joint lower prevision P on L(X × Y) that models the value taken by X × Y . We may want to model the information about the value that

Y takes, irrespective of the value X takes. We may do so by

means of the marginal lower prevision P Y , given by

P Y (g) = P(g′),

where

g′(x, y) = g(y) ∀x ∈ X, y ∈ Y.

2nd SIPTA school on imprecise probabilities – p. 48/8

SLIDE 49

Constant gambles

Although P is defined on gambles on X × Y, the above reasoning allows us to define it for gambles on X or on Y. For instance, given f ∈ L(X), we can define the gamble g′

n X × Y by

g′(x, y) = g(x) ∀x ∈ X, y ∈ Y.

Such a gamble g′ is called X-constant: it is constant once we fix the value in X.

2nd SIPTA school on imprecise probabilities – p. 49/8

SLIDE 50

Conditional lower previsions

Assume we have a joint lower prevision P on L(X × Y) that models the value taken by X × Y . We may want to update our model assuming that we came to know the value that Y has taken in Y. We must do so by means of a conditional lower prevision

P(·|Y) on L(X × Y).

2nd SIPTA school on imprecise probabilities – p. 50/8

SLIDE 51

A two place function

A conditional lower prevision is a two place function: First, we have a mapping P(·|Y) on Y that assigns the functional P(·|y) to the element y of Y. On the other hand, the functional P(·|y) on L(X × Y) gives us, for any gamble f, the supremum acceptable buying price if we came to know that the outcome belongs to X × {y}.

2nd SIPTA school on imprecise probabilities – p. 51/8

SLIDE 52

Separate coherence

The first thing we must require to a conditional lower prevision P(·|Y) is that it is separately coherent:

P(y|y) = 1 for any y ∈ Y. P(·|y) is a coherent lower prevision on L(X × Y).

Consequence: P(·|Y) is determined by its values on

X-constant gambles: for any y ∈ Y, h(x, y) = h′(x, y) ∀x ∈ X ⇒ P(h|y) = P(h|y′).

2nd SIPTA school on imprecise probabilities – p. 52/8

SLIDE 53

Separate coherence on linear spaces

If P(·|Y) is defined on a linear subset of L(X × Y) that contains IX ×{y} for all y, then it is separately coherent if and

nly if for any f, g ∈ K, λ ≥ 0 and y ∈ Y,
1. P(f|y) ≥ infx∈X f(x, y).
2. P(λf|y) = λP(f|y).
3. P(f + g|y) ≥ P(f|y) + P(g|y).

2nd SIPTA school on imprecise probabilities – p. 53/8

SLIDE 54

Joint coherence

Assume we have: A coherent lower prevision P on L(X × Y). A separately coherent conditional lower prevision P(·|Y)

n L(X × Y).

For the coherence of these assessments, it is necessary that

P(IX ×{y}(h − P(h|y))) = 0 ∀h ∈ L(X × Y), y ∈ Y.

This is called the Generalized Bayes Rule (GBR).

2nd SIPTA school on imprecise probabilities – p. 54/8

SLIDE 55

The Generalized Bayes Rule

When Y is a finite set, then the GBR is also sufficient for coherence. If P(y) > 0, then P(f|y) is the unique value that satisfies the GBR. In more general cases, the GBR is necessary but not

sufficient. We need moreover

P(f − P(f|Y)) ≥ 0

for any f ∈ L(X × Y).

2nd SIPTA school on imprecise probabilities – p. 55/8

SLIDE 56

GBR for linear previsions

When Y is finite and the joint prevision is linear, the GBR becomes

P(h|y) = P(hIX ×{y}) P({y})

if P({y}) > 0.

P(·|y) is then a precise prevision with mass function P(x|y) = P(x,y)

P({y}). We obtain then Bayes’ rule.

2nd SIPTA school on imprecise probabilities – p. 56/8

SLIDE 57

Natural extension

As in the with unconditional case, we can study the behavioural consequences of the assessments present in a conditional and an unconditional prevision which are jointly

coherent. This can be done by means of their natural

extension . In some cases there are no coherent extensions. If there are, the natural extension may be only a lower bound of the smallest coherent extensions. It is the smallest coherent extension when X, Y are finite.

2nd SIPTA school on imprecise probabilities – p. 57/8

SLIDE 58

Regular extension

The GBR uniquely determines the value P(f|y) when

P(y) > 0. When P(y) = 0, we can consider P 1(f|y) := inf{P(f|y) : P ∈ M(P), P(y) > 0},

where P(f|y) is defined using Bayes’ rule. This is called the regular extension of P.

2nd SIPTA school on imprecise probabilities – p. 58/8

SLIDE 59

Marginal extension

Assume we have: A coherent conditional lower prevision P(·|Y) on

L(X × Y).

A coherent lower prevision P on L(Y) (or Y-constant gambles). Then, the smallest jointly coherent extension to L(X × Y) is given by

P := P(P(·|Y)).

It is called the marginal extension of P, P(·|Y).

2nd SIPTA school on imprecise probabilities – p. 59/8

SLIDE 60

Some types of non-additive measures

2 and n-monotone capacities.

Belief functions. Possibility and necessity measures.

2nd SIPTA school on imprecise probabilities – p. 60/8

SLIDE 61

Capacities

A mapping µ : P(X) → [0, 1] is called a capacity when it satisfies the following properties:

1. µ(∅) = 0, µ(X) = 1 (boundary conditions).
2. A ⊆ B ⇒ µ(A) ≤ µ(B) (monotonicity).

Capacities are also called fuzzy measures or Choquet capacities of first order. When they satisfy some additional property they are called upper (or lower) probabilities.

2nd SIPTA school on imprecise probabilities – p. 61/8

SLIDE 62

Lower and upper probabilities

Some of the properties that are required to capacities are the following:

µ(A ∪ B) ≥ µ(A) + µ(B) (superadditivity). µ(A ∩ B) ≤ µ(A) + µ(B) (subadditivity). µ(∪nAn) = supn µ(An) for any increasing sequence

(lower continuity).

µ(∩nAn) = infn µ(An) for any decreasing sequence

(upper continuity). The choice between the different properties must be made in terms of the interpretation.

2nd SIPTA school on imprecise probabilities – p. 62/8

SLIDE 63

2-monotone capacities

Let P be a lower probability defined on a field of events A. It is called a 2-monotone capacity when

P(A ∪ B) + P(A ∩ B) ≥ P(A) + P(B)

for any A, B ∈ A. 2-monotone capacities are also called submodular or convex . They do not have an easy behavioural interpretation, but they do possess interesting mathematical properties.

2nd SIPTA school on imprecise probabilities – p. 63/8

SLIDE 64

Properties

A 2-monotone capacity is a coherent lower probability. If P is a probability and f : [0, 1] → [0, 1] is a convex mapping with f(0) = 0, the lower probability defined by

P(X) = 1, P(A) = f(P(A)) for any A = X is

2-monotone. A 2-monotone capacity on a field A can be extended to all subsets of X by

µ∗(A) = sup{µ(B) : B ⊆ A}.

2nd SIPTA school on imprecise probabilities – p. 64/8

SLIDE 65

Example

In a roulette there is an unknown dependence between the two first outcomes, so that the first one (red/black) is completely random and the second is either always the same or the opposite of the first. Let Hi=“the i-th outcome is red”, i = 1, 2.

P(red) = P(black) = 0.5, so we should consider P(H1) = P(H1) = P(H2) = P(H2) = 0.5, P(H1 ∩ H2) = 0, P(H1 ∩ H2) = 0.5, P(H1 ∪ H2) = 0.5, P(H1 ∪ H2) = 1.

In such a case

P(H1 ∪ H2) + P(H1 ∩ H2) = 0.5 < P(H1) + P(H2) = 1 :

ur beliefs would not be 2-monotone.

2nd SIPTA school on imprecise probabilities – p. 65/8

SLIDE 66

n-monotone capacities

Let P be a lower probability defined on a field of events A. It is called n-monotone when

P(A1 ∪ · · · ∪ An) ≥

∅=I⊆{1,...,n}

(−1)|I|+1P(∩i∈IAi)

for any {A1, . . . , An} in A. The conjugate of an n-monotone capacity is called

n-alternating, and satisfies P(A1 ∩ · · · ∩ An) ≤

∅=I⊆{1,...,n}

(−1)|I|+1P(∪i∈IAi)

for any family {A1, . . . , An} in A.

2nd SIPTA school on imprecise probabilities – p. 66/8

SLIDE 67

Properties

An n-monotone capacity is also m-monotone for any

m < n.

In particular, it is a coherent lower probability. In some cases additional topological conditions are required on the field A, and then n-monotone capacities must satisfy some additional continuity condition, usually: Continuity for decreasing sequences of compact sets. Lower continuity (continuity for increasing sequences).

2nd SIPTA school on imprecise probabilities – p. 67/8

SLIDE 68

Example

In many betting systems the benefit is bigger when less people has betted on the event. If P0(A) is the proportion of bets on A and τ ∈ (0, 1) is the percentage of benefit of the systems, the winning rate for A is (1 − τ)/P0(A). We obtain

P(A) = P0(A)/(1 − τ) P(A) = max{(P0(A) − τ)/(1 − τ), 0}.

This is called the pari-mutuel model.

2nd SIPTA school on imprecise probabilities – p. 68/8

SLIDE 69

Example (II)

P is 2-monotone, because f(x) = max{(x − τ)/(1 − τ), 0} is

convex. It is not 3-monotone: consider

X = {a, b, c}, P0(a) = 0.5, P0(b) = 0.3, P0(c) = 0.2, and take A1 = {a, b}, A2 = {a, c}, A3 = {b, c}.

2nd SIPTA school on imprecise probabilities – p. 69/8

SLIDE 70

Belief functions

If a lower probability P is n-monotone for every n, it is called

∞-monotone or a Choquet capacity of ∞ order. Hence, for

any n ≥ 0 and any family {A1, . . . , An} of elements of A,

P(A1 ∪ . . . An) ≥

∅=I⊆{1,...,n}

(−1)|I|+1P(∩i∈IAi)

When X is finite, ∞-monotone functions are also called belief functions.

2nd SIPTA school on imprecise probabilities – p. 70/8

SLIDE 71

The evidential interpretation

Belief functions were developed mostly by Shafer after some works by Dempster in the 60s. The belief of an event

A, P(A), represents the amount of evidence supporting A.

It is usually assumed that there exists a true (and unknown) state for the experiment in X. However, this does not mean that P should be defined on singletons only, or that it is characterised by their restriction to them.

2nd SIPTA school on imprecise probabilities – p. 71/8

SLIDE 72

Example

There has been a murder and police has two suspects, A and B. An unreliable witness claims to have seen B in the scene of the crime. We think that either (1) he really saw B

r (2) he saw nothing. In the first case, the list of suspects

becomes just B, and in the second it remains the same. If we assume that the probability of (1) and (2) is

P((1)) = α, P((2)) = 1 − α, we obtain the belief function P

given by P(B) = α, P(A) = 0, P(A, B) = 1.

2nd SIPTA school on imprecise probabilities – p. 72/8

SLIDE 73

Probabilities

A particular case of ∞-monotone functions are finite- and

σ-additive probabilities. These functions satisfy the n-monotonicity condition with = for every n, and are at the

same time ∞-monotone and ∞-alternating. This means that all the models of non-additive measures we have considered until now (coherent lower probabilities,

2- and n-monotone capacities, belief functions) contains

probabilities as a particular case.

2nd SIPTA school on imprecise probabilities – p. 73/8

SLIDE 74

Vacuous belief functions

The case where we have no information at all corresponds with the basic probability assignment m(X) = 1, m(A) = 0 for any A X. The associated belief and plausibility functions are

Bel(A) = 0 ∀A = X, Pl(A) = 1 ∀A = ∅

These are called vacuous belief and plausibility functions.

2nd SIPTA school on imprecise probabilities – p. 74/8

SLIDE 75

Possibility and necessity measures

A possibility measure on X is a mapping Π : P(X) → [0, 1] satisfying that for any family of subsets (Ai)i∈I,

Π(∪i∈IAi) = sup

i∈I

Π(Ai).

The conjugate of a possibility measure, defined by

Nec(A) = 1 − Π(Ac), is said to be a necessity measure and

satisfies

Nec(∩i∈IAi) = inf

i∈I Nec(Ai)

for any family of subsets (Ai)i∈I.

2nd SIPTA school on imprecise probabilities – p. 75/8

SLIDE 76

Properties

A possibility measure is ∞-alternating, and a necessity measure is ∞-monotone. A possibility measure is characterised by its possibility distribution π : X → [0, 1], which is given by

π(ω) = Π({ω}). It holds Π(A) = sup

ω∈A

π(ω) ∀A ⊆ X

A possibility measure is lower continuous, and a necessity measure, upper continuous.

2nd SIPTA school on imprecise probabilities – p. 76/8

SLIDE 77

Maxitive and minitive measures

A slightly more general but less used alternative to possibility measures are the maxitive, measures, those upper probabilities satisfying

P(A ∪ B) = max{P(A), P(B)}

for any A, B ⊆ X. Their conjugates are the minitive measures, which satisfy

P(A ∩ B) = min{P(A), P(B)} for any A, B ⊆ X.

2nd SIPTA school on imprecise probabilities – p. 77/8

SLIDE 78

Properties

A maxitive measure is ∞-alternating, and a minitive measure is ∞-monotone. A maxitive measure need not be lower continuous. Given P : P(X) → [0, 1],

P possibility ⇐ ⇒

maxitive and condensable, where condensability means continuity for increasing nets.

2nd SIPTA school on imprecise probabilities – p. 78/8

SLIDE 79

Lower probabilities on finite sets

When the space X is finite, we can give an alternative representation of 2-, n- and ∞-monotone functions via the basic probability assignment. A function m : P(X) → [0, 1] is a basic probability assignment when it satisfies the following properties:

1. m(∅) = 0.

2.

A⊆X m(A) = 1.

2nd SIPTA school on imprecise probabilities – p. 79/8

SLIDE 80

Relationship with belief functions

Given a basic probability assignment m, the function

P : P(X) → [0, 1] dada por P(A) =

B⊆A

m(B)

is a belief function. Given a belief function P, the function m : P(X) → [0, 1] given by

m(A) =

B⊆A

(−1)|A−B|P(B)

is basic probability assignment. This is a 1-to-1 correspondence.

2nd SIPTA school on imprecise probabilities – p. 80/8

SLIDE 81

m is also called the Möbius inverse of the belief function P.

The concept generalises to 2-monotone capacities. The function m : P(X) → R given by

m(A) =

B⊆A

(−1)|A−B|P(B)

is the Möbius inverse of P, and P(A) =

B⊆A m(B). Note

that m need not take only non-negative values; indeed,

P belief function ⇐ ⇒ m non negative

The events A such that m(A) > 0 are called the focal elements of m.

2nd SIPTA school on imprecise probabilities – p. 81/8

SLIDE 82

Relationship with upper probabilities

Given a belief function Bel : P(X) → [0, 1], and its conjugate plausibility measure Pl : P(X) → [0, 1], given by

Pl(A) = 1 − Bel(Ac),

both Pl and Bel can be obtained from the same basic probability assignment, in the case of Pl through the formula

Pl(A) =

B∩A=∅

m(B)

2nd SIPTA school on imprecise probabilities – p. 82/8

SLIDE 83

Probabilities and Möbius functions

Let P : P(X) → [0, 1] be a belief function, and let P be the conjugate plausibility functions. The following are equivalent:

1. P is a probability measure.
2. The focal elements of P are singletons.
3. P = P.
4. P(A) + P(Ac) = 1 for any A ⊆ X.

In this finite case, there is no distinction between finite and

σ-additive measures.

2nd SIPTA school on imprecise probabilities – p. 83/8

SLIDE 84

Maxitive and possibility measures

In the finite case, the following are equivalent:

P is a possibility measure. P is maxitive.

The focal sets of P are nested. Shafer calls necessity measures on a finite case consonant belief functions. Under the evidential interpretation, all the evidence points in the same direction, and is heterogeneous in the sense that only the precision varies.

2nd SIPTA school on imprecise probabilities – p. 84/8

SLIDE 85

Some references

P . Walley, Statistical reasoning with imprecise probabilities, Chapman and Hall, 1991.

G. Shafer, A mathematical theory of evidence,

Princeton, 1976.

D. Denneberg, Non-additive measure and integral,

Kluwer, 1994.

R. Kruse and H. Meyer, Statistics with vague data,

Kluwer, 1987.

M. Grabisch, H.T. Nguyen, E.A. Walker, Fundamentals
f uncertainty calculi with applications to fuzzy

inference, Kluwer, 1995.

2nd SIPTA school on imprecise probabilities – p. 85/8