Concentration bounds for CVaR estimation: The cases of light-tailed - - PowerPoint PPT Presentation

concentration bounds for cvar estimation the cases of
SMART_READER_LITE
LIVE PREVIEW

Concentration bounds for CVaR estimation: The cases of light-tailed - - PowerPoint PPT Presentation

Concentration bounds for CVaR estimation: The cases of light-tailed and heavy-tailed distributions . Paper #2156 Prashanth L A , Krishna Jagannathan , Ravi Kolla IIT Madras, Chennai, India AB InBev cn 2 One-Slide Summary


slide-1
SLIDE 1

Concentration bounds for CVaR estimation: The cases of light-tailed and heavy-tailed distributions

.

Prashanth L A∗, Krishna Jagannathan∗, Ravi Kolla† Paper #2156

∗ IIT Madras, Chennai, India † AB InBev

slide-2
SLIDE 2

One-Slide Summary

Objective: Estimate the Conditional Value-at-Risk (CVaR) cα(X) of a r.v. X from n i.i.d. samples

Our Contributions: Concentration bounds P [|cn,α − cα(X)| > ϵ]

. . 1) Sub-Gaussian distributions: Our bounds match an existing result, but with better constants . 2) Light-tailed distributions: We derive an O exp cn min

2

tail bound for an empirical CVaR estimator . . 3) Heavy-tailed distributions with bounded variance: We derive an O exp cn 2 tail bound for a truncated CVaR estimator. . . 4) Bandit application: Best CVaR arm identification and error bounds

slide-3
SLIDE 3

One-Slide Summary

Objective: Estimate the Conditional Value-at-Risk (CVaR) cα(X) of a r.v. X from n i.i.d. samples

Our Contributions: Concentration bounds P [|cn,α − cα(X)| > ϵ]

. . 1) Sub-Gaussian distributions: Our bounds match an existing result, but with better constants . . 2) Light-tailed distributions: We derive an O exp cn min

2

tail bound for an empirical CVaR estimator . . 3) Heavy-tailed distributions with bounded variance: We derive an O exp cn 2 tail bound for a truncated CVaR estimator. . . 4) Bandit application: Best CVaR arm identification and error bounds

slide-4
SLIDE 4

One-Slide Summary

Objective: Estimate the Conditional Value-at-Risk (CVaR) cα(X) of a r.v. X from n i.i.d. samples

Our Contributions: Concentration bounds P [|cn,α − cα(X)| > ϵ]

. . 1) Sub-Gaussian distributions: Our bounds match an existing result, but with better constants . . 2) Light-tailed distributions: We derive an O(exp(−cn min(ϵ, ϵ2))) tail bound for an empirical CVaR estimator . . 3) Heavy-tailed distributions with bounded variance: We derive an O exp cn 2 tail bound for a truncated CVaR estimator. . . 4) Bandit application: Best CVaR arm identification and error bounds

slide-5
SLIDE 5

One-Slide Summary

Objective: Estimate the Conditional Value-at-Risk (CVaR) cα(X) of a r.v. X from n i.i.d. samples

Our Contributions: Concentration bounds P [|cn,α − cα(X)| > ϵ]

. . 1) Sub-Gaussian distributions: Our bounds match an existing result, but with better constants . . 2) Light-tailed distributions: We derive an O(exp(−cn min(ϵ, ϵ2))) tail bound for an empirical CVaR estimator . . 3) Heavy-tailed distributions with bounded variance: We derive an O(exp(−cnϵ2)) tail bound for a truncated CVaR estimator. . . 4) Bandit application: Best CVaR arm identification and error bounds

slide-6
SLIDE 6

One-Slide Summary

Objective: Estimate the Conditional Value-at-Risk (CVaR) cα(X) of a r.v. X from n i.i.d. samples

Our Contributions: Concentration bounds P [|cn,α − cα(X)| > ϵ]

. . 1) Sub-Gaussian distributions: Our bounds match an existing result, but with better constants . . 2) Light-tailed distributions: We derive an O(exp(−cn min(ϵ, ϵ2))) tail bound for an empirical CVaR estimator . . 3) Heavy-tailed distributions with bounded variance: We derive an O(exp(−cnϵ2)) tail bound for a truncated CVaR estimator. . . 4) Bandit application: Best CVaR arm identification and error bounds

slide-7
SLIDE 7

What is Conditional Value-at-Risk (CVaR)?

slide-8
SLIDE 8

VaR and CVaR are Risk Metrics

  • Widely used in financial portfolio optimization, credit risk

assessment and insurance

  • Let X be a continuous random variable
  • Fix a `risk level' α ∈ (0, 1) (say α = 0.95)

Value at Risk: . . v X F

1 X

Conditional Value at Risk: . . c X X X v X . . v X 1 1 X v X

slide-9
SLIDE 9

VaR and CVaR are Risk Metrics

  • Widely used in financial portfolio optimization, credit risk

assessment and insurance

  • Let X be a continuous random variable
  • Fix a `risk level' α ∈ (0, 1) (say α = 0.95)

Value at Risk: . . vα(X) = F−1

X (α)

Conditional Value at Risk: . . c X X X v X . . v X 1 1 X v X

slide-10
SLIDE 10

VaR and CVaR are Risk Metrics

  • Widely used in financial portfolio optimization, credit risk

assessment and insurance

  • Let X be a continuous random variable
  • Fix a `risk level' α ∈ (0, 1) (say α = 0.95)

Value at Risk: . . vα(X) = F−1

X (α)

Conditional Value at Risk: . cα(X) = E [X|X > vα(X)] . . = vα(X) + 1 1 − αE [X − vα(X)]+

slide-11
SLIDE 11

CVaR Estimation and Concentration bounds

slide-12
SLIDE 12

CVaR estimation

Problem: Given i.i.d. samples X1, . . . , Xn from the distribution F of r.v. X, estimate . . cα(X) = E [X|X > vα(X)] . . Nice to have: Sample complexity O ( 1/ϵ2) for accuracy ϵ

slide-13
SLIDE 13

Empirical VaR and CVaR Estimates

Empirical distribution function (EDF): Given samples X1, . . . , Xn from distribution F, . . ˆ Fn(x) = 1

n

∑n

i=1 I {Xi ≤ x} , x ∈ R

Using EDF and the order statistics X[1] ≤ X[2] ≤ . . . , X[n], VaR estimate: . . ˆ vn,α = inf{x : ˆ Fn(x) ≥ α} = X[⌈nα⌉]. CVaR estimate: . . cn vn

1 n 1 n i 1 Xi

vn

slide-14
SLIDE 14

Empirical VaR and CVaR Estimates

Empirical distribution function (EDF): Given samples X1, . . . , Xn from distribution F, . . ˆ Fn(x) = 1

n

∑n

i=1 I {Xi ≤ x} , x ∈ R

Using EDF and the order statistics X[1] ≤ X[2] ≤ . . . , X[n], VaR estimate: . . ˆ vn,α = inf{x : ˆ Fn(x) ≥ α} = X[⌈nα⌉]. CVaR estimate: . . ˆ cn,α = ˆ vn,α +

1 n(1−α)

∑n

i=1 (Xi − ˆ

vn,α)+

slide-15
SLIDE 15

Empirical CVaR concentration: What is known ?

Goal: Bound . P [|ˆ cn,α − cα(X)| > ϵ] Distribution type Reference Salient Feature Bounded support [Wang et al. ORL 2010] exp(−cnϵ2) Sub-Gaussian/ [Kolla et al. ORL 2019] VaR conc. sub-exponential One-sided CVaR Sub-Gaussian [S. Bhat & P. L.A. NeurIPS 2019] Wasserstein Sub-exponential/ This work Heavy-tailed

slide-16
SLIDE 16

VaR Concentration 1

Assumption (A1): X is a continuous r.v. with a CDF F that satisfies a condition of sufficient growth around the VaR vα: There exists constants δ, η > 0 such that min (F (vα + δ) − F (vα) , F (vα) − F (vα − δ)) ≥ ηδ.

Lemma (VaR concentration)

Suppose that (A1) holds. We have for all vn v 2 exp 2n 2 2 Proof uses DKW inequality; no tail assumptions required.

1Concentration bounds for empirical conditional value-at-risk: The unbounded case;

  • R. Kolla, L.A. Prashanth, S. P. Bhat, K. Jagannathan; Operations Research Letters, 2019
slide-17
SLIDE 17

VaR Concentration 1

Assumption (A1): X is a continuous r.v. with a CDF F that satisfies a condition of sufficient growth around the VaR vα: There exists constants δ, η > 0 such that min (F (vα + δ) − F (vα) , F (vα) − F (vα − δ)) ≥ ηδ.

Lemma (VaR concentration)

Suppose that (A1) holds. We have for all ϵ ∈ (0, δ), P [|ˆ vn,α − vα| ≥ ϵ] ≤ 2 exp ( −2nη2ϵ2) . Proof uses DKW inequality; no tail assumptions required.

1Concentration bounds for empirical conditional value-at-risk: The unbounded case;

  • R. Kolla, L.A. Prashanth, S. P. Bhat, K. Jagannathan; Operations Research Letters, 2019
slide-18
SLIDE 18

Concentration for CVaRα Estimator

  • Obtaining concentration for CVaRα estimator is more

involved than for VaRα

  • Need to make some assumptions on the tail distribution
  • We work with three progressive broader distribution

classes . . (i) X is sub-Gaussian or . (ii) X is sub-exponential (i.e., light-tailed) or . (iii) X has a bounded second moment

  • For (i) and (ii), we use the empirical CVaR estimator; for

(iii) we use a truncated CVaR estimator

slide-19
SLIDE 19

Concentration for CVaRα Estimator

  • Obtaining concentration for CVaRα estimator is more

involved than for VaRα

  • Need to make some assumptions on the tail distribution
  • We work with three progressive broader distribution

classes . . (i) X is sub-Gaussian or . . (ii) X is sub-exponential (i.e., light-tailed) or . . (iii) X has a bounded second moment

  • For (i) and (ii), we use the empirical CVaR estimator; for

(iii) we use a truncated CVaR estimator

slide-20
SLIDE 20

Concentration for CVaRα Estimator

  • Obtaining concentration for CVaRα estimator is more

involved than for VaRα

  • Need to make some assumptions on the tail distribution
  • We work with three progressive broader distribution

classes . . (i) X is sub-Gaussian or . . (ii) X is sub-exponential (i.e., light-tailed) or . . (iii) X has a bounded second moment

  • For (i) and (ii), we use the empirical CVaR estimator; for

(iii) we use a truncated CVaR estimator

slide-21
SLIDE 21

Sub-Gaussian and Sub-Exponential distributions

A random variable is X is sub-Gaussian if ∃ σ > 0 s.t. . . E [ eλX] ≤ e

σ2λ2 2 , ∀λ ∈ R.

Or equivalently, letting Z ∼ N(0, σ2),

. . P [X > ϵ] ≤ cP [Z > ϵ] , ∀ϵ > 0. .Tail dominated by a Gaussian

A random variable is X is sub-exponential if c0 0 s.t. . . e X c0 Or equivalently, b 0 s.t.

. . e X e

2 2 2

1 b

Or

. . X c1 exp c2 .Tail dominated by an exponential r.v .

slide-22
SLIDE 22

Sub-Gaussian and Sub-Exponential distributions

A random variable is X is sub-Gaussian if ∃ σ > 0 s.t. . . E [ eλX] ≤ e

σ2λ2 2 , ∀λ ∈ R.

Or equivalently, letting Z ∼ N(0, σ2),

. . P [X > ϵ] ≤ cP [Z > ϵ] , ∀ϵ > 0. .Tail dominated by a Gaussian

A random variable is X is sub-exponential if ∃ c0 > 0 s.t. . . E [ eλX] < ∞, ∀|λ| < c0. Or equivalently, ∃σ, b > 0 s.t.

. . E [ eλX] ≤ e

σ2λ2 2

, ∀|λ| ∈ 1 b . Or . . P [X > ϵ] ≤ c1 exp(−c2ϵ), ∀ϵ > 0. .Tail dominated by an exponential r.v .

slide-23
SLIDE 23

CVaR concentration for Sub-Gaussian case

Recall . . ˆ vn,α = inf{x : ˆ Fn(x) ≥ α} = X[⌈nα⌉]. . . ˆ cn,α = ˆ vn,α +

1 n(1−α)

∑n

i=1 (Xi − ˆ

vn,α)+ Theorem (CVaR concentration for sub-Gaussian) Assume (A1). Suppose that Xi, i 1 n are

  • sub-Gaussian.

Then, for any we have cn c 6 exp n

1

where

1 2 1 2 min 2 1

8 max

2 8

slide-24
SLIDE 24

CVaR concentration for Sub-Gaussian case

Recall . . ˆ vn,α = inf{x : ˆ Fn(x) ≥ α} = X[⌈nα⌉]. . . ˆ cn,α = ˆ vn,α +

1 n(1−α)

∑n

i=1 (Xi − ˆ

vn,α)+ Theorem (CVaR concentration for sub-Gaussian) Assume (A1). Suppose that Xi, i = 1, . . . , n are σ-sub-Gaussian. Then, for any ϵ ∈ (0, δ), we have P [|ˆ cn,α − cα| > ϵ] ≤ 6 exp [−nψ1(ϵ)] , where ψ1(ϵ) = ϵ2(1 − α)2 min ( η2, 1 ) 8 max (σ2, 8) .

slide-25
SLIDE 25

CVaR concentration for Sub-Exponential case

Recall . . ˆ vn,α = inf{x : ˆ Fn(x) ≥ α} = X[⌈nα⌉]. . . ˆ cn,α = ˆ vn,α +

1 n(1−α)

∑n

i=1 (Xi − ˆ

vn,α)+ Theorem (CVaR concentration for sub-Exponential) Assume (A1). Suppose that Xi, i = 1, . . . , n are sub-exponential with parameters σ, b. Then, for all ϵ ∈ (0, δ), we have P [|ˆ cn,α − cα| > ϵ] ≤ 6 exp [−nψ2(ϵ)] , where ψ2(ϵ) = min ( ϵ2(1 − α)2 min ( η2, 1 ) 8 max (σ2, 8) , ϵ(1 − α) 8b ) .

slide-26
SLIDE 26

Handling Heavy-Tailed distributions

  • Heavy-tailed distributions occur commonly in finance

applications

  • Tail of distribution decays slower than any exponential

—characterised by atypically large sample values

  • Empirical estimates may be `thrown off' due to atypically

large values occurring early in the aggregating process

  • Raw empirical estimates do not concentrate well around

true value

  • Key Idea: Truncated estimator!
  • Truncate large values, while slowly growing the truncation

threshold2

2Bubeck et. al., Bandits with Heavy-Tail, IEEE Trans. Inf. Thy., 2013.

slide-27
SLIDE 27

Handling Heavy-Tailed distributions

  • Heavy-tailed distributions occur commonly in finance

applications

  • Tail of distribution decays slower than any exponential

—characterised by atypically large sample values

  • Empirical estimates may be `thrown off' due to atypically

large values occurring early in the aggregating process

  • Raw empirical estimates do not concentrate well around

true value

  • Key Idea: Truncated estimator!
  • Truncate large values, while slowly growing the truncation

threshold2

2Bubeck et. al., Bandits with Heavy-Tail, IEEE Trans. Inf. Thy., 2013.

slide-28
SLIDE 28

The Bounded (Second) Moment case: Truncated CVaR estimator

Assume . (A2) ∃ u such that E[X2] < u < ∞. Truncated CVaR Estimator . . cn

1 n 1 n i 1 Xi

vn Xi Bi , where Bi ui

slide-29
SLIDE 29

The Bounded (Second) Moment case: Truncated CVaR estimator

Assume . (A2) ∃ u such that E[X2] < u < ∞. Truncated CVaR Estimator . . ˆ cn,α =

1 n(1−α)

∑n

i=1 XiI {ˆ

vn,α ≤ Xi ≤ Bi}, where Bi ∝ √ ui.

slide-30
SLIDE 30

CVaR Concentration for the Bounded Moment case

Theorem (CVaR concentration: Bounded second moment case) Let {Xi}n

i=1 be a sequence of i.i.d. r.v.s satisfying (A1) and (A2).

Let ˆ cn,α be the truncated CVaR estimate formed using the above set of samples. For all ϵ > 0, P [|ˆ cn,α − cα| > ϵ] ≤ 2 exp ( − n(1 − α)2ϵ2 144 (√u + vα )2 ) + 4 exp ( −nη2(1 − α)2 min ( ϵ2, δ2) 144 ) , where η and δ are as defined in (A1).

slide-31
SLIDE 31

Bandit application

slide-32
SLIDE 32

CVaR-aware bandits: Model

Known # of arms K and horizon n Unknown Distributions Fk, k = 1, . . . , K, CVaR-values (at fixed risk level α) : c1, c2, . . . , cK Interaction In each round t = 1, . . . , n

  • pull arm It ∈ {1, . . . , K}
  • observe a sample loss from FIt

Recommendation Arm Jn Benchmark: k∗ = arg min

k=1,...,K

ck. . Goal: Minimize probability of erroneous recommendation Jn k

slide-33
SLIDE 33

CVaR-aware bandits: Model

Known # of arms K and horizon n Unknown Distributions Fk, k = 1, . . . , K, CVaR-values (at fixed risk level α) : c1, c2, . . . , cK Interaction In each round t = 1, . . . , n

  • pull arm It ∈ {1, . . . , K}
  • observe a sample loss from FIt

Recommendation Arm Jn Benchmark: k∗ = arg min

k=1,...,K

ck. . Goal: Minimize probability of erroneous recommendation P [Jn ̸= k∗]

slide-34
SLIDE 34

The Successive Rejects Algorithm3

. . . Initial- ization

.

. A1 = {1, . . . , K}, nk = ⌈ 1 log(K) n − K K + 1 − k ⌉

.

. Phase 1

.

. Play each arm j ∈ A1, n1 times; Set A2 = A1 \ arg max

j∈A1

ˆ cj,n1

.

. Phase 2

.

. Play each arm j ∈ A2, n2 − n1 times; Eliminate . . . . . . . .

.

. . . . . . Phase K − 1

.

. Play the remaining two arms nK−1 − nK−2 times

  • One arm played n1 times,

. . ., another played nK−2 times

  • Two arms played nK−1

times

  • n1 + . . . + nK−1 + nK−1 ≤ n
  • nk increases with k
  • Adaptive exploration:

better than uniform (i.e., play each arm n/K times)

3Audibert et al., Best Arm Identification in Multi-armed Bandits, COLT 2010

slide-35
SLIDE 35

Probability of error for Successive Rejects

  • Suppose the arm distributions are all 1-sub-exponential.
  • Given a simulation budget n, the probability that the SR

algorithm identifies a suboptimal arm as being optimal can be bounded as . . P [Jn ̸= k∗] ≤ 3K(K − 1) exp ( − (n−K)(1−α)2β

H2 log(K)

) , where β is a problem dependent constant (indep. of the gaps), and H2 = max

k=1,2,...,K

k min(∆k, ∆2

k, δ2 k),

where δk is the constant from (A1) for arm k's distribution

slide-36
SLIDE 36

Concluding Remarks

  • Derived a concentration bound for empirical CVaRα estimator

for sub-Gaussian and sub-exponential r.v.s

  • A truncated CVaR estimator to handle heavy-tailed

distributions

  • Showed a bandit application for best CVaRα arm identification,

and derived probability of error for SR algorithm

slide-37
SLIDE 37

Thank you!