Computational Information Games A minitutorial Part I Houman Owhadi - - PowerPoint PPT Presentation

computational information games a minitutorial part i
SMART_READER_LITE
LIVE PREVIEW

Computational Information Games A minitutorial Part I Houman Owhadi - - PowerPoint PPT Presentation

Computational Information Games A minitutorial Part I Houman Owhadi ICERM June 5, 2017 DARPA EQUiPS / AFOSR award no FA9550-16-1-0054 (Computational Information Games) Probabilistic Numerical Methods Statistical Inference approaches to


slide-1
SLIDE 1

Houman Owhadi Computational Information Games A minitutorial Part I

ICERM June 5, 2017

DARPA EQUiPS / AFOSR award no FA9550-16-1-0054 (Computational Information Games)

slide-2
SLIDE 2

Probabilistic Numerical Methods

http://probabilistic-numerics.org/ http://oates.work/samsi

Statistical Inference approaches to numerical approximation and algorithm design

slide-3
SLIDE 3

3 approaches to Numerical Approximation 3 approaches to inference and to dealing with uncertainty

slide-4
SLIDE 4

Game theory

John Von Neumann John Nash

  • J. Von Neumann. Zur Theorie der Gesellschaftsspiele. Math. Ann., 100(1):295–320,

1928

  • J. Von Neumann and O. Morgenstern. Theory of Games and Economic Behavior.

Princeton University Press, Princeton, New Jersey, 1944.

  • N. Nash. Non-cooperative games. Ann. of Math., 54(2), 1951.
slide-5
SLIDE 5

Player I Player II

3 1

  • 2
  • 2

Deterministic zero sum game

Player I’s payoff

How should I & II play the (repeated) game?

slide-6
SLIDE 6

Player I Player II

3 1

  • 2
  • 2

II should play blue and lose 1 in the worst case

Worst case approach

slide-7
SLIDE 7

Player I Player II

3 1

  • 2
  • 2

Worst case approach

I should play red and lose 2 in the worst case

slide-8
SLIDE 8

Player I Player II

3 1

  • 2
  • 2

No saddle point

slide-9
SLIDE 9

1/2 -1/2

Player II

Average case (Bayesian) approach

3 1

  • 2
  • 2
slide-10
SLIDE 10

Mixed strategy (repeated game) solution

3 1

  • 2
  • 2

II should play red with probability 3/8 and win 1/8 on average

Player II

slide-11
SLIDE 11

Mixed strategy (repeated game) solution

3 1

  • 2
  • 2

I should play red with probability 3/8 and lose 1/8 on average

Player I

slide-12
SLIDE 12

Game theory

John Von Neumann

Player I Player II

3 1

  • 2
  • 2

q

1 − q

p

1 − p

Optimal strategies are mixed strategies Optimal way to play is at random

Saddle point

slide-13
SLIDE 13

The optimal mixed strategy is determined by the loss matrix

Player I

5 1

  • 2
  • 2

p

1 − p

Player II

II should play red with probability 3/10 and win 1/8 on average

slide-14
SLIDE 14

Pioneering work

“ “ These concepts and techniques These concepts and techniques have attracted little attention have attracted little attention among numerical analysts” (Larkin, 1972) among numerical analysts” (Larkin, 1972)

Bayesian/probabilistic approach not new but appears to have remained overlooked

slide-15
SLIDE 15

Bayesian Numerical Analysis

  • P. Diaconis
  • A. O’ Hagan
  • J. E. H. Shaw
slide-16
SLIDE 16

Information based complexity

  • H. Wozniakowski
  • G. W. Wasilkowski
  • J. F. Traub
  • E. Novak
slide-17
SLIDE 17

Compute

Numerical Analysis Approach

  • P. Diaconis
slide-18
SLIDE 18

Compute

Bayesian Approach

slide-19
SLIDE 19

E.g.

slide-20
SLIDE 20

E.g. E.g.

slide-21
SLIDE 21

− div(a∇u) = g, x ∈ Ω, u = 0, x ∈ ∂Ω,

(1)

ai,j ∈ L∞(Ω)

Ω ⊂ Rd

∂Ω is piec. Lip.

a unif. ell.

Approximate the solution space of (1) with a finite dimensional space

Q

slide-22
SLIDE 22

Numerical Homogenization Approach

HMM Harmonic Coordinates Babuska, Caloz, Osborn, 1994

Allaire Brizzi 2005; Owhadi, Zhang 2005

Engquist, E, Abdulle, Runborg, Schwab, et Al. 2003-...

MsFEM

[Hou, Wu: 1997]; [Efendiev, Hou, Wu: 1999] Nolen, Papanicolaou, Pironneau, 2008

Flux norm Berlyand, Owhadi 2010; Symes 2012

Kozlov, 1979 [Fish - Wagiman, 1993]

Projection based method Variational Multiscale Method, Orthogonal decomposition

Work hard to find good basis functions

Harmonic continuation

slide-23
SLIDE 23

Bayesian Approach

Put a prior on g

Compute E u(x) finite no of observations

Proposition

− div(a∇u) = g, x ∈ Ω, u = 0, x ∈ ∂Ω,

slide-24
SLIDE 24

Replace g by ξ

ξ: White noise

Gaussian field with covariance function Λ(x, y) = δ(x − y)

∀f ∈ L2(Ω), R

Ω f(x)ξ(x) dx is N

¡ 0, kfk2

L2(Ω)

¢

Bayesian approach

slide-25
SLIDE 25

Theorem

Let x1, . . . , xN ∈ Ω

xN xi x1

a = Id

ai,j ∈ L∞(Ω)

[Harder-Desmarais, 1972]

[Duchon 1976, 1977,1978] [Owhadi-Zhang-Berlyand 2013]

slide-26
SLIDE 26

Theorem

Standard deviation of the statistical error bounds/controls the worst case error

slide-27
SLIDE 27

The Bayesian approach leads to old and new quadrature rules. Summary Statistical errors seem to imply/control deterministic worst case errors

  • Why does it work?
  • How far can we push it?
  • What are its limitations?
  • How can we make sense of the process
  • f randomizing a known function?

Questions

slide-28
SLIDE 28

L

u g

u and g live in infinite dimensional spaces Direct computation is not possible Given g find u Given u find g

Direct Problem Inverse Problem

slide-29
SLIDE 29

u

um gm

g

Inverse Problem Reduced operator

∈ Rm

Rm

Numerical implementation requires computation with partial information.

um ∈ Rm u ∈ B1

Missing information

φ1, . . . , φm ∈ B∗

1

um = ([φ1, u], . . . , [φm, u])

L

slide-30
SLIDE 30

Multigrid Methods Multiresolution/Wavelet based methods

[Brewster and Beylkin, 1995, Beylkin and Coult, 1998, Averbuch et al., 1998] Multigrid: [Fedorenko, 1961, Brandt, 1973, Hackbusch, 1978]

Fast Solvers

Robust/Algebraic multigrid

[Mandel et al., 1999,Wan-Chan-Smith, 1999, Xu and Zikatanov, 2004, Xu and Zhu, 2008], [Ruge-St¨ uben, 1987] [Panayot - 2010]

Stabilized Hierarchical bases, Multilevel preconditioners

[Vassilevski - Wang, 1997, 1998] [Panayot - Vassilevski, 1997] [Chow - Vassilevski, 2003] [Aksoylu- Holst, 2010]

Low rank matrix decomposition methods

Fast Multipole Method: [Greengard and Rokhlin, 1987]

Hierarchical Matrix Method: [Hackbusch et al., 2002]

[Bebendorf, 2008]:

slide-31
SLIDE 31

Common theme between these methods

Computation is done with partial information over hierarchies of levels of complexity Restriction Interpolation To compute fast we need to compute with partial information

slide-32
SLIDE 32

The process of discovery of interpolation operators is based on intuition, brilliant insight, and guesswork

Missing information Problem This is one entry point for statistical inference into Numerical analysis and algorithm design

slide-33
SLIDE 33

Φx = y

Based on the information that

Φ: Known m × n rank m matrix (m < n)

y: Known element of Rm

A simple approximation problem

slide-34
SLIDE 34

Worst case approach (Optimal Recovery)

Problem

slide-35
SLIDE 35

Solution

slide-36
SLIDE 36

Average case approach (IBC) Problem

slide-37
SLIDE 37

Solution

slide-38
SLIDE 38

Player I Player II

Max Max Min Min

Adversarial game approach

slide-39
SLIDE 39

Loss function

Player I Player II

No saddle point of pure strategies

slide-40
SLIDE 40

Player I Player II

Max Max Min Min

Randomized strategy for player I

slide-41
SLIDE 41

Loss function

Saddle point

slide-42
SLIDE 42

Canonical Gaussian field

slide-43
SLIDE 43

Equilibrium saddle point Player I Player II

slide-44
SLIDE 44

Statistical decision theory

Abraham Wald

  • A. Wald. Statistical decision functions which minimize the maximum risk. Ann.
  • f Math. (2), 46:265–280, 1945.
  • A. Wald. An essentially complete class of admissible decision functions. Ann.
  • Math. Statistics, 18:549–555, 1947.
  • A. Wald. Statistical decision functions. Ann. Math. Statistics, 20:165–205, 1949.
slide-45
SLIDE 45

The game theoretic solution is equal to the worst case solution

slide-46
SLIDE 46

Generalization

slide-47
SLIDE 47

Examples

L

slide-48
SLIDE 48

Canonical Gaussian field

slide-49
SLIDE 49

Canonical Gaussian field

slide-50
SLIDE 50

Canonical Gaussian field

slide-51
SLIDE 51

Examples

L

slide-52
SLIDE 52

The recovery problem at the core of Algorithm Design and Numerical Analysis

Missing information Problem

Restriction Interpolation To compute fast we need to compute with partial information

slide-53
SLIDE 53

Player I Player II

Max Max Min Min

slide-54
SLIDE 54

Examples

Player I Player II Player I Player II

slide-55
SLIDE 55

Loss function

Player I Player II

No saddle point of pure strategies

slide-56
SLIDE 56

Player I Player II

Max Max Min Min

Randomized strategy for player I

slide-57
SLIDE 57

Loss function Theorem But

slide-58
SLIDE 58

Loss function Theorem Definition

slide-59
SLIDE 59

Theorem

slide-60
SLIDE 60

Theorem

slide-61
SLIDE 61
slide-62
SLIDE 62

Game theoretic solution = Worst case solution Optimal Recovery Solution

slide-63
SLIDE 63

Optimal bet of player II Gamblets

slide-64
SLIDE 64

Gamblets = Optimal Recovery Splines Optimal Recovery Splines

slide-65
SLIDE 65

Dual bases

slide-66
SLIDE 66

Example

( − div(a∇u) = g, x ∈ Ω, u = 0, x ∈ ∂Ω,

slide-67
SLIDE 67

ψi

Your best bet on the value of u given the information that

R

τi u = 1 and

R

τj u = 0 for j 6= i

ψi

slide-68
SLIDE 68

Example

slide-69
SLIDE 69

Example

slide-70
SLIDE 70

Example

xi x1 xm

φi(x) = δ(x − xi)

ψi: Polyharmonic splines

[Harder-Desmarais, 1972][Duchon 1976, 1977,1978]

slide-71
SLIDE 71

Example

xi x1 xm

φi(x) = δ(x − xi)

ψi: Rough Polyharmonic splines

[Owhadi-Zhang-Berlyand 2013]

ai,j ∈ L∞(Ω)

slide-72
SLIDE 72

Example

slide-73
SLIDE 73

Example

slide-74
SLIDE 74

Summary

  • Does the canonical Gaussian field remain optimal (or near optimal)

beyond average relative errors (e.g. rare events/large deviations ) or when measurements are not linear. This is a fundamental question if probabilistic numerical errors are to be merged with model errors in a unified Bayesian framework.

  • What are the properties of gamblets?
  • Can the game theoretic approach help us solve known open

problems in numerical analysis and algorithm design?

Questions

  • Bayesian numerical analysis ``works’’ because

Gaussian priors form the optimal class of priors when losses are defined using quadratic norms and measurements are linear

  • The game theoretic solution is equal to the classical worst case
  • ptimal recovery solution under above questions
  • The canonical Gaussian field contains all the required information

to bridge scales/levels of complexity in numerical approximation and it does not depend on the linear measurements.

slide-75
SLIDE 75

Thank you

DARPA EQUiPS / AFOSR award no FA9550-16-1-0054 (Computational Information Games)