Project Proposal: Prediction by Compression Lasse Blaauwbroek Czech - - PowerPoint PPT Presentation

project proposal prediction by compression
SMART_READER_LITE
LIVE PREVIEW

Project Proposal: Prediction by Compression Lasse Blaauwbroek Czech - - PowerPoint PPT Presentation

Project Proposal: Prediction by Compression Lasse Blaauwbroek Czech Institute for Informatics, Robotics and Cybernetics Czech Technical University in Prague AITP 2018 March 30, 2018 Compressor C such that C ( s ) is the length of the


slide-1
SLIDE 1

Project Proposal: Prediction by Compression

Lasse Blaauwbroek

Czech Institute for Informatics, Robotics and Cybernetics Czech Technical University in Prague AITP 2018

March 30, 2018

slide-2
SLIDE 2

Compressor C such that C(s) is the length of the compression of s

[Cilibrasi and Vitanyi 2003], [Li et al. 2004]

slide-3
SLIDE 3

Compressor C such that C(s) is the length of the compression of s s and t share all information = ⇒ C(st) ≈ C(s)+b s and t share no information = ⇒ C(st) ≈ C(s)+C(t)

[Cilibrasi and Vitanyi 2003], [Li et al. 2004]

slide-4
SLIDE 4

Compressor C such that C(s) is the length of the compression of s s and t share all information = ⇒ C(st) ≈ C(s)+b s and t share no information = ⇒ C(st) ≈ C(s)+C(t) NCDC(s,t) = C(st)−min(C(s),C(t)) max(C(s),C(t))

[Cilibrasi and Vitanyi 2003], [Li et al. 2004]

slide-5
SLIDE 5

Compressor C such that C(s) is the length of the compression of s s and t share all information = ⇒ C(st) ≈ C(s)+b s and t share no information = ⇒ C(st) ≈ C(s)+C(t) NCDC(s,t) = C(st)−min(C(s),C(t)) max(C(s),C(t)) Under reasonable conditions for C, NCDc approximates a metric

[Cilibrasi and Vitanyi 2003], [Li et al. 2004]

slide-6
SLIDE 6

Let P be the set of valid programs for programming language L

[Cilibrasi and Vitanyi 2003], [Li et al. 2004]

slide-7
SLIDE 7

Let P be the set of valid programs for programming language L Kolmogorov complexity K: K(s) = argmin

p∈P∧L(p)=s

|p|

[Cilibrasi and Vitanyi 2003], [Li et al. 2004]

slide-8
SLIDE 8

Let P be the set of valid programs for programming language L Kolmogorov complexity K: K(s) = argmin

p∈P∧L(p)=s

|p| NCDK(s,t) = K(st)−min(K(s),K(t)) max(K(s),K(t)) NCDK is the distance metric: ∀d,s,t computable(d) ⇒ NCDK(s,t) ≤ d(s,t)

[Cilibrasi and Vitanyi 2003], [Li et al. 2004]

slide-9
SLIDE 9

No domain-specific knowledge necessary!

slide-10
SLIDE 10

No domain-specific knowledge necessary!

slide-11
SLIDE 11

No domain-specific knowledge necessary!

slide-12
SLIDE 12

Problem: Mathematical statements are short

slide-13
SLIDE 13

Problem: Mathematical statements are short

Compression: Prediction by Partial Matching

slide-14
SLIDE 14

Problem: Mathematical statements are short

Compression: Prediction by Partial Matching Compress entire proof states

slide-15
SLIDE 15

a ∧b ¬a ∨c ¬c ∨¬b ¬c c ¬b

[CKaliszyk, Urban and Vyskoci 2015]

slide-16
SLIDE 16

a ∧b ¬a ∨c ¬c ∨¬b ¬c c ¬b

[CKaliszyk, Urban and Vyskoci 2015]

slide-17
SLIDE 17

a ∧b ¬a ∨c ¬c ∨¬b ¬c c ¬b

a b ¬a c

[CKaliszyk, Urban and Vyskoci 2015]

slide-18
SLIDE 18

a ∧b ¬a ∨c ¬c ∨¬b ¬c c ¬b

a b ¬a c “a ∧b¬a ∨c¬bab¬ac”

[CKaliszyk, Urban and Vyskoci 2015]

slide-19
SLIDE 19

a ∧b ¬a ∨c ¬c ∨¬b ¬c c ¬b

a b ¬a c “a ∧b¬a ∨c¬bab¬ac”

Database

  • [CKaliszyk, Urban and Vyskoci 2015]
slide-20
SLIDE 20

2 4 6 8 10 0.2 0.4 0.6 0.8 1 random best case choice available k nearest neighbor percentage compression

slide-21
SLIDE 21

2 4 6 8 10 0.6 0.65 0.7 0.75 0.8 best case k nearest neighbor percentage compression compression randomized compression reversed feature based comparison

slide-22
SLIDE 22

About 30-40 compressions per second No vector space: n compressions per prediction

slide-23
SLIDE 23

About 30-40 compressions per second No vector space: n compressions per prediction Idea: Impose structure through an n-dimensional lattice Sn = {X ⊆ S | |X| = n}

  • ut(s) = argmax

X∈Sn

t,u∈X

NCD(t,u) ∑

t∈X

NCD(s,t)

slide-24
SLIDE 24
slide-25
SLIDE 25

Pros ⊲ No domain-specific knowledge required ⊲ Predictions are competitive ⊲ Robust against different representations of proof states

slide-26
SLIDE 26

Pros ⊲ No domain-specific knowledge required ⊲ Predictions are competitive ⊲ Robust against different representations of proof states Cons ⊲ Relatively slow ⊲ No vector space

slide-27
SLIDE 27

Pros ⊲ No domain-specific knowledge required ⊲ Predictions are competitive ⊲ Robust against different representations of proof states Cons ⊲ Relatively slow ⊲ No vector space Ideas ⊲ Adapt the PPM compressor for tree-structures ⊲ Impose a n-dimensional lattice on the data

slide-28
SLIDE 28

Pros ⊲ No domain-specific knowledge required ⊲ Predictions are competitive ⊲ Robust against different representations of proof states Cons ⊲ Relatively slow ⊲ No vector space Ideas ⊲ Adapt the PPM compressor for tree-structures ⊲ Impose a n-dimensional lattice on the data

?