Project Proposal: Prediction by Compression Lasse Blaauwbroek Czech - PowerPoint PPT Presentation

Project Proposal: Prediction by Compression Lasse Blaauwbroek Czech Institute for Informatics, Robotics and Cybernetics Czech Technical University in Prague AITP 2018 March 30, 2018

Compressor C such that C ( s ) is the length of the compression of s [Cilibrasi and Vitanyi 2003], [Li et al. 2004]

Compressor C such that C ( s ) is the length of the compression of s s and t share all information = ⇒ C ( st ) ≈ C ( s )+ b s and t share no information = ⇒ C ( st ) ≈ C ( s )+ C ( t ) [Cilibrasi and Vitanyi 2003], [Li et al. 2004]

Compressor C such that C ( s ) is the length of the compression of s s and t share all information = ⇒ C ( st ) ≈ C ( s )+ b s and t share no information = ⇒ C ( st ) ≈ C ( s )+ C ( t ) NCD C ( s , t ) = C ( st ) − min( C ( s ) , C ( t )) max( C ( s ) , C ( t )) [Cilibrasi and Vitanyi 2003], [Li et al. 2004]

Compressor C such that C ( s ) is the length of the compression of s s and t share all information = ⇒ C ( st ) ≈ C ( s )+ b s and t share no information = ⇒ C ( st ) ≈ C ( s )+ C ( t ) NCD C ( s , t ) = C ( st ) − min( C ( s ) , C ( t )) max( C ( s ) , C ( t )) Under reasonable conditions for C , NCD c approximates a metric [Cilibrasi and Vitanyi 2003], [Li et al. 2004]

Let P be the set of valid programs for programming language L [Cilibrasi and Vitanyi 2003], [Li et al. 2004]

Let P be the set of valid programs for programming language L Kolmogorov complexity K : K ( s ) = argmin | p | p ∈ P ∧ L ( p )= s [Cilibrasi and Vitanyi 2003], [Li et al. 2004]

Let P be the set of valid programs for programming language L Kolmogorov complexity K : K ( s ) = argmin | p | p ∈ P ∧ L ( p )= s NCD K ( s , t ) = K ( st ) − min( K ( s ) , K ( t )) max( K ( s ) , K ( t )) NCD K is the distance metric: ∀ d , s , t computable ( d ) ⇒ NCD K ( s , t ) ≤ d ( s , t ) [Cilibrasi and Vitanyi 2003], [Li et al. 2004]

No domain-specific knowledge necessary!

No domain-specific knowledge necessary! � � � �

No domain-specific knowledge necessary! � � � � �

Problem: Mathematical statements are short

Problem: Mathematical statements are short Compression: Prediction by Partial Matching

Problem: Mathematical statements are short Compression: Prediction by Partial Matching Compress entire proof states

a ∧ b ¬ a ∨ c ¬ c ∨¬ b ¬ b ¬ c c [CKaliszyk, Urban and Vyskoci 2015]

a ∧ b ¬ a ∨ c ¬ c ∨¬ b ¬ b ¬ c c a b ¬ a c [CKaliszyk, Urban and Vyskoci 2015]

a ∧ b ¬ a ∨ c ¬ c ∨¬ b ¬ b ¬ c c a b ¬ a c “ a ∧ b ¬ a ∨ c ¬ bab ¬ ac ” [CKaliszyk, Urban and Vyskoci 2015]

a ∧ b Database ¬ a ∨ c �� ¬ c ∨¬ b ¬ b ⇔ �� ¬ c c �� a b ¬ a c “ a ∧ b ¬ a ∨ c ¬ bab ¬ ac ” [CKaliszyk, Urban and Vyskoci 2015]

1 choice available 0 . 8 best case compression percentage 0 . 6 0 . 4 random 0 . 2 0 2 4 6 8 10 k nearest neighbor

0 . 8 best case 0 . 75 compression percentage compression randomized compression reversed 0 . 7 feature based comparison 0 . 65 0 . 6 2 4 6 8 10 k nearest neighbor

About 30-40 compressions per second No vector space: n compressions per prediction

About 30-40 compressions per second No vector space: n compressions per prediction Idea: Impose structure through an n -dimensional lattice S n = { X ⊆ S | | X | = n } NCD ( t , u ) ∑ t , u ∈ X out ( s ) = argmax NCD ( s , t ) ∑ X ∈ S n t ∈ X

Pros ⊲ No domain-specific knowledge required ⊲ Predictions are competitive ⊲ Robust against different representations of proof states

Pros ⊲ No domain-specific knowledge required ⊲ Predictions are competitive ⊲ Robust against different representations of proof states Cons ⊲ Relatively slow ⊲ No vector space

Pros ⊲ No domain-specific knowledge required ⊲ Predictions are competitive ⊲ Robust against different representations of proof states Cons ⊲ Relatively slow ⊲ No vector space Ideas ⊲ Adapt the PPM compressor for tree-structures ⊲ Impose a n -dimensional lattice on the data

Pros ⊲ No domain-specific knowledge required ⊲ Predictions are competitive ⊲ Robust against different representations of proof states Cons ⊲ Relatively slow ⊲ No vector space Ideas ⊲ Adapt the PPM compressor for tree-structures ⊲ Impose a n -dimensional lattice on the data ?

Project Proposal: Prediction by Compression Lasse Blaauwbroek Czech - PowerPoint PPT Presentation

Project Proposal: Prediction by Compression Lasse Blaauwbroek Czech Institute for Informatics, Robotics and Cybernetics Czech Technical University in Prague AITP 2018 March 30, 2018 Compressor C such that C ( s ) is the length of the

Lossless compression in lossy compression systems Almost every lossy compression system

14.9.2 JPEG2000 compression DCT compression basis for JPEG wavelet compression

JPEG Compression Ian Snyder December 11, 2009 Ian Snyder JPEG Compression Outline

Lecture 9: Compression 1 / 52 Compression Recap Bu ff er Management Recap 2 / 52 Compression

Digital Image Compression Digital Image Compression Digital Image Compression and JPEG Standards

Digital Video Compression Digital Video Compression Digital Video Compression and H.261

From Sorting to Heaps to Compression Data Compression video on demand/set top box jpeg

Tradeoffs in XML Database Compression James Cheney University of Edinburgh Data Compression

Structured Prediction Introduction What is structured prediction? CS 6355: Structured Prediction

Branch Prediction Branch Prediction vs vs Execution Time Execution Time Prediction

A Model to Address Salary Compression for Faculty (an anti-compression model) Presented to

Compression Overview Multimedia Encoding and Compression Huffman codes Lossless

Compression Programs File Compression: Gzip, Bzip Archivers :Arc, Pkzip, Winrar,

Scientific Data Compression: From Stone-Age to Renaissance Factor 10,100 compression

Information Retrieval Tutorial 3: Index Compression Professor: Michel Schellekens TA: Ang Gao

Basic Techniques II: Iterative Compression Marek Cygan Institute of Informatics University of

Measuring EnPIs Is my plant suffering from Obese or is it fit & healthy? 1 What is an

Solid Lithospheric Phases J. D. Price Very important questions: What are the elements that make

AP Chemistry Compounds 2015-09-14 www.njctl.org Slide 3 / 163 Slide 4 / 163 Table of

Neutrino Astrophysics at Hyper-Kamiokande 1 Takatomi Yano ICRR Revealing the history of the

2016 Gas/Electric Partnership Conference XXIV Opportunities in Gas & Oil Infrastructure

Lecture 1 Introduction Objectives Digital image processing, Why? Scope of digital image

1 Institute for Research on Population and Social Policies 2 Institute of Molecular Bioimaging and

Why study Computer Vision? (Jan-April 2007) Images and video are everywhere Fast-growing

Sambuz

Useful Links

Newsletter

Mail Us

Project Proposal: Prediction by Compression Lasse Blaauwbroek Czech - PowerPoint PPT Presentation

Project Proposal: Prediction by Compression Lasse Blaauwbroek Czech Institute for Informatics, Robotics and Cybernetics Czech Technical University in Prague AITP 2018 March 30, 2018 Compressor C such that C ( s ) is the length of the

Lossless compression in lossy compression systems Almost every lossy compression system

14.9.2 JPEG2000 compression DCT compression basis for JPEG wavelet compression

JPEG Compression Ian Snyder December 11, 2009 Ian Snyder JPEG Compression Outline

Lecture 9: Compression 1 / 52 Compression Recap Bu ff er Management Recap 2 / 52 Compression

Digital Image Compression Digital Image Compression Digital Image Compression and JPEG Standards

Digital Video Compression Digital Video Compression Digital Video Compression and H.261

From Sorting to Heaps to Compression Data Compression video on demand/set top box jpeg

Tradeoffs in XML Database Compression James Cheney University of Edinburgh Data Compression

Structured Prediction Introduction What is structured prediction? CS 6355: Structured Prediction

Branch Prediction Branch Prediction vs vs Execution Time Execution Time Prediction

A Model to Address Salary Compression for Faculty (an anti-compression model) Presented to

Compression Overview Multimedia Encoding and Compression Huffman codes Lossless

Compression Programs File Compression: Gzip, Bzip Archivers :Arc, Pkzip, Winrar,

Scientific Data Compression: From Stone-Age to Renaissance Factor 10,100 compression

Information Retrieval Tutorial 3: Index Compression Professor: Michel Schellekens TA: Ang Gao

Basic Techniques II: Iterative Compression Marek Cygan Institute of Informatics University of

Measuring EnPIs Is my plant suffering from Obese or is it fit &amp; healthy? 1 What is an

Solid Lithospheric Phases J. D. Price Very important questions: What are the elements that make

AP Chemistry Compounds 2015-09-14 www.njctl.org Slide 3 / 163 Slide 4 / 163 Table of

Neutrino Astrophysics at Hyper-Kamiokande 1 Takatomi Yano ICRR Revealing the history of the

2016 Gas/Electric Partnership Conference XXIV Opportunities in Gas &amp; Oil Infrastructure

Lecture 1 Introduction Objectives Digital image processing, Why? Scope of digital image

1 Institute for Research on Population and Social Policies 2 Institute of Molecular Bioimaging and

Why study Computer Vision? (Jan-April 2007) Images and video are everywhere Fast-growing

Sambuz

Useful Links

Newsletter

Mail Us

Measuring EnPIs Is my plant suffering from Obese or is it fit & healthy? 1 What is an

2016 Gas/Electric Partnership Conference XXIV Opportunities in Gas & Oil Infrastructure