Learned Index Structures paper by Tim Kraska, Alex Beutel, Ed H. - PowerPoint PPT Presentation

Learned Index Structures paper by Tim Kraska, Alex Beutel, Ed H. Chi, Jeffrey Dean, Neoklis Polyzotis Bigtable Research Review Meeting Presented by Deniz Altinbuken go/learned-index-structures-presentation January 29, 2018

Objectives 1. Show that all index structures can be replaced with deep learning models: learned indexes . 2. Analyze under which conditions learned indexes outperform traditional index structures and describe the main challenges in designing learned index structures. 3. Show that the idea of replacing core components of a data management system through learned models can be very powerful.

Claims ● Traditional indexes assume worst case data distribution so that they can be general purpose. ○ They do not take advantage of patterns. ● Knowing the exact data distribution enables highly optimizing any index the database system uses. ● ML opens up the opportunity to learn a model that reflects the patterns and correlations in the data and thus enable the automatic synthesis of specialized index structures:learned indexes.

Main Idea A model can learn the sort order or structure of lookup keys and use this signal to effectively predict the position or existence of records.

Background Learned Index Structures Results Conclusion

Background

Background Neural Networks: An Example Recognizing handwriting Learned Index ● Very difficult to express our intuitions such as "9 has a Structures loop at the top, and a vertical stroke in the bottom right". ● Very difficult to create precise rules and solve this algorithmically. ○ Too many exceptions, special cases. Results Conclusion

Background Neural Networks: An Example Neural networks approach the problem in a different way. Learned Index Structures ● Take a large number of handwritten digits: training data . Results Conclusion ● Develop a system which can learn from the training data.

Background Neural Networks: An Example Neural networks approach the problem in a different way. Learned Index Structures Automatically infer rules for recognizing handwritten digits by going through Results examples! Conclusion

Background Neural Networks: An Example Neural networks approach the problem in a different way. Learned Index Structures Create a network of neurons that can Results learn! :) Conclusion

Background Neurons: Perceptron A perceptron takes several binary inputs , x1,x2,… and Learned Index produces a single binary output : Structures x 1 w 1 w 2 x 2 output w 3 Results x 3 The output is computed as a function of the inputs, where Conclusion weights w1,w2,… express the importance of inputs to the output.

Background Neurons: Perceptron The output is determined by whether the weighted sum Learned Index ∑ j w j x j is less than or greater than some threshold value . Structures x 1 w 1 w 2 x 2 output t w 3 Results x 3 Just like the weights, the threshold is a number which is a Conclusion parameter of the neuron. If the threshold is reached, the neuron fires.

Background Neurons: Perceptron The output is determined by whether the weighted sum Learned Index ∑ j w j x j is less than or greater than some threshold value . Structures 0 if ∑ j w j x j ≤ threshold output = 1 if ∑ j w j x j > threshold Results Just like the weights, the threshold is a number which is a Conclusion parameter of the neuron. If the threshold is reached, the neuron fires.

Background Neurons: Perceptron A more common way to describe a perceptron is: Learned Index Structures Bias describes how easy ● ∑ j w j x j w ⋅ x it is to get the neuron to fire. ● -threshold bias Results 0 if w ⋅ x + bias ≤ 0 0 if ∑ j w j x j ≤ threshold output = output = 1 if w ⋅ x + bias > 0 1 if ∑ j w j x j > threshold Conclusion

Background Neurons: Perceptron ● By varying the weights and the threshold, we get different Learned Index models of decision-making. Structures ● A complex network of perceptrons that uses layers can make quite subtle decisions. Results inputs output Conclusion

Background Neurons: Perceptron ● By varying the weights and the threshold, we get different Learned Index models of decision-making. Structures ● A complex network of perceptrons that uses layers can make quite subtle decisions. Results inputs output Conclusion 1 st layer 2 nd layer

Background Neurons: Perceptron ● By varying the weights and the threshold, we get different Learned Index models of decision-making. Structures ● A complex network of perceptrons that uses layers can make quite subtle decisions. Results inputs output Conclusion input layer hidden layers output layer

Background Neurons: Perceptron Learned Index Structures Perceptrons are great for decision making. Results Conclusion

Neurons: Perceptron How about learning? Learned Index Conclusion Results Background Structures

Background Neurons: Perceptron Earlier Learned Index Structures Automatically infer rules for recognizing handwritten digits by going through Results examples! Conclusion

Background Learning ● A neural network goes through examples to learn weights Learned Index and biases so that the output from the network correctly Structures classifies a given digit. ● When a small change is made in some weight or bias in the network if this causes a small corresponding change Results in the output from the network, the network can learn. Conclusion

Background Learning ● A neural network goes through examples to learn weights Learned Index and biases so that the output from the network correctly Structures classifies a given digit. ● When a small change is made in some weight or bias in the network if this causes a small corresponding change Results in the output from the network, the network can learn. Trying to create the right mapping for all cases. Conclusion

Background Learning Learned Index Structures The neural network is “trained” by adjusting weights and biases to find the perfect model that would generate the Results expected output for the “training data”. Conclusion

Background Learning Learned Index Structures Through training you minimize the prediction error. Results (But having perfect output is difficult.) Conclusion

Background Neurons: Sigmoid ● Sigmoid neurons are similar to perceptrons, but modified Learned Index so that small changes in their weights and bias cause Structures only a small change in their output. Small Δ in any weight or bias causes a small Δ in the output! w + Δ w Results output + Δ output inputs Conclusion

Background Neurons: Sigmoid ● A sigmoid takes several inputs , x1,x2,… which can be Learned Index any real number between 0 and 1 (i.e. 0.256) and Structures produces a single output , which can also be any real number between 0 and 1. output = σ (w ⋅ x + bias) Results Conclusion

Background Neurons: Sigmoid ● A sigmoid takes several inputs , x1,x2,… which can be Learned Index any real number between 0 and 1 (i.e. 0.256) and Structures produces a single output , which can also be any real number between 0 and 1. output = σ (w ⋅ x + bias) Results 1 σ (z) = 1 + e -z Conclusion sigmoid function

Background Neurons: Sigmoid ● A sigmoid takes several inputs , x1,x2,… which can be Learned Index any real number between 0 and 1 (i.e. 0.256) and Structures produces a single output , which can also be any real number between 0 and 1. output = σ (w ⋅ x + bias) Results Great for representing Conclusion probabilities!

Background Neurons: ReLU (Rectified Linear Unit) ● Better for deep learning because it preserves the Learned Index information from earlier layers better as it goes through Structures hidden layers. Results inputs output Conclusion

Background Neurons: ReLU (Rectified Linear Unit) ● Better for deep learning because it preserves the Learned Index information from earlier layers better as it goes through Structures hidden layers. 0 if x ≤ 0 Results output = x if x > 0 Conclusion

Background Activation Functions (Transfer Functions) Learned Index Structures To get an intuition about the neurons, it helps to see the Results shape of the activation function. Conclusion

Learned Index Structures

Background Index Structures as Neural Network Models ● Indexes are already to a large extent learned models like Learned Index neural networks. Structures ● Indexes predict the location of a value given a key. ○ A B-tree is a model that takes a key as an input and predicts the position of a data record. Results ○ A bloom filter is a binary classifier, which given a key predicts if a key exists in a set or not. Conclusion

Background B-tree The B-tree provides a mapping from a lookup key into a Learned Index position inside the sorted array of records. Structures Results Conclusion

Background B-tree The B-tree provides a mapping from a lookup key into a Learned Index position inside the sorted array of records. Structures For efficiency, index to page Results granularity. Conclusion

Learned Index Structures paper by Tim Kraska, Alex Beutel, Ed H. - PowerPoint PPT Presentation

Learned Index Structures paper by Tim Kraska, Alex Beutel, Ed H. Chi, Jeffrey Dean, Neoklis Polyzotis Bigtable Research Review Meeting Presented by Deniz Altinbuken go/learned-index-structures-presentation January 29, 2018 Objectives 1. Show

CS143: Index 1 Topics to Learn Important concepts Dense index vs. sparse index Primary

The Case for R244 Learned Index Structures Michael Chi Ian Tang Kraska, T., Beutel, A., Chi, E.

Index Rules and Methodology Index Name Ticker S-Network US Equity 3000 Index SN3000 S-Network

Lessons Learned Lessons Learned From From Lessons Learned Lessons Learned From From

Hypo contact and Sasakian SU ( 2 ) -structures in 5-dimensions structures on Lie groups Sasakian

FAANG+ holdings in S&P 500 & MSCI EM Index S&P 500 Index Weighting 20% MSCI EM Index

THE INDEX OF RETAIL PRICES REVISION OF THE INDEX OF RETAIL PRICES INDEX OF RETAIL PRICES The

Index Rules and Methodology S-Network Europe Equity 500 Index (Ticker: SNE500) S-Network Europe

Fourier transform for nilpotent Lie groups Index sets and representations Granada Index sets

Index Blocking Factors, Views Rose-Hulman Institute of Technology Curt Clifton Index Redux

Contact manifolds and SU ( 2 ) -structures in 5-dimensions SU ( n ) -structures Sasaki-Einstein

Learned Index Structures Naufal Fikri Setiawan, Benjamin I.P. Rubinstein, Renata Borovica-Gajic

1/37 Lesson: How I Learned to Stop Worrying and Love the Bot 2/37 Lesson: How I Learned to Stop

The PGM-index: a fully-dynamic compressed learned index with provable worst-case bounds Paolo

Period Index: A Learned 2D Hash Index for Range and Duration Queries Andreas Behrend 1 os 2 Johann

Targeting Text Structures to Improve Reading What are Text Structures? Text Structures are

Storage and Indexing (Appendix D, Chapter 10B Kroenke) 1 Database Design Process

Index Unit EBLL case reported for a child under age six in a public housing unit What now?

Production of light flavor hadrons at intermediate and high p T measured with ALICE Michael Linus

Speech Processing 15-492/18-492 Human Speech Processing Phonetics and Phonology This lecture is

MySQL Index Cookbook Deep & Wide Index Tutorial Rick James Feb., 2015 TOC Preface

Architecture Case Study: 1 Key Word in Context (KWIC) Aims: To demonstrate key features

Turbocharge your MySQL analytics with ElasticSearch Guillaume Lefranc Data & Infrastructure

An Example of Index An Example of Index pattern of structure in indicators pattern of structure

Learned Index Structures paper by Tim Kraska, Alex Beutel, Ed H. - PowerPoint PPT Presentation

Learned Index Structures paper by Tim Kraska, Alex Beutel, Ed H. Chi, Jeffrey Dean, Neoklis Polyzotis Bigtable Research Review Meeting Presented by Deniz Altinbuken go/learned-index-structures-presentation January 29, 2018 Objectives 1. Show

CS143: Index 1 Topics to Learn Important concepts Dense index vs. sparse index Primary

The Case for R244 Learned Index Structures Michael Chi Ian Tang Kraska, T., Beutel, A., Chi, E.

Index Rules and Methodology Index Name Ticker S-Network US Equity 3000 Index SN3000 S-Network

Lessons Learned Lessons Learned From From Lessons Learned Lessons Learned From From

Hypo contact and Sasakian SU ( 2 ) -structures in 5-dimensions structures on Lie groups Sasakian

FAANG+ holdings in S&amp;P 500 &amp; MSCI EM Index S&amp;P 500 Index Weighting 20% MSCI EM Index

THE INDEX OF RETAIL PRICES REVISION OF THE INDEX OF RETAIL PRICES INDEX OF RETAIL PRICES The

Index Rules and Methodology S-Network Europe Equity 500 Index (Ticker: SNE500) S-Network Europe

Fourier transform for nilpotent Lie groups Index sets and representations Granada Index sets

Index Blocking Factors, Views Rose-Hulman Institute of Technology Curt Clifton Index Redux

Contact manifolds and SU ( 2 ) -structures in 5-dimensions SU ( n ) -structures Sasaki-Einstein

Learned Index Structures Naufal Fikri Setiawan, Benjamin I.P. Rubinstein, Renata Borovica-Gajic

1/37 Lesson: How I Learned to Stop Worrying and Love the Bot 2/37 Lesson: How I Learned to Stop

The PGM-index: a fully-dynamic compressed learned index with provable worst-case bounds Paolo

Period Index: A Learned 2D Hash Index for Range and Duration Queries Andreas Behrend 1 os 2 Johann

Targeting Text Structures to Improve Reading What are Text Structures? Text Structures are

Storage and Indexing (Appendix D, Chapter 10B Kroenke) 1 Database Design Process

Index Unit EBLL case reported for a child under age six in a public housing unit What now?

Production of light flavor hadrons at intermediate and high p T measured with ALICE Michael Linus

Speech Processing 15-492/18-492 Human Speech Processing Phonetics and Phonology This lecture is

MySQL Index Cookbook Deep &amp; Wide Index Tutorial Rick James Feb., 2015 TOC Preface

Architecture Case Study: 1 Key Word in Context (KWIC) Aims: To demonstrate key features

Turbocharge your MySQL analytics with ElasticSearch Guillaume Lefranc Data &amp; Infrastructure

An Example of Index An Example of Index pattern of structure in indicators pattern of structure

FAANG+ holdings in S&P 500 & MSCI EM Index S&P 500 Index Weighting 20% MSCI EM Index

MySQL Index Cookbook Deep & Wide Index Tutorial Rick James Feb., 2015 TOC Preface

Turbocharge your MySQL analytics with ElasticSearch Guillaume Lefranc Data & Infrastructure