DATA ANALYTICS USING DEEP LEARNING GT 8803 // FALL 2018 // - PowerPoint PPT Presentation

DATA ANALYTICS USING DEEP LEARNING GT 8803 // FALL 2018 // CHRISTINE HERLIHY L E C T U R E # 0 4 : S E Q 2 S Q L : G E N E R A T I N G S T R U C T U R E D Q U E R I E S F R O M N A T U R A L L A N G U A G E U S I N G R E I N F O R C E M E N T L E A R N I N G

TODAY’S PAPER • Seq2SQL: Generating Structured Queries from Natural Language using Reinforcement Learning � Authors: • Victor Zhong, Caiming Xiong, Richard Socher • They are all affiliated with Salesforce Research � Areas of focus: • Machine translation; deep learning and reinforcement learning for query generation and validation GT 8803 // Fall 2018 2

TODAY’S AGENDA • Problem Overview • Context: Background Info on Relevant Concepts • Key Idea • Technical Details • Experiments • Discussion Questions GT 8803 // Fall 2018 3

PROBLEM OVERVIEW • Status Quo: � A lot of interesting data is stored in relational databases � To access this data, you have to know SQL What is the capital of the United • Objective: States? � Make it easier for end-users to query relational databases vs. by translating natural language questions to SQL queries SELECT capital WHERE country = “United States” • Key contributions: � Seq2SQL model: a DNN to translate NL questions to SQL � WikiSQL: annotated corpus containing 80,654 questions mapped to SQL queries and tables from Wikipedia GT 8803 // Fall 2018 4

CONTEXT: SQL CONCEPTS • SQL is a declarative query language used to extract information from relational databases; results are returned as rows and columns • A schema is a collection of database objects (here, tables) • Even basic queries may include several clauses: � Aggregation operation(s) • (e.g., COUNT, MIN, MAX, etc.) � SELECT column(s) FROM schema.table � WHERE (condition1) AND Image: https://arxiv.org/pdf/1709.00103.pdf (condition2) GT 8803 // Fall 2018 5

CONTEXT: DEEP LEARNING CONCEPTS • Recurrent Neural Networks (RNNs): � Neural network architecture containing self-referential loops � Intended to allow Image: https://colah.github.io/posts/2015-08-Understanding-LSTMs/ knowledge/information learned in previous steps to influence the current prediction/output; well- suited for sequential/temporal data GT 8803 // Fall 2018 6

CONTEXT: DEEP LEARNING CONCEPTS • Long Short-Term Memory (LSTM) architecture: � Intended to mitigate vanishing/exploding gradient problem associated with RNNs � Better suited for longer- term temporal dependencies h t = hidden state Image: https://www.researchgate.net/publication/319770438_Long- z t = prediction at time step t Term_Recurrent_Convolutional_Networks_for_Visual_Recognition_and_Description � Incorporate a memory cell and forget gate GT 8803 // Fall 2018 7

CONTEXT: DEEP LEARNING CONCEPTS Similar to Seq2Seq, but ∀ output token ∈ input • How are LSTMs used in this paper? � Seq2SQL generates SQL queries token-by-token � They use LSTMs for encoding the embeddings associated with each word in the input Image: https://google.github.io/seq2seq/ sequence, and decoding each query token, ! " , as a function of the most recently generated token, ! "#$ GT 8803 // Fall 2018 8

CONTEXT: DEEP LEARNING CONCEPTS • Activation functions define the output of individual neurons in a DNN, given a set of input(s) • Relevant activation functions from this paper: � Hyperbolic tangent (tanh): • Outputs values in (-1,1); less likely to get “stuck” than logistic sigmoid Images and information: https://en.wikipedia.org/wiki/Activation_function GT 8803 // Fall 2018 9

CONTEXT: DEEP LEARNING CONCEPTS • The loss function of a DNN represents the error to be minimized • Cross-entropy loss: � Measures the performance of a classifier whose output is a probability value in [0,1] � When number of classes = 2 (e.g., {0,1}) −(# log ' + 1 − # log(1 − ') • Image: https://google.github.io/seq2seq/ � For number of classes, M > 2, compute loss for each label per observation, o , and sum: / − ∑ ,-. # 0,, log(' 0,, ) • GT 8803 // Fall 2018 10

CONTEXT: REINFORCEMENT LEARNING • Reinforcement learning: “learning what to do—how to map situations to actions–so as to maximize a numerical reward signal” � Agent must explore state space and exploit knowledge gained � Evaluative feedback based on actions, rather than action-independent instructional feedback Source: Richard S. Sutton and Andrew G. Barto. 1998. Introduction to Reinforcement Learning (1st ed.). MIT Press, Cambridge, MA, USA. GT 8803 // Fall 2018 11

CONTEXT: REINFORCEMENT LEARNING • Policy ( ! ): Grid-World Example Problem: � “[D]efines the agent’s way of behaving at a given time, and is a mapping from perceived states of the environment to actions to be taken when in those states” � Classical example is the grid- Image: https://slideplayer.com/slide/4757729/ world problem Source: Richard S. Sutton and Andrew G. Barto. 1998. Introduction to Reinforcement Learning (1st ed.). MIT Press, Cambridge, MA, USA. GT 8803 // Fall 2018 12

CONTEXT: REINFORCEMENT LEARNING • As applied in the paper: � States correspond to the portion of the query generated thus far � Actions correspond to the selection of the next term in the output sequence, conditional on the input sequence and all terms selected so far � Rewards are assigned when the generated queries are executed; depend on validity, correctness, and string match GT 8803 // Fall 2018 13

CONTEXT: REINFORCEMENT LEARNING • Teacher forcing: � Refers to scenario where, after the model is trained, the actual or expected output sequence token at time step t is used as input when predicting token t+1 , instead of using the output generated by the DNN � In the paper, teacher forcing is used as an initial step when training the model for WHERE clause output • Policy is not learned from scratch • Rather, with TF as a foundation, they continue to policy learning • Why? Information: https://machinelearningmastery.com/teacher-forcing-for-recurrent-neural-networks/ GT 8803 // Fall 2018 14

CONTEXT: FOUNDATIONAL WORKS • Semantic parsing: � Converting natural language utterance to logical/machine-interpretable representation • Baseline model: � Attentional sequence to sequence neural semantic parser: Dong & Lapata (2016) � Goal of this paper was also to develop a generalized approach to query generation requiring minimal domain knowledge Image: https://arxiv.org/pdf/1601.01280.pdf � They develop a sequence-to-tree model to incorporate hierarchical nature of semantic information GT 8803 // Fall 2018 15

CONTEXT: FOUNDATIONAL WORKS • Augmented pointer network: � Seq2SQL extends the work of Embedded Vinyals et al. (2015) input Generating � The referenced paper introduced network Ptr-Net, a “neural architecture to learn the conditional probability of an output sequence with elements that are discrete tokens Image: https://arxiv.org/pdf/1506.03134.pdf corresponding to positions in a [variable-length] input sequence” GT 8803 // Fall 2018 16 16

KEY IDEA • Objective: Ingest a natural language question, a set of table column names, and the set of unique words in the SQL vocabulary; output a valid SQL query that returns correct results when compared to results from ground truth query, ! " . • How? Image: https://arxiv.org/pdf/1709.00103.pdf GT 8803 // Fall 2018 17

TECHNICAL DETAILS • Input sequence: � Concatenation of {column names, the terms that form the natural language question, limited SQL vocabulary terms} Images: https://arxiv.org/pdf/1709.00103.pdf GT 8803 // Fall 2018 18

TECHNICAL DETAILS • Query generation: � SQL queries are generated token-by-token � Seq2SQL has 3 component parts: • Aggregation operator (Does query need one or not? Which one?) • SELECT column required (note, input column tokens provide the alphabet; softmax function used to produce a distribution over possible columns) • Construction of the WHERE clause (RL is used for this) GT 8803 // Fall 2018 19

TECHNICAL DETAILS • Role of deep learning: � LSTM networks are used to encode vector embeddings of items from the input sequence, and decoded to obtain tokens that, when strung together, constitute the SQL query Decoder output: %$& = Scalar attention score for each position t of the input sequence ! ",$ %$& ) - The next token selected, ' " = argmax( ! " GT 8803 // Fall 2018 20

TECHNICAL DETAILS • Role of RL: � Intended to address the fact that component pieces of a WHERE clause form an unordered set � Reward Function Used: As a result, it is possible for some generated queries to yield correct results when executed even when they are Images: https://arxiv.org/pdf/1709.00103.pdf not perfect string matches with their corresponding ground truth queries GT 8803 // Fall 2018 21

TECHNICAL DETAILS • Resulting objective function: � Model trained using gradient descent to minimize: ! = ! #$$ + ! &'(')* + ! +,'-' • The total gradient is the equally weighted sum of: � The gradient from the cross-entropy loss in predicting the SELECT column � The gradient from the cross-entropy loss in predicting AGG � The gradient from policy learning Image and Information: https://arxiv.org/pdf/1709.00103.pdf GT 8803 // Fall 2018 22

DATA ANALYTICS USING DEEP LEARNING GT 8803 // FALL 2018 // - PowerPoint PPT Presentation

DATA ANALYTICS USING DEEP LEARNING GT 8803 // FALL 2018 // CHRISTINE HERLIHY L E C T U R E # 0 4 : S E Q 2 S Q L : G E N E R A T I N G S T R U C T U R E D Q U E R I E S F R O M N A T U R A L L A N G U A G E U S I N G R E I N F O R C E

Analytics and Data Summit 2020 Analytics and Data Summit 2020 Analytics and Data Summit 2020

DSC 102 Systems for Scalable Analytics Arun Kumar Topic 6: Deep Learning Systems 1 Outline

Hao Su July 6, 2017 Outline Overview of 3D deep learning 3D deep learning algorithms

All You Want To Know About CNNs Yukun Zhu Deep Learning Deep Learning Image from

Deep Neural Networks and Deep Reinforcement Learning Deep Learning, Goodfellow, Bengio and

Undergraduate Business Analytics Minor Spreadsheet Analytics BANA-2081 Business Analytics

Deep Data Analytics for Pricing: Uses, Issues, and Solutions Walter R. Paczkowski, Ph.D. Data

DATA ANALYTICS USING DEEP LEARNING GT 8803 // FALL 2018 // VENKATA KISHORE PATCHA Lecture#16 :

Architecture 3.0 Landscape Analytics Jrgen Dllner Hasso-Plattner-Institut Jrgen

AGN deep multiwavelength AGN deep multiwavelength AGN deep multiwavelength surveys: surveys:

Google Analytics Overview Whats Google Analytics? The Google Analytics

Document Name Solar Analytics - Rooftop PV energy analytics PREPARED BY: Your Name, Your Title

Deep Learning: Theory and Practice Deep Learning - Practical 02-04-2020 Considerations

Data Mining & Analytics Data Mining Reference Model Data Warehouse Legal and Ethical Issues

Presentation about Deep Learning --- Zhongwu xie Contents 1.Brief introduction of Deep learning.

Deep Learning on GPUs March 2016 What is Deep Learning? GPUs and DL AGENDA DL in practice

Case Study: Social Media Analytics for Stance Mining With Examples From COVID-19 Twitter

Models of Integration Ohio Health Home TTA Webinar Kathleen Reynolds, LMSW ACSW February 11,

Searchlight Project onboarding THUY DANG | thuydang TRINH NGUYEN | dangtrinhnt SA PHAM | sapd

DESIRE II LDAP Indexing System 45 IETF, Oslo LDAP Service Deployment - Take 2 BoF 15. July 1999

Parameter Tuning for Influence Maximization Manqing Ma Last Updated: 11/19/2018 (CSCI6250 FNS

Gender Lens In Investment Theme Live Demo and Discussion February 14, 2019 1 Agenda 1.

homelessness With: Chair: Sue Christoforou (Policy Manager, Homeless Link) Rory Weal

Suraiya Shiwnarain and Joelyz Wolcott Research Impacts Research Impacts By: Joelyz Wolcott &

Sambuz

Useful Links

Newsletter

Mail Us

DATA ANALYTICS USING DEEP LEARNING GT 8803 // FALL 2018 // - PowerPoint PPT Presentation

DATA ANALYTICS USING DEEP LEARNING GT 8803 // FALL 2018 // CHRISTINE HERLIHY L E C T U R E # 0 4 : S E Q 2 S Q L : G E N E R A T I N G S T R U C T U R E D Q U E R I E S F R O M N A T U R A L L A N G U A G E U S I N G R E I N F O R C E

Analytics and Data Summit 2020 Analytics and Data Summit 2020 Analytics and Data Summit 2020

DSC 102 Systems for Scalable Analytics Arun Kumar Topic 6: Deep Learning Systems 1 Outline

Hao Su July 6, 2017 Outline Overview of 3D deep learning 3D deep learning algorithms

All You Want To Know About CNNs Yukun Zhu Deep Learning Deep Learning Image from

Deep Neural Networks and Deep Reinforcement Learning Deep Learning, Goodfellow, Bengio and

Undergraduate Business Analytics Minor Spreadsheet Analytics BANA-2081 Business Analytics

Deep Data Analytics for Pricing: Uses, Issues, and Solutions Walter R. Paczkowski, Ph.D. Data

DATA ANALYTICS USING DEEP LEARNING GT 8803 // FALL 2018 // VENKATA KISHORE PATCHA Lecture#16 :

Architecture 3.0 Landscape Analytics Jrgen Dllner Hasso-Plattner-Institut Jrgen

AGN deep multiwavelength AGN deep multiwavelength AGN deep multiwavelength surveys: surveys:

Google Analytics Overview Whats Google Analytics? The Google Analytics

Document Name Solar Analytics - Rooftop PV energy analytics PREPARED BY: Your Name, Your Title

Deep Learning: Theory and Practice Deep Learning - Practical 02-04-2020 Considerations

Data Mining &amp; Analytics Data Mining Reference Model Data Warehouse Legal and Ethical Issues

Presentation about Deep Learning --- Zhongwu xie Contents 1.Brief introduction of Deep learning.

Deep Learning on GPUs March 2016 What is Deep Learning? GPUs and DL AGENDA DL in practice

Case Study: Social Media Analytics for Stance Mining With Examples From COVID-19 Twitter

Models of Integration Ohio Health Home TTA Webinar Kathleen Reynolds, LMSW ACSW February 11,

Searchlight Project onboarding THUY DANG | thuydang TRINH NGUYEN | dangtrinhnt SA PHAM | sapd

DESIRE II LDAP Indexing System 45 IETF, Oslo LDAP Service Deployment - Take 2 BoF 15. July 1999

Parameter Tuning for Influence Maximization Manqing Ma Last Updated: 11/19/2018 (CSCI6250 FNS

Gender Lens In Investment Theme Live Demo and Discussion February 14, 2019 1 Agenda 1.

homelessness With: Chair: Sue Christoforou (Policy Manager, Homeless Link) Rory Weal

Suraiya Shiwnarain and Joelyz Wolcott Research Impacts Research Impacts By: Joelyz Wolcott &amp;

Sambuz

Useful Links

Newsletter

Mail Us

Data Mining & Analytics Data Mining Reference Model Data Warehouse Legal and Ethical Issues

Suraiya Shiwnarain and Joelyz Wolcott Research Impacts Research Impacts By: Joelyz Wolcott &