In-Database Factorized Learning Dan Olteanu Joint work with M. - PowerPoint PPT Presentation

In-Database Factorized Learning Dan Olteanu Joint work with M. Schleich, J. Zavodny & FDB Team M. Abo-Khamis, H. Ngo, X. Nguyen http://www.cs.ox.ac.uk/projects/FDB/ Recent Trends in Knowledge Compilation Dagstuhl, Sept 2017 1 / 32

We Work on In-Database Analytics In-database analytics = solve optimization problems inside the database engine. Why in-database analytics? 1. Bring analytics close to data ⇒ Save non-trivial export/import time 2. Large chunks of analytics code can be rewritten into database queries ⇒ Use scalable systems and low complexity for query processing 3. Used by LogicBlox retail-planning and forecasting applications Unified in-database analytics solution for a host of optimization problems. 2 / 32

Problem Formulation 3 / 32

Problem Formulation A typical machine learning task is to solve θ ∗ := arg min θ J ( θ ), where � J ( θ ) := L ( � g ( θ ) , h ( x ) � , y ) + Ω( θ ) . ( x , y ) ∈ D θ = ( θ 1 , . . . , θ p ) ∈ R p are the parameters of the learned model 4 / 32

Problem Formulation A typical machine learning task is to solve θ ∗ := arg min θ J ( θ ), where � J ( θ ) := L ( � g ( θ ) , h ( x ) � , y ) + Ω( θ ) . ( x , y ) ∈ D θ = ( θ 1 , . . . , θ p ) ∈ R p are the parameters of the learned model D is the training dataset with features x and response y ◮ Typically, D is the result of a feature extraction query over a database. 4 / 32

Problem Formulation A typical machine learning task is to solve θ ∗ := arg min θ J ( θ ), where � J ( θ ) := L ( � g ( θ ) , h ( x ) � , y ) + Ω( θ ) . ( x , y ) ∈ D θ = ( θ 1 , . . . , θ p ) ∈ R p are the parameters of the learned model D is the training dataset with features x and response y ◮ Typically, D is the result of a feature extraction query over a database. L is a loss function, Ω is the regularizer 4 / 32

Problem Formulation A typical machine learning task is to solve θ ∗ := arg min θ J ( θ ), where � J ( θ ) := L ( � g ( θ ) , h ( x ) � , y ) + Ω( θ ) . ( x , y ) ∈ D θ = ( θ 1 , . . . , θ p ) ∈ R p are the parameters of the learned model D is the training dataset with features x and response y ◮ Typically, D is the result of a feature extraction query over a database. L is a loss function, Ω is the regularizer functions g : R p → R m and h : R n → R m for n numeric features ( m > 0) ◮ g = ( g j ) j ∈ [ m ] is a vector of multivariate polynomials ◮ h = ( h j ) j ∈ [ m ] is a vector of multivariate monomials 4 / 32

Problem Formulation A typical machine learning task is to solve θ ∗ := arg min θ J ( θ ), where � J ( θ ) := L ( � g ( θ ) , h ( x ) � , y ) + Ω( θ ) . ( x , y ) ∈ D θ = ( θ 1 , . . . , θ p ) ∈ R p are the parameters of the learned model D is the training dataset with features x and response y ◮ Typically, D is the result of a feature extraction query over a database. L is a loss function, Ω is the regularizer functions g : R p → R m and h : R n → R m for n numeric features ( m > 0) ◮ g = ( g j ) j ∈ [ m ] is a vector of multivariate polynomials ◮ h = ( h j ) j ∈ [ m ] is a vector of multivariate monomials Example problems: ridge linear regression , degree- d polynomial regression, degree- d factorization machines; logistic regression, SVM; PCA. 4 / 32

Ridge Linear Regression General problem formulation: � J ( θ ) := L ( � g ( θ ) , h ( x ) � , y ) + Ω( θ ) . ( x , y ) ∈ D Under square loss L , ℓ 2 -regularization, data points x = ( x 0 , x 1 , . . . , x n ), p = n + 1 parameters θ = ( θ 0 , . . . , θ n ), ◮ x 0 = 1 corresponds to the bias parameter θ 0 g and h identity functions g ( θ ) = θ and h ( x ) = x ◮ � g ( θ ) , h ( x ) � = � θ , x � = � n k =0 θ k x k we obtain the following formulation for ridge linear regression: � � 2 � � n 1 + λ 2 � θ � 2 J ( θ ) := θ k x k − y 2 . 2 | D | k =0 ( x , y ) ∈ D 5 / 32

Rewriting the Objective Function J We decouple the parameters θ from the data-dependent features x in J . We can rewrite the loss function � � 2 � � n 1 + λ 2 � θ � 2 J ( θ ) := θ k x k − y 2 . 2 | D | ( x , y ) ∈ D k =0 as follows: J ( θ ) = 1 2 θ ⊤ Σ θ − � θ , c � + s Y 2 + λ 2 � θ � 2 2 , where � 1 Σ = ( σ i , j ) i , j ∈ [ n ] , σ i , j = x i · x j | D | ( x , y ) ∈ D � 1 c = ( c i ) i ∈ [ n ] , c i = y · x i | D | ( x , y ) ∈ D � 1 y 2 . s Y = | D | ( x , y ) ∈ D 6 / 32

Batch Gradient Descent for Parameter Computation Repeatedly update θ in the direction of the gradient until convergence: θ := θ − α · ∇ J ( θ ) . Since J ( θ ) = 1 2 θ ⊤ Σ θ − � θ , c � + s Y 2 + λ 2 � θ � 2 2 , the gradient vector ∇ J ( θ ) becomes: ∇ J ( θ ) = Σ θ − c + λ θ . 7 / 32

Key Insights The computation of the training dataset entails a high degree of redundancy, which can be avoided by factorized joins . Compressed lossless representations of query result that are: d eterministic D ecomposable O rdered M ulti-Valued D iagrams Aggregates can be computed directly over factorized joins. 8 / 32

Factorization Example 9 / 32

Factorization Example Orders (O for short) Dish (D for short) Items (I for short) customer day dish dish item item price Elise Monday burger burger patty patty 6 Elise Friday burger burger onion onion 2 Steve Friday hotdog burger bun bun 2 Joe Friday hotdog hotdog bun sausage 4 hotdog onion hotdog sausage Consider the join of the above relations: O(customer, day, dish), D(dish, item), I(item, price) customer day dish item price Elise Monday burger patty 6 Elise Monday burger onion 2 Elise Monday burger bun 2 Elise Friday burger patty 6 Elise Friday burger onion 2 Elise Friday burger bun 2 . . . . . . . . . . . . . . . 10 / 32

Factorization Example O(customer, day, dish), D(dish, item), I(item, price) customer day dish item price Elise Monday burger patty 6 Elise Monday burger onion 2 Elise Monday burger bun 2 Elise Friday burger patty 6 Elise Friday burger onion 2 Elise Friday burger bun 2 . . . . . . . . . . . . . . . A relational algebra expression encoding the above query result is: � Elise � × � Monday � × � burger � × � patty � × � 6 � ∪ � Elise � × � Monday � × � burger � × � onion � × � 2 � ∪ � Elise � × � Monday � × � burger � × � bun � × � 2 � ∪ � Elise � × � Friday � × � burger � × � patty � × � 6 � ∪ � Elise � × � Friday � × � burger � × � onion � × � 2 � ∪ � Elise � × � Friday � × � burger � × � bun � × � 2 � ∪ . . . It uses relational product ( × ), union ( ∪ ), and data (singleton relations). The attribute names are not shown to avoid clutter. 11 / 32

This is How A Factorized Join Looks Like ∪ � burger � � hotdog � dish × × ∪ ∪ ∪ ∪ day � Monday � � Friday � � patty � � bun � � onion � � Friday � � bun � � onion � � sausage � item × × × × × × × × × ∪ ∪ ∪ ∪ ∪ ∪ ∪ ∪ ∪ costumer price � Elise � � Elise � � 6 � � 2 � � 2 � � Joe � � Steve � � 2 � � 2 � � 4 � Variable order Grounding of the variable order over the input database There are several algebraically equivalent factorized joins defined: by distributivity of product over union and their commutativity; as groundings of join trees. 12 / 32

This is How A Factorized Join Looks Like ∪ � burger � � hotdog � dish × × ∪ ∪ ∪ ∪ day � Monday � � Friday � � patty � � bun � � onion � � Friday � � bun � � onion � � sausage � item × × × × × × × × × ∪ ∪ ∪ ∪ ∪ ∪ ∪ ∪ ∪ costumer price � Elise � � Elise � � 6 � � 2 � � 2 � � Joe � � Steve � � 2 � � 2 � � 4 � Variable order Grounding of the variable order over the input database deterministic Decomposable Ordered Multi-Valued Diagram Each union has children representing distinct domain values of a variable 13 / 32

This is How A Factorized Join Looks Like ∪ � burger � � hotdog � dish × × ∪ ∪ ∪ ∪ day � Monday � � Friday � � patty � � bun � � onion � � Friday � � bun � � onion � � sausage � item × × × × × × × × × ∪ ∪ ∪ ∪ ∪ ∪ ∪ ∪ ∪ costumer price � Elise � � Elise � � 6 � � 2 � � 2 � � Joe � � Steve � � 2 � � 2 � � 4 � Variable order Grounding of the variable order over the input database deterministic Decomposable Ordered Multi-Valued Diagram Each product has children over disjoint sets of variables 14 / 32

In-Database Factorized Learning Dan Olteanu Joint work with M. - PowerPoint PPT Presentation

In-Database Factorized Learning Dan Olteanu Joint work with M. Schleich, J. Zavodny & FDB Team M. Abo-Khamis, H. Ngo, X. Nguyen http://www.cs.ox.ac.uk/projects/FDB/ Recent Trends in Knowledge Compilation Dagstuhl, Sept 2017 1 / 32 We

Factorized groups and solubility Bernhard Amberg Universit at Mainz Malta, March 2018

Aggregation and Ordering in Factorized Databases B akibayev, K o y, O lteanu, and Z cisk

Database Utilities 10/17/2007 DC/Win Database Utilities Opening Database Utilities From File on

RECURRENT KALMAN NETWORKS Factorized Inference in High-Dimensional Deep Feature Spaces Philipp

Products of groups which contain abelian subgroups of finite index Bernhard Amberg Universit

Multi-modal Factorized High-order Pooling for Visual Question Answering Team HDU-USYD-UNCC with

New progress on factorized groups and subgroup permutability Paz Arroyo-Jord Instituto

Solving linear programs on factorized databases Nicolas Crosetti joint work with F. Capelli, J.

NEBC Database Course 2008 Database Servers Database Interfaces Tim Booth : tbooth@ceh.ac.uk

National Address Database National Address Database What is a National Address Database?

DATABASE SECURITY CS4750 Database Systems Prof. Nada Basit Email: basit@virginia.edu Fall

DATABASE SECURITY CS4750 Database Systems Prof. Nada Basit Email: basit@virginia.edu Fall

DATABASE SYSTEMS Database programming in a web environment Database System Course, 2016-2017

DATABASE SYSTEMS Database programming in a web environment Database System Course AGENDA FOR

Advanced Database CS 525: Organization? Advanced Database =Database Implementation

CSc 337 LECTURE 24: CREATING A DATABASE AND MORE JOINS Creating a database In the command line

Drupal High Availability High Performance Samstag, 3. November 12 Drupal High Availability

How open source helps you prevent the next Drupalgeddon the best marketing for this talk was

< Web Development /> Presented by Table of Contents 1. The World Wide Web 2. HTML 3. CSS

High Availability High Performance How to sleep without the server-crash-fear Michael Schmid

Better tools for content editors Petr ILLEK Morpht Better tools for content editors Modifiers

SCONE: Secure Linux Containers with Intel SGX Sergei Arnautov1, Bohdan Trach1, Franz Gregor1,

MPTEE: Bringing Flexible and Efficient Memory Protection to Intel SGX Wenjia Zhao 1,2 , Kangjie Lu

Building Applications Tutorial Session for Trustworthy Data Analysis in the Cloud Andrey Brito

In-Database Factorized Learning Dan Olteanu Joint work with M. - PowerPoint PPT Presentation

In-Database Factorized Learning Dan Olteanu Joint work with M. Schleich, J. Zavodny & FDB Team M. Abo-Khamis, H. Ngo, X. Nguyen http://www.cs.ox.ac.uk/projects/FDB/ Recent Trends in Knowledge Compilation Dagstuhl, Sept 2017 1 / 32 We

Factorized groups and solubility Bernhard Amberg Universit at Mainz Malta, March 2018

Aggregation and Ordering in Factorized Databases B akibayev, K o y, O lteanu, and Z cisk

Database Utilities 10/17/2007 DC/Win Database Utilities Opening Database Utilities From File on

RECURRENT KALMAN NETWORKS Factorized Inference in High-Dimensional Deep Feature Spaces Philipp

Products of groups which contain abelian subgroups of finite index Bernhard Amberg Universit

Multi-modal Factorized High-order Pooling for Visual Question Answering Team HDU-USYD-UNCC with

New progress on factorized groups and subgroup permutability Paz Arroyo-Jord Instituto

Solving linear programs on factorized databases Nicolas Crosetti joint work with F. Capelli, J.

NEBC Database Course 2008 Database Servers Database Interfaces Tim Booth : tbooth@ceh.ac.uk

National Address Database National Address Database What is a National Address Database?

DATABASE SECURITY CS4750 Database Systems Prof. Nada Basit Email: basit@virginia.edu Fall

DATABASE SECURITY CS4750 Database Systems Prof. Nada Basit Email: basit@virginia.edu Fall

DATABASE SYSTEMS Database programming in a web environment Database System Course, 2016-2017

DATABASE SYSTEMS Database programming in a web environment Database System Course AGENDA FOR

Advanced Database CS 525: Organization? Advanced Database =Database Implementation

CSc 337 LECTURE 24: CREATING A DATABASE AND MORE JOINS Creating a database In the command line

Drupal High Availability High Performance Samstag, 3. November 12 Drupal High Availability

How open source helps you prevent the next Drupalgeddon the best marketing for this talk was

&lt; Web Development /&gt; Presented by Table of Contents 1. The World Wide Web 2. HTML 3. CSS

High Availability High Performance How to sleep without the server-crash-fear Michael Schmid

Better tools for content editors Petr ILLEK Morpht Better tools for content editors Modifiers

SCONE: Secure Linux Containers with Intel SGX Sergei Arnautov1, Bohdan Trach1, Franz Gregor1,

MPTEE: Bringing Flexible and Efficient Memory Protection to Intel SGX Wenjia Zhao 1,2 , Kangjie Lu

Building Applications Tutorial Session for Trustworthy Data Analysis in the Cloud Andrey Brito

< Web Development /> Presented by Table of Contents 1. The World Wide Web 2. HTML 3. CSS