1 / 32
In-Database Factorized Learning
Dan Olteanu Joint work with
- M. Schleich, J. Zavodny & FDB Team
- M. Abo-Khamis, H. Ngo, X. Nguyen
In-Database Factorized Learning Dan Olteanu Joint work with M. - - PowerPoint PPT Presentation
In-Database Factorized Learning Dan Olteanu Joint work with M. Schleich, J. Zavodny & FDB Team M. Abo-Khamis, H. Ngo, X. Nguyen http://www.cs.ox.ac.uk/projects/FDB/ Recent Trends in Knowledge Compilation Dagstuhl, Sept 2017 1 / 32 We
1 / 32
2 / 32
3 / 32
4 / 32
◮ Typically, D is the result of a feature extraction query over a database. 4 / 32
◮ Typically, D is the result of a feature extraction query over a database.
4 / 32
◮ Typically, D is the result of a feature extraction query over a database.
◮ g = (gj)j∈[m] is a vector of multivariate polynomials ◮ h = (hj)j∈[m] is a vector of multivariate monomials 4 / 32
◮ Typically, D is the result of a feature extraction query over a database.
◮ g = (gj)j∈[m] is a vector of multivariate polynomials ◮ h = (hj)j∈[m] is a vector of multivariate monomials
4 / 32
◮ x0 = 1 corresponds to the bias parameter θ0
◮ g(θ), h(x) = θ, x = n
k=0 θkxk
2 .
5 / 32
2 .
2 , where
6 / 32
2 ,
7 / 32
8 / 32
9 / 32
Orders (O for short) customer day dish Elise Monday burger Elise Friday burger Steve Friday hotdog Joe Friday hotdog Dish (D for short) dish item burger patty burger
burger bun hotdog bun hotdog
hotdog sausage Items (I for short) item price patty 6
2 bun 2 sausage 4
O(customer, day, dish), D(dish, item), I(item, price) customer day dish item price Elise Monday burger patty 6 Elise Monday burger
2 Elise Monday burger bun 2 Elise Friday burger patty 6 Elise Friday burger
2 Elise Friday burger bun 2 . . . . . . . . . . . . . . .
10 / 32
O(customer, day, dish), D(dish, item), I(item, price) customer day dish item price Elise Monday burger patty 6 Elise Monday burger
2 Elise Monday burger bun 2 Elise Friday burger patty 6 Elise Friday burger
2 Elise Friday burger bun 2 . . . . . . . . . . . . . . .
Elise × Monday × burger × patty × 6 ∪ Elise × Monday × burger ×
× 2 ∪ Elise × Monday × burger × bun × 2 ∪ Elise × Friday × burger × patty × 6 ∪ Elise × Friday × burger ×
× 2 ∪ Elise × Friday × burger × bun × 2 ∪ . . .
11 / 32
∪ burger hotdog × × ∪ bun onion sausage × × × ∪ ∪ ∪ 2 2 4 ∪ Friday × ∪ Joe Steve ∪ patty bun onion × × × ∪ ∪ ∪ 6 2 2 ∪ Friday × ∪ Elise Monday × ∪ Elise dish day item costumer price
12 / 32
∪ burger hotdog × × ∪ bun onion sausage × × × ∪ ∪ ∪ 2 2 4 ∪ Friday × ∪ Joe Steve ∪ patty bun onion × × × ∪ ∪ ∪ 6 2 2 ∪ Friday × ∪ Elise Monday × ∪ Elise dish day item costumer price
13 / 32
∪ burger hotdog × × ∪ bun onion sausage × × × ∪ ∪ ∪ 2 2 4 ∪ Friday × ∪ Joe Steve ∪ patty bun onion × × × ∪ ∪ ∪ 6 2 2 ∪ Friday × ∪ Elise Monday × ∪ Elise dish day item costumer price
14 / 32
∪ burger hotdog × × ∪ bun onion sausage × × × ∪ ∪ ∪ 2 2 4 ∪ Friday × ∪ Joe Steve ∪ patty bun onion × × × ∪ ∪ ∪ 6 2 2 ∪ Friday × ∪ Elise Monday × ∪ Elise dish day item costumer price
15 / 32
∪ burger hotdog × × ∪ bun onion sausage × × × ∪ ∪ ∪ 2 2 4 ∪ Friday × ∪ Joe Steve ∪ patty bun onion × × × ∪ ∪ ∪ 6 2 2 ∪ Friday × ∪ Elise Monday × ∪ Elise dish day item costumer price
16 / 32
∪ burger hotdog × × ∪ sausage bunonion × × × ∪ 4 ∪ Friday × ∪ Joe Steve ∪ patty bun onion × × × ∪ ∪ ∪ 6 2 2 ∪ Friday × ∪ Elise Monday × ∪ Elise dish day item costumer price
17 / 32
∪ Monday Friday × × ∪ ∪ Elise × ∪ burger × ∪ pattybunonion × × × ∪ ∪ ∪ 6 2 2 Elise × ∪ burger × ∪ pattybunonion × × × ∪ ∪ ∪ 6 2 2 Joe × ∪ hotdog × ∪ bun onion sausage × × × ∪ ∪ ∪ 2 2 4 Steve × ∪ hotdog × ∪ bun onion sausage × × × ∪ ∪ ∪ 2 2 4 day costumer dish item price
18 / 32
∪ Monday Friday × × ∪ ∪ Elise × ∪ burger × ∪ pattybunonion × × × ∪ ∪ ∪ 6 2 2 Elise × ∪ burger × Joe × ∪ hotdog × ∪ bun onion sausage × × × ∪ 4 Steve × ∪ hotdog × day costumer dish item price
19 / 32
×
×
×
dish day customer item price
20 / 32
×
×
×
21 / 32
∪ burger hotdog × × ∪ sausage bunonion × × × ∪ 4 ∪ Friday × ∪ Joe Steve ∪ patty bun onion × × × ∪ ∪ ∪ 6 2 2 ∪ Friday × ∪ Elise Monday × ∪ Elise dish day item costumer price
◮ values → 1, ◮ ∪ → +, ◮ × → ∗. 22 / 32
+ 1 1 ∗ ∗ + 1 1 1 ∗ ∗ ∗ + 1 + 1 ∗ + 1 1 + 1 1 1 ∗ ∗ ∗ + + + 1 1 1 + 1 ∗ + 1 1 ∗ + 1 dish day item costumer price
12 6 6 2 3 1 1 1 1 1 3 2 1 2
◮ values → 1, ◮ ∪ → +, ◮ × → ∗. 23 / 32
∪ burger hotdog × × ∪ sausage bunonion × × × ∪ 4 ∪ Friday × ∪ Joe Steve ∪ patty bun onion × × × ∪ ∪ ∪ 6 2 2 ∪ Friday × ∪ Elise Monday × ∪ Elise dish day item costumer price
◮ Assume there is a function f that turns dish into reals. ◮ All values except for dish & price → 1, ◮ ∪ → +, ◮ × → ∗. 24 / 32
+ f (burger) f (hotdog) ∗ ∗ + 1 1 1 ∗ ∗ ∗ + 4 + 1 ∗ + 1 1 + 1 1 1 ∗ ∗ ∗ + + + 6 2 2 + 1 ∗ + 1 1 ∗ + 1 dish day item costumer price
20∗f (burger)+16∗f (hotdog) 16 20 2 10 1 1 6 2 2 8 2 4 2
◮ Assume there is a function f that turns dish into reals. ◮ All values except for dish & price → 1, ◮ ∪ → +, ◮ × → ∗. 25 / 32
26 / 32
27 / 32
28 / 32
◮ Next slide: Times (sec) for running in one thread on one machine.
◮ Exact method, but fastest among all available in R
◮ Exact method, but fastest among all available in M 29 / 32
30 / 32
◮ They can take arbitrarily less time and space than standard joins (DNFs)
◮ There are restrictions on variable orders in case of free variables
◮ Aggregates Σ, c, sY need only be computed once over the factorized join ◮ For linear regression, the n gradient aggregates Σθ can be computed
together in O(n) time over the factorized join
31 / 32
32 / 32