people people name name id id age age stephanie
play

people people name name id id age age - PowerPoint PPT Presentation

people people name name id id age age stephanie stephanie 1 1 19 19 Query 1 dylan dylan 2 2 26 26 people.filter{p => p.age 18} mary kate mary kate 3 3 17 17 pets


  1. ● ○ ○ ● ○ ○ ● ○

  2. ● ○ ●

  3. people people name name id id age age stephanie stephanie 1 1 19 19 Query 1 dylan dylan 2 2 26 26 people.filter{p => p.age 18} mary kate mary kate 3 3 17 17 pets Query 2 name owner people.join(pets, "id === owner") catsidy 2 .filter(people.age 18) gigi 3

  4. Cache filter { p => p.age > 18 } people.filter(age 18) table people Cache Physical Optimization Substitution Planning filter filter filter { p => p.age > 18 } { p => p.age > 18 } { p => p.age > 18 } table people table people FileScan people

  5. Cache filter { p => p . age > 18 } people.join(pets, "id === owner") .filter(people.age 18) table people Cache Physical Optimization Substitution Planning select * select * select * join (owner, id) hashjoin filter people.age > 18 (owner, id) table filter pets people.age > 18 filter filescan pets join (owner, id) people.age > 18 table table table filescan people people pets people

  6. Cache filter { p => p . age > 18 } table people Cache Physical Optimization Substitution Planning select * select * select * join (owner, id) hashjoin filter people.age > 18 (owner, id) table filter pets people.age > 18 filter filescan pets join (owner, id) people.age > 18 table table table filescan people people pets people

  7. ○ ○ ○ ○

  8. Current Pipeline Physical Cache Optimization Planning Physical Optimization Cache Planning Optimization-first pipeline

  9. ● ○ ○ ● ○ ○ ○

  10. Current Pipeline Physical Cache Optimization Planning Optimization-first pipeline (slow!) Physical Optimization Cache Planning Insight: not all optimizations help caching! Partial Physical Cache Optimization Optimization Planning

  11. Boolean Simplification Constant Propagation ID Reassignment Filter Pruning Object Elimination Custom Rules ...

  12. ● ● ● ●

  13. ○ ○ ○ ○

  14. UDFs are blackboxes that hide caching opportunities select * select * { p => where age > 18 p.age > 18 } table people table people

  15. Program User Froid Acorn Synthesis Annotation

  16. Program User Froid Acorn Synthesis Annotation Correct ✓ ✓ ✓ ✓

  17. Program User Froid Acorn Synthesis Annotation Correct ✓ ✓ ✓ ✓ Transparent X ✓ ✓ ✓

  18. Program User Froid Acorn Synthesis Annotation Correct ✓ ✓ ✓ ✓ Transparent X ✓ ✓ ✓ General X X ✓ ✓ (Java, Scala)

  19. Program User Froid Acorn Synthesis Annotation Correct ✓ ✓ ✓ ✓ Transparent X ✓ ✓ ✓ General X X ✓ ✓ (Java, Scala) Fast X ✓ ✓ ✓

  20. Scala Native Spark ● ● ●

  21. person.filter(p => p.age > 18) 1 aload_1 1 Person r1 := @param0 2 invokeinterface 2 double $d0 = r1.age() 3 dload_1 3 int $d1 = 18 4 ldc2_w 4 if $d0 < $d1 5 dcmpg 5 goto 8 6 ifge 18 6 boolean $zo = 1 7 iconst_1 7 goto 9 8 goto 10 8 $zo = 0 9 iconst_0 9 return $zo 10 aload_0 11 aload_1

  22. 1 Person r1 := @param0 2 double $d0 = r1.age() 3 int $d1 = 18 4 if $d0 > $d1 5 goto 8 6 boolean $zo = 1 7 goto 9 8 $zo = 0 9 return $zo Name Type Expression r1 class[Person] this

  23. 1 Person r1 := @param0 2 double $d0 = r1.age() 3 int $d1 = 18 4 if $d0 > $d1 5 goto 8 6 boolean $zo = 1 7 goto 9 8 $zo = 0 9 return $zo Name Type Expression r1 class[Person] this d0 double Attribute("age")

  24. 1 Person r1 := @param0 2 double $d0 = r1.age() 3 int $d1 = 18 4 if $d0 > $d1 5 goto 8 6 boolean $zo = 1 7 goto 9 8 $zo = 0 9 return $zo Name Type Expression r1 class[Person] this d0 double Attribute("age") d1 int Literal(18)

  25. 1 Person r1 := @param0 2 double $d0 = r1.age() If 3 int $d1 = 18 4 if $d0 > $d1 5 goto 8 GreaterThan(Attribute("age"), Literal(18)) 6 boolean $zo = 1 7 goto 9 8 $zo = 0 9 return $zo Name Type Expression r1 class[Person] this d0 double Attribute("age") d1 int Literal(18)

  26. 1 Person r1 := @param0 2 double $d0 = r1.age() If 3 int $d1 = 18 4 if $d0 > $d1 5 goto 8 GreaterThan(Attribute("age"), Literal(18)) 6 boolean $zo = 1 7 goto 9 8 $zo = 0 9 return $zo Name Type Expression cast (0) as boolean r1 class[Person] this d0 double Attribute("age") d1 int Literal(18)

  27. 1 Person r1 := @param0 2 double $d0 = r1.age() If 3 int $d1 = 18 4 if $d0 > $d1 5 goto 8 GreaterThan(Attribute("age"), Literal(18)) 6 boolean $zo = 1 7 goto 9 8 $zo = 0 9 return $zo Name Type Expression cast(1) as boolean r1 class[Person] this d0 double Attribute("age") d1 int Literal(18)

  28. 1 Person r1 := @param0 2 double $d0 = r1.age() If 3 int $d1 = 18 4 if $d0 > $d1 5 goto 8 GreaterThan(Attribute("age"), Literal(18)) 6 boolean $zo = 1 7 goto 9 8 $zo = 0 9 return $zo Name Type Expression cast (0) as cast (1) as boolean boolean r1 class[Person] this d0 double Attribute("age") d1 int Literal(18)

  29. IF select age GreaterThan(Attribute("age"), Literal(18) filterUDF{ p => p.age > 18 } cast (1) cast(0) as as boolean boolean table people person.filter(p => p.age > 18)

  30. IF select age GreaterThan(Attribute("age"), Literal(18)) filter(If(GreaterThan("age", 18), cast 0 as boolean, cast 1 as boolean)) cast (1) cast(0) as as boolean boolean table people person.filter(p => p.age > 18)

  31. select * select * filter (If(GreaterThan("age", 18), cast 0 Partial Optimizer filter "age" > 18 as boolean, cast 1 as boolean)) table people table people person.filter(age > 18) person.filter(p => p.age > 18)

  32. ฀฀

  33. ● ● ●

  34. ● ○ ○ ● ●

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend