 
              The 45th International Conference on Very Large Data Bases (VLDB 2019) An Experimental Evaluation of Garbage Collectors on Big Data Applications Lijie Xu 1 , Tian Guo 2 , Wensheng Dou 1 , Wei Wang 1 , Jun Wei 1 1 Institute of Software, Chinese Academy of Sciences 2 Worcester Polytechnic Institute August 2019 1
Popular big data frameworks rely on garbage-collected languages to manage in-memory objects Big Data Frameworks: Garbage-collected Languages: Rely on JVM garbage collector to manage the in-memory objects generated by big data applications GC root GC root GC root 2
GC inefficiency � Big data applications su ff er from heavy GC overhead • GC time takes up to ~30% of Spark application execution time [StackOverflow][1] � � memory size � Memory usage GC: When JVM memory is nearly full, � memory usage garbage collector will pause the � frequent and application to reclaim unused objects. long GC pauses Time [1] https://stackoverflow.com/questions/38965787/spark-executor-gc-taking-long 3
Question � Q: What are the causes of heavy GC overhead? 4
The causes of heavy GC overhead � Example: Spark Join Application Shuffle write in-memory partition Shuffle read in-memory aggregattion 1, a 1 a map() 2 b 2, b 1, (a, A) 1 a d A reduce() 1, (d, A) 3 c 3, c 3 c C map() 3, (c, C) mapper 1 d 1, d Application dataflow: if cached 1, A 1 A map() 2 B 2, B 2, (b, B) 2 b B D reduce() 2, (b, D) 3 C 3, C map() 2 D 2, D User code Shuffle write/read Cached data Large intermediate computing results Large intermediate data Large cached data Cause 1: In-memory objects Big data application generates GC root massive data objects Objects managed GC root by JVM ⇒ GC is time-consuming GC root 5
The causes of heavy GC overhead � Example: Spark Join Application Shuffle write in-memory partition Shuffle read in-memory aggregattion 1, a 1 a map() 2, b 2 b 1, (a, A) 1 a d A reduce() 1, (d, A) 3 c 3, c 3 c C map() 3, (c, C) mapper 1 d 1, d Cause 2: if cached The framework only manages 1, A 1 A map() 2 B 2, B 2, (b, B) the intermediate and cached 2 b B D reduce() 2, (b, D) 3 C 3, C data in a logical space map() 2 D 2, D Memory space 1. User space 2. Execution space 3. Storage space managed by for user code for shuffle write/read for cached data the framework In-memory objects GC root Objects managed GC root ⇒ Rely on garbage collectors to by JVM manage the data objects GC root 6
The causes of heavy GC overhead � Example: Spark Join Application Shuffle write in-memory partition Shuffle read in-memory aggregattion 1, a 1 a map() 2, b 2 b 1, (a, A) 1 a d A reduce() 1, (d, A) 3 c 3, c 3 c C map() 3, (c, C) mapper 1 d 1, d if cached 1, A 1 A map() 2 B 2, B 2, (b, B) 2 b B D reduce() 2, (b, D) 3 C 3, C map() 2 D 2, D Memory space 1. User space 2. Execution space 3. Storage space managed by for user code for shuffle write/read for cached data the framework In-memory objects Cause 3: GC root Current garbage collectors are not designed for big data applications Objects managed GC root by JVM GC root (not aware of the characteristics of big data objects) 7
Three popular garbage collectors � JVM has three popular garbage collectors Parallel , CMS , and G1 collectors • One JVM uses only one collector at runtime • � JVM heap layout object Parallel/CMS JVM Parallel/CMS collector Young Generation Survivor Eden Survivor Contiguous Old Generation Young Gen: for storing generations object short-lived objects Old Gen: for storing long-lived objects G1 collector G1 JVM (equal-sized regions) O E E O E Eden Non-contiguous O S H H S Survivor generations E H H E O Old (equal-size regions) S O O S H Humongous O E O E Non-allocated 8
Three popular garbage collectors � GC process: Mark unused objects � Sweep unused objects � Compact the space (optional) Di ff erent GC algorithms Parallel GC CMS/G1 GC GC root GC root Before GC Stop-the-world GC Concurrent GC GC root App threads App threads GC root Stop the world Stop the world Marking GC root unused objects Concurrent GC threads GC root marking for marking Stop the world for sweeping live objects unused objects Concurrent sweeping Stop-the-world GC root Stop the world GC root Sweeping unused objects GC root App threads App threads 9
Research questions � Q1: Why are current garbage collectors ine ffi cient for big data applications? Root causes? Q2: Are there any GC optimization methods? 10
Methodology – Experimental evaluation � 1. Select representative big data applications with di ff erent memory usage patterns Machine SQL Graph Learning 2. Run applications on di ff erent garbage collectors to identify GC patterns memory size Memory usage memory usage (Parallel, CMS, G1 collector) frequent and long GC pauses Time 3. Analyze the correlation between memory usage patterns and GC patterns to identify the causes of GC inefficiency 11
Application selection – memory usage patterns � 1. GroupBy (from BigSQLBench) 2. Join (from BigSQLBench) Map Stage Reduce Stage Map Stage Reduce Stage Long-lived accumulated records Memory usage pattern: map () Long-lived accumulated records map () 1 a 1, a Massive temporary records 1.1 a 1, a join () 2 b 2, b 2.3 b 2, b 1, [a, d, A] 1, [a, d, A] 3 c 3, c 1, (a, A) 3.5 c 3, c 1, [(a,d), A] 3, [c, C] 3, [c, C] 1 d 1, d Cartesian 1, (d, A) 1.2 d 1, d 3, [c, C] product 3, (c, C) 1.6 A 1, A 2, [b, B] 2, [b, B] 1 A 1, A 2.8 B 2, B 2, (b, B) 4, [D] 4, [D] 2 B 2, B 2, [b, (B,D)] 2, (b, D) 3.7 C 3, C 3 C 3, C groupByKey () spill () merge () 4.9 D 4, D 2 D 2, D 2, [b, B] 3. SVM (from Spark MLlib) 4. PageRank (from Spark Graph library) Input matrix Long-lived cached records Humongous data objects Long-lived cached records Long-lived accumulated records Features Label Map Stage 2nd Iterative Stage 3rd Iterative Stage 1st Iterative Stage Reduce Stage n n x 1 y 1 Map Stage X x i X × × × Cached data join() reduceByKey() reduceByKey() n gradient loss w T x 2 y 2 y i y i x i i = 1 , vector value i = 1 2, 1.0 1 2 1, 2 1, [2] 1, 1.0 1, [(2), 1.0] 1, 2.0 3, 0.5 2 1 2, 1 6, [3, 7] 6, 1.0 6, [(3, 7), 1.0] 6, 0.5 n n x 3 y 3 7, 0.5 X X grad ( w , x i ) , loss ( w , x i ) + = , x 4 y 4 w T + = = 3 5 3, 5 i = 1 i = 1 2, [1] 2, 1.0 2, [(1), 1.0] 1, 1.0 2, 1.0 3 6 3, 6 Driver n n x 5 y 5 X X 5, 0.5 grad ( w , x i ) , loss ( w , x i ) 4 1 4, 1 3, [5, 6] 3, 1.0 3, [(5, 6),1.0] 3, 0.5 n n 6, 0.5 x 6 y 6 X X 6 3 6, 3 7, [4] 7, 1.0 7, [(4), 1.0] 7, 0.5 grad , loss i = 1 i = 1 4, 1.0 i = 1 i = 1 n n 6 7 6, 7 x 7 y 7 4, [1] 4, 1.0 4, [(1), 1.0] 1, 1.0 4, 1.0 X X grad ( w , x i ) , loss ( w , x i ) 7 4 7, 4 compute ( w new ) x 8 y 8 i = 1 i = 1 broadcast the new hyperplane vector w T 12
Application selection – memory usage patterns � 1. GroupBy (SQL) 2. Join (SQL) Map Stage Reduce Stage Map Stage Reduce Stage Long-lived accumulated records Memory usage pattern: map () Long-lived accumulated records map () 1 a 1, a Massive temporary records join () 2 b 2, b 1.1 a 1, a 2.3 b 2, b 3 c 3, c 1, [a, d, A] 1, [a, d, A] 1, (a, A) 1 d 1, d 1, [(a,d), A] Cartesian 3.5 c 3, c 3, [c, C] 3, [c, C] 1, (d, A) 3, [c, C] product 1.2 d 1, d 3, (c, C) 1.6 A 1, A 2, [b, B] 2, [b, B] 1 A 1, A 2, (b, B) 2.8 B 2, B 4, [D] 4, [D] 2 B 2, B 2, [b, (B,D)] 2, (b, D) 3.7 C 3, C 3 C 3, C groupByKey () spill () merge () 4.9 D 4, D 2 D 2, D 2, [b, B] JVM heap Young Gen Shuffled records are accumulated in memory ⇒ Long-lived objects ⇒ Stored in Old Gen Old Gen P1: Long-lived accumulated records 13
Application selection – memory usage patterns � 1. GroupBy (SQL) 2. Join (SQL) Map Stage Reduce Stage Map Stage Reduce Stage Long-lived accumulated records Memory usage pattern: map () Long-lived accumulated records map () 1 a 1, a Massive temporary records join () 2 b 2, b 1.1 a 1, a 2.3 b 2, b 3 c 3, c 1, [a, d, A] 1, [a, d, A] 1, (a, A) 1 d 1, d 1, [(a,d), A] Cartesian 3.5 c 3, c 3, [c, C] 3, [c, C] 1, (d, A) 3, [c, C] product 1.2 d 1, d 3, (c, C) 1.6 A 1, A 2, [b, B] 2, [b, B] 1 A 1, A 2, (b, B) 2.8 B 2, B 4, [D] 4, [D] 2 B 2, B 2, [b, (B,D)] 2, (b, D) 3.7 C 3, C 3 C 3, C groupByKey () spill () merge () 4.9 D 4, D 2 D 2, D 2, [b, B] JVM heap Temporary results are generated in user code Young Gen Shuffled records are (e.g., cartesian()) P2: Massive temporary records accumulated in memory ⇒ short-lived objects ⇒ Long-lived objects ⇒ Stored in Young Gen ⇒ Stored in Old Gen Old Gen P1: Long-lived accumulated records 14
Recommend
More recommend