Pharo VM performance Clement Bera Myself Clment Bra 2011-2013: - - PowerPoint PPT Presentation

pharo vm performance
SMART_READER_LITE
LIVE PREVIEW

Pharo VM performance Clement Bera Myself Clment Bra 2011-2013: - - PowerPoint PPT Presentation

Pharo VM performance Clement Bera Myself Clment Bra 2011-2013: Engineer on the Pharo VM 2013-2017: PhD student Optimisations of the Pharo VM JIT compiler Binary tree benchmark 16 14 12 10 8 6 4 2 0 Interpreter


slide-1
SLIDE 1

Pharo VM performance

Clement Bera

slide-2
SLIDE 2

Myself

  • Clément Béra
  • 2011-2013: Engineer on the Pharo VM
  • 2013-2017: PhD student
  • Optimisations of the Pharo VM JIT compiler
slide-3
SLIDE 3

2 4 6 8 10 12 14 16 Interpreter 2005 Stack 2009 Cog V1 2010 Cog V2 2011 Spur 2014 Sista future

Binary tree benchmark

slide-4
SLIDE 4

2 4 6 8 10 12 14 16 Interpreter 2005 Stack 2009 Cog V1 2010 Cog V2 2011 Spur 2014 Sista future

Binary tree benchmark

Pharo 5 2016

slide-5
SLIDE 5

Plan

  • Pharo 5 (stable)
  • First time we out benched most competitors
  • Pharo 6 (released next week ???)
  • Pharo 7
slide-6
SLIDE 6

Code execution GC

slide-7
SLIDE 7

GC

  • Pharo 5
  • New memory manager Spur
  • Pharo 6
  • New compactor
  • Pharo 7
  • Incremental GC ???
slide-8
SLIDE 8

Pharo 5: Spur

  • Efficient scavenges
  • In most applications, most GC time is now in

scavenges Code execution GC

slide-9
SLIDE 9

Pharo 6: New compactor

Loading a 200 Mb Moose Model in 250 Mb image

February April Total time 2 min 1 min 2 sec Time in Full GC 1 min 2 sec Full GC avg pause 15 sec 0.5 sec Time in scavenge 15 sec 15 sec

slide-10
SLIDE 10

Pharo 6: New compactor

Loading a 200 Mb Moose Model in 250 Mb image

February April Total time 2 min 1 min 2 sec Time in Full GC 1 min 2 sec Full GC avg pause 15 sec 0.5 sec Time in scavenge 15 sec 15 sec

<- GC tuning gets it down to 5 sec

slide-11
SLIDE 11

Pharo 7: Incremental GC ??

  • Full GC pauses: ~500 ms at ~500Mb
  • Java default GC at 200ms soft real time
  • Solution
  • Incremental marking
  • Incremental compaction
slide-12
SLIDE 12

Code execution

  • Pharo 5:
  • Spur got 1.8x
  • Pharo 6:
  • Polishing and micro-optimisations
  • Pharo 7:
  • Sista gets 1.5x-5x
slide-13
SLIDE 13

Pharo 5: Spur 1.8x

  • Class table speeds-up look-up caches
  • New immediate objects
  • 22 bits hash
slide-14
SLIDE 14

Pharo 6

  • Register allocation improvements
  • Two path compilation
  • Frameless code for setter-like methods
slide-15
SLIDE 15

Sista: Pharo 7 ?

  • Program introspection
  • Speculate on types based on previous runs
  • Optimize frequently used code
  • Deoptimize and reoptimize code incorrectly

speculated

slide-16
SLIDE 16

Goals

  • Program readability
  • Performance
slide-17
SLIDE 17

Program readability

1 to: array size do: [ :i | (array at: i) yourself ]. array do: [ :elem | elem yourself ]. array do: #yourself.

slide-18
SLIDE 18

Program readability

1 to: array size do: [ :i | (array at: i) yourself ]. array do: [ :elem | elem yourself ]. array do: #yourself.

2 5 20 87M/ sec 28M/ sec 13M/ sec 3.7M /sec 15M/ sec 21M/ sec 10M/ sec 3.9M /sec 94M/ sec 40M/ sec 22M/ sec 6.5M /sec

slide-19
SLIDE 19

Performance

1 2 3 4 5 6 A* ThreadRing SpectralNorm JSJSON BinaryTree DeltaBlue Richards TCAP Kmeans Sista Pharo

slide-20
SLIDE 20

Getting stable

  • Support most development workflow
  • Support image recompilation
  • Integration has started
slide-21
SLIDE 21

In-image design

Smalltalk image Virtual machine Cogit CompiledCode to native code Machine-specific optimisations Scorch CompiledCode to CompiledCode Smalltalk-specific optimisations CompiledCode (persisted across start-ups) native functions (discarded on shut-down) Baseline JIT Optimising JIT

slide-22
SLIDE 22

Missing

  • IDE support
  • Debugger
  • Methods to show
  • Stability, testing
slide-23
SLIDE 23

Are you interested ?

  • Incremental GC ?
  • VM performance ?
  • VM features ?
  • Come and talk to us !
slide-24
SLIDE 24

We are looking for…

  • Use-cases showing what to improve
  • Large real-world benchmarks
  • Contributors
  • Investment
slide-25
SLIDE 25

Conclusion

  • Pharo 5: Fastest VM
  • Pharo 6: Polishing
  • Pharo 7: Going further