JavaScript Code Martin Burtscher, UT Austin Ben Livshits & Ben - - PowerPoint PPT Presentation

javascript
SMART_READER_LITE
LIVE PREVIEW

JavaScript Code Martin Burtscher, UT Austin Ben Livshits & Ben - - PowerPoint PPT Presentation

JSZap : Compressing JavaScript Code Martin Burtscher, UT Austin Ben Livshits & Ben Zorn, Microsoft Research Gaurav Sinha, IIT Kanpur A Web 2.0 Application Dissected Talks to 14 backend services 1+ MB code (traffic, images, directions,


slide-1
SLIDE 1

JSZap: Compressing JavaScript Code

Martin Burtscher, UT Austin Ben Livshits & Ben Zorn, Microsoft Research Gaurav Sinha, IIT Kanpur

slide-2
SLIDE 2

A Web 2.0 Application Dissected

70,000+ lines of JavaScript code downloaded 2,855 Functions 1+ MB code Talks to 14 backend services (traffic, images, directions, ads, …)

2

slide-3
SLIDE 3

Lots of JavaScript being Transmitted

0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100%

www.live.com spreadsheets.google maps.live chi.lexigame hotmail gmail dropthings maps.google pageflakes bunny hunt

Fraction of download that is JavaScript

3

Up to 85% of a Web 2.0 app is JavaScript code!

slide-4
SLIDE 4

AJAX: Tension Headaches

4

Execution can’t start without the code Move code to client for responsiveness

slide-5
SLIDE 5

JavaScript on the Wire

JavaScript crunch gzip -d parser AST

JSZap

gzip

5

slide-6
SLIDE 6

JSZap Approach

  • Represent JavaScript as AST instead of source
  • Serialize the compressed AST
  • Decompress directly into AST on client
  • Use gzip as 2nd-level (de-)compressor

6

slide-7
SLIDE 7

Benefits of AST-based Compression

  • Compression: less to transmit
  • ASTs are blasted directly into the browser

Reduced Latency

  • Reduces mobile charges
  • Reduces operator network costs: better for servers

Reduced Network Bandwidth

  • Ensures well-formedness of code
  • Can use to check language subsets easily (AdSafe)
  • Caching incremental updates
  • Unblocking HTML parser

Correctness, Security, and other Benefits

7

slide-8
SLIDE 8

JSZap Compression

JavaScript JSZap gzip

8

slide-9
SLIDE 9

JSZap Compression

JavaScript identifiers gzip literals productions 1 2 3

9

slide-10
SLIDE 10

10

GZIP is a formidable

  • pponent
slide-11
SLIDE 11

JSZap vs. GZIP

11

5.4 5.4 18.4 19.0

8.4 11.5

5 10 15 20 25 30 35 40

JSZap gzip

Size in KB

Literals Identifiers Productions

slide-12
SLIDE 12

Talk Outline

identifiers literals productions 1 2 3 evaluation

  • n real

code

12

slide-13
SLIDE 13

Background: ASTs

a * b + c

1) E  E + T 2) E  T 3) T T * F 4) T  F 5) F  id

+

*

a b c

5 5 1 3 5

13

Expression Grammar Tree

slide-14
SLIDE 14

A Simple Javascript Example

var y = 2; function foo () { var x = "jscrunch"; var z = 3; z = y + y; } x = "jszap";

Identifier Stream

y foo x z z y y x

Literal Stream

"jscrunch" 2 3 "jszap"

14

Production Stream

1 3 4 ... 1 3 4 ...

slide-15
SLIDE 15

Benchmarking JSZap

Benchmark name Source lines Source bytes gmonkey 922 17,382 getDOMHash 1,136 25,467 bing1 3,758 77,891 bingmap1 3,473 80,066 livemsg1 5,307 93,982 bingmap2 9,726 113,393 facebook1 5,886 141,469 livemsg2 7,139 156,282

  • fficelive1

22,016 668,051

  • JavaScript files up

to 22K LOC

  • Variety of app types
  • Both hand-

generated, and machine-generated

  • gzipped everything

15

slide-16
SLIDE 16

Components of JavaScript Source

0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100%

gmonkey getDOMHash bing1 bingmap1 livemsg1 bingmap2 facebook1 livemsg2

  • fficelive1

productions identifiers literals

16

  • None of the categories can be ignored
  • Identifiers become more prominent with code growth
slide-17
SLIDE 17

Compressing the Production Stream

  • Frequency-based production renaming
  • Differential encoding: 26 and 57 => 2 and 3
  • Chain rule: eliminate predictable productions
  • Tree-based prediction-by-partial-match

17

slide-18
SLIDE 18

PPMC

  • Consider compressing

– if (P) then X else X

  • Should be very compressible
  • if (P) then ...abc... else ...abc...

18

P X X … …

  • Tree context used to build a

predictor

  • Provides the next likely

child node given context C and child position p

  • Arithmetic coding: more

likely=shorter IDs

  • See paper for details
slide-19
SLIDE 19

Production Compression with PPMC

0.6772

50% 55% 60% 65% 70% 75% 80% 85% 90% 95% 100%

gmonkey getDOMHash bing1 bingmap1 livemsg1 bingmap2 facebook1 livemsg2

  • fficelive1

Production Compression (gzip = 1)

19

slide-20
SLIDE 20

Compressing the Identifier Stream

  • Symbol tables instead of identifier stream:

– Compress redundancy: offset into table – Global or local symbol tables – Use variable-length encoding

  • Other techniques:

– Sort symbols by frequency – Rename local variables

20

slide-21
SLIDE 21

Variable-length Encoding for Identifiers

is global? is renamed local 00… 01… fits in 1 byte? 11… 10…

21

slide-22
SLIDE 22

Variable-Length Identifier Encoding

0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100%

gmonkey getDOMHash bing1 bingmap1 livemsg1 bingmap2 facebook1 livemsg2

  • fficelive1

parent local 2byte local 1byte local builtin global 2byte global 1byte

22

slide-23
SLIDE 23

Symbol Tables: Effectiveness

0.943

89%

80% 85% 90% 95% 100%

gmonkey getDOMHash bing1 bingmap1 livemsg1 bingmap2 facebook1 livemsg2

  • fficelive1

Identifiers (NoST = 1)

Global ST VarEnc

23

slide-24
SLIDE 24

Compressing Literals

  • Symbol tables
  • Grouping literals by type
  • Pre-fixes and post-fixes
  • These techniques result in 5-10% savings

compared to gzip

24

slide-25
SLIDE 25

Average JSZap Compression: 10%

0.8792

80% 82% 84% 86% 88% 90% 92% 94% 96% 98% 100%

gmonkey getDOMHash bing1 bingmap1 livemsg1 bingmap2 facebook1 livemsg2

  • fficelive1

JSZap Compression (gzip = 1)

25

Productions, 26% Identifiers, 57% Literals, 17%

13% savings

slide-26
SLIDE 26

Summary and Conclusions

  • JSZap: AST-based compression for JavaScript
  • Propose a range of techniques for compressing

– Productions – Identifiers – Literals

  • Preliminary results are encouraging: 10% savings over gzip
  • Future focus

– Latency measurements – Browser integration

26

slide-27
SLIDE 27

Well- formedness

Security (AdSafe)

AST

representation

Unblocking HTML parser

Caching and incremental updates Compression with JSZap

27

?

Questions?