JSZap: Compressing JavaScript Code
Martin Burtscher, UT Austin Ben Livshits & Ben Zorn, Microsoft Research Gaurav Sinha, IIT Kanpur
JavaScript Code Martin Burtscher, UT Austin Ben Livshits & Ben - - PowerPoint PPT Presentation
JSZap : Compressing JavaScript Code Martin Burtscher, UT Austin Ben Livshits & Ben Zorn, Microsoft Research Gaurav Sinha, IIT Kanpur A Web 2.0 Application Dissected Talks to 14 backend services 1+ MB code (traffic, images, directions,
JSZap: Compressing JavaScript Code
Martin Burtscher, UT Austin Ben Livshits & Ben Zorn, Microsoft Research Gaurav Sinha, IIT Kanpur
A Web 2.0 Application Dissected
70,000+ lines of JavaScript code downloaded 2,855 Functions 1+ MB code Talks to 14 backend services (traffic, images, directions, ads, …)
2
Lots of JavaScript being Transmitted
0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100%
www.live.com spreadsheets.google maps.live chi.lexigame hotmail gmail dropthings maps.google pageflakes bunny hunt
Fraction of download that is JavaScript
3
Up to 85% of a Web 2.0 app is JavaScript code!
4
Execution can’t start without the code Move code to client for responsiveness
JavaScript crunch gzip -d parser AST
JSZap
gzip
5
6
Benefits of AST-based Compression
Reduced Latency
Reduced Network Bandwidth
Correctness, Security, and other Benefits
7
JavaScript JSZap gzip
8
JavaScript identifiers gzip literals productions 1 2 3
9
10
11
5.4 5.4 18.4 19.0
8.4 11.5
5 10 15 20 25 30 35 40
JSZap gzip
Size in KB
Literals Identifiers Productions
identifiers literals productions 1 2 3 evaluation
code
12
a * b + c
1) E E + T 2) E T 3) T T * F 4) T F 5) F id
+
*
a b c
5 5 1 3 5
13
Expression Grammar Tree
A Simple Javascript Example
var y = 2; function foo () { var x = "jscrunch"; var z = 3; z = y + y; } x = "jszap";
Identifier Stream
y foo x z z y y x
Literal Stream
"jscrunch" 2 3 "jszap"
14
Production Stream
1 3 4 ... 1 3 4 ...
Benchmark name Source lines Source bytes gmonkey 922 17,382 getDOMHash 1,136 25,467 bing1 3,758 77,891 bingmap1 3,473 80,066 livemsg1 5,307 93,982 bingmap2 9,726 113,393 facebook1 5,886 141,469 livemsg2 7,139 156,282
22,016 668,051
to 22K LOC
generated, and machine-generated
15
0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100%
gmonkey getDOMHash bing1 bingmap1 livemsg1 bingmap2 facebook1 livemsg2
productions identifiers literals
16
Compressing the Production Stream
17
– if (P) then X else X
18
P X X … …
predictor
child node given context C and child position p
likely=shorter IDs
Production Compression with PPMC
0.6772
50% 55% 60% 65% 70% 75% 80% 85% 90% 95% 100%
gmonkey getDOMHash bing1 bingmap1 livemsg1 bingmap2 facebook1 livemsg2
Production Compression (gzip = 1)
19
– Compress redundancy: offset into table – Global or local symbol tables – Use variable-length encoding
– Sort symbols by frequency – Rename local variables
20
Variable-length Encoding for Identifiers
is global? is renamed local 00… 01… fits in 1 byte? 11… 10…
21
Variable-Length Identifier Encoding
0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100%
gmonkey getDOMHash bing1 bingmap1 livemsg1 bingmap2 facebook1 livemsg2
parent local 2byte local 1byte local builtin global 2byte global 1byte
22
0.943
89%
80% 85% 90% 95% 100%
gmonkey getDOMHash bing1 bingmap1 livemsg1 bingmap2 facebook1 livemsg2
Identifiers (NoST = 1)
Global ST VarEnc
23
compared to gzip
24
0.8792
80% 82% 84% 86% 88% 90% 92% 94% 96% 98% 100%
gmonkey getDOMHash bing1 bingmap1 livemsg1 bingmap2 facebook1 livemsg2
JSZap Compression (gzip = 1)
25
Productions, 26% Identifiers, 57% Literals, 17%
13% savings
– Productions – Identifiers – Literals
– Latency measurements – Browser integration
26
Well- formedness
Security (AdSafe)
AST
representation
Unblocking HTML parser
Caching and incremental updates Compression with JSZap
27
?