javascript
play

JavaScript Code Martin Burtscher, UT Austin Ben Livshits & Ben - PowerPoint PPT Presentation

JSZap : Compressing JavaScript Code Martin Burtscher, UT Austin Ben Livshits & Ben Zorn, Microsoft Research Gaurav Sinha, IIT Kanpur A Web 2.0 Application Dissected Talks to 14 backend services 1+ MB code (traffic, images, directions,


  1. JSZap : Compressing JavaScript Code Martin Burtscher, UT Austin Ben Livshits & Ben Zorn, Microsoft Research Gaurav Sinha, IIT Kanpur

  2. A Web 2.0 Application Dissected Talks to 14 backend services 1+ MB code (traffic, images, directions, ads, …) 70,000+ lines of JavaScript code 2,855 Functions downloaded 2

  3. Lots of JavaScript being Transmitted 0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100% www.live.com spreadsheets.google maps.live Up to 85% of a Web 2.0 chi.lexigame hotmail gmail app is JavaScript code! dropthings maps.google Fraction of download that is JavaScript pageflakes bunny hunt 3

  4. AJAX: Tension Headaches Move code to Execution can’t client for start without responsiveness the code 4

  5. JavaScript on the Wire JSZap JavaScript crunch gzip gzip -d parser AST 5

  6. JSZap Approach • Represent JavaScript as AST instead of source • Serialize the compressed AST • Decompress directly into AST on client • Use gzip as 2 nd -level (de-)compressor 6

  7. Benefits of AST-based Compression Reduced Latency • Compression: less to transmit • ASTs are blasted directly into the browser Reduced Network Bandwidth • Reduces mobile charges • Reduces operator network costs: better for servers Correctness, Security, and other Benefits • Ensures well-formedness of code • Can use to check language subsets easily (AdSafe) • Caching incremental updates • Unblocking HTML parser 7

  8. JSZap Compression JavaScript JSZap gzip 8

  9. JSZap Compression productions 1 JavaScript identifiers gzip 2 literals 3 9

  10. GZIP is a formidable opponent 10

  11. JSZap vs. GZIP Literals Identifiers Productions 40 35 11.5 30 8.4 Size in KB 25 20 15 19.0 18.4 10 5 5.4 5.4 0 gzip JSZap 11

  12. Talk Outline productions 1 evaluation identifiers on real 2 code literals 3 12

  13. Background: ASTs Expression Grammar Tree 1 a * b + c + 1) E  E + T E  T 2) T  T * F 3) 3 * c T  F 4) 5 F  id 5) a b 5 5 13

  14. A Simple Javascript Example var y = 2; function foo () { var x = "jscrunch"; var z = 3; z = y + y; } x = "jszap"; Production Stream 1 3 4 ... 1 3 4 ... Identifier Stream y foo x z z y y x Literal Stream "jscrunch" 2 3 "jszap" 14

  15. Benchmarking JSZap • JavaScript files up Benchmark name Source Source to 22K LOC lines bytes gmonkey 922 17,382 • Variety of app types getDOMHash 1,136 25,467 bing1 3,758 77,891 bingmap1 3,473 80,066 • Both hand- livemsg1 5,307 93,982 generated, and bingmap2 9,726 113,393 machine-generated facebook1 5,886 141,469 livemsg2 7,139 156,282 officelive1 22,016 668,051 • gzipped everything 15

  16. Components of JavaScript Source productions identifiers literals 100% 90% • None of the categories can be ignored 80% 70% • Identifiers become more prominent with code growth 60% 50% 40% 30% 20% 10% 0% gmonkey getDOMHash bing1 bingmap1 livemsg1 bingmap2 facebook1 livemsg2 officelive1 16

  17. Compressing the Production Stream • Frequency-based production renaming • Differential encoding: 26 and 57 => 2 and 3 • Chain rule: eliminate predictable productions • Tree-based prediction-by-partial-match 17

  18. PPMC • Tree context used to build a predictor • Consider compressing – if (P) then X else X • Provides the next likely … child node given context C • Should be very compressible and child position p … • if (P) then ...abc... else ...abc... • Arithmetic coding: more P likely=shorter IDs X X • See paper for details 18

  19. Production Compression (gzip = 1) 100% 50% 55% 60% 65% 70% 75% 80% 85% 90% 95% Production Compression with PPMC gmonkey getDOMHash bing1 bingmap1 livemsg1 bingmap2 0.6772 facebook1 livemsg2 officelive1 19

  20. Compressing the Identifier Stream • Symbol tables instead of identifier stream: – Compress redundancy: offset into table – Global or local symbol tables – Use variable-length encoding • Other techniques: – Sort symbols by frequency – Rename local variables 20

  21. Variable-length Encoding for Identifiers is global? is renamed local fits in 1 byte? 00… 11… 01… 10… 21

  22. Variable-Length Identifier Encoding 100% 90% 80% 70% 60% parent 50% local 2byte 40% local 1byte 30% local builtin 20% 10% global 2byte 0% global 1byte gmonkey getDOMHash bing1 bingmap1 livemsg1 bingmap2 facebook1 livemsg2 officelive1 22

  23. Identifiers (NoST = 1) 100% 80% 85% 90% 95% Symbol Tables: Effectiveness gmonkey getDOMHash Global ST bing1 bingmap1 livemsg1 VarEnc bingmap2 0.943 facebook1 89% livemsg2 officelive1 23

  24. Compressing Literals • Symbol tables • Grouping literals by type • Pre-fixes and post-fixes • These techniques result in 5-10% savings compared to gzip 24

  25. Average JSZap Compression: 10% 100% JSZap Compression (gzip = 1) 98% Productions, 96% 26% 94% 92% 90% 0.8792 88% 13% savings Identifiers, 86% 57% 84% 82% 80% gmonkey getDOMHash bing1 bingmap1 livemsg1 bingmap2 facebook1 livemsg2 officelive1 Literals, 17% 25

  26. Summary and Conclusions • JSZap: AST-based compression for JavaScript • Propose a range of techniques for compressing – Productions – Identifiers – Literals • Preliminary results are encouraging: 10% savings over gzip • Future focus – Latency measurements – Browser integration 26

  27. Security Well- (AdSafe) formedness Unblocking AST ? HTML representation parser Caching and Compression incremental with JSZap updates Questions? 27

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend