staging parser combinators for efficient data processing
play

Staging Parser Combinators for Efficient Data Processing Parsing @ - PowerPoint PPT Presentation

Staging Parser Combinators for Efficient Data Processing Parsing @ SLE, 14 September 2014 Manohar Jonnalagedda What are they good for? Composable Each combinator builds a new parser from a previous one Context-sensitive We can


  1. Staging Parser Combinators for Efficient Data Processing Parsing @ SLE, 14 September 2014 Manohar Jonnalagedda

  2. What are they good for? ● Composable ○ Each combinator builds a new parser from a previous one ● Context-sensitive We can make decisions based on a specific parse result ○ ● Easy to Write DSL-style of writing ○ Tight integration with host language ○ 2

  3. Example: HTTP Response HTTP/1.1 200 OK Date: Mon, 23 May 2013 22:38:34 GMT Server: Apache/1.3.3.7 (Unix) (Red-Hat/Linux) Last-Modified: Wed, 08 Jan 2012 23:11:55 GMT Etag: "3f80f-1b6-3e1cb03b" Content-Type: text/html; charset=UTF-8 Content-Length: 129 Connection: close ... payload ... 3

  4. Example: HTTP Response Status HTTP/1.1 200 OK Date: Mon, 23 May 2013 22:38:34 GMT Server: Apache/1.3.3.7 (Unix) (Red-Hat/Linux) Last-Modified: Wed, 08 Jan 2012 23:11:55 GMT Headers Etag: "3f80f-1b6-3e1cb03b" Content-Type: text/html; charset=UTF-8 Content-Length: 129 Connection: close ... payload ... Content 4

  5. Example: HTTP Response def status = ( ("HTTP/" ~ decimalNumber) ~> wholeNumber <~ (text ~ crlf) ) map (_.toInt) Transform parse results on the fly 5

  6. Example: HTTP Response def status = ( ("HTTP/" ~ decimalNumber) ~> wholeNumber <~ (text ~ crlf) ) map (_.toInt) Transform parse results on the fly def header = (headerName <~ ":") flatMap { Make decision key => (valueParser(key) <~ crlf) map { based on parse value => (key, value) result } } 6

  7. Example: HTTP Response def status = ( ("HTTP/" ~ decimalNumber) ~> wholeNumber <~ (text ~ crlf) ) map (_.toInt) Transform parse results on the fly def header = (headerName <~ ":") flatMap { Make decision key => (valueParser(key) <~ crlf) map { based on parse value => (key, value) result } } def respWithPayload = response flatMap { Make decision r => body(r.contentLength) based on parse } result 7

  8. Parser combinators are slow Throughput Standard Parser Combinators 20x Staged Parser Combinators Topic of this talk. 9

  9. Parser Combinators are slow def status: Parser[Int] = ( ("HTTP/" ~ decimalNumber) ~> wholeNumber <~ (text ~ crlf) class Parser[T] extends (Input => ParseResult[T]) ... ) map (_.toInt) def header = (headerName <~ ":") flatMap { key => (valueParser(key) <~ crlf) map { value => (key, value) } } def respWithPayload = response flatMap { r => body(r.contentLength) } 10

  10. Parser Combinators are slow def status: Parser[Int] = ( ("HTTP/" ~ decimalNumber) ~> wholeNumber <~ (text ~ crlf) class Parser[T] extends (Input => ParseResult[T]) ... ) map (_.toInt) def header = (headerName <~ ":") flatMap { def ~[U](that: Parser[U]) = new Parser[(T,U)] { key => (valueParser(key) <~ crlf) map { def apply (i: Input ) = ... value => (key, value) } } } def respWithPayload = response flatMap { r => body(r.contentLength) } 11

  11. Parser Combinators are slow ● Prohibitive composition overhead ● But: composition is mostly static Let us systematically remove it! ○ 12

  12. Staged Parser Combinators Composition of Parsers 12

  13. Staged Parser Combinators Composition of Parsers Composition of Code Generators 13

  14. Staging (LMS) def add3(a: Int, b: Int, c: 6 add3(1, 2, 3) Int) = a + b + c ‘Classic’ evaluation 14

  15. Staging (LMS) Expression in the next stage def add3(a: Rep[ Int ], b: Int, c: Int) = a + b + c Executed at staging time Executed at staging time Constant in the next stage Constant in the next stage Adding Rep types def add3(a: Int, b: Int, c: 6 add3(1, 2, 3) Int) = a + b + c ‘Classic’ evaluation 15

  16. Staging (LMS) Expression in the next stage def add$3$2$3(a: Int ) def add3(a: Rep[ Int ], b: add3(x, 2, 3) = a + 5 Int, c: Int) = a + b + c Code generation Executed at staging time Executed at staging time Constant in the next stage Constant in the next stage Evaluation of add$3$2$3(1) Adding Rep types generated code def add3(a: Int, b: Int, c: 6 add3(1, 2, 3) Int) = a + b + c ‘Classic’ evaluation 16

  17. LMS LMS runtime User-written code, Generated/optimized code may contain Rep types code. generation 17

  18. Staging Parser Combinators Composition of Code Generators dynamic inputs dynamic input/output class Parser[T] extends class Parser[T] extends (Input => ParseResult (Rep[Input] => Rep[ParseResult[T]]) [T]) static function: application == inlining for free 18

  19. Staging Parser Combinators Composition of Code Generators dynamic input/output dynamic inputs class Parser[T] extends class Parser[T] extends (Input => ParseResult (Rep[Input] => Rep[ParseResult[T]]) [T]) static function: application == inlining for free def ~[U](that: Parser def ~[U](that: Parser still a code generator [U]) [U]) def map[U](f: T => U): Parser def map[U](f: Rep[T] => Rep[U]): Parser[U] [U] 19

  20. Staging Parser Combinators Composition of Code Generators dynamic input/output dynamic inputs class Parser[T] extends class Parser[T] extends (Input => ParseResult (Rep[Input] => Rep[ParseResult[T]]) [T]) static function: application == inlining for free def ~[U](that: Parser def ~[U](that: Parser still a code generator [U]) [U]) def map[U](f: T => U): Parser def map[U](f: Rep[T] => Rep[U]): Parser[U] [U] def flatMap[U](f: Rep[T] => Parser def flatMap[U](f: T => Parser[U]) [U]) : Parser[U] : Parser[U] still a code generator 20

  21. A closer look def respWithPayload: Parser[..] = User-written parser response flatMap { r => body(r.contentLength) } code generation // code for parsing response val response = parseHeaders() val n = response.contentLength //parsing body Generated code var i = 0 while (i < n) { readByte() i += 1 } 21

  22. Gotchas ● Recursion ○ explicit recursion combinator (fix-point like) ● Diamond control flow code generation blowup ○ General solution generate staged functions ( Rep[Input => ParseResult] ) ○ 22

  23. Performance: Parsing JSON 20 times faster than Scala’s ● 3 times faster than Parboiled2 ● parser combinators 23

  24. Performance HTTP Response CSV 24

  25. If you want to know more ● Parser Combinators for Dynamic Programming [OOPSLA ‘14] ○ based on ADP code gen for GPU ○ ● Using Scala Macros [Scala ‘14] 25

  26. Desirable Parser Properties Hand-written Parser Generators Staged Parser Combinators Composable ✓ ✓ X Customizable ✓ X X Context-Sensitive ✓ ✓ ~ Fast ✓ ✓ ✓ Easy to write ✓ ✓ X 26

  27. The people ● Eric Béguet ● Sandro Stucki ● Thierry Coppey ● Tiark Rompf ● Martin Odersky 27

  28. Tack! Fråga?

  29. Staging all the way down ● Staged structs ○ boxing of temporary results eliminated ● Staged strings substring not computed all the time ○

  30. Optimizing String handling class InputWindow[Input](val in: Input, val start: Int , val end: Int ){ override def equals(x: Any) = x match { case s : InputWindow[Input] => s.in == in && s.start == start && s.end == end case _ => super.equals(x) } }

  31. Key performance impactors Standard Parser Combinators Beware! ● String.substring is in linear time ( >= Java 1.6). ● Parsers on Strings are inefficient. ● Need to use a FastCharSequence which mimics original behaviour of substring.

  32. Key performance impactors Standard Parser Combinators Standard Parser Combinators with FastCharSequence

  33. Key performance impactors Standard Parser Combinators Standard Parser Combinators with FastCharSequence ~7-8x FastParsers with error reporting and without inlining

  34. Key performance impactors Standard Parser Combinators Standard Parser Combinators with FastCharSequence ~7-8x FastParsers with error reporting and without inlining ~ 2x FastParsers without error reporting without inlining

  35. Key performance impactors Standard Parser Combinators Standard Parser Combinators with FastCharSequence ~7-8x FastParsers with error reporting and without inlining ~ 2x FastParsers without error reporting without inlining ~ 30% FastParsers without error reporting with inlining

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend