Towards More Security in Data Exchange Defining Unparsers with - - PowerPoint PPT Presentation

towards more security in data exchange
SMART_READER_LITE
LIVE PREVIEW

Towards More Security in Data Exchange Defining Unparsers with - - PowerPoint PPT Presentation

Towards More Security in Data Exchange Defining Unparsers with Context-Sensitive Encoders for Context-Free Grammars Lars Hermerschmidt, Stephan Kugelmann, Bernhard Rumpe Software Engineering RWTH Aachen http://www.se-rwth.de/ Lars


slide-1
SLIDE 1

Defining Unparsers with Context-Sensitive Encoders for Context-Free Grammars

Towards More Security in Data Exchange

Lars Hermerschmidt, Stephan Kugelmann, Bernhard Rumpe Software Engineering RWTH Aachen http://www.se-rwth.de/

slide-2
SLIDE 2

Lars Hermerschmidt

Chair of Software Engineering RWTH Aachen

Slide 2

About Me

Background

  • Penetration Tester
  • Now Software Engineering

Research Focus

  • Model-Driven Software Development
  • Textual Modeling Languages
  • Security Architecture

Why is Cross Site Scripting (XSS) Protection so hard to get right?

slide-3
SLIDE 3

Lars Hermerschmidt

Chair of Software Engineering RWTH Aachen

Slide 3

Injection Attacks

SQL Injection XSS: plenty of different contexts where JavaScript can be used

Attacker Frontend Target

HTTP SQL

Attacker Frontend Target

HTTP HTML, ...

slide-4
SLIDE 4

Lars Hermerschmidt

Chair of Software Engineering RWTH Aachen

Slide 4

Injection Attacks

SQL Injection XSS: plenty of different contexts where JavaScript can be used Injection Attack

Attacker Frontend Target

Language1 Language2

Attacker Frontend Target

HTTP SQL

Attacker Frontend Target

HTTP HTML, ... unparse parse

slide-5
SLIDE 5

Lars Hermerschmidt

Chair of Software Engineering RWTH Aachen

Slide 5

State of the art

In general: Do not trust user data, sanitize or encode it SQL: Prepared Statements HTML, JavaScript, CSS

  • context aware encoding (HTML, <script>, JavaScript in HTML

attribute, ...)

  • apply encoding automatically
  • What about all the other languages?
  • Enterprise backend communication e.g. SAP systems
  • Cyber Physical Systems like cars, industrial control systems
  • new or custom formats

[Weinberger2011]

slide-6
SLIDE 6

Lars Hermerschmidt

Chair of Software Engineering RWTH Aachen

Slide 6

It happens during unparsing

Correct roundtrip Injection: malicious AST m containing control tokens within terminals correct roundtrip for malicious AST m

AST parse unparse

x x unparse parse x   )) ( ( : AST String representation program logic's interface to the document m d parse d   ) ( : m m unparse parse

encode decode

 )) ( (

slide-7
SLIDE 7

Lars Hermerschmidt

Chair of Software Engineering RWTH Aachen

Slide 7

Defining Context-sensitive encoding

MontiCoder

  • Generate (un)parser with context-sensitive (en/de)coder
  • Define encoding per token in the grammar

Element = "tags" LCURLY TagsToken RCURLY; token LCURLY = "{"; token RCURLY = "}"; token TagsToken = (~('{' | '}' | ' '))+; encodeTable TagsToken = { "{" -> "&#x0123;", "}" -> "&#x0125;", "&" -> "&#x0038;", " " -> "&#x0020;" }; production rule

MG

slide-8
SLIDE 8

Lars Hermerschmidt

Chair of Software Engineering RWTH Aachen

Slide 8

Language Composition

  • One grammar per language (enables reuse, lowers complexity)
  • Replace terminal from super-language with start symbol of sub-

language

  • enables embedding of JavaScript in HTML
  • Encoding specified separately for each language

1

L

2

L

3

L

4

L

Unparsing

  • Start Encoding in the most nested language
  • Control characters from L2 get encoded when used in L4

Parsing

  • Start parsing super-language
  • Run decoder on tokens
  • Run subparser
slide-9
SLIDE 9

Lars Hermerschmidt

Chair of Software Engineering RWTH Aachen

Slide 9

Reducing Language Features

Use Case: Include rich user input e.g. HTML into output

  • Option 1: Reduce output language
  • Change production rules to match only tokens with special

names, define encoding

  • not elegant, but more secure
  • Option 2: Reduce input language
  • Copy input into output AST
  • Program logic must not alter this input
slide-10
SLIDE 10

Lars Hermerschmidt

Chair of Software Engineering RWTH Aachen

Slide 10

Using MontiCoder

Language Developer 1. Define output grammar and encoding table 2. Generate parser and unparser which include Context-Sensitive (de/en)coding Language user a.k.a. application developer 1. Construct an AST for the output document a) Create parsable template b) Parse template to preinitialized AST 2. Add untrusted user data to AST nodes 3. Run generated MontiCoder unparser

slide-11
SLIDE 11

Lars Hermerschmidt

Chair of Software Engineering RWTH Aachen

Slide 11

Case Study: HTML and JavaScript

  • Implemented grammars and encoding tables for HTML and

JavaScript

  • Web Application uses generated unparser
  • Performed XSS Scan with OWASP ZAP and FuzzDB
  • found no XSS
  • Manual penetration test
  • found error in one encoding table definition (<script> = <Script>)
  • added options: case-insensitive, ignore whitespaces
slide-12
SLIDE 12

Lars Hermerschmidt

Chair of Software Engineering RWTH Aachen

Slide 12

Conclusion

  • Injection attacks arise from unparsing without encoding
  • Encoding is a language property
  • defined by encoding table per grammar token
  • MontiCoder: Derive context-sensitive encoder from it's definition

within the grammar

  • NOT yet another HTML, JavaScript encoder
  • Templates considered harmful
  • Directly putting untrusted data into output
  • Context within the output is lost
  • Stop using IO APIs which have no idea of correct encoding
  • e.g. System.out.printl()
slide-13
SLIDE 13

Lars Hermerschmidt

Chair of Software Engineering RWTH Aachen

Slide 13

Thank You Comments? Questions?