Retrofitting Security in input parsing routines Jayakrishna Menon, - - PowerPoint PPT Presentation

retrofitting security in input parsing routines
SMART_READER_LITE
LIVE PREVIEW

Retrofitting Security in input parsing routines Jayakrishna Menon, - - PowerPoint PPT Presentation

Retrofitting Security in input parsing routines Jayakrishna Menon, Christophe Hauser, Yan Shoshitaishvili, Stephen Schwab {jmenon, hauser, schwab}@isi.edu yans@asu.edu Modern defenses Vulnerabilities Many programs are still OS defenses


slide-1
SLIDE 1

Retrofitting Security in input parsing routines

Jayakrishna Menon, Christophe Hauser, Yan Shoshitaishvili, Stephen Schwab

{jmenon, hauser, schwab}@isi.edu yans@asu.edu

slide-2
SLIDE 2
  • Many programs are still

written in unsafe languages like C/C++.

  • Memory corruption

vulnerabilities remain prominent.

  • OS defenses (ASLR, DEP).
  • Compiler-level defenses

(e.g., stack canaries).

  • Code audit tools.

Modern defenses Vulnerabilities

slide-3
SLIDE 3

parsers

  • Directly exposed to user input.
  • Many custom implementations in unsafe languages (C/C++).
  • Over 170 vulnerabilities reported in various parsing

mechanisms since 1999.

  • Varying semantics and the abundance of string

manipulations make their implementation error-prone.

slide-4
SLIDE 4

Solution space

slide-5
SLIDE 5
  • Code audits.
  • Refactoring/inserting

correct parsers.

  • No source code?
  • Parser libraries.
  • Parser generators.
  • Formal methods.

Design time security post-design security

slide-6
SLIDE 6
  • Source code not always

available (legacy code, uncooperative editors, untrusted IoT devices).

  • What you see is not what

you execute: compiler bugs, compiler “backdoors” e.g., XCodeGhost (linking malicious code into executables).

Binary-level approach

WYSINWYX

slide-7
SLIDE 7

challenges

slide-8
SLIDE 8

Scaling problem

Program analysis techniques are difficult to automate in a scalable and precise manner.

slide-9
SLIDE 9
  • Precise.
  • Unscalable.
  • Scalable.
  • Imprecise.

Static analysis Symbolic execution

slide-10
SLIDE 10

Dynamic analysis

  • Precise.
  • Low coverage.
slide-11
SLIDE 11
  • Registers.
  • Memory locations.
  • Basic blocks.
  • ...
  • Types.
  • Variable names.
  • Functions.
  • ...

Source code Binary

slide-12
SLIDE 12

How to scale to real world programs?

slide-13
SLIDE 13

template-based approach

… to discover vulnerabilities based on templates corresponding to common classes of security bugs. … to retrofit security by patching programs at the binary-level.

slide-14
SLIDE 14
  • Unconstrained input.
  • Under-constrained input

size.

  • Unchecked termination

condition.

  • ...
  • Focuses on overflows in

buffers allocated statically on the stack.

  • template-based:

categorize causes of vulnerabilities into three classes.

  • Combines static analysis

and symbolic execution.

Initial approach classes/templates

slide-15
SLIDE 15

Unconstrained input.

Improper usage of functions that do not check for sizes such as strcpy, sprintf etc.

slide-16
SLIDE 16

Example 1: CVE-2003-0390

int opt_atoi( char *s) { char buf[1024]; char *fmt = "String [%s] is not valid"; sprintf(buf, fmt, s); }

slide-17
SLIDE 17

Under-constrained input size.

Improper validation of size field in functions such as memcpy.

slide-18
SLIDE 18

Example 2: CVE-2015-3329

void phar_set_inode( phar_entry_info *entry) { char tmp[1024]; memcpy(tmp, entry->phar->fname, entry->phar->fname_len); }

slide-19
SLIDE 19

Unchecked termination condition.

Performing operations on (possibly) incorrectly terminated strings.

slide-20
SLIDE 20

2-step Analysis approach

CFG DDG

Identify string manipulation functions. Analyze backward data-dependency. Identify destination buffers (sinks). Identify user input.

SE

}

Dangerous program paths.}

}

Path constraints. (Memory corruption caused by unsafe buffer manipulation) Static analysis Symbolic analysis

slide-21
SLIDE 21

Analysis results

Static Analysis Symbolic execution Overall False positive rate 6.6% 0% 0% * False negative rate 40% 0% * 40% Time 1-260s 1-400s 2-660s

slide-22
SLIDE 22

2 new bugs found in the binary code of common opensource projects and libraries (in a semi-automatic setting)

New bugs

slide-23
SLIDE 23

Retrofitting security: binary patching

slide-24
SLIDE 24

Adding the missing checks

  • Remember: we focus on stack

buffers.

  • On the identified program

paths, we constrain the user input such that: user_input_size < stack_buffer_size

slide-25
SLIDE 25

Adding the missing checks

When the constraints are violated, we crash the program. This is equivalent to e.g., __sprintf_chk()

slide-26
SLIDE 26

Patching the binary

Static reassembly problems: breaking internal program references. Partial solution: inject trampoline gadgets in padding bytes between functions (up to 15 consecutive NOPs).

slide-27
SLIDE 27

Inserting checks

int opt_atoi(char *s) sprintf(buf, fmt, s); int opt_atoi(char *s) if(strlen(s)>1024) exit() sprintf(buf, fmt, s);

slide-28
SLIDE 28

More templates

slide-29
SLIDE 29

New template

Memory allocation errors … authentication errors. … misuses of cryptographic APIs. … information leakage.

slide-30
SLIDE 30

12 new bugs found in the binary code of common opensource programs and libraries (in a fully automated setting).

New bugs

slide-31
SLIDE 31

discussion

Lightweight and scalable approach. … but high rate of false negatives. … limited patching capabilities.

slide-32
SLIDE 32

Stumbling blocks

Data structure recovery. Pointer aliasing.

slide-33
SLIDE 33

Future work

  • Improve data dependence tracking.
  • Leverage static reassembly techniques.
  • More vulnerability templates.
  • Apply to large corpus of IoT firmware.
slide-34
SLIDE 34

Key takeaways

  • Templates per vulnerability class.
  • Scalable, two-level approach based on a

combination of static analysis + symbolic execution.

  • High-precision: we can infer semantic-agnostic

patches for each class.

  • New bugs.
slide-35
SLIDE 35

?