A hybrid analysis framework for detecting web application - - PowerPoint PPT Presentation

a hybrid analysis framework for detecting web application
SMART_READER_LITE
LIVE PREVIEW

A hybrid analysis framework for detecting web application - - PowerPoint PPT Presentation

Universit` a degli Studi di Milano Facolt` a di Scienze Matematiche, Fisiche e Naturali Dipartimento di Informatica e Comunicazione A hybrid analysis framework for detecting web application vulnerabilities Mattia Monga Roberto Paleari


slide-1
SLIDE 1

Universit` a degli Studi di Milano Facolt` a di Scienze Matematiche, Fisiche e Naturali Dipartimento di Informatica e Comunicazione

A hybrid analysis framework for detecting web application vulnerabilities

Mattia Monga Roberto Paleari Emanuele Passerini

SESS 2009

  • M. Monga, R. Paleari, E. Passerini

A hybrid analysis framework for detecting . . . SESS 2009 1 / 15

slide-2
SLIDE 2

Introduction

Web applications

many applications adopt the web paradigm: client-server model + HTTP protocol web servers are augmented with modules for the execution of server-side code

Security issues

web applications are known to be subject to different attacks (e.g., SQLI and XSS) ∼ 60% of software vulnerabilities are specific to web applications

Root cause

insufficient sanitization of user-supplied input

  • M. Monga, R. Paleari, E. Passerini

A hybrid analysis framework for detecting . . . SESS 2009 2 / 15

slide-3
SLIDE 3

Introduction

Web applications

many applications adopt the web paradigm: client-server model + HTTP protocol web servers are augmented with modules for the execution of server-side code

Security issues

web applications are known to be subject to different attacks (e.g., SQLI and XSS) ∼ 60% of software vulnerabilities are specific to web applications

Root cause

insufficient sanitization of user-supplied input

  • M. Monga, R. Paleari, E. Passerini

A hybrid analysis framework for detecting . . . SESS 2009 2 / 15

slide-4
SLIDE 4

Taint analysis of web applications

How it works?

1 data from untrusted sources are marked as tainted 2 propagation of the “taint” attribute 3 alert if tainted data with malicious characters reach a sink 4 sanitization: tainted → untainted

Static analysis

complete no run-time overhead

  • verly conservative:

results can be imprecise

Dynamic analysis

accurate results incomplete high overhead (∼30%)

  • M. Monga, R. Paleari, E. Passerini

A hybrid analysis framework for detecting . . . SESS 2009 3 / 15

slide-5
SLIDE 5

Taint analysis of web applications

How it works?

1 data from untrusted sources are marked as tainted 2 propagation of the “taint” attribute 3 alert if tainted data with malicious characters reach a sink 4 sanitization: tainted → untainted

Static analysis

complete no run-time overhead

  • verly conservative:

results can be imprecise

Dynamic analysis

accurate results incomplete high overhead (∼30%)

  • M. Monga, R. Paleari, E. Passerini

A hybrid analysis framework for detecting . . . SESS 2009 3 / 15

slide-6
SLIDE 6

A hybrid approach

Goal

design and develop a hybrid analysis framework in order to obtain: accurate results low run-time overhead

Our idea

1 off-line analysis

build a static model of the whole application identify dangerous code statements

2 on-line analysis

dynamic taint-analysis over dangerous statements

  • M. Monga, R. Paleari, E. Passerini

A hybrid analysis framework for detecting . . . SESS 2009 4 / 15

slide-7
SLIDE 7

A hybrid approach

Goal

design and develop a hybrid analysis framework in order to obtain: accurate results low run-time overhead

Our idea

1 off-line analysis

build a static model of the whole application identify dangerous code statements

2 on-line analysis

dynamic taint-analysis over dangerous statements

  • M. Monga, R. Paleari, E. Passerini

A hybrid analysis framework for detecting . . . SESS 2009 4 / 15

slide-8
SLIDE 8

Motivating example

1 function get_product($id) { 2 $q = "SELECT ... WHERE id=$id"; 3 mysql_connect(...); 4 $res = mysql_query($q); 5 } 6 if(isset($_GET[’product_id’])) { 7 $a = $_GET[’product_id’]; 8 get_product($a); 9 } else { 10 $msg = ’Invalid request’; 11 echo $msg; 12 }

  • M. Monga, R. Paleari, E. Passerini

A hybrid analysis framework for detecting . . . SESS 2009 5 / 15

slide-9
SLIDE 9

Motivating example

1 function get_product($id) { 2 $q = "SELECT ... WHERE id=$id"; 3 mysql_connect(...); 4 $res = mysql_query($q); 5 } 6 if(isset($_GET[’product_id’])) { 7 $a = $_GET[’product_id’]; 8 get_product($a); 9 } else { 10 $msg = ’Invalid request’; 11 echo $msg; 12 }

Vulnerability

SQL injection control-dependent on condition at line 6

  • M. Monga, R. Paleari, E. Passerini

A hybrid analysis framework for detecting . . . SESS 2009 5 / 15

slide-10
SLIDE 10

Motivating example

1 function get_product($id) { 2 $q = "SELECT ... WHERE id=$id"; 3 mysql_connect(...); 4 $res = mysql_query($q); 5 } 6 if(isset($_GET[’product_id’])) { 7 $a = $_GET[’product_id’]; 8 get_product($a); 9 } else { 10 $msg = ’Invalid request’; 11 echo $msg; 12 }

Off-line analysis

identify dangerous statements

  • M. Monga, R. Paleari, E. Passerini

A hybrid analysis framework for detecting . . . SESS 2009 5 / 15

slide-11
SLIDE 11

Motivating example

1 function get_product($id) { 2 $q = "SELECT ... WHERE id=$id"; 3 mysql_connect(...); 4 $res = mysql_query($q); 5 } 6 if(isset($_GET[’product_id’])) { 7 $a = $_GET[’product_id’]; 8 get_product($a); 9 } else { 10 $msg = ’Invalid request’; 11 echo $msg; 12 }

On-line analysis

taint-propagation only over dangerous statements

  • M. Monga, R. Paleari, E. Passerini

A hybrid analysis framework for detecting . . . SESS 2009 5 / 15

slide-12
SLIDE 12

Phan: PHP Hybrid Analyzer

translate into IR construct CFG/iCFG dangerous statements

  • ff-line analysis

identify sources propagate taint info detect attack execution loop

  • M. Monga, R. Paleari, E. Passerini

A hybrid analysis framework for detecting . . . SESS 2009 6 / 15

slide-13
SLIDE 13

Off-line analysis

Translation into IR

6 if(isset($_GET[’product_id’])) { 7 $a = $_GET[’product_id’]; 8 get_product($a); 9 } else { 10 $msg = ’Invalid request’; 11 echo $msg; 12 }

6 V0 := T0__GET 6 P0 := V0[c("product_id")] 6 P1 := c(1) 6 T1 := CALL c("isset") 6 JUMP ((T1 == c(0))) c(10) 7 V2 := T0__GET 7 V3 := V2[c("product_id")] 7 C0_a := V3 7 V4 := C0_a 8 P1 := C0_a 8 V5 := CALL c("get_product") 9 JUMP c(12) 10 C1_msg := c("Invalid...") 10 V6 := C1_msg 11 P0 := C1_msg 11 CALL c("echo") 12 RET c(1)

Intermediate language

RISC-like instructions 5 instruction types, 4 expression types

  • M. Monga, R. Paleari, E. Passerini

A hybrid analysis framework for detecting . . . SESS 2009 7 / 15

slide-14
SLIDE 14

Off-line analysis

CFG construction

1 function get_product($id) { 2 $q = "SELECT ... WHERE id=$id"; 3 mysql_connect(...); 4 $res = mysql_query($q); 5 } 6 if(isset($_GET[’product_id’])) { 7 $a = $_GET[’product_id’]; 8 get_product($a); 9 } else { 10 $msg = ’Invalid request’; 11 echo $msg; 12 }

00 C0_id := P1 01 T0 := c("") 02 T0 := (T0 . c("SELECT ... WHERE id=")) 03 T0 := (T0 . C0_id) 04 C1_q := T0 04 V1 := C1_q 05 D0 := c("mysql_query") 06 P1 := C1_q 07 V2 := CALL D0 08 C2_res := V2 08 V3 := C2_res 09 RET c(None)

00 NOP 01 V0 := T0__GET 02 P0 := V0[c("product_id")] 02 P1 := c(1) 02 T1 := CALL c("###isset###") 03 JUMP ((T1 == c(0))) c(10) 10 C1_msg := c("Invalid request") 10 V6 := C1_msg 11 P0 := C1_msg 11 CALL c("echo") 04 V2 := T0__GET 05 V3 := V2[c("product_id")] 06 C0_a := V3 06 V4 := C0_a 07 P1 := C0_a 08 V5 := CALL c("get_product") 12 RET c(1) 09 JUMP c(12)

  • M. Monga, R. Paleari, E. Passerini

A hybrid analysis framework for detecting . . . SESS 2009 8 / 15

slide-15
SLIDE 15

Off-line analysis

iCFG construction

1 function get_product($id) { 2 $q = "SELECT ... WHERE id=$id"; 3 mysql_connect(...); 4 $res = mysql_query($q); 5 } 6 if(isset($_GET[’product_id’])) { 7 $a = $_GET[’product_id’]; 8 get_product($a); 9 } else { 10 $msg = ’Invalid request’; 11 echo $msg; 12 }

00 C0_id := P1 01 T0 := c("") 02 T0 := (T0 . c("SELECT ... WHERE id=")) 03 T0 := (T0 . C0_id) 04 C1_q := T0 04 V1 := C1_q 05 D0 := c("mysql_query") 06 P1 := C1_q 07 V2 := CALL D0 08 C2_res := V2 08 V3 := C2_res 09 V5 := c(None) 09 JUMP c(12) 00 NOP 01 V0 := T0__GET 02 P0 := V0[c("product_id")] 02 P1 := c(1) 02 T1 := CALL c("###isset###") 03 JUMP ((T1 == c(0))) c(10) 10 C1_msg := c("Invalid request") 10 V6 := C1_msg 11 P0 := C1_msg 11 CALL c("echo") 04 V2 := T0__GET 05 V3 := V2[c("product_id")] 06 C0_a := V3 06 V4 := C0_a 07 P1 := C0_a 08 CALL c("get_product") 12 RET c(1)

constant propagation to handle iCTI handling of inclusion statements

  • M. Monga, R. Paleari, E. Passerini

A hybrid analysis framework for detecting . . . SESS 2009 9 / 15

slide-16
SLIDE 16

Off-line analysis

Identification of dangerous statements

1 function get_product($id) { 2 $q = "SELECT ... WHERE id=$id"; 3 mysql_connect(...); 4 $res = mysql_query($q); 5 } 6 if(isset($_GET[’product_id’])) { 7 $a = $_GET[’product_id’]; 8 get_product($a); 9 } else { 10 $msg = ’Invalid request’; 11 echo $msg; 12 }

identify sources and sinks find paths from sources to sinks compute backward slice over sinks arguments flag only dangerous statements

  • M. Monga, R. Paleari, E. Passerini

A hybrid analysis framework for detecting . . . SESS 2009 10 / 15

slide-17
SLIDE 17

Off-line analysis

Identification of dangerous statements

1 function get_product($id) { 2 $q = "SELECT ... WHERE id=$id"; 3 mysql_connect(...); 4 $res = mysql_query($q); 5 } 6 if(isset($_GET[’product_id’])) { 7 $a = $_GET[’product_id’]; 8 get_product($a); 9 } else { 10 $msg = ’Invalid request’; 11 echo $msg; 12 }

identify sources and sinks find paths from sources to sinks compute backward slice over sinks arguments flag only dangerous statements ignore sinks with constant input arguments

  • M. Monga, R. Paleari, E. Passerini

A hybrid analysis framework for detecting . . . SESS 2009 10 / 15

slide-18
SLIDE 18

Off-line analysis

Identification of dangerous statements

1 function get_product($id) { 2 $q = "SELECT ... WHERE id=$id"; 3 mysql_connect(...); 4 $res = mysql_query($q); 5 } 6 if(isset($_GET[’product_id’])) { 7 $a = $_GET[’product_id’]; 8 get_product($a); 9 } else { 10 $msg = ’Invalid request’; 11 echo $msg; 12 }

identify sources and sinks find paths from sources to sinks compute backward slice over sinks arguments flag only dangerous statements

  • M. Monga, R. Paleari, E. Passerini

A hybrid analysis framework for detecting . . . SESS 2009 10 / 15

slide-19
SLIDE 19

Off-line analysis

Identification of dangerous statements

1 function get_product($id) { 2 $q = "SELECT ... WHERE id=$id"; 3 mysql_connect(...); 4 $res = mysql_query($q); 5 } 6 if(isset($_GET[’product_id’])) { 7 $a = $_GET[’product_id’]; 8 get_product($a); 9 } else { 10 $msg = ’Invalid request’; 11 echo $msg; 12 }

identify sources and sinks find paths from sources to sinks compute backward slice over sinks arguments flag only dangerous statements

  • M. Monga, R. Paleari, E. Passerini

A hybrid analysis framework for detecting . . . SESS 2009 10 / 15

slide-20
SLIDE 20

Off-line analysis

Identification of dangerous statements

1 function get_product($id) { 2 $q = "SELECT ... WHERE id=$id"; 3 mysql_connect(...); 4 $res = mysql_query($q); 5 } 6 if(isset($_GET[’product_id’])) { 7 $a = $_GET[’product_id’]; 8 get_product($a); 9 } else { 10 $msg = ’Invalid request’; 11 echo $msg; 12 }

identify sources and sinks find paths from sources to sinks compute backward slice over sinks arguments flag only dangerous statements

  • M. Monga, R. Paleari, E. Passerini

A hybrid analysis framework for detecting . . . SESS 2009 10 / 15

slide-21
SLIDE 21

On-line analysis

Dynamic taint analysis 1 function get_product($id) { 2 $q = "SELECT ... WHERE id=$id"; 3 mysql_connect(...); 4 $res = mysql_query($q); 5 } 6 if(isset($_GET[’product_id’])) { 7 $a = $_GET[’product_id’]; 8 get_product($a); 9 } else { 10 $msg = ’Invalid request’; 11 echo $msg; 12 }

On-line analysis

1 monitor only dangerous statements 2 taint-propagation 3 alert when tainted data reaches a sensitive sink

  • M. Monga, R. Paleari, E. Passerini

A hybrid analysis framework for detecting . . . SESS 2009 11 / 15

slide-22
SLIDE 22

On-line analysis

Dynamic taint analysis 1 function get_product($id) { 2 $q = "SELECT ... WHERE id=$id"; 3 mysql_connect(...); 4 $res = mysql_query($q); 5 } 6 if(isset($_GET[’product_id’])) { 7 $a = $_GET[’product_id’]; 8 get_product($a); 9 } else { 10 $msg = ’Invalid request’; 11 echo $msg; 12 }

On-line analysis

1 monitor only dangerous statements 2 taint-propagation 3 alert when tainted data reaches a sensitive sink

  • M. Monga, R. Paleari, E. Passerini

A hybrid analysis framework for detecting . . . SESS 2009 11 / 15

slide-23
SLIDE 23

On-line analysis

Dynamic taint analysis 1 function get_product($id) { 2 $q = "SELECT ... WHERE id=$id"; 3 mysql_connect(...); 4 $res = mysql_query($q); 5 } 6 if(isset($_GET[’product_id’])) { 7 $a = $_GET[’product_id’]; 8 get_product($a); 9 } else { 10 $msg = ’Invalid request’; 11 echo $msg; 12 }

On-line analysis

1 monitor only dangerous statements 2 taint-propagation 3 alert when tainted data reaches a sensitive sink

  • M. Monga, R. Paleari, E. Passerini

A hybrid analysis framework for detecting . . . SESS 2009 11 / 15

slide-24
SLIDE 24

On-line analysis

Dynamic taint analysis 1 function get_product($id) { 2 $q = "SELECT ... WHERE id=$id"; 3 mysql_connect(...); 4 $res = mysql_query($q); 5 } 6 if(isset($_GET[’product_id’])) { 7 $a = $_GET[’product_id’]; 8 get_product($a); 9 } else { 10 $msg = ’Invalid request’; 11 echo $msg; 12 }

On-line analysis

1 monitor only dangerous statements 2 taint-propagation 3 alert when tainted data reaches a sensitive sink

  • M. Monga, R. Paleari, E. Passerini

A hybrid analysis framework for detecting . . . SESS 2009 11 / 15

slide-25
SLIDE 25

On-line analysis

Dynamic taint analysis 1 function get_product($id) { 2 $q = "SELECT ... WHERE id=$id"; 3 mysql_connect(...); 4 $res = mysql_query($q); 5 } 6 if(isset($_GET[’product_id’])) { 7 $a = $_GET[’product_id’]; 8 get_product($a); 9 } else { 10 $msg = ’Invalid request’; 11 echo $msg; 12 }

SQL injection

On-line analysis

1 monitor only dangerous statements 2 taint-propagation 3 alert when tainted data reaches a sensitive sink

  • M. Monga, R. Paleari, E. Passerini

A hybrid analysis framework for detecting . . . SESS 2009 11 / 15

slide-26
SLIDE 26

Implementation

Off-line module

PHP extension module bytecode to IR translator IR analysis modules

◮ ∼ 6000 Python LoC + ∼ 1500 C LoC

On-line module

hooks inside the Zend VM self-contained module (easily portable)

◮ ∼ 1000 C LoC

  • M. Monga, R. Paleari, E. Passerini

A hybrid analysis framework for detecting . . . SESS 2009 12 / 15

slide-27
SLIDE 27

Preliminary evaluation

Application Type Opc Path opc Dangerous opc Clean CMS 1.5 SQLI 221 104 56 (53.85%) Goople CMS 1.8.2 SQLI 62 58 17 (29.31%) MyForum 1.3 SQLI 1102 651 141 (21.66%) Pizzis CMS 1.5.1 SQLI 91 38 11 (28.95%) W2B phpGreetCards XSS 1078 814 221 (27.15%) WordPress XSS 612 26 10 (38.46%)

Experimental results

  • pen-source applications with known vulnerabilities

high performance gain future improvements can further reduce run-time overhead

  • M. Monga, R. Paleari, E. Passerini

A hybrid analysis framework for detecting . . . SESS 2009 13 / 15

slide-28
SLIDE 28

Conclusions

Contributions

hybrid program analysis framework to detect input-driven security vulnerability in web application prototype implementation for PHP (at bytecode level)

Limitations

93/150 Zend opcodes limited support for aliasing and class constructs second-order injections

Future Work

improve static analysis module (e.g., static taint analysis) support more Zend opcodes

  • M. Monga, R. Paleari, E. Passerini

A hybrid analysis framework for detecting . . . SESS 2009 14 / 15

slide-29
SLIDE 29

Thank you for the attention!

Questions?

  • M. Monga, R. Paleari, E. Passerini

A hybrid analysis framework for detecting . . . SESS 2009 15 / 15