vuddy a scalable approach for
play

VUDDY: A Scalable Approach for Vulnerable Code Clone Detection - PowerPoint PPT Presentation

IEEE S&P 2017 VUDDY: A Scalable Approach for Vulnerable Code Clone Detection Seulbae Kim , Seunghoon Woo, Heejo Lee, and Hakjoo Oh Korea University May 23, 2017 Question Number of unpatched vulnerabilities in smartphone firmwares


  1. IEEE S&P 2017 VUDDY: A Scalable Approach for Vulnerable Code Clone Detection Seulbae Kim , Seunghoon Woo, Heejo Lee, and Hakjoo Oh Korea University May 23, 2017

  2. Question • Number of unpatched vulnerabilities in smartphone firmware’s source code? 200+ unpatched vulnerable code clones detected! Computer & Communication Security Lab., Korea University 1

  3. Motivation • Number of open source software is increasing Computer & Communication Security Lab., Korea University 2

  4. Motivation • Code clones – reused code fragments • Major cause of vulnerability propagation CVE-2016-5195 Computer & Communication Security Lab., Korea University 3

  5. Problem: Scalable & Accurate Vulnerable Code Clone Discovery Computer & Communication Security Lab., Korea University 4

  6. Scalable & Accurate Vulnerable Code Clone discovery • Scalability Software systems are getting bigger Linux kernel – 25.4 MLoC accuracy “L” Smart TV – 35 MLoC scalability Computer & Communication Security Lab., Korea University 5

  7. Scalable & Accurate Vulnerable Code Clone discovery • Accuracy scalability FP == increased time and efforts accuracy Computer & Communication Security Lab., Korea University 6

  8. Scalable & Accurate Vulnerable Code Clone discovery • Previous approaches accuracy Line-level Token-level matching matching Jang et al., Kamiya et al., Graph/tree ReDeBug (S&P’12) CCFinder (TSE’02) matching Bag-of-tokens Jiang et al ., (ICSE’07) matching Sasaki et al., Sajnani et al., FCFinder (MSR’10) SourcererCC (ICSE’16) File-level matching scalability Computer & Communication Security Lab., Korea University 7

  9. Scalable & Accurate Vulnerable Code Clone discovery • Goal accuracy ? Line-level Token-level matching matching Jang et al., Kamiya et al., Graph/tree ReDeBug (S&P’12) CCFinder (TSE’02) matching Bag-of-tokens Jiang et al ., (ICSE’07) matching Sasaki et al., Sajnani et al., FCFinder (MSR’10) SourcererCC (ICSE’16) File-level matching scalability Computer & Communication Security Lab., Korea University 8

  10. Proposed Method: VUDDY Computer & Communication Security Lab., Korea University 9

  11. Demonstration of VUDDY Computer & Communication Security Lab., Korea University 10

  12. Proposed method: VUDDY • VUDDY: VUlnerable coDe clone DiscoverY Computer & Communication Security Lab., Korea University 11

  13. Proposed method: VUDDY • VUDDY: VUlnerable coDe clone DiscoverY • Searches for vulnerable code clones Computer & Communication Security Lab., Korea University 12

  14. Proposed method: VUDDY • VUDDY: VUlnerable coDe clone DiscoverY • Searches for vulnerable code clones • Scales beyond 1 BLoC target Computer & Communication Security Lab., Korea University 13

  15. Proposed method: VUDDY • VUDDY: VUlnerable coDe clone DiscoverY • Searches for vulnerable code clones • Scales beyond 1 BLoC target • Detects both known & unknown vulnerability Computer & Communication Security Lab., Korea University 14

  16. Proposed method: VUDDY • VUDDY: VUlnerable coDe clone DiscoverY • Searches for vulnerable code clones • Scales beyond 1 BLoC target • Detects both known & unknown vulnerability • Low false positive rate Computer & Communication Security Lab., Korea University 15

  17. Proposed method: VUDDY • Overview fingerprinting dictionary vulnerable functions fingerprint dictionary comparison vulnerable of vulnerable functions code clones fingerprinting A Program a target program fingerprint dictionary of target functions Computer & Communication Security Lab., Korea University 16

  18. Collecting vulnerable code • Vulnerability patching Old code New code CVE patch (vulnerable) (fixed) Computer & Communication Security Lab., Korea University 17

  19. Collecting vulnerable code • Reconstructing vulnerability from security patch Old code Software repository CVE patch (vulnerable) Computer & Communication Security Lab., Korea University 18

  20. Fingerprinting a program A Program Computer & Communication Security Lab., Korea University 19

  21. Fingerprinting a program 1. Retrieve all functions from a program int sum (int a, int b) { return a + b; } void increment() { int num = 80; A Program num++; // no return } void printer (char* src) { printf(“%s”, src); } Computer & Communication Security Lab., Korea University 20

  22. Fingerprinting a program 2. Apply abstraction and normalization to functions int sum (int a, int b) { returnfparam+fparam; return a + b; } void increment() { int num = 80; dtypelvar=80;lvar++; A Program num++; // no return } void printer (char* src) { funccall (“%s”, fparam); printf (“%s”, src); } Computer & Communication Security Lab., Korea University 21

  23. Fingerprinting a program 3. Compute length and hash value int sum (int a, int b) length : 20 { returnfparam+fparam; return a + b; hash val: C94D9910… } void increment() { length : 20 int num = 80; dtypelvar=80;lvar++; A Program hash val: D6E77882… num++; // no return } void printer (char* src) length : 23 { funccall (“%s”, fparam); printf (“%s”, src); hash val: 9A45E4A1… } Computer & Communication Security Lab., Korea University 22

  24. Fingerprinting a program 4. Store in a dictionary length : 20 hash val: C94D9910… “Fingerprint dictionary” 20: [C94D9910, D6E77882] length : 20 A Program hash val: D6E77882… 23: [9A45E4A1] length : 23 hash val: 9A45E4A1… Computer & Communication Security Lab., Korea University 23

  25. Abstraction • Transform function by replacing • Formal parameters Level 0: No abstraction • Data types 1 void avg (float arr [], int len ) { 2 static float sum = 0; • Local variables 3 unsigned int i; 4 • Function names 5 for (i = 0; i < len ; i++) { 6 sum += arr [i]; 7 } 8 9 printf (“%f %d \ n”, sum/ len , validate (sum)); 10 } Computer & Communication Security Lab., Korea University 24

  26. Abstraction • Transform function by replacing • Formal parameters Level 1: Formal parameter abstraction 1 void avg (float FPARAM [], int FPARAM ) { • Data types 2 static float sum = 0; • Local variables 3 unsigned int i; 4 • Function names 5 for (i = 0; i < FPARAM ; i++) { 6 sum += FPARAM [i]; 7 } 8 9 printf (“%f %d \ n”, sum/ FPARAM , validate (sum)); 10 } Computer & Communication Security Lab., Korea University 25

  27. Abstraction • Transform function by replacing • Formal parameters Level 2: Local variable name abstraction 1 void avg (float FPARAM[], int FPARAM) { • Data types 2 static float LVAR = 0; • Local variables 3 unsigned int LVAR ; 4 • Function names 5 for ( LVAR = 0; LVAR < FPARAM; LVAR ++) { 6 LVAR += FPARAM[ LVAR ]; 7 } 8 9 printf (“%f %d \ n”, LVAR /FPARAM, validate ( LVAR )); 10 } Computer & Communication Security Lab., Korea University 26

  28. Abstraction • Transform function by replacing • Formal parameters Level 3: Data type abstraction 1 DTYPE avg ( DTYPE FPARAM[], DTYPE FPARAM) { • Data types 2 DTYPE LVAR = 0; • Local variables 3 unsigned DTYPE LVAR; 4 • Function names 5 for (LVAR = 0; LVAR < FPARAM; LVAR ++) { 6 LVAR += FPARAM[LVAR]; 7 } 8 9 printf (“%f %d \ n”, LVAR/FPARAM, validate (LVAR)); 10 } Computer & Communication Security Lab., Korea University 27

  29. Abstraction • Transform function by replacing • Formal parameters Level 4: Function call abstraction 1 DTYPE avg (DTYPE FPARAM[], DTYPE FPARAM) { • Data types 2 DTYPE LVAR = 0; • Local variables 3 unsigned DTYPE LVAR; 4 • Function names 5 for (LVAR = 0; LVAR < FPARAM; LVAR ++) { 6 LVAR += FPARAM[LVAR]; 7 } 8 9 FUNCCALL (“%f %d \ n”, LVAR/FPARAM, FUNCCALL (LVAR)); 10 } Computer & Communication Security Lab., Korea University 28

  30. Normalization • Remove • comments 1 DTYPE avg (DTYPE FPARAM[], DTYPE FPARAM) { • tabs 2 DTYPE LVAR = 0; • white spaces 3 unsigned DTYPE LVAR; 4 • CRLF 5 for (LVAR = 0; LVAR < FPARAM; LVAR ++) { 6 LVAR += FPARAM[LVAR]; • Convert into lowercase 7 } 8 9 FUNCCALL (“%f %d \ n”, LVAR/FPARAM, FUNCCALL (LVAR)); 10 } dtypelvar=0;unsigneddtypelvar;for(lvar=0;lvar<fparam;lvar++){lvar+=fparam[lvar];} funccall (“% f %d\n ”, lvar/fparam, funccall (lvar)); Computer & Communication Security Lab., Korea University 29

  31. Vulnerable code clone detection • By comparing two fingerprint dictionaries repository fingerprint dictionary of vulnerable functions Computer & Communication Security Lab., Korea University 30

  32. Vulnerable code clone detection • By comparing two fingerprint dictionaries repository fingerprint dictionary of vulnerable functions target program fingerprint dictionary of target functions Computer & Communication Security Lab., Korea University 31

  33. Vulnerable code clone detection • By comparing two fingerprint dictionaries 20: [ABCDEF01, C94D9910] 21: [D155F630] 22: [C67F45FD, DDBF3838] repository fingerprint dictionary of vulnerable functions 20: [C94D9910, D6E77882] 23: [9A45E4A1] target program fingerprint dictionary of target functions Computer & Communication Security Lab., Korea University 32

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend