motivation
play

Motivation Underlying question : How does software change ? In : - PowerPoint PPT Presentation

Motivation Underlying question : How does software change ? In : Two versions of a program Out : Picture of changes Relevance Software development Software engineering 1 Understanding Source Code Evolution Using Abstract


  1. Motivation � Underlying question : How does software change ? � In : Two versions of a program � Out : Picture of changes � Relevance � Software development � Software engineering 1 Understanding Source Code Evolution Using Abstract Syntax Tree Matching

  2. Objective and Approach � Summarize C program changes � Functions (body AST, prototype) � Global variables (type and initializer) � Types � Structs/Unions (fields deleted / added / type changed) � Typedefs � Enums � Our Approach: AST matching � Accurate; handles renamings � Scales to real-world applications; e.g., Apache, Linux kernel, OpenSSH 2 Understanding Source Code Evolution Using Abstract Syntax Tree Matching

  3. Raw Output struct "net_device": 1 fields changed type: “accept_fastpath” struct "reiserfs_journal": 1 fields deleted: “j_dummy_inode” struct "reiserfs_journal": 1 fields added: “j_dirty_buffers” function "block_read_full_page": 1 arguments changed type: “get_block” function "ext2_readdir": 1 arguments changed type: “filldir___0” + function “inetdev_changename” + function “__ide_dma_good_drive” + function “ide_unplugged_outbsync” + function “inode_init_once” - function “target_cpus” - function “ide_dmafunc_verbose” + typedef “cisco_proto” - typedef “ide_ioctl_proc” + global var “idecd” Linux 2.4.20 vs 2.4.21 3 Understanding Source Code Evolution Using Abstract Syntax Tree Matching

  4. The Renaming Problem Same program, syntactic changes only typedef int sz_t; typedef int size_t; struct foo { struct bar { int i; int i; }; }; int count; int counter; void f(int a) { void f(int b) { struct foo sf; struct bar sb; sz_t c = 2; size_t d = 2; sf.i = a + c; sf.i = b + d; count++; counter++; } } Version 1 Version 2 4 Understanding Source Code Evolution Using Abstract Syntax Tree Matching

  5. Abstract Syntax Tree Matching Compare ASTs for functions with same name Program Prog ram AST AST AST 1 AST 1 Parsing Chang Changes Version 1 Version 1 Traversal Traversal Renaming & Detection Map Change Statisti Statistics cs Generation Detection Program Program AST 2 AST 2 Parsing Version 2 Version 2 AST Matching 5 Understanding Source Code Evolution Using Abstract Syntax Tree Matching

  6. AST Traversal - Name Map Generation Name Map f f a a b b c d sf c sf sb sb d count counter c= 2 d= 2 sf.i= a+ c sb.i= b+ d count+ + counter+ + void f(int a) { void f(int b) { struct foo sf; struct bar sb; sz_t c = 2; size_t d = 2; sf.i = a + c; sf.i = b + d; count++; counter++; } } Version 1 Version 2 6 Understanding Source Code Evolution Using Abstract Syntax Tree Matching

  7. AST Traversal - Type Map Generation Type Map f int int f struct foo b : int struct bar a : int sz_t size_t sb : struct bar sf : struct foo d : size_t c : sz_t void f(int a) { void f(int b) { struct foo sf; struct bar sb; sz_t c = 2; size_t d = 2; sf.i = a + c; sf.i = b + d; count++; counter++; } } Version 1 Version 2 7 Understanding Source Code Evolution Using Abstract Syntax Tree Matching

  8. Abstract Syntax Tree Matching Name/Type Maps -> Name/Type Bijections Traverse the ASTs in parallel, computing changes Prog Program ram AST AST AST 1 AST 1 Parsing Changes Chang Version 1 Version 1 Traversal Traversal Renaming & Detection Map Change Statistics Statisti cs Generation Detection Program Program AST 2 AST 2 Parsing Version 2 Version 2 AST Matching A renamed to B iff • A B in the map • A deleted • B added 8 Understanding Source Code Evolution Using Abstract Syntax Tree Matching

  9. AST Traversal - Change Detection struct foo : struct foo struct foo field i changed type: int -> long long field e added i : int i : long long f : sz_t f : size_t e : double sz_t size_t typedef int sz_t; typedef int size_t; struct foo { struct foo { int i; long long i; sz_t f; size_t f; } double e; } Version 1 Version 2 9 Understanding Source Code Evolution Using Abstract Syntax Tree Matching

  10. Implementation � Parsing via CIL toolkit � Merges whole program into single, preprocessed file � Fast � Scales linearly, 400.000 LOC in 1 minute � Generates different output formats � Raw differences, summaries, density trees 10 Understanding Source Code Evolution Using Abstract Syntax Tree Matching

  11. Summary Statistics ------- Functions ------- ------- Structs/ Unions ------- ------- Typedefs ------- Version1 : 7697 Version1 : 1214 Version1 : 487 Version2 : 7881 Version2 : 1233 Version2 : 469 added : 232 added : 17 added : 13 deleted : 48 deleted : 1 deleted : 31 locals/ formals changed name : 3 field type changes : 15 base type changes : 2 arguments type changes : 19 field count changes : 19 return types changes : 15 ------- Global Variables --- ------- Enums ------- Version1 : 8027 Version1 :33 Version2 : 8074 Version2 : 31 added : 43 deleted : 2 deleted : 16 item count changes : 1 var type changes : 11 var exp changes : 20 var val changes : 51 Linux 2.4.20 vs 2.4.21 11 Understanding Source Code Evolution Using Abstract Syntax Tree Matching

  12. Density Trees / : 1 11 i nc lude / : 101 l i nux / : 96 f s .h : 4 i de . h : 80 r e iser fs_ fs_sb .h : 1 r e iser fs_ fs_ i . h : 2 sched .h : 1 w i rel ess .h : 1 hdreg .h : 7 ne t / : 2 t cp . h : 1 sock .h : 1 asm- i386 / : 3 i o_ap ic .h: 3 d r i ve rs / : 9 char / : 1 agp / : 1 a gp .h : 1 i de / : 8 i de - pc i .c: 8 n e t / : 1 i pv4 / : 1 i p_ f r agment .c : 1 Linux 2.4.20 vs 2.4.21 Struct/Union field additions 12 Understanding Source Code Evolution Using Abstract Syntax Tree Matching

  13. Case Studies: OpenSSH, Vsftpd, Apache Functions & global variables: how often added and deleted? • OpenSSH changes most frequently • Deletions infrequent, relative to additions 13 Understanding Source Code Evolution Using Abstract Syntax Tree Matching

  14. Case Studies: OpenSSH, Vsftpd, Apache How often do function bodies and prototypes change? • Function bodies do change a lot • Function prototypes do not change much 14 Understanding Source Code Evolution Using Abstract Syntax Tree Matching

  15. Related Approaches � Standard diff � Low-level � Verbose: Linux 2.4.20-> 2.4.21 patch : 21MB � Release notes � High level � Possibly incomplete 15 Understanding Source Code Evolution Using Abstract Syntax Tree Matching

  16. Summary � Approach for reporting changes to C programs � AST-matching � Variety of changes at several levels of detail � Accurate � Scalable � Soon to be available at http://www.cs.umd.edu/~ neamtiu/evolution 16 Understanding Source Code Evolution Using Abstract Syntax Tree Matching

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend