Semantic Slicing of Software Version Histories Yi Li / U Toronto - - PowerPoint PPT Presentation

semantic slicing of software version histories
SMART_READER_LITE
LIVE PREVIEW

Semantic Slicing of Software Version Histories Yi Li / U Toronto - - PowerPoint PPT Presentation

Semantic Slicing of Software Version Histories Yi Li / U Toronto Julia Rubin / MIT Marsha Chechik / U Toronto ASE 2015 / Lincoln, NE Motivation Feb, 2015 release [1.3.8] v1.3.8 make groovy.sandbox.blacklist append-only avoid


slide-1
SLIDE 1

Semantic Slicing of Software Version Histories

Yi Li / U Toronto Julia Rubin / MIT Marsha Chechik / U Toronto ASE 2015 / Lincoln, NE

slide-2
SLIDE 2

Motivation

2

v1.3.8

release [1.3.8]

v1.3.6

release [1.3.6] make ’groovy.sandbox.blacklist’ append-only avoid NullPointerException if optional Groovy jar is removed updated docs to use v1.3.6 as current prepare for next development iteration (1.3.7-SNAPSHOT) make groovy sandbox method blacklist dynamically additive … …

Feb, 2015 Nov, 2014 30 authors 67 commits 87 files changed

slide-3
SLIDE 3

Motivation

2

v1.3.8

release [1.3.8]

v1.3.6

release [1.3.6] make ’groovy.sandbox.blacklist’ append-only avoid NullPointerException if optional Groovy jar is removed updated docs to use v1.3.6 as current prepare for next development iteration (1.3.7-SNAPSHOT) make groovy sandbox method blacklist dynamically additive … …

Feb, 2015 Nov, 2014 30 authors 67 commits 87 files changed

slide-4
SLIDE 4

Motivation

2

v1.3.8

release [1.3.8]

v1.3.6

release [1.3.6] make ’groovy.sandbox.blacklist’ append-only avoid NullPointerException if optional Groovy jar is removed updated docs to use v1.3.6 as current prepare for next development iteration (1.3.7-SNAPSHOT) make groovy sandbox method blacklist dynamically additive … …

Feb, 2015 Nov, 2014 30 authors 67 commits 87 files changed

slide-5
SLIDE 5

Motivation

2

v1.3.8

release [1.3.8]

v1.3.6

release [1.3.6] make ’groovy.sandbox.blacklist’ append-only avoid NullPointerException if optional Groovy jar is removed updated docs to use v1.3.6 as current prepare for next development iteration (1.3.7-SNAPSHOT) make groovy sandbox method blacklist dynamically additive … …

Feb, 2015 Nov, 2014 30 authors 67 commits 87 files changed

slide-6
SLIDE 6

Why is it so hard?

3

base target

slide-7
SLIDE 7

Why is it so hard?

Options?

  • 1. Pick target commits
  • 2. Pick the whole history
  • 3. Manually identify necessary commits

3

base target

slide-8
SLIDE 8

Why is it so hard?

Options?

  • 1. Pick target commits
  • 2. Pick the whole history
  • 3. Manually identify necessary commits

3

base target

slide-9
SLIDE 9

Why is it so hard?

Options?

  • 1. Pick target commits
  • 2. Pick the whole history
  • 3. Manually identify necessary commits

3

base target

slide-10
SLIDE 10

Why is it so hard?

Options?

  • 1. Pick target commits
  • 2. Pick the whole history
  • 3. Manually identify necessary commits

Existing version control tools:

  • Code treated as plain texts
  • Do not understand the semantics
  • User provided semantic/logical grouping is

inaccurate!

3

base target

// comment int boo1() {

  • {return 0;}

+ {return (new Bar()).y;} } class Bar { + int y = 0; static int bar1(int x) {return x - 1;}

slide-11
SLIDE 11

What can we do?

Exploit existing artifacts:

  • Strictly structured data
  • Well-defined language syntax

and semantics

  • Carefully designed test suites

4

base target

slide-12
SLIDE 12

What can we do?

Exploit existing artifacts:

  • Strictly structured data
  • Well-defined language syntax

and semantics

  • Carefully designed test suites

4

base target

slide-13
SLIDE 13

Solution: Semantic Slicing

Exploit existing artifacts:

  • Strictly structured data
  • Well-defined language syntax

and semantics

  • Carefully designed test suites

4

base target base target

History: sequence of commits + Criterion: set of tests Sub-history: well-formed: compiles & semantic preserving: passing tests

slide-14
SLIDE 14

Outline

  • 1. Introduction
  • 2. Dependency Hierarchy
  • 3. CSlicer Algorithm
  • 4. Evaluation
  • 5. Related Work & Conclusion

5

slide-15
SLIDE 15

Running Example

6

class A { int g() {return 0;} } class B { static int f(int x) {return x + 1;} }

v1.0

slide-16
SLIDE 16

class A { // comment int g() {return 0;} } class B { static int f(int x) {return x + 1;} }

Running Example

6

class A { + // comment int g() {return 0;} C1 v1.0

slide-17
SLIDE 17

class A { // comment int g() {return 0;} } class B { static int f(int x) {return x - 1;} }

Running Example

6

class A { + // comment int g() {return 0;} C1 class A { static int f(int x) {

  • {return x + 1;}

+ {return x - 1;} } C2 v1.0

slide-18
SLIDE 18

class A { // comment int g() {return (new B()).y;} } class B { int y = 0; static int f(int x) {return x - 1;} }

Running Example

6

class A { + // comment int g() {return 0;} C1 class A { static int f(int x) {

  • {return x + 1;}

+ {return x - 1;} } C2 // comment int g() {

  • {return 0;}

+ {return (new B()).y;} } class B { + int y = 0; static int f(int x) {return x - 1;} C3 v1.0

slide-19
SLIDE 19

class A { int x; // comment int g() {return (new B()).y;} } class B { int y = 0; static int f(int x) {return x - 1;} }

Running Example

6

class A { + // comment int g() {return 0;} C1 class A { static int f(int x) {

  • {return x + 1;}

+ {return x - 1;} } C2 // comment int g() {

  • {return 0;}

+ {return (new B()).y;} } class B { + int y = 0; static int f(int x) {return x - 1;} C3 class A { + int x; // comment int g() C4 v1.0

slide-20
SLIDE 20

class A { int x; int g() {return B.f(x);} // comment int h() {return (new B()).y;} } class B { int y = 0; static int f(int x) {return x - 1;} }

Running Example

6

class A { + // comment int g() {return 0;} C1 class A { static int f(int x) {

  • {return x + 1;}

+ {return x - 1;} } C2 // comment int g() {

  • {return 0;}

+ {return (new B()).y;} } class B { + int y = 0; static int f(int x) {return x - 1;} C3 class A { + int x; // comment int g() C4 v1.1 class A { int x; + int g() + {return B.f(x);} // comment int h() C5

class TestA { public void t1() A a = new A(); {assertEquals(-1, a.g();} }

v1.0

Test case:

a.g()==-1

slide-21
SLIDE 21

class A { int x; int g() {return B.f(x);} // comment int h() {return 0;} } class B{ static int f(int x) {return x - 1;} }

Running Example

6

class A { + // comment int g() {return 0;} C1 class A { static int f(int x) {

  • {return x + 1;}

+ {return x - 1;} } C2 // comment int g() {

  • {return 0;}

+ {return (new B()).y;} } class B { + int y = 0; static int f(int x) {return x - 1;} C3 class A { + int x; // comment int g() C4 v1.1 class A { int x; + int g() + {return B.f(x);} // comment int h() C5

class TestA { public void t1() A a = new A(); {assertEquals(-1, a.g();} }

v1.0

Test case:

a.g()==-1

slide-22
SLIDE 22

class A { int x; int g() {return B.f(x);} // comment int h() {return 0;} } class B{ static int f(int x) {return x - 1;} }

Running Example

6

class A { + // comment int g() {return 0;} C1 class A { static int f(int x) {

  • {return x + 1;}

+ {return x - 1;} } C2 // comment int g() {

  • {return 0;}

+ {return (new B()).y;} } class B { + int y = 0; static int f(int x) {return x - 1;} C3 class A { + int x; // comment int g() C4 v1.1 class A { int x; + int g() + {return B.f(x);} // comment int h() C5

class TestA { public void t1() A a = new A(); {assertEquals(-1, a.g();} }

v1.0

Test case:

a.g()==-1

slide-23
SLIDE 23

class A { int x; int g() {return B.f(x);} // comment int h() {return 0;} } class B{ static int f(int x) {return x - 1;} }

Running Example

6

class A { + // comment int g() {return 0;} C1 class A { static int f(int x) {

  • {return x + 1;}

+ {return x - 1;} } C2 // comment int g() {

  • {return 0;}

+ {return (new B()).y;} } class B { + int y = 0; static int f(int x) {return x - 1;} C3 class A { + int x; // comment int g() C4 v1.1 class A { int x; + int g() + {return B.f(x);} // comment int h() C5

class TestA { public void t1() A a = new A(); {assertEquals(-1, a.g();} }

v1.0

Test case:

a.g()==-1

slide-24
SLIDE 24

class A { int x; int g() {return B.f(x);} // comment int h() {return 0;} } class B{ static int f(int x) {return x - 1;} }

Running Example

6

class A { + // comment int g() {return 0;} C1 class A { static int f(int x) {

  • {return x + 1;}

+ {return x - 1;} } C2 // comment int g() {

  • {return 0;}

+ {return (new B()).y;} } class B { + int y = 0; static int f(int x) {return x - 1;} C3 class A { + int x; // comment int g() C4 v1.1 class A { int x; + int g() + {return B.f(x);} // comment int h() C5

class TestA { public void t1() A a = new A(); {assertEquals(-1, a.g();} }

v1.0

Test case:

a.g()==-1

slide-25
SLIDE 25

Dependency Hierarchy

7

Dependency Types Examples Definitions

Functional

required for maintaining the semantic behaviours (e.g., pass the same tests)

Compilation

required for maintaining the wellformedness of the program (e.g., free from compilation errors)

Hunk

specific to text-based version control systems (e.g., Git)

class A { + // comment int g() {return 0;} C1 class A { + int x; // comment int g() C4 class A { static int f(int x) {

  • {return x + 1;}

+ {return x - 1;} } C2

slide-26
SLIDE 26

Dependency Hierarchy

7

Dependency Types Examples Definitions

Functional

required for maintaining the semantic behaviours (e.g., pass the same tests)

Compilation

required for maintaining the wellformedness of the program (e.g., free from compilation errors)

Hunk

specific to text-based version control systems (e.g., Git)

Dependency Hierarchy

class A { + // comment int g() {return 0;} C1 class A { + int x; // comment int g() C4 class A { static int f(int x) {

  • {return x + 1;}

+ {return x - 1;} } C2

Textual Contexts Structural Glue Code Functional Core

Correctness Well-formedness Applicability

slide-27
SLIDE 27

Outline

8

  • 1. Introduction
  • 2. Dependency Hierarchy
  • 3. CSlicer Algorithm
  • 4. Evaluation
  • 5. Related Work & Conclusion
slide-28
SLIDE 28

… H

p0 pk

T

t1 … tm Compute

  • Func. Set

Compute

  • Comp. set

AST Diff

pi pi-1 Slicing

Λ Π ∆i ∆1’, …, ∆k’ H’ ∆i’

CSlicer Overview

9

Input:

  • H = p0 … pk well-formed
  • T = {t1, …, tm} tests for pk

Slicing core:

  • FUNC set: ᴧ
  • COMP set: ᴨ
  • Slicer(ᴧ, ᴨ, ∆i) = ∆i’

Output:

  • H’ = <∆1’, …, ∆k’> slice
slide-29
SLIDE 29

… H

p0 pk

T

t1 … tm Compute

  • Func. Set

Compute

  • Comp. set

AST Diff

pi pi-1 Slicing

Λ Π ∆i ∆1’, …, ∆k’ H’ ∆i’

CSlicer Overview

9

Input:

  • H = p0 … pk well-formed
  • T = {t1, …, tm} tests for pk

Slicing core:

  • FUNC set: ᴧ
  • COMP set: ᴨ
  • Slicer(ᴧ, ᴨ, ∆i) = ∆i’

Output:

  • H’ = <∆1’, …, ∆k’> slice
  • 1. AST differencing
slide-30
SLIDE 30

… H

p0 pk

T

t1 … tm Compute

  • Func. Set

Compute

  • Comp. set

AST Diff

pi pi-1 Slicing

Λ Π ∆i ∆1’, …, ∆k’ H’ ∆i’

CSlicer Overview

9

Input:

  • H = p0 … pk well-formed
  • T = {t1, …, tm} tests for pk

Slicing core:

  • FUNC set: ᴧ
  • COMP set: ᴨ
  • Slicer(ᴧ, ᴨ, ∆i) = ∆i’

Output:

  • H’ = <∆1’, …, ∆k’> slice
  • 1. AST differencing
  • 2. Compute Functional set
slide-31
SLIDE 31

… H

p0 pk

T

t1 … tm Compute

  • Func. Set

Compute

  • Comp. set

AST Diff

pi pi-1 Slicing

Λ Π ∆i ∆1’, …, ∆k’ H’ ∆i’

CSlicer Overview

9

Input:

  • H = p0 … pk well-formed
  • T = {t1, …, tm} tests for pk

Slicing core:

  • FUNC set: ᴧ
  • COMP set: ᴨ
  • Slicer(ᴧ, ᴨ, ∆i) = ∆i’

Output:

  • H’ = <∆1’, …, ∆k’> slice
  • 1. AST differencing
  • 2. Compute Functional set
  • 3. Compute Compilation set
slide-32
SLIDE 32

… H

p0 pk

T

t1 … tm Compute

  • Func. Set

Compute

  • Comp. set

AST Diff

pi pi-1 Slicing

Λ Π ∆i ∆1’, …, ∆k’ H’ ∆i’

CSlicer Overview

9

Input:

  • H = p0 … pk well-formed
  • T = {t1, …, tm} tests for pk

Slicing core:

  • FUNC set: ᴧ
  • COMP set: ᴨ
  • Slicer(ᴧ, ᴨ, ∆i) = ∆i’

Output:

  • H’ = <∆1’, …, ∆k’> slice
  • 1. AST differencing
  • 2. Compute Functional set
  • 3. Compute Compilation set
  • 4. Changeset Slicing
slide-33
SLIDE 33

Language Model

Simplified language model:

  • Featherweight Java [Igarashi et al., ACM TOPLAS’01]
  • Core object-oriented features and type system
  • No reflection, abstract class, etc.
  • Advanced Java features can be handled as

algorithmic extensions

10

slide-34
SLIDE 34

… H

p0 pk

T

t1 … tm Compute

  • Func. Set

Compute

  • Comp. set

AST Diff

pi pi-1 Slicing

Λ Π ∆i ∆1’, …, ∆k’ H’ ∆i’

AST Differencing

11

slide-35
SLIDE 35

… H

p0 pk

T

t1 … tm Compute

  • Func. Set

Compute

  • Comp. set

AST Diff

pi pi-1 Slicing

Λ Π ∆i ∆1’, …, ∆k’ H’ ∆i’

AST Differencing

11

slide-36
SLIDE 36

AST Differencing

Compare two abstract syntax trees:

  • Ignore cosmetic changes; match on unique names
  • Focus on structural nodes (class, method, and field)
  • Structural differencing [Fluri et al., IEEE TSE’07]

12

Pi-1 Pi

Edit Operations: + Ins((x,n,v),y)

  • Del(x)

* Upd(x,v)

foo B A h() f(int) foo B A y:int h() f(int)

slide-37
SLIDE 37

AST Differencing

Compare two abstract syntax trees:

  • Ignore cosmetic changes; match on unique names
  • Focus on structural nodes (class, method, and field)
  • Structural differencing [Fluri et al., IEEE TSE’07]

12

Pi-1 Pi

∆i

Ins(y:int,B) Upd(A.f(int))

Edit Operations: + Ins((x,n,v),y)

  • Del(x)

* Upd(x,v)

foo B A h() f(int) foo B A y:int h() f(int)

slide-38
SLIDE 38

… H

p0 pk

T

t1 … tm Compute

  • Func. Set

Compute

  • Comp. set

AST Diff

pi pi-1 Slicing

Λ Π ∆i ∆1’, …, ∆k’ H’ ∆i’

Compute Functional Set

13

slide-39
SLIDE 39

… H

p0 pk

T

t1 … tm Compute

  • Func. Set

Compute

  • Comp. set

AST Diff

pi pi-1 Slicing

Λ Π ∆i ∆1’, …, ∆k’ H’ ∆i’

Compute Functional Set

13

slide-40
SLIDE 40

foo B A x:int y:int h() g() f(int)

Compute Functional Set

Functional Set:

  • Nodes directly traversed

during test execution

  • Dynamic analysis
  • Ensure functional correctness

14

class A { int x; int g() {return B.f(x);} // comment int h() {return (new B()).y;} } class B { int y = 0; static int f(int x) {return x - 1;} }

Pk

slide-41
SLIDE 41

foo B A x:int y:int h() g() f(int)

Compute Functional Set

Functional Set:

  • Nodes directly traversed

during test execution

  • Dynamic analysis
  • Ensure functional correctness

14

class A { int x; int g() {return B.f(x);} // comment int h() {return (new B()).y;} } class B { int y = 0; static int f(int x) {return x - 1;} }

Test case:

a.g()==-1

Pk

slide-42
SLIDE 42

foo B A x:int y:int h() g() f(int)

Compute Functional Set

Functional Set:

  • Nodes directly traversed

during test execution

  • Dynamic analysis
  • Ensure functional correctness

14

class A { int x; int g() {return B.f(x);} // comment int h() {return (new B()).y;} } class B { int y = 0; static int f(int x) {return x - 1;} }

foo B A x:int y:int h() g() f(int)

Test case:

a.g()==-1

Pk

slide-43
SLIDE 43

Compute Compilation Set

15

… H

p0 pk

T

t1 … tm Compute

  • Func. Set

Compute

  • Comp. set

AST Diff

pi pi-1 Slicing

Λ Π ∆i ∆1’, …, ∆k’ H’ ∆i’

slide-44
SLIDE 44

Compute Compilation Set

15

… H

p0 pk

T

t1 … tm Compute

  • Func. Set

Compute

  • Comp. set

AST Diff

pi pi-1 Slicing

Λ Π ∆i ∆1’, …, ∆k’ H’ ∆i’

slide-45
SLIDE 45

foo B A x:int y:int h() g() f(int)

Compute Compilation Set

Compilation Set:

  • Nodes referenced by the

functional set

  • Static analysis
  • Ensure type safety

Inference Rules:

  • Enclosing classes should exist
  • Accessed fields should exist
  • etc.

16

class A { int x; int g() {return B.f(x);} // comment int h() {return (new B()).y;} } class B { int y = 0; static int f(int x) {return x - 1;} }

Pk

slide-46
SLIDE 46

foo B A x:int y:int h() g() f(int)

Compute Compilation Set

Compilation Set:

  • Nodes referenced by the

functional set

  • Static analysis
  • Ensure type safety

Inference Rules:

  • Enclosing classes should exist
  • Accessed fields should exist
  • etc.

16

class A { int x; int g() {return B.f(x);} // comment int h() {return (new B()).y;} } class B { int y = 0; static int f(int x) {return x - 1;} }

foo B A x:int y:int h() g() f(int)

Pk

slide-47
SLIDE 47

Compute Compilation Set

17

Inference Rules:

  • Based on [Kastner & Apel, ASE’08]
  • Tailored for method-field level granularity
  • Complete for our language model
slide-48
SLIDE 48

… H

p0 pk

T

t1 … tm Compute

  • Func. Set

Compute

  • Comp. set

AST Diff

pi pi-1 Slicing

Λ Π ∆i ∆1’, …, ∆k’ H’ ∆i’

Changeset Slicing

18

slide-49
SLIDE 49

… H

p0 pk

T

t1 … tm Compute

  • Func. Set

Compute

  • Comp. set

AST Diff

pi pi-1 Slicing

Λ Π ∆i ∆1’, …, ∆k’ H’ ∆i’

Changeset Slicing

18

slide-50
SLIDE 50

+

A.g()

+

B.y A.h()

+

C1 C4

// comment

* *

C3 C2 C5 B.f(int)

+

A.x

Changeset Slicing

Change Matrix: maps atomic changes to commits

  • Cells are marked by change types
  • Atomic changes are color coded

19

+ Ins

  • Del

* Upd

Functional Compilation

slide-51
SLIDE 51

Changeset Slicing

20

C1

+

  • C2

δ5

+ +

δ4

*

  • δ3

C3

*

C5 C4

δ1

*

δ2

+ Ins - Del * Upd

Functional Compilation

General Slicing Rules:

  • Keep blue cells
  • Keep purple +, -
  • Drop white - unless

affecting method lookup

slide-52
SLIDE 52

C1

+

  • C2

δ5

+ +

δ4

*

  • δ3

C3

*

C5 C4

δ1

*

δ2

+ Ins - Del * Upd

Functional Compilation

Changeset Slicing

Side-effects (Git):

  • Keeping original commit
  • Dependencies between

white cells

  • Detection and resolution

21

slide-53
SLIDE 53

C1

+

  • C2

δ5

+ +

δ4

*

  • δ3

C3

*

C5 C4

δ1

*

δ2

+ Ins - Del * Upd

Functional Compilation

Changeset Slicing

Side-effects (Git):

  • Keeping original commit
  • Dependencies between

white cells

  • Detection and resolution

21

slide-54
SLIDE 54

C1

+

  • C2

δ5

+ +

δ4

*

  • δ3

C3

*

C5 C4

δ1

*

δ2

+ Ins - Del * Upd

Functional Compilation

Changeset Slicing

Side-effects (Git):

  • Keeping original commit
  • Dependencies between

white cells

  • Detection and resolution

21

slide-55
SLIDE 55

C1

+

  • C2

δ5

+ +

δ4

*

  • δ3

C3

*

C5 C4

δ1

*

δ2

Changeset Slicing

Side-effects (Git):

  • Keeping original commit
  • Dependencies between

white cells

  • Detection and resolution

21

δ3 δ2 δ4 δ1 δ5

C1

+

  • C2

+ + *

  • C3

*

C5 C4

*

CN

*

slide-56
SLIDE 56

Hunk Dependency

22

… H

p0 pk

T

t1 … tm Compute

  • Func. Set

Compute

  • Comp. set

AST Diff

pi pi-1 Slicing

Λ Π ∆i ∆1’, …, ∆k’ H’ ∆i’

slide-57
SLIDE 57

Hunk Dependency

H’

Hunk Dependency

22

… H

p0 pk

T

t1 … tm Compute

  • Func. Set

Compute

  • Comp. set

AST Diff

pi pi-1 Slicing

Λ Π ∆i ∆1’, …, ∆k’ H’ ∆i’

HunkDeps(H’)

Specific to text-based version control Applicable history slice

slide-58
SLIDE 58

Outline

23

  • 1. Introduction
  • 2. Dependency Hierarchy
  • 3. CSlicer Algorithm
  • 4. Evaluation
  • 5. Related Work & Conclusion
slide-59
SLIDE 59

Evaluation

Research questions

  • Accuracy: do we find what we want?
  • Effectiveness: reduction rate?
  • Efficiency: performance w.r.t. project scale & history length?

Subjects

  • Advanced Java features not tested: abstract class, reflection, etc.
  • Non-Java changes are included by default

24

Project # Java Files LOC # Authors Hadoop 5,861 1,291K 169 Elasticsearc h 3,865 616K 649 Maven 1,048 142K 78 CSlicer 141 18K 2

slide-60
SLIDE 60

Accuracy

25

trunk feature

slide-61
SLIDE 61

Accuracy

Feature branch

  • Merges with the main branch

periodically

  • 42 feature commits + 47 merges
  • 58 accompanied test cases

25

trunk feature

slide-62
SLIDE 62

trunk feature

Accuracy

Feature branch

  • Merges with the main branch

periodically

  • 42 feature commits + 47 merges
  • 58 accompanied test cases

Case Study:

  • Separate feature changes
  • Identified 65 out of 267 commits

related to the feature

  • 41 matches original

25

slide-63
SLIDE 63

Effectiveness

Average Reduction:

~80%!

Reduction depends on:

  • 1. tests complexity
  • 2. committing styles

26

length(slice) / length(history)

0.0% 22.5% 45.0% 67.5% 90.0% H a d

  • p

1 H a d

  • p

2 H a d

  • p

3 E l a s t i c 1 E l a s t i c 2 E l a s t i c 3 M a v e n 1 M a v e n 2 M a v e n 3 C S l i c e r 1 C S l i c e r 2 C S l i c e r 3

SLICE(H') HUNK

slide-64
SLIDE 64

Effectiveness

Average Reduction:

~80%!

Reduction depends on:

  • 1. tests complexity
  • 2. committing styles

26

length(slice) / length(history)

0.0% 22.5% 45.0% 67.5% 90.0% H a d

  • p

1 H a d

  • p

2 H a d

  • p

3 E l a s t i c 1 E l a s t i c 2 E l a s t i c 3 M a v e n 1 M a v e n 2 M a v e n 3 C S l i c e r 1 C S l i c e r 2 C S l i c e r 3

SLICE(H') HUNK

Large test suites

slide-65
SLIDE 65

Effectiveness

Average Reduction:

~80%!

Reduction depends on:

  • 1. tests complexity
  • 2. committing styles

26

length(slice) / length(history)

0.0% 22.5% 45.0% 67.5% 90.0% H a d

  • p

1 H a d

  • p

2 H a d

  • p

3 E l a s t i c 1 E l a s t i c 2 E l a s t i c 3 M a v e n 1 M a v e n 2 M a v e n 3 C S l i c e r 1 C S l i c e r 2 C S l i c e r 3

SLICE(H') HUNK

Large test suites Good committing style

slide-66
SLIDE 66

Performance

  • Total CSlicer time: 2 ~ 65 s
  • Major part spent in

functional & compilation set computation

  • History length has little

effects on performance for large projects

27

CSlicer time breakdown HUNK 22% SLICE 8% COMP 52% FUNC 17%

slide-67
SLIDE 67

Performance

  • Total CSlicer time: 2 ~ 65 s
  • Major part spent in

functional & compilation set computation

  • History length has little

effects on performance for large projects

27

CSlicer time breakdown HUNK 22% SLICE 8% COMP 52% FUNC 17%

slide-68
SLIDE 68

Outline

28

  • 1. Introduction
  • 2. Dependency Hierarchy
  • 3. CSlicer Algorithm
  • 4. Evaluation
  • 5. Related Work & Conclusion
slide-69
SLIDE 69

Related Work

Change Representation

  • Code change classification [Falleri et al., ASE’14; Chawathe,

SIGMOD’96]

  • History granularity transformation [Muslu et al., ASE’15]

Change Impact Analysis

  • Compute affected regression tests [Ren et al., OPPSLA’04]
  • Fault localization [Zhang et al., ICSM’01]

29

slide-70
SLIDE 70

Conclusion & Future Work

CSlicer: history semantic slicing

  • Filling the gap between texts and semantics
  • Adapted to existing version control tools
  • Many interesting applications: history

comprehension; functionality transferring …

What’s next?

  • Handle distributed histories
  • Slice integration — the “paste” step

30

bitbucket.org/liyistc/gitslice

slide-71
SLIDE 71

Questions?

31

Semantic Slicing

Exploit existing artifacts:

  • Strictly structured data
  • Well-defined language syntax

and semantics

  • Carefully designed test suites
4 base target base target

History: sequence of commits + Criterion: set of tests Sub-history: well-formed: compiles & semantic preserving: passing tests

Dependency Hierarchy

7

Dependency Types Examples Definitions

Functional required for maintaining the semantic behaviours (e.g., pass the same tests) Compilation required for maintaining the wellformedness of the program (e.g., free from compilation errors) Hunk specific to text-based version control systems (e.g., Git) Dependency Hierarchy class A { + // comment int g() {return 0;} C1 class A { + int x; // comment int g() C4 class A { static int f(int x) {
  • {return x + 1;}
+ {return x - 1;} } C2 T extual Contexts Structural Glue Code Functional Core

Correctness Well-formedness Applicability

… H p0 pk T t1 … tm Compute
  • Func. Set
Compute
  • Comp. set
AST Diff pi pi-1 Slicing Λ Π ∆i ∆1’, …, ∆k’ H’ ∆i’

CSlicer Overview

9

Input:

  • H = p0 … pk well-formed
  • T = {t1, …, tm} tests for pk

Slicing core:

  • FUNC set: ᴧ
  • COMP set: ᴨ
  • Slicer(FUNC, COMP

, ∆i) = ∆i’

Output:

  • H’ = <∆1’, …, ∆k’> slice
  • 1. AST differencing
  • 2. Compute Functional set
  • 3. Compute Compilation set
  • 4. Changeset Slicing

Experiments

Reduction depends on:

  • 1. tests complexity
  • 2. coding style
  • 3. committing style
24

length(slice) / length(history)

0.0% 22.5% 45.0% 67.5% 90.0% H a d
  • p
1 H a d
  • p
2 H a d
  • p
3 E l a s t i c 1 E l a s t i c 2 E l a s t i c 3 M a v e n 1 M a v e n 2 M a v e n 3 C S l i c e r 1 C S l i c e r 2 C S l i c e r 3 SLICE(H') HUNK