FHistorian: Locating Features in Version Histories Yi Li / - - PowerPoint PPT Presentation

fhistorian locating features in version histories
SMART_READER_LITE
LIVE PREVIEW

FHistorian: Locating Features in Version Histories Yi Li / - - PowerPoint PPT Presentation

FHistorian: Locating Features in Version Histories Yi Li / UToronto Chenguang Zhu / UToronto Julia Rubin / UBC Marsha Chechik / UToronto Sep 27, 2017 Feature Location Feature location is the activity of identifying an initial location


slide-1
SLIDE 1

FHistorian: Locating Features in Version Histories

Yi Li / UToronto Chenguang Zhu / UToronto Julia Rubin / UBC Marsha Chechik / UToronto Sep 27, 2017

slide-2
SLIDE 2

Feature Location

“Feature location is the activity of identifying an initial location in the source code that implements functionality in a software system.”

Dit, B., Revelle, M., Gethers, M. and Poshyvanyk, D. (2013), Feature location in source code: a taxonomy and

  • survey. J. Softw. Evol. and Proc., 25: 53–95. doi:10.1002/smr.567

2

slide-3
SLIDE 3

Feature Location for SPLE

3

core assets (features) configurations + feature model product outputs The “top-down” approach

slide-4
SLIDE 4

Feature Location for SPLE

3

core assets (features) configurations + feature model product outputs product variants The “top-down” approach The “bottom-up” approach

slide-5
SLIDE 5

Feature Location for SPLE

3

core assets (features) configurations + feature model product outputs

f1 f2 f4 f3

  • 2. feature relationships

(feature models) f1: f2: f3: f4:

  • 1. feature implementations

(assets) product variants

From “ad-hoc” to “systematic”

The “top-down” approach The “bottom-up” approach

slide-6
SLIDE 6

Feature Location from Product Variants

4

Variant 1 Variant 2 Variant n-1 Variant n

f1, f2, f3 f1, f3 f1, f2, f4 f1, f3, f5

slide-7
SLIDE 7

Feature Location from Product Variants

4

Variant 1 Variant 2 Variant n-1 Variant n

f1, f2, f3 f1, f3 f1, f2, f4 f1, f3, f5

code elements

slide-8
SLIDE 8

Feature Location from Product Variants

4

Variant 1 Variant 2 Variant n-1 Variant n

f1, f2, f3 f1, f3 f1, f2, f4 f1, f3, f5 Intersection-based feature location

code elements

slide-9
SLIDE 9

Feature Location from Product Variants

4

Variant 1 Variant 2 Variant n-1 Variant n

f1, f2, f3 f1, f3 f1, f2, f4 f1, f3, f5 Intersection-based feature location

code elements

slide-10
SLIDE 10

Feature Location from Product Variants

4

Variant 1 Variant 2 Variant n-1 Variant n

f1, f2, f3 f1, f3 f1, f2, f4 f1, f3, f5 Intersection-based feature location

code elements

f4 f1 f3 f2 f5

slide-11
SLIDE 11

Feature Location from Product Variants

4

Variant 1 Variant 2 Variant n-1 Variant n

f1, f2, f3 f1, f3 f1, f2, f4 f1, f3, f5 Intersection-based feature location f1: f2: f3: f4: f5:

code elements

f4 f1 f3 f2 f5

slide-12
SLIDE 12

Feature Location from Product Variants

4

Variant 1 Variant 2 Variant n-1 Variant n

f1, f2, f3 f1, f3 f1, f2, f4 f1, f3, f5 Intersection-based feature location f1: f2: f3: f4: f5:

code elements

f4 f1 f3 f2 f5

What if: Variant 1 also has f6 and f7?

f6? f7?

slide-13
SLIDE 13

Intersection-based FL:

  • Only works well with a large

number of variants

  • Operates in static manner
  • Feature labeling has to be

exhaustive

Pitfalls of Intersection-Based Approaches

5

slide-14
SLIDE 14

Intersection-based FL:

  • Only works well with a large

number of variants

  • Operates in static manner
  • Feature labeling has to be

exhaustive

Reality:

  • 3~10 products, ~50 features
  • Maintained in version control

systems (e.g., Git)

Pitfalls of Intersection-Based Approaches

5

slide-15
SLIDE 15

Feature Location in Version Histories

6

master feature 2 feature 3 feature 4 feature 1 test 1 test 4 test 2 test 3

slide-16
SLIDE 16

Feature Location in Version Histories

6

New features: {f1, f2, f3, f4}, tests: {t1,t2,t3,t4}

master feature 2 feature 3 feature 4 feature 1 test 1 test 4 test 2 test 3

slide-17
SLIDE 17

Feature Location in Version Histories

6

New features: {f1, f2, f3, f4}, tests: {t1,t2,t3,t4} f1: f2: f3: f4:

master feature 2 feature 3 feature 4 feature 1

commits

test 1 test 4 test 2 test 3

slide-18
SLIDE 18

Feature Location in Version Histories

6

New features: {f1, f2, f3, f4}, tests: {t1,t2,t3,t4} f1: f2: f3: f4:

master feature 2

f1 f2 f4 f3

feature 3 feature 4 feature 1

commits features

test 1 test 4 test 2 test 3

slide-19
SLIDE 19

History-Based vs. Intersection-Based

History-based dynamic feature location

7

slide-20
SLIDE 20

History-Based vs. Intersection-Based

History-based dynamic feature location

  • More flexible:
  • 1. Implicit feature labeling: release notes
  • 2. Traceability of evolution information
  • 3. Effective even with limited numbers of variants

7

slide-21
SLIDE 21

History-Based vs. Intersection-Based

History-based dynamic feature location

  • More flexible:
  • 1. Implicit feature labeling: release notes
  • 2. Traceability of evolution information
  • 3. Effective even with limited numbers of variants
  • More accurate:
  • 4. Captures runtime dependencies
  • 5. Focused search space: only considering changes

within a history range

  • 6. Generates Light-weight feature models

7

slide-22
SLIDE 22

Outline

  • 1. Introduction
  • 2. Background
  • Semantics-Preserving History Slice
  • Semantic History Slicing
  • 3. FHistorian
  • FLocate: identifying feature implementations in histories
  • FHGraph: inferring feature relationships
  • 4. Evaluation
  • 5. Conclusion & Future Work

8

slide-23
SLIDE 23

(H)istory (T)ests H ⊨ T

T1, T2

Semantics-Preserving History Slice

9

slide-24
SLIDE 24

(H)istory (T)ests H ⊨ T

T1, T2 T1, T2

Semantics-Preserving History Slice

9

Remove!

slide-25
SLIDE 25

(H)istory (T)ests H ⊨ T

T1, T2 T1, T2 T1

Semantics-Preserving History Slice

9

slide-26
SLIDE 26

(H)istory (T)ests H ⊨ T

T1, T2 T1, T2 T1 T2

Semantics-Preserving History Slice

9

slide-27
SLIDE 27

(H)istory (T)ests H ⊨ T

T1, T2 T1, T2 T1 T2

Semantics-Preserving History Slice

9

Minimal semantics-preserving slice = feature implementing changes?

slide-28
SLIDE 28

http://www.cs.toronto.edu/~liyi/cslicer [ASE’16]

Semantic History Slicing

slide-29
SLIDE 29

Outline

X

  • 1. Introduction
  • 2. Background
  • Semantics-Preserving History Slice
  • Semantic History Slicing
  • 3. FHistorian
  • FLocate: identifying feature implementations in histories
  • FHGraph: inferring feature relationships
  • 4. Evaluation
  • 5. Conclusion & Future Work
slide-30
SLIDE 30

FHistorian = FLocate + FHGraph

11

FHISTORIAN

Input History: H

Tfn Tf1

F LOCATE

… Hfn Hf1 …

F H GRAPH

Feature Model:

(F, Er , Ed , h)

slide-31
SLIDE 31

FHistorian = FLocate + FHGraph

11

FHISTORIAN

Input History: H

Tfn Tf1

F LOCATE

… Hfn Hf1 …

F H GRAPH

Feature Model:

(F, Er , Ed , h)

Light-weight

slide-32
SLIDE 32

FHISTORIAN

Input History: H

Tfn Tf1

FLOCATE

… Hfn Hf1 …

FHGRAPH

Feature Model:

(F, Er , Ed , h)

FLocate: Locating Feature Implementations

Based on Definer [ASE’16]

  • Foreach feature f, find a

minimal slice: Hf s.t. Hf ⊨ Tf

  • Factoring out other features:

f = Hf \ Hf’ for all other f’

  • Hunk minimization (details

in paper…)

12

H : Hf1 : Hf2 : Hf3 :

slide-33
SLIDE 33

FHISTORIAN

Input History: H

Tfn Tf1

FLOCATE

… Hfn Hf1 …

FHGRAPH

Feature Model:

(F, Er , Ed , h)

FLocate: Locating Feature Implementations

Based on Definer [ASE’16]

  • Foreach feature f, find a

minimal slice: Hf s.t. Hf ⊨ Tf

  • Factoring out other features:

f = Hf \ Hf’ for all other f’

  • Hunk minimization (details

in paper…)

12

H : Hf1 : Hf2 : Hf3 : f1 :

slide-34
SLIDE 34

FHISTORIAN

Input History: H

Tfn Tf1

FLOCATE

… Hfn Hf1 …

FHGRAPH

Feature Model:

(F, Er , Ed , h)

FLocate: Locating Feature Implementations

Based on Definer [ASE’16]

  • Foreach feature f, find a

minimal slice: Hf s.t. Hf ⊨ Tf

  • Factoring out other features:

f = Hf \ Hf’ for all other f’

  • Hunk minimization (details

in paper…)

12

H : Hf1 : Hf2 : Hf3 : f2 : f1 :

slide-35
SLIDE 35

FHISTORIAN

Input History: H

Tfn Tf1

FLOCATE

… Hfn Hf1 …

FHGRAPH

Feature Model:

(F, Er , Ed , h)

FLocate: Locating Feature Implementations

Based on Definer [ASE’16]

  • Foreach feature f, find a

minimal slice: Hf s.t. Hf ⊨ Tf

  • Factoring out other features:

f = Hf \ Hf’ for all other f’

  • Hunk minimization (details

in paper…)

12

H : Hf1 : Hf2 : Hf3 : f2 : f3 : f1 :

slide-36
SLIDE 36

FHISTORIAN

Input History: H

Tfn Tf1

FLOCATE

… Hfn Hf1 …

FHGRAPH

Feature Model:

(F, Er , Ed , h)

Light-weight feature model:

Depends-on Reflecting runtime dependencies Relates-to Revealing underlying connections

FHGraph: Inferring Feature Relationships

13

(f2 → f1) ⇔ (Hf1 ⊆ Hf2)

(f2 $ f1) , (Hf1 \ Hf2 6= ;)

Hf1 : Hf2 : Hf3 :

f2 f3 f1

slide-37
SLIDE 37

FHISTORIAN

Input History: H

Tfn Tf1

FLOCATE

… Hfn Hf1 …

FHGRAPH

Feature Model:

(F, Er , Ed , h)

Light-weight feature model:

Depends-on Reflecting runtime dependencies Relates-to Revealing underlying connections

FHGraph: Inferring Feature Relationships

13

(f2 → f1) ⇔ (Hf1 ⊆ Hf2)

(f2 $ f1) , (Hf1 \ Hf2 6= ;)

Hf1 : Hf2 : Hf3 :

f2 f3 f1

depends-on depends-on

slide-38
SLIDE 38

FHISTORIAN

Input History: H

Tfn Tf1

FLOCATE

… Hfn Hf1 …

FHGRAPH

Feature Model:

(F, Er , Ed , h)

Light-weight feature model:

Depends-on Reflecting runtime dependencies Relates-to Revealing underlying connections

FHGraph: Inferring Feature Relationships

13

(f2 → f1) ⇔ (Hf1 ⊆ Hf2)

(f2 $ f1) , (Hf1 \ Hf2 6= ;)

Hf1 : Hf2 : Hf3 :

f2 f3 f1

relates-to depends-on depends-on

slide-39
SLIDE 39

Outline

X

  • 1. Introduction
  • 2. Background
  • Semantics-Preserving History Slice
  • Semantic History Slicing
  • 3. FHistorian
  • FLocate: identifying feature implementations in histories
  • FHGraph: inferring feature relationships
  • 4. Evaluation
  • 5. Conclusion & Future Work
slide-40
SLIDE 40

Evaluation

FHistorian:

  • Implementation: bitbucket.org/liyistc/gitslice
  • Data set [MSR’17]: github.com/Chenguang-Zhu/DoSC

Research questions:

  • How accurate are the feature location results?
  • Are the inferred feature relationships useful?

14

slide-41
SLIDE 41

Preparing subjects:

  • Take a release history (ideally

with JIRA issue tracking)

  • Go through each feature (64)
  • Identify feature tests (36)

Evaluation Subjects

15

{f1, f2, . . . , fn}

release notes

slide-42
SLIDE 42

Preparing subjects:

  • Take a release history (ideally

with JIRA issue tracking)

  • Go through each feature (64)
  • Identify feature tests (36)

Evaluation Subjects

15

{f1, f2, . . . , fn} f1

release notes features

slide-43
SLIDE 43

Preparing subjects:

  • Take a release history (ideally

with JIRA issue tracking)

  • Go through each feature (64)
  • Identify feature tests (36)

Evaluation Subjects

15

{f1, f2, . . . , fn} f1 Tf1

release notes features feature tests

slide-44
SLIDE 44

Preparing subjects:

  • Take a release history (ideally

with JIRA issue tracking)

  • Go through each feature (64)
  • Identify feature tests (36)

Evaluation Subjects

15

{f1, f2, . . . , fn} f1 Tf1

release notes features feature tests

slide-45
SLIDE 45

Results

Comparing with developer annotations:

  • Ground truth: extracted from change logs and release

notes (not always perfect)

  • Perfect match on 15/36 features
  • Finding more changes, occasionally missing changes

16

slide-46
SLIDE 46

Results

Comparing with developer annotations:

  • Ground truth: extracted from change logs and release

notes (not always perfect)

  • Perfect match on 15/36 features
  • Finding more changes, occasionally missing changes

Reasons for the differences:

  • Conceptual vs. operational
  • Missing minor optimizations: not affecting tests
  • Discovering hidden dependencies

16

slide-47
SLIDE 47

Results: Feature Relationships

17

relates-to: depends-on:

A B A B

COMPRESS 374 COMPRESS 369 COMPRESS 373 COMPRESS 327 COMPRESS 368 COMPRESS 327’ COMPRESS 373 COMPRESS 369 COMPRESS 374 COMPRESS 368 COMPRESS 360 Seekable

slide-48
SLIDE 48

Results: Feature Relationships

17

Hidden feature Hidden feature

relates-to: depends-on:

A B A B

COMPRESS 374 COMPRESS 369 COMPRESS 373 COMPRESS 327 COMPRESS 368 COMPRESS 327’ COMPRESS 373 COMPRESS 369 COMPRESS 374 COMPRESS 368 COMPRESS 360 Seekable

slide-49
SLIDE 49

Conclusion & Future Work

FHistorian: History-based feature location

  • More flexible and more accurate
  • Exploiting version control data
  • Identifying feature implementations dynamically
  • Inferring light-weight feature models

What’s next?

  • Extracting feature meta information automatically
  • Generating richer feature models

18

slide-50
SLIDE 50

Questions?

19

Yi Li University of Toronto liyi@cs.toronto.edu

New features: {f1, f2, f3, f4}, tests: {t1,t2,t3,t4} f1: f2: f3: f4:

master feature 2

f1 f2 f4 f3

feature 3 feature 4 feature 1

commits features

test 1 test 4 test 2 test 3