Concept Location in Source Code Feature: a requirement that user - - PowerPoint PPT Presentation
Concept Location in Source Code Feature: a requirement that user - - PowerPoint PPT Presentation
Concept Location in Source Code Feature: a requirement that user can invoke and that has an observable behavior. Feature Location Impact Analysis Concept Location discovering human oriented concepts and assigning them to their
Feature: a requirement that user can invoke and that has an observable behavior.
Feature Location Impact Analysis
Concept Location
- “… discovering human oriented concepts
and assigning them to their implementation instances within a program …” [Biggerstaff’93]
- Concept location is needed whenever a
change is to be made
- Change requests are most often formulated
in terms of domain concepts
– the programmer must find in the code the locations where concept “paste” is located – this is the start of the change
Concept Location = Point of Change
Assumption
- The programmer understands domain
concepts, but not the code
– the knowledge of domain concepts is based
- n program use and is easier to acquire
- user of a word processor learns about cut-and-
paste, fonts, and other concepts of the domain
- All domain concepts map onto the
fragments of the code
– finding that fragment is concept location
Partial Comprehension of the Code
- Large programs cannot be completely
comprehended
– programmers seek the minimum essential understanding for the particular software task – they use an as-needed strategy – they attempt to understand how certain specific concepts are reflected in the code
Existing Feature Location Work
Software Reconn SPR ASDGs LSI NLP Cerberus PROMESIR SITIR SNIAFL DORA FCA
Static Textual Dynamic
Dit, Revelle, Gethers and Denys Poshyvanyk. “Feature Location in Source Code: A Taxonomy and Survey.” Submission to Journal of Software Maintenance and Evolution: Research and Practice.
SUADE
Concepts vs. Features vs. Concerns
- Features correspond to user visible behavior of
the systems
– e.g., print, open file, copy, paste, etc. – usually captured by the functional requirements of the systems
- All features are concepts but not the other way
around
– e.g., linked list – part of the solution domain, not problem domain – can not use dynamic techniques to locate such concepts
- Concerns are synonym with concepts
– aspects = crosscutting concerns
Concept Location as Text Search
- Source code is regarded as text data
- Techniques differ by:
– source code pre-processing – query/search mechanism – granularity and structure of the results
Grep-based Concept Location
- Source code is not processed
- Queries are regular expressions (i.e.,
formal language): [hc]at, .at, *at, [^b] at, ^[hc]at, [hc]at$, etc.
- Search mechanism is regular expression
matching
- Results are unordered lines of text where
the query is matched
Grep-based Concept Location in an IDE
How Can We Do Better?
What is Information Retrieval?
- An Information Retrieval System (IR) is
capable of storage, retrieval, and maintenance of information (e.g., text, images, audio, video, and other multi- media objects)
- IR methods: signature files, inversion,
clustering, probabilistic classifiers, vector space models, etc.
What is Text Retrieval?
- TR = IR of textual data
– a.k.a document retrieval
- Basis for internet search engines
- Search space is a collection of documents
- Search engine creates a cache consisting
- f indexes of each document – different
techniques create different indexes
Terminology
- Document – unit of text – set of words
- Corpus – collection of documents
- Term vs. word – basic unit of text - not all
terms are words
- Query
- Index
- Rank
- Relevance
TR-based Concept Location
- Source code is processed into documents
- Queries are sets of terms/words
- Search mechanism based on the TR
technique used
- Results are documents and are ranked
w.r.t. the query
TR-based Concept Location - Process
1. Creating a corpus of a software system 2. Indexing the corpus with the TR method
(we used LSI, Lucene, GDS, LDA)
3. Formulating a query 4. Ranking methods 5. Examining results 6. Go to 3 if needed
Creating a Corpus of a Software System
- Parsing source code and extracting documents
– corpus – collection of documents (e.g., methods)
- Removing non-literals and stop words
– common words in English, standard function library names, programming language keywords
- Preprocessing:
– split_identifiers and SplitIdentifiers
- NLP methods can be applied such as stemming
Parsing Source Code and Extracting Documents
- Documents can be at different granularities
(e.g., methods, classes, files)
Parsing Source Code and Extracting Documents
- Documents can be at different granularities
(e.g., methods, classes, files)
Source Code is Text Too
public void run IProgressMonitor monitor throws InvocationTargetException InterruptedException if m_iFlag processCorpus monitor checkUpdate else if m_iFlag processCorpus monitor UD_UPDATECORPUS else processQueryString monitor if monitor isCancelled throw new InterruptedException the long running
Splitting Identifiers
- IProgressMonitor = i progress monitor
- InvocationTargetException = invocation target exception
- m_IFlag = m i flag
- UD_UPDATECORPUS = ud updatecorpus
public void run IProgressMonitor monitor throws InvocationTargetException InterruptedException if m_iFlag the processCorpus monitor checkUpdate else if m_iFlag processCorpus monitor UD_UPDATECORPUS else a processQueryString monitor if monitor isCancelled throw new InterruptedException the long running
Removing Stop Words
public void run IProgressMonitor monitor throws InvocationTargetException InterruptedException if m_iFlag the processCorpus monitor checkUpdate else if m_iFlag processCorpus monitor UD_UPDATECORPUS else a processQueryString monitor if monitor isCancelled throw new InterruptedException the long running
- Common words in English
- Programming language keywords
More Processing
- NLP methods can be used such as
stemming, part of speech tagging, etc.
- Example:
– fishing, fished, fish, fishes, and fisher all reduce to the root word fish
Vector Space Model
Typical weight – TF-IDF
Similarity Measure:
Cosine of the contained angle between the vectors
j: Rows - Documents i: Column - Terms [i, j]: Weighted frequency of terms within a document
term frequency-inverse document frequency
Query and Ranking of Results
- Any unit of text
– one word, one sentence, entire documents, piece of code, change request, etc.
- The query is interpreted as a pseudo-
document and represented in the VSM
- The results are documents, ranked based
- n the similarity to the query (pseudo-)
document
Evaluation Measures
- Precision - a measure of exactness or
fidelity
- Recall - a measure of completeness
IRiSS JIRiSS GES
30
Textual Feature Location
- Information Retrieval (IR)
– Searching for documents or within docs for relevant information
- First used for feature location by Marcus
et al. in 2004*.
– Latent Semantic Indexing** (LSI)
- Utilized by many existing approaches:
PROMESIR, SITIR, HIPIKAT etc.
* Marcus, A., Sergeyev, A., Rajlich, V., and Maletic, J., "An Information Retrieval Approach to Concept Location in Source Code", in Proc. of Working Conference on Reverse Engineering, 2004, pp. 214-223. ** Deerwester, S., Dumais, S. T ., Furnas, G. W., Landauer, T . K., and Harshman, R., "Indexing by Latent Semantic Analysis", Journal of the American Society for Information Science, vol. 41, no. 6, Jan. 1990, pp. 391-407.
31
Applying IR to Source Code
- Corpus creation
– Choose granularity
- Preprocessing
– Stop word removal, splitting, stemming
- Indexing
– Term-by-document matrix – Singular Value Decomposition
- Querying
– User-formulated
- Generate results
– Ranked list
synchronized void print(TestResult result, long runTime) { printHeader(runTime); printErrors(result); printFailures(result); printFooter(result); } print test result result run time print header run time print errors result print failure result print footer result
print test result ... m1 5 1 3 ... m2 ... ... ... ... prin t test result ... m1 5 1 3 ... m2 ... ... ... ...
print test result result run time print header run time print errors result print failure result print footer result print test result result run time print head run time print error result print fail result print foot result
Feature Location with Software Reconnaissance - Dynamic Analysis
readAndDispatch -- org.eclipse.swt.widgets.Display checkDevice -- org.eclipse.swt.widgets.Display isDisposed -- org.eclipse.swt.graphics.Device drawMenuBars -- org.eclipse.swt.widgets.Display runPopups -- org.eclipse.swt.widgets.Display filterMessage -- org.eclipse.swt.widgets.Display windowProc -- org.eclipse.swt.widgets.Display windowProc -- org.eclipse.swt.widgets.Control WM_TIMER -- org.eclipse.swt.widgets.Control windowProc -- org.eclipse.swt.widgets.Display windowProc -- org.eclipse.swt.widgets.Control WM_TIMER -- org.eclipse.swt.widgets.Control windowProc -- org.eclipse.swt.widgets.Display windowProc -- org.eclipse.swt.widgets.Control checkDevice -- org.eclipse.swt.widgets.Display isDisposed -- org.eclipse.swt.graphics.Device drawMenuBars -- org.eclipse.swt.widgets.Display runPopups -- org.eclipse.swt.widgets.Display filterMessage -- org.eclipse.swt.widgets.Display windowProc -- org.eclipse.swt.widgets.Display windowProc -- org.eclipse.swt.widgets.Control readAndDispatch -- org.eclipse.swt.widgets.Display checkDevice -- org.eclipse.swt.widgets.Display isDisposed -- org.eclipse.swt.graphics.Device drawMenuBars -- org.eclipse.swt.widgets.Display runPopups -- org.eclipse.swt.widgets.Display runAsyncMessages -- org.eclipse.swt.widgets.Display removeFirst -- org.eclipse.swt.widgets.Synchronizer
Scenario NOT exercising the feature (trace 1) Scenario exercising the feature (trace 2)
[Wilde’92]
Feature Location with Scenario- based Probabilistic Ranking
readAndDispatch -- org.eclipse.swt.widgets.Display checkDevice -- org.eclipse.swt.widgets.Display isDisposed -- org.eclipse.swt.graphics.Device drawMenuBars -- org.eclipse.swt.widgets.Display runPopups -- org.eclipse.swt.widgets.Display filterMessage -- org.eclipse.swt.widgets.Display windowProc -- org.eclipse.swt.widgets.Display windowProc -- org.eclipse.swt.widgets.Control WM_TIMER -- org.eclipse.swt.widgets.Control windowProc -- org.eclipse.swt.widgets.Display windowProc -- org.eclipse.swt.widgets.Control WM_TIMER -- org.eclipse.swt.widgets.Control windowProc -- org.eclipse.swt.widgets.Display windowProc -- org.eclipse.swt.widgets.Control checkDevice -- org.eclipse.swt.widgets.Display isDisposed -- org.eclipse.swt.graphics.Device drawMenuBars -- org.eclipse.swt.widgets.Display runPopups -- org.eclipse.swt.widgets.Display filterMessage -- org.eclipse.swt.widgets.Display windowProc -- org.eclipse.swt.widgets.Display windowProc -- org.eclipse.swt.widgets.Control WM_TIMER -- org.eclipse.swt.widgets.ProgressBar readAndDispatch -- org.eclipse.swt.widgets.Display checkDevice -- org.eclipse.swt.widgets.Display isDisposed -- org.eclipse.swt.graphics.Device drawMenuBars -- org.eclipse.swt.widgets.Display runPopups -- org.eclipse.swt.widgets.Display runAsyncMessages -- org.eclipse.swt.widgets.Display removeFirst -- org.eclipse.swt.widgets.Synchronizer
Scenario NOT exercising the feature (trace 1) Scenario exercising the feature (trace 2)
[Antoniol’06]
drawMenuBars -- org.eclipse.swt.widgets.Display runPopups -- org.eclipse.swt.widgets.Display runAsyncMessages -- removeFirst --
- rg.eclipse.swt.widgets.Synchronizer
34
Shortcomings of Dynamic Concept Location
Min Max 25% Med 75% σ µ Methods 88K 1.5M M 312 K 525 K 1MM 666K 406K Eclipse Unique Methods 1.9K 9.3K 3.9K 5K 6.3K 5.1K 2K Eclipse Size-MB 9.5 290 55 98 202 124 83 Threads 1 26 7 10 12 10 5 Methods 160 K 12MM 612 K 909 K 1.8M M 1.8M M 2.3MM Rhino Unique Methods 777 1.1K 870 917 943 912 54 Rhino Size-MB 18 1,668 71 104 214 210 273 Threads 1 1 1 1 1 1
- Execution traces are large even for small
systems
Shortcomings of Dynamic Concept Location
- Execution traces are large even for small
systems
- Selecting multiple scenarios may be
difficult
- Filtering the traces is equally problematic –
best filtering methods still return hundreds
- f methods
Probabilistic Ranking Of MEthodS and Information Retrieval
readAndDispatch -- org.eclipse.swt.widgets.Display checkDevice -- org.eclipse.swt.widgets.Display isDisposed -- org.eclipse.swt.graphics.Device drawMenuBars -- org.eclipse.swt.widgets.Display runPopups -- org.eclipse.swt.widgets.Display filterMessage -- org.eclipse.swt.widgets.Display windowProc -- org.eclipse.swt.widgets.Display windowProc -- org.eclipse.swt.widgets.Control WM_TIMER -- org.eclipse.swt.widgets.Control checkDevice -- org.eclipse.swt.widgets.Display isDisposed -- org.eclipse.swt.graphics.Device drawMenuBars -- org.eclipse.swt.widgets.Display runPopups -- org.eclipse.swt.widgets.Display checkDevice -- org.eclipse.swt.widgets.Display isDisposed -- org.eclipse.swt.graphics.Device drawMenuBars -- org.eclipse.swt.widgets.Display runPopups -- org.eclipse.swt.widgets.Display filterMessage -- org.eclipse.swt.widgets.Display windowProc -- org.eclipse.swt.widgets.Display windowProc -- org.eclipse.swt.widgets.Control
Scenario NOT exercising the feature (trace 1) Scenario exercising the feature (trace 2)
[Poshyvanyk’07]
Scenario-based Probabilistic Ranking
- Collecting Execution Traces
– Multiple (ir) relevant scenarios are executed to collect traces – processor emulation (VALGRIND for C++) to improve the precision of data collection – byte code instrumentation (JIKES for Java)
- Knowledge-based filtering to eliminate noisy
events
- Probabilistic ranking
– events are re-weighted (Wilde’s equation is renormalized)
Scenario-based Probabilistic Ranking
- A functionality, two scenarios => Two traces
- Comparison of the two traces to identify the
features related to the functionality
- Dynamic analysis
– Traces = sequences of intervals – Intervals = sequences of events (method calls) – Events are (ir) relevant to the feature
Scenario-based Probabilistic Ranking
- Without noise is simple
– Set operations
- With noise is difficult
– Imprecise locations of events – Imprecise beginning/end of intervals (C++ multi- threaded programs) – Imprecision of statistical profiling – Feature-relevant events are tangled or lost …
Scenario-based Probabilistic Ranking
- Knowledge-based filtering
– Frequent (ir) relevant events – Application-specific (middleware, code generators, external components …) events
- Probabilistic ranking
– F ( ) scenarios (not) exercising a feature – Intervals with (ir) relevant events – Wilde’s relevance index for an event ei
Combining the Experts
- SPR and LSI – “judgments” of the experts
- SPR – construct overlapping scenarios
- LSI – formulate a query that describes the
features
- Combined judgments:
Feature Location with PROMESIR
Example of using PROMESIR
- Locating a feature in JEdit
- Feature: “showing white-space as a visible
symbol in the text area”
- Steps:
– Run two scenarios – Run query – Explore results
Scenario Exercising the Feature in JEdit
Start Tracing
Scenario Exercising the Feature in JEdit
Stop Tracing
Second Scenario NOT Exercising the Feature in JEdit
Start Tracing
Example of using PROMESIR – Results
Scenario-based probabilistic rankings IR-based rankings PROMESIR
- Number of methods identified by
SPR – 284
- The position of the first relevant
method according to IR ranking – 56
- Position of the first relevant method
according to PROMESIR - 7
Case Studies
- Locating features associated with bugs:
- Case study objectives:
– Compare PROMESIR with stand-alone feature location approaches: LSI and SPR
Mozilla Eclipse Classes 4,853 7,648 Methods 53,617 89,341 Words 85,439 56,861
SIngle Trace Information Retrieval (SITIR)
readAndDispatch -- org.eclipse.swt.widgets.Display checkDevice -- org.eclipse.swt.widgets.Display isDisposed -- org.eclipse.swt.graphics.Device drawMenuBars -- org.eclipse.swt.widgets.Display runPopups -- org.eclipse.swt.widgets.Display filterMessage -- org.eclipse.swt.widgets.Display windowProc -- org.eclipse.swt.widgets.Display windowProc -- org.eclipse.swt.widgets.Control WM_TIMER -- org.eclipse.swt.widgets.Control windowProc -- org.eclipse.swt.widgets.Display windowProc -- org.eclipse.swt.widgets.Control WM_TIMER -- org.eclipse.swt.widgets.Control windowProc -- org.eclipse.swt.widgets.Display windowProc -- org.eclipse.swt.widgets.Control
Single Execution Trace
readAndDispatch -- org.eclipse.swt.widgets.Display checkDevice -- org.eclipse.swt.widgets.Display isDisposed -- org.eclipse.swt.graphics.Device drawMenuBars -- org.eclipse.swt.widgets.Display runPopups -- org.eclipse.swt.widgets.Display filterMessage -- org.eclipse.swt.widgets.Display windowProc -- org.eclipse.swt.widgets.Display windowProc -- org.eclipse.swt.widgets.Control WM_TIMER -- org.eclipse.swt.widgets.Control windowProc -- org.eclipse.swt.widgets.Display windowProc -- org.eclipse.swt.widgets.Control WM_TIMER -- org.eclipse.swt.widgets.Control windowProc -- org.eclipse.swt.widgets.Display windowProc -- org.eclipse.swt.widgets.Control
Source Code
Collecting Execution Traces in SITIR
- Java Platform Debugger Architecture (JPDA)1
– Infrastructure to build end-user debugging applications for Java platform
- JPDA highlights:
– Debugger works on a separate virtual machine – Minimal interference of a tracing tool with a subject program – Separate thread-based traces – Marked traces (start/stop recording) ________________
1http://java.sun.com/javase/technologies/core/toolsapis/jpda/
Combining Structural, Dynamic & Textual Info
Software Artifacts Analysis Tool 1 Parser Analysis Tool N Tracer Structural Information AST Call Graph Data Flow Control Flow IR Tool LSI Semantic Similarities Semantic Space System Representation
52
χ
Data Fusion Example
Time Position
Actual Position Inertial Navigation System (INS) + Continuous measurements + Centimeter accuracy + Low noise
- Drifts over time
χ χ χ χ χ χ χ χ χ χ χ χ
Global Positioning System (GPS)
- Discrete measurements
- Meter accuracy
- Noisy
+ No drift
53
Data Fusion for Feature Location
- Combining information from multiple
sources will yield better results than if the data is used separately
– Previous
- Textual, Dynamic, and Static (i.e., Cerberus)
– Current
- Textual info from IR
- Execution info from dynamic tracing
- Web mining
54
Web Mining
m1 m6 m5 m4 m3 m2 m14 m8 m9 m7 m12 m11 m10 m15 m13 m16 m17 m18 m19 m20
55
Web Mining Algorithms
PageRank
– Measure the relative importance of a web page – Used by the Google search engine – Link from X to Y means a vote by X for Y – A node’s PageRank depends
- n the # of incoming links
and the PageRank of nodes that link to it
Brin, S. and Page, L., "The Anatomy of a Large-Scale Hyper-textual Web Search Engine", in Proc. of 7th International Conference on World Wide Web, Brisbane, Australia, 1998, pp. 107-117.
Image source: http://en.wikipedia.org/wiki/Pagerank
56
Web Mining Algorithms
HITS
– Hyperlinked-Induced Topic Search – Identifies hub and authority pages – Hubs point to many good authorities – Authorities are pointed to by many hubs
Kleinberg, J. M., "Authoritative sources in a hyperlinked environment", Journal of the ACM, vol. 46, no. 5, 1999, pp. 604-632.
Hubs Authorities
57 m1 m6 m5 m4 m3 m2 m14 m8 m9 m7 m12 m11 m10 m15 m13 m16 m17 m18 m19 m20 1/6 1/6 1/6 1/6 1/6 1/6 1/1 1/4 1/4 1/4 1/4 1/2 1/2 1/2 1/2 1/1 1/1 1/1 1/2 1/2 1/2 1/2 1/4 1/4 1/4 1/4
Probabilistic Program Dependence Graph*
PPDG
– Derived from feature- specific trace – Binary weights – Execution frequency weights
1/1 1/7 2/7 1/7 1/7 1/7 1/7 1/1 3/8 2/8 1/8 2/8 2/3 1/3 4/5 1/5 3/3 1/1 1/1 2/4 2/4 2/6 4/6 3/9 1/9 2/9 3/9 1/1
15 16 20 13 17 18 19 14 10 12 11 7 8 9 2 3 4 5 6 1
*Baah, G. K., Podgurski, A., and Harrold, M. J. 2008. The probabilistic program dependence graph and its application to fault diagnosis. In Proceedings of the 2008 International Symposium on Software Testing and Analysis, 2008.
58
Incorporating Web Mining with Feature Location
LSI score m15 0.91 m16 0.88 m2 0.85 m6 0.79 m47 0.74 m52 0.60 PR m15 0.14 m16 0.09 m20 0.07 m13 0.04 m17 0.001 ... ...
59
Feature Location Techniques Evaluated
LSI & Dynamic Analysis Web Mining LSI, Dyn, & PageRank LSI, Dyn, & H , Dyn, & HITS LSI PR(bin) LSI+Dyn+PR(bin)top LSI+Dyn+HITS(h,bin)top LSI+Dyn+HITS(h,bin)bottom LSI+Dyn PR(freq) LSI+Dyn+PR(bin)bottom LSI+Dyn+HITS(h,freq)top LSI+Dyn+HITS(h,freq)
bottom
(baseline) HITS(h, bin) LSI+Dyn+PR(freq)top LSI+Dyn+HITS(a,bin)top LSI+Dyn+HITS(a,bin)bottom H I T S ( h , freq) LSI+Dyn+PR(freq)
bottom
LSI+Dyn+HITS(a,freq)top LSI+Dyn+HITS(a,freq)bottom HITS(a, bin) H I T S ( a , freq) Use LSI to rank methods, prune unexecuted Use web mining algorithm to rank methods.
Use LSI to rank methods also rank methods and prune
- ethods. Prune unexecuted. Use w
and prune top- or bottom- ranked m results.
- d. Use web mining algorithm to
nked methods from LSI+Dyn’s
60
Feature Location Techniques Explained
LSI LSI+Dyn PR(bin) LSI+Dyn PR(bin)top
+
HITS(h, bin)bottom
m1 m6 m5 m4 m3 m2 m14 m8 m9 m7 m12 m11 m10 m15 m13 m16 m17 m18 m19 m20
PR(bin)
61
Subject Systems
- Eclipse 3.0
– 10K classes, 120K methods, and 1.6 million LOC – 45 features – Gold set: methods modified to fix bug – Queries: short description from bug report – Traces: steps to reproduce bug
62
63
Subject Systems
- Rhino 1.5
– 138 classes, 1,870 methods, and 32,134 LOC – 241 features – Gold set: Eaddy et al.’s dataset* – Queries: description in specification – Traces: test cases
* http://www.cs.columbia.edu/~eaddy/concerntagger/
64
Research Questions
- RQ1
– Does combining web mining algorithms with an existing approach to feature location improve its effectiveness?
- RQ2
– Which web-mining algorithms, HITS or PageRank, produces better results?
65
Data Collection & Testing
- Effectiveness measure
– Descriptive statistics
- 45 Eclipse features
- 241 Rhino features
- Statistical Testing
– Wilcoxon rank sum test – Null hypothesis
- There is no significant difference between the
effectiveness of X and the baseline (LSI+Dyn).
– Alternative hypothesis
- The effectiveness of X is significantly better than
the baseline (LSI+Dyn).
LSI score m15 0.91 m16 0.88 m2 0.85 m6 0.79 m47 0.74 m52 0.60
Effectiveness = 4
66
Results: Web Mining Techniques
17500.0000 35000.0000 52500.0000 70000.0000 500.0000 1000.0000 1500.0000 2000.0000 574 637 680 655 560 471 568 702
Eclipse Rhino LSI LSI+Dyn PR(freq) PR(bin) HITS(a, freq) HITS(a, bin) HITS(h, freq) HITS(h, bin) LSI LSI+Dyn PR(freq) PR(bin) HITS(a, freq) HITS(a, bin) HITS(h, freq) HITS(h, bin)
67
Results: IR, Dyn, & Web Mining
500.0000 1000.0000 1500.0000 2000.0000 22145% 50682% 42167% 31550% 21864% 59474% 43534% 38420% 50406% 39918% 49477% 31848% 58013% 150.0000 300.0000 450.0000 600.0000 140 200 300 160 300 200 260 175 190 240 160 250 300
Eclipse Rhino
- T1. LSI+Dyn
- T2. LSI+Dyn+PR(freq)top [40, 60]%
- T3. LSI+Dyn+PR(freq)bot [20, 70]%
- T4. LSI+Dyn+PR(bin)top [40, 60]%
- T5. LSI+Dyn+PR(bin)bot [10, 70]%
- T6. LSI+Dyn+HITS(a, freq)top [30, 70]%
- T7. LSI+Dyn+HITS(a, freq)bot [40, 60]%
- T8. LSI+Dyn+HITS(h, freq)top [10, 70]%
- T9. LSI+Dyn+HITS(h, freq)bot [60, 50]%
- T10. LSI+Dyn+HITS(a, bin)top [20, 70]%
- T11. LSI+Dyn+HITS(a, bin)bot [40, 40]%
- T12. LSI+Dyn+HITS(h, bin)top [10, 70]%
- T13. LSI+Dyn+HITS(h, bin) bot [70, 60]%
68
A Case in Point: Eclipse exclusion filter
LSI = 1,696
LSI+Dyn = 61 LSI+Dyn+ HITS(h, bin)bottom = 24
69
Eclipse Rhino Null Hypothesis
PR(freq)
1 1 Not Rejected
PR(bin)
1 1 Not Rejected
HITS(a, freq)
1 1 Not Rejected
HITS(a, bin)
1 1 Not Rejected
HITS(h, freq)
1 1 Not Rejected
HITS(h, bin)
1 1 Not Rejected
LSI+Dyn+PR(freq)top
< 0.0001 < 0.0001 Rejected
LSI+Dyn+PR(freq)bottom
0.004 Rejected
LSI+Dyn+PR(bin)top
< 0.0001 < 0.0001 Rejected
LSI+Dyn+PR(bin)bottom
< 0.0001 0.74 Not Rejected LSI+Dyn+HITS(a, freq)top < 0.0001 Rejected LSI+Dyn+HITS(a, freq)bottom < 0.0001 0.99 Not Rejected LSI+Dyn+HITS(h, freq)top 1 Not Rejected LSI+Dyn+HITS(h, freq)bottom < 0.0001 < 0.0001 Rejected LSI+Dyn+HITS(a, bin)top < 0.0001 < 0.0001 Rejected LSI+Dyn+HITS(a, bin)bottom < 0.0001 1 Not Rejected LSI+Dyn+HITS(h, bin)top 1 Not Rejected LSI+Dyn+HITS(h, bin)bottom < 0.0001 < 0.0001 Rejected
Results of the Wilcoxon Rank Sum test comparing these techniques to the baseline, LSI+Dyn. α = 0.05. Null Hypothesis: There is no significant difference between the effectiveness of X and the baseline, LSI +Dyn.
70
Research Questions Revisited
- RQ1: Does combining web mining
algorithms with an existing approach to feature location improve its effectiveness?
– Yes
- RQ2: Which web-mining algorithms, HITS
- r PageRank, produces better results?
– HITS
71
Best Techniques
- LSI+Dyn+HITS(h, freq)bottom
- LSI+Dyn+HITS(h, bin)bottom
- Methods with low HITS hub values are
getters and setters
72
Other Work
- HITS and PageRank on static vs. dynamic
info
- Evaluation first relevant vs. all relevant
methods
- Evaluation against fan-in and fan-out and
heuristics based on setters and getters
- Impact of thresholds on the filtering power
Impact of the Selection of a Threshold
LSI+Dyn+HITS(h, freq)bottom LSI+Dyn+HITS(h, bin)bottom
- Eclipse
Impact of the Selection of a Threshold
LSI+Dyn+HITS(h, freq)bottom LSI+Dyn+HITS(h, bin)bottom
- Rhino
Results: All of the Feature’s Methods
Eclipse Rhino
- T1. LSI
- T2. LSI+Dyn
- T3. PR(freq)
- T4. PR(bin)
- T5. HITS(a, freq)
- T6. HITS(a, bin)
- T7. HITS(h, freq)
- T8. HITS(h, bin)
76
Results: All of the Feature’s Methods
Eclipse Rhino
- T1. LSI+Dyn
- T2. LSI+Dyn+PR(freq)top [50, 30]%
- T3. LSI+Dyn+PR(freq)bot [20, 30]%
- T4. LSI+Dyn+PR(bin)top [50, 30]%
- T5. LSI+Dyn+PR(bin)bot [20, 40]%
- T6. LSI+Dyn+HITS(a, freq)top [20, 30]%
- T7. LSI+Dyn+HITS(a, freq)bot [40, 30]%
- T8. LSI+Dyn+HITS(h, freq)top [10, 30]%
- T9. LSI+Dyn+HITS(h, freq)bot [60, 40]%
- T10. LSI+Dyn+HITS(a, bin)top [20, 40]%
- T11. LSI+Dyn+HITS(a, bin)bot [40, 30]%
- T12. LSI+Dyn+HITS(h, bin)top [10, 30]%
- T13. LSI+Dyn+HITS(h, bin) bot [60, 40]%
Using a Static Call Graph (Eclipse)
Filter top PR results Filter bottom PR results Filter top HITS authority results Filter bottom HITS authority results Filter top HITS hubs results Filter bottom HITS hubs results
Using a Static Call Graph (Rhino)
Filter top PR results Filter bottom PR results Filter top HITS authority results Filter bottom HITS authority results Filter top HITS hubs results Filter bottom HITS hubs results
Results: Getters and Setters (Best Ranks)
Eclipse Rhino
- T1. LSI+Dyn
- T2. LSI+Dyn with getters and setters filtered
- T3. LSI+Dyn+HITS(h, bin) bot
Results: Getters and Setters (All Ranks)
Eclipse Rhino
- T1. LSI+Dyn
- T2. LSI+Dyn with getters and setters filtered
- T3. LSI+Dyn+HITS(h, bin) bot
Results: Fan-In (Best Ranks)
Eclipse Rhino
- T1. LSI+Dyn+HITS(h, bin) bot
- T2. LSI+Dyn, filter fan-in ≤1
- T3. LSI+Dyn, filter fan-in ≤2
- T4. LSI+Dyn, filter fan-in ≤3
- T5. LSI+Dyn, filter fan-in ≤4
- T6. LSI+Dyn, filter fan-in ≤5
- T7. LSI+Dyn, filter fan-in ≤10
Results: Fan-In (All Ranks)
Eclipse Rhino
- T1. LSI+Dyn+HITS(h, bin) bot
- T2. LSI+Dyn, filter fan-in ≤1
- T3. LSI+Dyn, filter fan-in ≤2
- T4. LSI+Dyn, filter fan-in ≤3
- T5. LSI+Dyn, filter fan-in ≤4
- T6. LSI+Dyn, filter fan-in ≤5
- T7. LSI+Dyn, filter fan-in ≤10
83
Tool Support
- FLAT3
– Eclipse Plug-in – Lucene-based IR – Execution tracing – Integration – Tagging – Metrics
http://www.cs.wm.edu/semeru/flat3/
Trevor Savage, Meghan Revelle, and Denys Poshyvanyk. "FLAT3: Feature Location and Textual Tracing Tool." In the Proceedings of the 32nd International Conference on Software Engineering (ICSE'10), Formal Research Tool Demonstration, Cape Town, South Africa, May 2-8, 2010.