Crosscutting Concerns Using Historical Code Changes Bram Adams - - PowerPoint PPT Presentation
Crosscutting Concerns Using Historical Code Changes Bram Adams - - PowerPoint PPT Presentation
Identifying Crosscutting Concerns Using Historical Code Changes Bram Adams Zhen Ming Jiang Ahmed E. Hassan SAIL, Queen's University http://sailhome.cs.queensu.ca/~bram/ What are crosscutting concerns? 2 Crosscutting Concerns 3
What are crosscutting concerns?
2
Crosscutting
Concerns
3
Crosscutting
Concerns
3
Crosscutting
Concerns
multi-threading tracing exception handling data persistence security memory cleanup 3D rendering p e r f
- r
m a n c e sound support
3
Crosscutting
Concerns
multi-threading tracing exception handling data persistence security memory cleanup 3D rendering p e r f
- r
m a n c e sound support
3
Crosscutting
Concerns
multi-threading tracing exception handling data persistence security memory cleanup 3D rendering p e r f
- r
m a n c e sound support
3
Crosscutting
Concerns
multi-threading tracing exception handling data persistence security memory cleanup 3D rendering p e r f
- r
m a n c e sound support
3
Crosscutting
Concerns
multi-threading tracing exception handling data persistence security memory cleanup 3D rendering p e r f
- r
m a n c e sound support
3
Crosscutting
Concerns
multi-threading tracing exception handling data persistence security memory cleanup 3D rendering p e r f
- r
m a n c e sound support
3
Crosscutting
Concerns
multi-threading tracing exception handling data persistence security memory cleanup 3D rendering p e r f
- r
m a n c e sound support
3
Crosscutting
Concerns
multi-threading tracing exception handling data persistence security memory cleanup 3D rendering p e r f
- r
m a n c e sound support
3
Crosscutting
Concerns
multi-threading tracing exception handling data persistence security memory cleanup 3D rendering p e r f
- r
m a n c e sound support
3
Crosscutting
Concerns
multi-threading tracing exception handling data persistence security memory cleanup 3D rendering p e r f
- r
m a n c e sound support
3
- 1. Which concerns are implemented?
- 2. Where?
- 3. How are concerns composed together?
(Crosscutting) Concern Mining
4
- 1. What is a Crosscutting Concern?
- 2. The Concern Mining Process and its
Shortcomings
- 3. COMMIT
- 4. Case Study
- 5. Conclusion
5
Concern Mining Process
data source
6
Concern Mining Process
data source concern seeds 1
6
Concern Mining Process
data source concern seeds Concern Mining Techniques :-) 1
6
Concern Mining Process
data source concern seeds Concern Mining Techniques :-) concerns 1 2
6
Concern Mining Process
data source concern seeds expanded concerns Concern Mining Techniques :-) concerns 1 2 3
6
Concern Mining Process
data source concern seeds expanded concerns concern composition Concern Mining Techniques :-) concerns 1 2 3 4
6
Concern Mining Process
data source concern seeds expanded concerns concern composition Concern Mining Techniques :-) MANUAL :-( concerns 1 2 3 4
6
Concern Mining Process
data source concern seeds expanded concerns concern composition Concern Mining Techniques :-) MANUAL :-( concerns 1 2 3 4
6
S1: Limited Context
1 2 3 4
7
S1: Limited Context
thread() process() block() clean() 1 2 3 4
7
S1: Limited Context
thread() process() mutex semaphore_t address sender subject block() DEFINED_LINUX clean() 1 2 3 4
7
S1: Limited Context
thread() process() mutex semaphore_t address sender subject CVS block() DEFINED_LINUX clean() thread() 1 2 3 4
7
S2: Noise
1 2 3 4
8
S2: Noise
1 2 3 4
8
S2: Noise
1 2 3 4
8
S3: No Composition
random
encrypt
decrypt seed
1 2 3 4
9
S3: No Composition
random
encrypt
decrypt seed
random
encrypt
decrypt seed
1 2 3 4
9
- 1. What is a Crosscutting Concern?
- 2. The Concern Mining Process and its
Shortcomings
- 3. COMMIT
- 4. Case Study
- 5. Conclusion
10
COncern Mining using Mutual Information over Time
CVS
11
limited context noise no composition
COncern Mining using Mutual Information over Time
analyze historical changes to all code entities
CVS
11
limited context noise no composition
COncern Mining using Mutual Information over Time
analyze historical changes to all code entities statistical clustering based on mutual information
CVS
11
limited context noise no composition
- S1. Historical Data Sources
CVS
CVS
12
- S1. Historical Data Sources
CVS
transactions
CVS
12
- S1. Historical Data Sources
CVS
transactions
CVS
12
- S1. Historical Data Sources
CVS
transactions
CVS
function call or variable access added
12
- S1. Historical Data Sources
CVS
transactions
CVS
function call or variable access added intentional co- addition of calls and accesses
12
- S1. Historical Data Sources
CVS
transactions
CVS
function call or variable access added intentional co- addition of calls and accesses
concern seed
12
- S2. Mutual
Information
13
- S2. Mutual
Information
13
- S2. Mutual
Information
13
- S2. Mutual
Information
13
- S2. Mutual
Information
13
- S2. Mutual
Information
13
- S2. Mutual
Information
13
- S2. Mutual
Information
13
- S2. Mutual
Information
13
- S2. Mutual
Information
13
- S2. Mutual
Information
13
- S2. Mutual
Information
13
- S2. Mutual
Information
13
- S2. Mutual
Information
14
How much does occurrence of reveal about occurrence of ?
- S2. Mutual
Information
14
How much does occurrence of reveal about occurrence of ?
- S2. Mutual
Information
14
How much does occurrence of reveal about occurrence of ?
- S2. Mutual
Information
14
How much does occurrence of reveal about occurrence of ?
- S2. Mutual
Information
14
How much does occurrence of reveal about occurrence of ?
- S2. Mutual
Information
14
- S3. Concern Relations
seed graph
15
- S3. Concern Relations
15
- S3. Concern Relations
composite concern simple concern
15
- 1. What is a Crosscutting Concern?
- 2. The Concern Mining Process and its
Shortcomings
- 3. COMMIT
- 4. Case Study
- 5. Conclusion
16
Case Study
1996-2002 1993-2003
(800 kLOC) (2 MLOC)
17
Comparative Study
18
CBFA HAM COMMIT
similar entity names
✖
identical set
- f callers
✖
mutual information
✔
limited context noise no composition
Comparative Study
18
CBFA HAM COMMIT
similar entity names
✖
identical set
- f callers
✖
mutual information
✔
limited context noise no composition
Comparative Study
18
CBFA HAM COMMIT
similar entity names
✖
identical set
- f callers
✖
mutual information
✔
limited context noise no composition
CVS CVS
snapshot
Comparative Study
18
CBFA HAM COMMIT
similar entity names
✖
identical set
- f callers
✖
mutual information
✔
limited context noise no composition
CVS CVS
snapshot
Comparative Study
18
CBFA HAM COMMIT
similar entity names
✖
identical set
- f callers
✖
mutual information
✔
limited context noise no composition
CVS CVS
snapshot
Study Design
19
Study Design
19
Study Design
CBFA HAM COMMIT
19
Study Design
CBFA HAM COMMIT top 20 top 20 top 20
19
Study Design
CBFA HAM COMMIT top 20 top 20 top 20
19
concern?
Study Design
CBFA HAM COMMIT top 20 top 20 top 20
19
Study Design
CBFA HAM COMMIT
19
Study Design
CBFA HAM COMMIT
19
Study Design
CBFA HAM COMMIT top 20 top 20 top 20
19
Study Design
CBFA HAM COMMIT top 20 top 20 top 20
19
top 20 top 20 top 20
- H1. Richer Data Sources
Yield richer Seeds
CVS
20
- H1. Richer Data Sources
Yield richer Seeds
8 16 24 32 40 CBFA HAM COMMIT 45 90 135 180 225 CBFA HAM COMMIT
#non-function entities #functions
CVS
20
- H1. Richer Data Sources
Yield richer Seeds
8 16 24 32 40 CBFA HAM COMMIT 45 90 135 180 225 CBFA HAM COMMIT
#non-function entities #functions
CVS
20
50% 79% 83% 29% 88% 75%
- H2. COMMIT Identifies a Larger
Percentage of unique Concerns
21
- H2. COMMIT Identifies a Larger
Percentage of unique Concerns
21
20 40 60 80 100 CBFA HAM COMMIT 20 40 60 80 100 CBFA HAM COMMIT
- H2. COMMIT Identifies a Larger
Percentage of unique Concerns
21
20 40 60 80 100 CBFA HAM COMMIT
56% 56%
20 40 60 80 100 CBFA HAM COMMIT
87.5% 50%
- H3. COMMIT complements
CBFA and HAM (1)
22
- H3. COMMIT complements
CBFA and HAM (1)
22
CBFA HAM COMMIT
1 8 14 9
CBFA HAM COMMIT
1 9 14 9
- H3. COMMIT complements
CBFA and HAM (2)
23
d1 d2 d3 d4 d5 d6 d7 d8 d9
- H3. COMMIT complements
CBFA and HAM (2)
device drivers
23
kernel
d1 d2 d3 d4 d5 d6 d7 d8 d9
- H3. COMMIT complements
CBFA and HAM (2)
CBFA concern (e.g., driver API)
23
kernel
d1 d2 d3 d4 d5 d6 d7 d8 d9
- H3. COMMIT complements
CBFA and HAM (2)
CBFA concern (e.g., driver API) HAM concern (e.g., cloned driver code)
23
kernel
d1 d2 d3 d4 d5 d6 d7 d8 d9
- H3. COMMIT complements
CBFA and HAM (2)
CBFA concern (e.g., driver API) HAM concern (e.g., cloned driver code) COMMIT concern (e.g., driver + infrastructure)
23
kernel
d1 d2 d3 d4 d5 d6 d7 d8 d9
- H3. COMMIT complements
CBFA and HAM (2)
CBFA concern (e.g., driver API) HAM concern (e.g., cloned driver code) COMMIT concern (e.g., driver + infrastructure)
23
kernel
24
ODBC Data Retrieval Composite Concern
24
ODBC Data Retrieval Composite Concern
- 1. connection configuration
1
24
ODBC Data Retrieval Composite Concern
ODBC
- 1. connection configuration
1 2
24
ODBC Data Retrieval Composite Concern
ODBC
- 1. connection configuration
- 2. connection error handling
1 3 2
24
ODBC Data Retrieval Composite Concern
ODBC
- 1. connection configuration
- 2. connection error handling
- 3. data transfer
1 3 2 4
24
ODBC Data Retrieval Composite Concern
ODBC
- 1. connection configuration
- 2. connection error handling
- 3. data transfer
- 4. SQL-to-ODBC conversion
1 3 2 4 5
24
ODBC Data Retrieval Composite Concern
ODBC
- 1. connection configuration
- 2. connection error handling
- 3. data transfer
- 4. SQL-to-ODBC conversion
- 5. ODBC-to-ESQL conversion
1 3 2 6 4 5
24
ODBC Data Retrieval Composite Concern
ODBC
- 1. connection configuration
- 2. connection error handling
- 3. data transfer
- 4. SQL-to-ODBC conversion
- 5. ODBC-to-ESQL conversion
- 6. conversion error handling
1 3 4 6 2 1 3 2 6 4 5 5
24
ODBC Data Retrieval Composite Concern
ODBC
- 1. connection configuration
- 2. connection error handling
- 3. data transfer
- 4. SQL-to-ODBC conversion
- 5. ODBC-to-ESQL conversion
- 6. conversion error handling
36 seeds
25
36 seeds ODBC Data Retrieval Concern
25
36 seeds ODBC Data Retrieval Concern
25
5 other composite concerns
Threats to Validity
- generalizability to other systems
- subjectivity ↔ substantial agreement (Kappa)
- seed quality not checked
- threshold optimization is task-specific
26
- 1. What is a Crosscutting Concern?
- 2. The Concern Mining Process and its
Shortcomings
- 3. COMMIT
- 4. Case Study
- 5. Conclusion
27
28
Crosscutting Concerns
multi-threading t r a c i n g e x c e p t i
- n
h a n d l i n g data persistence s e c u r i t y m e m
- r
y c l e a n u p 3D rendering performance sound support
28
Crosscutting Concerns
multi-threading t r a c i n g e x c e p t i
- n
h a n d l i n g data persistence s e c u r i t y m e m
- r
y c l e a n u p 3D rendering performance sound support
28
Concern Mining Shortcomings
- S1. limited seed context
- S2. noise between seeds
- S3. no composition of concerns
Crosscutting Concerns
multi-threading t r a c i n g e x c e p t i
- n
h a n d l i n g data persistence s e c u r i t y m e m
- r
y c l e a n u p 3D rendering performance sound support
28
Concern Mining Shortcomings
- S1. limited seed context
- S2. noise between seeds
- S3. no composition of concerns
COMMIT
CVS
transactions
function call or variable access added intentional co- addition of calls and accesses
concern seed
Crosscutting Concerns
multi-threading t r a c i n g e x c e p t i
- n
h a n d l i n g data persistence s e c u r i t y m e m
- r
y c l e a n u p 3D rendering performance sound support
28
COMMIT complements CBFA and HAM
CBFA HAM COMMIT
1 8 14 9
CBFA HAM COMMIT
1 9 14 9
Concern Mining Shortcomings
- S1. limited seed context
- S2. noise between seeds
- S3. no composition of concerns
COMMIT
CVS
transactions
function call or variable access added intentional co- addition of calls and accesses
concern seed
QUESTIONS?
Crosscutting Concerns
multi-threading t r a c i n g e x c e p t i
- n
h a n d l i n g data persistence s e c u r i t y m e m
- r
y c l e a n u p 3D rendering performance sound support
28
COMMIT complements CBFA and HAM
CBFA HAM COMMIT
1 8 14 9
CBFA HAM COMMIT
1 9 14 9
Concern Mining Shortcomings
- S1. limited seed context
- S2. noise between seeds
- S3. no composition of concerns
COMMIT
CVS
transactions
function call or variable access added intentional co- addition of calls and accesses
concern seed