Scalability-First Pointer Analysis with Self-Tuning Context Sensitivity
Yue Li, Tian Tan, Anders Møller and Yannis Smaragdakis
1
Scalability-First Pointer Analysis with Self-Tuning Context - - PowerPoint PPT Presentation
Scalability-First Pointer Analysis with Self-Tuning Context Sensitivity Yue Li, Tian Tan, Anders Mller and Yannis Smaragdakis 1 Pointer Analysis Concept Which objects a variable may point to? Importance Fundamental for virtually
Scalability-First Pointer Analysis with Self-Tuning Context Sensitivity
Yue Li, Tian Tan, Anders Møller and Yannis Smaragdakis
1
Pointer Analysis
Fundamental for virtually all static analyses Useful for many software engineering tasks
e.g., call graphs, alias, etc.
Which objects a variable may point to?
e.g., bug detection, security analysis, program understanding, etc.
2
Problem: Unpredictable Scalability
3
Problem: Unpredictable Scalability
4
Problem: Unpredictable Scalability
2type 1type CI 2obj Less precise Faster
5
Problem: Unpredictable Scalability
285 2458 53 93 5374 95 960 289 2950 45 54 1203 228 48 135 49 117 112 22 22 67 994
2000 4000 6000 8000 10000
2obj 2type CI
timeout (>10800 seconds)
6
Problem: Unpredictable Scalability
7
Problem: Unpredictable Scalability
as a part of a large-scale security analysis
8
Problem: Unpredictable Scalability
as a part of a large-scale security analysis
9
Problem: Unpredictable Scalability
Precise 2obj Unscalable for many
?
as a part of a large-scale security analysis
X X X X X X X
10
Problem: Unpredictable Scalability
Precise 2obj Unscalable for many
?
as a part of a large-scale security analysis Scalable CI Imprecise for all?
11
Problem: Unpredictable Scalability
Precise 2obj Unscalable for many
?
2obj à 2type à 1type à CI
as a part of a large-scale security analysis
12
Scalable CI Imprecise for all?
Scaler
Good Scalability & High Precision regardless of the program being analyzed
13
Scaler
Scalability Precision
as good as CI comparable to or better than the best scalable CS
285 2458 53 93 5374 95 960 289 2950 45 54 1203 1194 254 652 272 1769 452 53 93 705 1236 2000 4000 6000 8000 100002obj 2type Scaler
timeout (>10800 seconds)Good Scalability & High Precision regardless of the program being analyzed
14
Idea
Scaler
15
: number of contexts for method m under CS c : number of points-to facts for method m
#ptsm #ctxm
c *
#ctxm
c
#ptsm
Number of worst-case CS points-to facts for method m
16
Insight
17
*
#ctxm
c
#ptsm
>
ST (Scalability Threshold)
(c is expensive)
Insight
18
*
#ctxm
c
#ptsm
ST (Scalability Threshold)
(c is expensive)
*
#ctxm
c’
#ptsm ≤
ST (Scalability Threshold)
(choose cheap c’)
Insight
>
19
*
#ctxm
c
#ptsm
ST (Scalability Threshold)
(c is expensive)
*
#ctxm
c’
#ptsm ≤
ST (Scalability Threshold)
(choose cheap c’)
How to estimate ?
*
#ctxm
c
#ptsm
Insight
How to identify scalability-critical methods? >
20
How to estimate
Context estimation problem à Graph traversal problem
*Making k-object-sensitive pointer analysis more precise with still k-limiting. Tan et al. SAS 2016 *
#ctxm
c
#ptsm ?
Pre-analysis: points-to results of CI
m
#ctxm
c
21
Example
Scaler
22
10 000
method
10 000 1
2obj
1 000 000 100 000
#ctx
c
m·#pts m
c =
method
23
STp
10 000
method
10 000
method
1
2obj
1 000 000 100 000
#ctx
c
m·#pts m
c =
method
1000
2obj
ST: Scalability Threshold
24
STp
10 000
method
10 000
method
1
2obj
1 000 000 100 000
#ctx
c
m·#pts m
c =
method
1000
2obj
ST: Scalability Threshold
25
STp
10 000
method
10 000
method
1
2obj
1 000 000 100 000
#ctx
c
m·#pts m
c =
method
1000
2obj
ST: Scalability Threshold
26
?
STp
10 000
method
10 000
method
1
2obj
1 000 000 100 000
#ctx
c
m·#pts m
c =
method
1000
2type 1type
c = c =
2obj
?
27
STp
10 000
method
10 000
method
1
2obj
1 000 000 100 000
#ctx
c
m·#pts m
c =
method
1000
2type 1type
c = c =
2obj 2type
28
10 000
ST
method
10 000
method
2000 1 method 4000
method
2type 1type 2obj
1 000 000 100 000
p
#ctx
c
m·#pts m
c = c = c =
2obj 2type 1type
29
10 000
ST
method
10 000
method
2000 1 method 4000
method
2type 1type 2obj
1 000 000 100 000
p
#ctx
c
m·#pts m
c = c = c =
2obj 2type 1type
For any scalability-critical method, use the most precise CS variant that can turn it to a non-scalability-critical method
30
10 000
ST
method
10 000
method
2000 1 method 4000
method
2type 1type 2obj
1 000 000 100 000
p
#ctx
c
m·#pts m
c = c = c =
2obj 2type 1type
For any scalability-critical method, use the most precise CS variant that can turn it to a non-scalability-critical method
31
Total Scalability Threshold (TST)
32
To automatically choose an appropriate for different program p
p
ST
Total Scalability Threshold (TST)
To automatically choose an appropriate for different program p
p
ST
How many points-to facts can the memory hold?
33
Total Scalability Threshold (TST)
Memory TST Program A #ctxm
c#ptsm
*
Σ
Program B #ctxm
c#ptsm
*
Σ
How many points-to facts can the memory hold?
34
To automatically choose an appropriate for different program p
p
ST
10 000
ST
method
10 000
method
2000 1 method 4000
method
2type 1type 2obj
1 000 000 100 000
p
#ctx
c
m·#pts m
c = c = c =
35
method
A1 A2 A3 A1 + A2 + A3 ≤ TST
is automatically computed based on the above inequality
method
10 000 2000 1 method 4000
method
10 000 1 000 000 100 000
STp
#ctx
c
m·#pts m p
ST
2type 1type 2obj
c = c = c =
E
= ( ) ST
p
E(
) ST
p
Program P #ctxm
c#ptsm
*
Σ
36
method
A1 A2 A3 A1 + A2 + A3 ≤ TST
is automatically computed based on the above inequality
method
10 000 2000 1 method 4000
method
10 000 1 000 000 100 000
STp
#ctx
c
m·#pts m p
ST
2type 1type 2obj
c = c = c =
E
= ( ) ST
p Program P #ctxm
c#ptsm
*
Σ
37
method
A1 A2 A3 A1 + A2 + A3 ≤ TST
is automatically computed based on the above inequality
method
10 000 2000 1 method 4000
method
10 000 1 000 000 100 000
STp
#ctx
c
m·#pts m p
ST
2type 1type 2obj
c = c = c =
E
= ( ) ST
p
is the max value satisfying this inequality
p
ST
38
Scalability-First Pointer Analysis with Self-Tuning Context Sensitivity
Scaler
CS Variants STp TST self-tuned by depends on
methodA1 A2 A3 A1 + A2 + A3 ≤ TST
is automatically computed based on the above inequality method 10 000 2000 1 method 4000 method 10 000 1 000 000 100 000STp
#ctx c m·#pts m p ST2type 1type 2obj
c = c = c =E
= ( ) ST
p39
Results
Scaler
40
Luindex Chart
10 Popular Java Programs
41
Settings
Scalability Precision
as good as CI comparable to or better than the best scalable CS
Results
285 2458 53 93 5374 95 960 289 2950 45 54 1203 1194 254 652 272 1769 452 53 93 705 1236 2000 4000 6000 8000 10000 2obj 2type Scaler timeout (>10800 seconds)Scaler
42
Settings
Scalability Precision
as good as CI comparable to or better than the best scalable CS
Results
Scaler
Complex program Medium-Complexity program Simple program Luindex
43
Analysis Time (seconds) 3h = 10800s Precision Metrics #may-fail casts #poly calls #reachable methods #call graph edges CI 112 2234 2778 12718 114856 2obj à 2type à 1type >3h + >3h + 1997 2117 2577 12430 111834 Scaler 452 1852 2500 12167 107410
Complex program
In all cases, lower is better
44
Analysis Time (seconds) 3h = 10800s Precision Metrics #may-fail casts #poly calls #reachable methods #call graph edges CI 49 2508 2925 13036 77370 2obj à 2type à 1type 2458 1409 2182 12657 65836 Scaler 272 1452 2195 12676 66177
Medium-Complexity program
In all cases, lower is better
45
Analysis Time (seconds) 3h = 10800s Precision Metrics #may-fail casts #poly calls #reachable methods #call graph edges CI 22 734 940 6670 33130 2obj à 2type à 1type 53 297 675 6256 29021 Scaler 53 297 675 6256 29021
Simple program
In all cases, lower is better
Luindex
46
Analysis Time (seconds) 3h = 10800s Precision Metrics #may-fail casts #poly calls #reachable methods #call graph edges CI 22 734 940 6670 33130 2obj à 2type à 1type 53 297 675 6256 29021 Scaler 53 297 675 6256 29021 Analysis Time (seconds) 3h = 10800s Precision Metrics #may-fail casts #poly calls #reachable methods #call graph edges CI 49 2508 2925 13036 77370 2obj à 2type à 1type 2458 1409 2182 12657 65836 Scaler 272 1452 2195 12676 66177 Analysis Time (seconds) 3h = 10800s Precision Metrics #may-fail casts #poly calls #reachable methods #call graph edges CI 112 2234 2778 12718 114856 2obj à 2type à 1type >3h + >3h + 1997 2117 2577 12430 111834 Scaler 452 1852 2500 12167 107410
47
Analysis Time (seconds) 3h = 10800s Precision Metrics #may-fail casts #poly calls #reachable methods #call graph edges CI 22 734 940 6670 33130 2obj à 2type à 1type 53 297 675 6256 29021 Scaler 53 297 675 6256 29021 Analysis Time (seconds) 3h = 10800s Precision Metrics #may-fail casts #poly calls #reachable methods #call graph edges CI 49 2508 2925 13036 77370 2obj à 2type à 1type 2458 1409 2182 12657 65836 Scaler 272 1452 2195 12676 66177 Analysis Time (seconds) 3h = 10800s Precision Metrics #may-fail casts #poly calls #reachable methods #call graph edges CI 112 2234 2778 12718 114856 2obj à 2type à 1type >3h + >3h + 1997 2117 2577 12430 111834 Scaler 452 1852 2500 12167 107410
Scaler: Good Scalability & High Precision regardless of the program being analyzed
48
Want to have a good night’s sleep?
49
Good Scalability & High Precision
Scaler
Want to have a good night’s sleep?
50
Conclusion
Scaler
51
Scaler
http://www.brics.dk/scaler
52
Luindex
ST
p 16607 78699
ST
p
ST
p 35080733
≤ TST
E(
) ST
p
(30M)
53
2 188 2 176 2 080 2 080 2 050
2 000 4 000 6 000 8 000 10 000 12 000 2 000 2 050 2 100 2 150 2 200 20M 30M 60M 80M 150M #may-fail casts 12GB 48GB 368GB
Time
Timeout
Precision
Timeout
TST value Memory size
54