Efficient Fail-Fast Dynamic Subtype Checking
Rohan Padhye and Koushik Sen UC Berkeley VMIL 2019
Efficient Fail-Fast Dynamic Subtype Checking Rohan Padhye and Koushik - - PowerPoint PPT Presentation
Efficient Fail-Fast Dynamic Subtype Checking Rohan Padhye and Koushik Sen UC Berkeley VMIL 2019 Dynamic Subtype Checking S S obj = .. T if (obj instance of T) { . ? } obj 2 Dynamic Subtype Checking S S obj = new X() T if (obj
Rohan Padhye and Koushik Sen UC Berkeley VMIL 2019
2
S obj = ….. if (obj instance of T) { …. }
S T
?
3
S obj = new X() if (obj instance of T) { …. }
S
X T
?
X <: T ?
4
S X A B T P
X <: T ?
5
S X A B T P
X <: T ?
6
S X A B T P
X <: T ?
7
S X A B T P
X <: T ?
8
S X A B T P
Subtype test successful X <: T
9
S X A B T P
X <: T ?
10
S X A B T P
X <: T ?
11
S X A B T P
X <: T ?
12
S X A B T P
Subtype test fails X <: T
Constant time? Constant space? (per-class) Supports multiple inheritance? Supports open hierarchies?
13
Pick up to 3
14
15
class A implements I1 { … } class B extends A implements I2 { … } interface I5 extends I6, I7, I2 { … } class C extends B implements I3, I4, I5 { … }
16
Primary super I1 I2 I6 I7 I5 I3 I4 Secondary Metadata for C depth super
1 A 2 B 3 C 4 5 6 7
I3 C I4 I5 I6 I7 I1 I2 A B
class A implements I1 { … } class B extends A implements I2 { … } interface I5 extends I6, I7, I2 { … } class C extends B implements I3, I4, I5 { … } class D extends C class E extends D class F extends E class G extends F class H extends G
17
depth super
1 A 2 B 3 C 4 D 5 E 6 F 7 G super I1 I2 I6 I7 I5 I3 I4 H Primary Secondary Metadata for H
18
depth super
1 A 2 B 3 C 4 D 5 E 6 F 7 G super I1 I2 I6 I7 I5 I3 I4 H Primary Secondary Metadata for H Primary super I1 I2 I6 I7 I5 I3 I4 Secondary Metadata for C depth super
1 A 2 B 3 C 4 5 6 7 X <: C ? X <: D ? X <: H ? X <: I5 ? X.primary[3] == C? X.primary[4] == D? X.secondary_check(H) X.secondary_check(I5)
19
X <: H ? X <: I5 ? X.secondary_check(H) X.secondary_check(I5) X.secondary_check(T) := { if (X.cache == T) return true; if (X == T) return true; foreach S in X.secondaries { if (S == T) { X.cache = S return true; } } return false; } super I1 I2 I6 I7 I5 I3 I4 H Secondary Metadata for H
20
X.secondary_check(T) := { if (X.cache == T) return true; if (X == T) return true; foreach S in X.secondaries { if (S == T) { X.cache = S return true; } } return false; }
Observations:
Are there workloads where dynamic subtype tests often fail?
21
case x:A => x.method_on_A() case y:B => y.method_on_B() case z:C => z.method_on_C() … } if (obj instanceof A) { A x = (A) obj; x.method_on_A(); } else if (obj instanceof B) { B y = (B) obj; y.method_on_B(); } else if (obj instanceof C) { C z = (C) obj; z.method_on_C(); } …
22
Compile to JVM
Small workload: scalac Hello.scala
47,597 instanceof tests
93% failed
Large workload: sbt compile # builds scalac
3.1 billion instanceof tests
76% failed 45 million secondary scans
23
static bool isLoopInvariant(const Value *V, const Loop *L) { if (isa<Constant>(V) || isa<Argument>(V) || isa<GlobalValue>(V)) return true; // Otherwise, it must be an instruction... return !L->contains(cast<Instruction>(V)->getParent()); }
24
if (AllocationInst *AI = dyn_cast<AllocationInst>(Val)) { … } else if (CallInst *CI = dyn_cast<CallInst>(Val)) { … } else if …
25
Inheritance diagram: class CallInst
Small workload: clang++ Hello.cpp
5.5 million dyn_cast<T>/isa<T> operations
74% failed
Large workload: clang selfie.c # 10K LoC
93.7 million dyn_cast<T>/isa<T> operations
78% failed
26
But fast path is optimized for successful tests L
27
Takeaway: In some workloads…
(with no overhead for the current fast path)
28
29
For each type T:
α(T) := k distinct integers, chosen randomly from [1..m] β(T) := α(T) ∪ α(S1) ∪ α(S2) ∪ … ∪ α(Sn)
where S1, S2, … Sn are all the (transitive) super-types of T
Invariant:
30
S T A B
α = {1, 3} α = {7, 9} α = {11, 4}
X
α = {5, 8} α = {1, 6} β = {1, 3, 4, 6, 7, 9, 11}
Y
α = {7, 11} X <: T ? No Y <: T ? Maybe
For each type T:
α(T) := compile_time_random(parity=k) // m-bit integer β(T) := α(T) | α(S1) | α(S2) | …| α(Sn)
where S1, S2, … Sn are all the (transitive) super-types of T
Invariant:
31
S T A B
α = 0x000000000101 α = 0x000101000000 α = 0x010000001000
X
α = 0x000010010000 α = 0x000000100001 β = 0x010101101101
Y
α = 0x010001000000 X <: T ? No Y <: T ? Maybe
32
Worst-case only when false positive in bloom filters
m = size of machine word k = parity ?? n = num. of transitive supertypes
33
False positive rate:
34
35
case x:A => x.method_on_A() case y:B => y.method_on_B() case z:C => z.method_on_C() … } if (obj instanceof A) { A x = (A) obj; x.method_on_A(); } else if (obj instanceof B) { B y = (B) obj; y.method_on_B(); } else if (obj instanceof C) { C z = (C) obj; z.method_on_C(); } …
36
Compile to JVM
37
Rewrite if T is a secondary type
38
39
trait Base trait A extends Base { def method_on_A(): Int } trait B extends Base { def method_on_B(): Int }
case x:A => x.method_on_A() case y:B => y.method_on_B() }
40
trait Base trait A extends Base { def method_on_A(): Int } trait B extends Base { def method_on_B(): Int } …
…
case x:A => x.method_on_A() case y:B => y.method_on_B() case z:C => z.method_on_C() case u:D => u.method_on_D() case v:E => v.method_on_E() }
41
case x:A => x.method_on_A() case y:B => y.method_on_B() case z:C => z.method_on_C() case u:D => u.method_on_D() case v:E => v.method_on_E() … case q:H => q.method_on_H() }
trait Base trait A extends Base { def method_on_A(): Int } trait B extends Base { def method_on_B(): Int } …
…
42
case x:A => x.method_on_A() case y:B => y.method_on_B() case z:C => z.method_on_C() case u:D => u.method_on_D() case v:E => v.method_on_E() … case q:H => q.method_on_H() }
trait Base extends N1, N2, N3, … N10 trait A extends Base { def method_on_A(): Int } trait B extends Base { def method_on_B(): Int } …
…
Dynamic subtype tests often fail (in some workloads) Worst-case linear search occurs (in production VMs) Bloom filters can enable fail-fast refutations (high probability)
expected constant time + constant space + multiple inheritance + open hierarchy
43