Efficient and Precise Points-to Analysis: Modeling the Heap by - PowerPoint PPT Presentation

Efficient and Precise Points-to Analysis: Modeling the Heap by Merging Equivalent Automata Tian Tan, Yue Li and Jingling Xue PLDI 2017 June, 2017 1

A New Points-to Analysis T echnique for Object-Oriented Programs 2

Points-to Analysis  Determines ◦ “which objects a variable can point to?” 3

Uses of Points-to Analysis Clients Tools  Security analysis  Bug detection  Compiler optimization Chord  Program verification  Program understanding …  … 4

Uses of Points-to Analysis Clients Tools  Security analysis  Bug detection  Compiler optimization Chord  Program verification  Program understanding …  … Call Graph 5

Existing Call Graph Construction  On-the-fly construction (run with points-to analysis) ◦ Precise ◦ Inefficient 6

Existing Call Graph Construction  On-the-fly construction (run with points-to analysis) ◦ Precise ◦ Inefficient  3-object-sensitive points-to analysis ◦ Very precise ◦ Adopted by, e.g., Chord 7 7

3-Object-Sensitive Points-to Analysis  Analyze Java programs ◦ Intel Xeon E5 3.70GHz,128GB of memory ◦ Time budget: 5 hours (18000 secs) 8

3-Object-Sensitive Points-to Analysis  Analyze Java programs ◦ Intel Xeon E5 3.70GHz,128GB of memory ◦ Time budget: 5 hours (18000 secs) Analysis time (sec.) 14469 pmd (4 hours) Unscalable findbugs (> 5 hours) 0 5000 10000 15000 9

T wo Mainstreams of Points-to Analysis T echniques  Model control-flow  Model data-flow 10

T wo Mainstreams of Points-to Analysis T echniques  Model control-flow ◦ Context-sensitivity  Call-site- sensitivity (PLDI’04, PLDI’06)  Object- sensitivity (ISSTA’02, TOSEM’05, SAS’16)  Type- sensitivity (POPL’11)  …  Model data-flow 11

T wo Mainstreams of Points-to Analysis T echniques  Model control-flow ◦ Context-sensitivity  Call-site- sensitivity (PLDI’04, PLDI’06)  Object- sensitivity (ISSTA’02, TOSEM’05, SAS’16)  Type- sensitivity (POPL’11)  …  Model data-flow ◦ Heap abstraction  Allocation-site abstraction  Type-based abstraction  … 12

T wo Mainstreams of Points-to Analysis T echniques  Model control-flow ◦ Context-sensitivity  Call-site- sensitivity (PLDI’04, PLDI’06)  Object- sensitivity (ISSTA’02, TOSEM’05, SAS’16)  Type- sensitivity (POPL’11)  …  Model data-flow ◦ Heap abstraction  Allocation-site abstraction  Type-based abstraction  … 13

Heap Abstraction Dynamic Static execution analysis abstracted or partitioned … … Finite Infinite-size (abstract) heap objects 14

Allocation-Site Abstraction  One object per allocation site 1 A a1 = new A(); 2 A a2 = new A(); 3 B b = new B() ; 15

Allocation-Site Abstraction  One object per allocation site o 1 A 1 A a1 = new A(); A o 2 2 A a2 = new A(); 3 B b = new B() ; o 3 B 16

Allocation-Site Abstraction  One object per allocation site ◦ Adopted by all mainstream points-to analyses o 1 A 1 A a1 = new A(); A o 2 2 A a2 = new A(); 3 B b = new B() ; o 3 B 17

Allocation-Site Abstraction  Over-partition for call graph construction o 1 o 2 A A o 1 A 1 A a1 = new A(); void foo(Object o) { o.toString(); 2 A a2 = new A(); A o 2 3 foo(a1); } 4 foo(a2); A::toString() 18

Allocation-Site Abstraction  Over-partition for type-dependent clients ◦ Call graph construction ◦ Devirtualization ◦ May-fail casting o 1 o 2 A A ◦ … o 1 A 1 A a1 = new A(); void foo(Object o) { o.toString(); 2 A a2 = new A(); A o 2 3 foo(a1); A a = (A) o; 4 foo(a2); } 19

Type-Based Abstraction  One object per type 1 A a1 = new A(); 2 A a2 = new A(); 3 B b = new B() ; 20

Type-Based Abstraction  One object per type A o 1 A a1 = new A(); 2 A a2 = new A(); B o 3 B b = new B(); 21

Type-Based Abstraction  Precision loss for type-dependent clients A o A a1 = new A(); A a2 = new A(); B o B b = new B(); C c = new C(); C o a1.f = b; a2.f = c; Object o = a1.f; o.toString(); 22

Type-Based Abstraction  Precision loss for type-dependent clients A o A a1 = new A(); A a2 = new A(); B o B b = new B(); C c = new C(); C o B o A o a1.f = b; a2.f = c; C o Object o = a1.f; o.toString(); 23

Type-Based Abstraction  Precision loss for type-dependent clients A o A a1 = new A(); A a2 = new A(); B o B b = new B(); C c = new C(); C o B o A o a1.f = b; a2.f = c; C o B o Object o = a1.f; o.toString(); C o 24

Type-Based Abstraction  Precision loss for type-dependent clients A o A a1 = new A(); A a2 = new A(); B o B b = new B(); C c = new C(); C o B o A o a1.f = b; a2.f = c; C o B o Object o = a1.f; B::toString() o.toString(); C o C::toString() 25

Type-Based Abstraction  Precision loss for type-dependent clients A o A a1 = new A(); A a2 = new A(); B o B b = new B(); C c = new C(); C o B o A o a1.f = b; a2.f = c; C o B o Object o = a1.f; B::toString() o.toString(); C o C::toString() False positive 26

Our Goal: Improve Efficiency Preserve Precision 27

M AHJONG : A New Heap Abstraction Analysis Time (sec.) 128 14469 pmd (4 fours) 524 Unscalable findbugs (> 5 hours) MAHJONG Allocation-site abstraction Improve Efficiency Adopted by all mainstream points-to analyses 28

M AHJONG : A New Heap Abstraction Analysis Time (sec.) 128 14469 pmd (4 fours) 524 Unscalable findbugs (> 5 hours) MAHJONG Allocation-site abstraction Improve Efficiency Adopted by all mainstream points-to analyses #call graph edges 44016 pmd 44004 MAHJONG Allocation-site abstraction Preserve Precision 29

M AHJONG : A New Heap Abstraction Analysis Time (sec.) 128 14469 pmd (4 fours) 524 Unscalable findbugs (> 5 hours) MAHJONG Allocation-site abstraction Improve Efficiency Adopted by all mainstream points-to analyses #call graph edges 44016 pmd 44004 MAHJONG Allocation-site abstraction Preserve Precision How? 30

alleviate Merging Objects Over-Partition cause Blindly Merging Objects Precision Loss 31

alleviate Merging Objects Over-Partition cause Blindly Merging Objects Precision Loss f o 3 o 1 B A f o 4 C o 2 A inconsistent inconsistent types types 32

alleviate Merging Objects Over-Partition cause Blindly Merging Objects Precision Loss f B o 3 o 1 B A o f A o f f C o 4 o C o 2 A inconsistent types 33

Type-Consistent Objects  Definition T and O j T are type-consistent objects, O i if for every sequence of field names, = f 1 . f 2 . ... . f n : f O i T . and O j T . point to the objects of the f f same types. 34

Type-Consistent Objects  Definition T and O j T are type-consistent objects, O i if for every sequence of field names, = f 1 . f 2 . ... . f n : f O i T . and O j T . point to the objects of the f f same types. M AHJONG only merges type-consistent objects 35

Type-Consistent Objects  Example o 7 Y h h o 3 T f U o 9 Y o 1 g k o 11 o 5 Y X o 4 U h f o 2 T o 8 Y g o 6 X k 36

Type-Consistent Objects  Example O 1 O 2 T T o 7 Y h .f U U h o 3 T f U o 9 Y o 1 .f.h Y Y g k o 11 o 5 Y X .g X X .g.k Y Y o 4 U h f o 2 T o 8 Y g o 6 X k 37

Type-Consistent Objects  Example ∵ O 1 O 2 T T o 7 Y h .f U U h o 3 T f U o 9 Y o 1 .f.h Y Y g k o 11 o 5 Y X .g X X .g.k Y Y o 4 U h f T and O 2 T are o 2 T o 8 Y O 1 ∴ type-consistent objects g o 6 X k 38

How to Check Type-Consistency? 39

Our Solution: Sequential Automata Check Test T ype-Consistency Equivalence of Objects of Automata 40

Sequential Automata  6-tuple (Q, Σ , δ , q 0 , Γ , γ ), where: ◦ Q is a set of states ◦ Σ is a set of input symbols ◦ δ is the next-state map: Q × Σ  P (Q) ◦ q 0 is the initial state ◦ Γ is a set of output symbols ◦ γ is the output map: Q  Γ 41

Check Test T ype-Consistency Equivalence of Objects of Automata How? 42

Objects Automata  A set of objects  Q: a set of states  A set of field names  Σ : a set of input symbols  δ : the next-state map  The field points-to map  The object to be checked  q 0 : the initial state  A set of types  Γ : a set of output symbols  The object-to-type map  γ : the output map o 4 U h f o 2 T o 8 Y o 6 X g k 43

Objects Automata  A set of objects  Q: a set of states  A set of field names  Σ : a set of input symbols  δ : the next-state map  The field points-to map  The object to be checked  q 0 : the initial state  A set of types  Γ : a set of output symbols  The object-to-type map  γ : the output map objects ↔ states o 4 U h f O 2 T , O 4 U , O 6 X , O 8 Y o 2 T o 8 Y o 6 X g k 44

Objects Automata  A set of objects  Q: a set of states  A set of field names  Σ : a set of input symbols  δ : the next-state map  The field points-to map  The object to be checked  q 0 : the initial state  A set of types  Γ : a set of output symbols  The object-to-type map  γ : the output map field names ↔ input symbols o 4 U h f f, g, h, k o 2 T o 8 Y o 6 X g k 45

Efficient and Precise Points-to Analysis: Modeling the Heap by - PowerPoint PPT Presentation

Efficient and Precise Points-to Analysis: Modeling the Heap by Merging Equivalent Automata Tian Tan, Yue Li and Jingling Xue PLDI 2017 June, 2017 1 A New Points-to Analysis T echnique for Object-Oriented Programs 2 Points-to Analysis

MQTT Protocol for Real Time GNSS Data and Correction Distribution Precise Positioning Precise

Precise Performance LTD Jake Yarranton jake@precise-performance.co.uk 07468 465754 Precise

Precise Garbage Collection in C PANKHURI February 16, 2011 Agenda Problem Statement. Precise /

Optimal Prices in the Towards a Precise . . . Towards a Precise . . . Presence of Discounts:

Points-to Analysis y = &z; y z Points-to Analysis y = &z; x = &y; x y z

CMPS 112, Spring 2019 Midterm (Solutions) Section Points Score Reductions 10 points Lists

Modeling nuclear effects Modeling nuclear effects in precise oscillation experiments in precise

Machining Expert Precise In Dimensions Precise In Deliveries Flexible In Business

Precise Gauging & Automation Technology Address:PRECISE GAUGING & AUTOMATION TECHNOLOGY

Effectiveness of Career Development? Ask a Precise Question if You Want a Precise Answer Peter

Precise Electroweak Tests at LHC Fernando Marroquim Universidade Federal do Rio de Janeiro April

CMSC427 Rendering polylines Points, polylines and polygons Points Polyline Polygon Polyline can

Efficient and good Delaunay meshes from points random points M. S. Ebeida et a.l Intro MPS M.

Points to ponder while we wait for everyone to log on Points to ponder while we wait for

September 27, 2013 New MAP-based Performance Policy Category Points Points Weighted Points

The projective line minus three fractional 3 kinds of integral points points Darmons M

Measurement and Analysis of Online Social Networks Alan Mislove Massimiliano Marcon

Task Analy ask Analysis T sis Tool ool

Transaction clustering using network traffic blockchains analysis for Bitcoin and derived

via Multi-Dimensional Trace Analysis Yanpei Chen, Kiran Srinivasan, Garth Goodson, Randy Katz UC

Coronavirus Covid-19: An Analysis by Milo Schield ASA Fellow Consultant: University of New

CS630 Object-Oriented Systems Analysis and Design Les Waguespack, Ph.D. Orientation

A Stepwise Analysis of Aggregated Crowdsourced Labels Describing Multimodal Emotional Behaviors

Real Analysis a short presentation on what and why I. Fourier Analysis Fourier analysis is

Efficient and Precise Points-to Analysis: Modeling the Heap by - PowerPoint PPT Presentation

Efficient and Precise Points-to Analysis: Modeling the Heap by Merging Equivalent Automata Tian Tan, Yue Li and Jingling Xue PLDI 2017 June, 2017 1 A New Points-to Analysis T echnique for Object-Oriented Programs 2 Points-to Analysis

MQTT Protocol for Real Time GNSS Data and Correction Distribution Precise Positioning Precise

Precise Performance LTD Jake Yarranton jake@precise-performance.co.uk 07468 465754 Precise

Precise Garbage Collection in C PANKHURI February 16, 2011 Agenda Problem Statement. Precise /

Optimal Prices in the Towards a Precise . . . Towards a Precise . . . Presence of Discounts:

Points-to Analysis y = &amp;z; y z Points-to Analysis y = &amp;z; x = &amp;y; x y z

CMPS 112, Spring 2019 Midterm (Solutions) Section Points Score Reductions 10 points Lists

Modeling nuclear effects Modeling nuclear effects in precise oscillation experiments in precise

Machining Expert Precise In Dimensions Precise In Deliveries Flexible In Business

Precise Gauging &amp; Automation Technology Address:PRECISE GAUGING &amp; AUTOMATION TECHNOLOGY

Effectiveness of Career Development? Ask a Precise Question if You Want a Precise Answer Peter

Precise Electroweak Tests at LHC Fernando Marroquim Universidade Federal do Rio de Janeiro April

CMSC427 Rendering polylines Points, polylines and polygons Points Polyline Polygon Polyline can

Efficient and good Delaunay meshes from points random points M. S. Ebeida et a.l Intro MPS M.

Points to ponder while we wait for everyone to log on Points to ponder while we wait for

September 27, 2013 New MAP-based Performance Policy Category Points Points Weighted Points

The projective line minus three fractional 3 kinds of integral points points Darmons M

Measurement and Analysis of Online Social Networks Alan Mislove Massimiliano Marcon

Task Analy ask Analysis T sis Tool ool

Transaction clustering using network traffic blockchains analysis for Bitcoin and derived

via Multi-Dimensional Trace Analysis Yanpei Chen, Kiran Srinivasan, Garth Goodson, Randy Katz UC

Coronavirus Covid-19: An Analysis by Milo Schield ASA Fellow Consultant: University of New

CS630 Object-Oriented Systems Analysis and Design Les Waguespack, Ph.D. Orientation

A Stepwise Analysis of Aggregated Crowdsourced Labels Describing Multimodal Emotional Behaviors

Real Analysis a short presentation on what and why I. Fourier Analysis Fourier analysis is

Points-to Analysis y = &z; y z Points-to Analysis y = &z; x = &y; x y z

Precise Gauging & Automation Technology Address:PRECISE GAUGING & AUTOMATION TECHNOLOGY