Binary‐level program analysis: Static Disassembly
Gang Tan
CSE 597 Spring 2019 Penn State University
3
Binarylevel program analysis: Static Disassembly Gang Tan CSE 597 - - PowerPoint PPT Presentation
Binarylevel program analysis: Static Disassembly Gang Tan CSE 597 Spring 2019 Penn State University 3 Disassemblers Disassembler Convert machine code in a binary file into assembly code or code in an equivalent IR Assume a
3
next instruction
– “6A 03”
– “83 C4 0C”
– “B8 CC CC CC CC”
6
7
8
9
14
15
– To accommodate variable‐sized instruction sets – May result in overlapping instructions
direct conditional branch instructions are selected as jump candidates
jump candidates as the starting points using recursive traversal
19
Initial control flow graph Blue nodes represent the nodes in the real CFG; Red nodes represent spurious nodes; Node A is the entry node; Pink dash lines indicate there is a conflict between the nodes; Solid arrows represent the edges in the initial CFG.
– Entry node must be valid – Nodes reachable from a valid node must be valid – Nodes in conflict with valid nodes must be invalid
– Assumption: valid nodes do not overlap – If two nodes in conflict share an ancestor, the ancestor must be invalid
– Assumes that valid nodes are more tightly integrated into a CFG – A node with more predecessors implies tighter integration – Clearly a heuristics
– Assumes that valid nodes are more tightly integrated into a CFG – More direct successors implies tighter integration – Heuristics
– Pick one from two conflicting nodes by random – Being desperate here
Control flow graph after the first step (Node B is removed) Blue nodes represent the nodes in the real CFG; Red nodes represent spurious nodes; Node A is the entry node; Pink dash lines indicate there is a conflict between the nodes; Solid arrows represent the edges in the initial CFG.
Control flow graph after the second step (Node J is removed) Blue nodes represent the nodes in the real CFG; Red nodes represent spurious nodes; Node A is the entry node; Pink dash lines indicate there is a conflict between the nodes; Solid arrows represent the edges in the initial CFG.
Control flow graph after the third step (Node K is removed) Blue nodes represent the nodes in the real CFG; Red nodes represent spurious nodes; Node A is the entry node; Pink dash lines indicate there is a conflict between the nodes; Solid arrows represent the edges in the initial CFG.
Control flow graph after the fourth step (Node C is removed) Blue nodes represent the nodes in the real CFG; Red nodes represent spurious nodes; Node A is the entry node; Pink dash lines indicate there is a conflict between the nodes; Solid arrows represent the edges in the initial CFG.
Program Objdump Linn/Debray IDA Pro This paper compress95 gcc go Ijpeg li m88ksim perl vortex 56.07 65.54 66.08 60.82 56.65 58.42 57.66 66.02 69.96 82.18 78.12 74.23 72.78 75.66 72.01 76.97 24.19 45.09 43.01 31.46 29.07 29.56 31.36 42.65 91.04 88.45 91.81 91.60 89.86 90.39 86.93 90.71 Mean 60.91 75.24 34.55 90.10
27
28