LLDSAL A L ow- L evel D omain- S pecific A spect L anguage for - - PowerPoint PPT Presentation

lldsal a l ow l evel d omain s pecific a spect l anguage
SMART_READER_LITE
LIVE PREVIEW

LLDSAL A L ow- L evel D omain- S pecific A spect L anguage for - - PowerPoint PPT Presentation

LLDSAL A L ow- L evel D omain- S pecific A spect L anguage for Dynamic Code-Generation and Program Modification Mathias Payer, Boris Bluntschli, & Thomas R. Gross Department of Computer Science ETH Zrich, Switzerland Motivation: program


slide-1
SLIDE 1

LLDSAL A Low-Level Domain-Specific Aspect Language for Dynamic Code-Generation and Program Modification

Mathias Payer, Boris Bluntschli, & Thomas R. Gross Department of Computer Science ETH Zürich, Switzerland

slide-2
SLIDE 2

2012-03-27 Mathias Payer, ETH Zürich 2

1 2 3 4

Motivation: program instrumentation

LLDSAL enables runtime code generation in a Dynamic Binary Translator (DBT)

slide-3
SLIDE 3

2012-03-27 Mathias Payer, ETH Zürich 3

Motivation: program instrumentation

LLDSAL enables runtime code generation in a Dynamic Binary Translator (DBT)

  • External aspects extend program functionality
  • Internal aspects to implement the instrumentation framework

1 2 3 4 1' 2' 3' 4' Dynamic Binary Translation (DBT)

slide-4
SLIDE 4

2012-03-27 Mathias Payer, ETH Zürich 4

Problem: code generation in DBT

DBT needs aspects that bridge between (translated) application and DBT world

  • No calling conventions, must store everything
  • Dynamic environment, no static addresses or locations
  • Code must be fast (JIT-able)

char *code = ...; BEGIN_ASM(code) addl $5, %eax movl %eax, %edx incl {&my_var} END_ASM

slide-5
SLIDE 5

2012-03-27 Mathias Payer, ETH Zürich 5

Solution: LLDSAL

Low-level Domain Specific Aspect/Assembly Language

  • Aspects have access to high-level language constructs
  • Aspects adhere to low-level conventions

DBT and LLDSAL enable AOP without any hooks

  • JIT binary rewriting adds aspects on the fly

LLDSAL status: implemented and in use

  • LLDSAL used for internal aspects of a BT (fastBT)
  • LLDSAL guarantees security properties (libdetox security

framework)

slide-6
SLIDE 6

2012-03-27 Mathias Payer, ETH Zürich 6

Outline

Motivation Background: Binary Translation (BT) Language design Implementation Related work Conclusion

slide-7
SLIDE 7

2012-03-27 Mathias Payer, ETH Zürich 7

Binary Translation in a nutshell

  • Translates individual basic blocks
  • Checks branch targets and origins
  • Weaves aspects into translated code

1 1' 2 2' 3 3' … ... Original code Code cache Mapping table Translator 1 2 4 3 1' 2' 3' R RX

Indirect control flow transfers use a dynamic check to verify target and origin

slide-8
SLIDE 8

2012-03-27 Mathias Payer, ETH Zürich 8

Outline

Motivation Binary Translation (BT) Language design

  • Dynamic assembly language
  • Data (variable) access
  • Example: Dynamic code generation

Implementation Related work Conclusion

slide-9
SLIDE 9

2012-03-27 Mathias Payer, ETH Zürich 9

Language design

Usability: low-level / high-level trade-off

  • Mix assembly code plus access to high-level language constructs

Integration into host language

  • DSL integrates naturally into the host language

No runtime dependencies

  • Source-to-source translation (LLDSAL to C code)

LLDSAL defines a dynamic assembly language

  • Enables dynamic low-level code generation at runtime
slide-10
SLIDE 10

2012-03-27 Mathias Payer, ETH Zürich 10

Dynamic assembly language

LLDSAL combines assembly code with access to high- level data structures

  • Expressiveness and syntax comparable to inline assembler
  • JIT code generation at runtime, optimization for data-accesses
  • Parameters encoded (inlined) into instructions

char *code = ...; BEGIN_ASM(code) addl $5, %eax movl %eax, %edx incl {&my_var} END_ASM

Pointer to code Variable access Assembly block

slide-11
SLIDE 11

2012-03-27 Mathias Payer, ETH Zürich 11

Comparison LLDSAL vs. inline asm

Code generation

  • Inline asm executes code inline
  • LLDSAL generates code inline

Access to dynamic or thread local data

  • Inline asm uses indirect memory references (pointer chasing)
  • LLDSAL embeds direct pointers in generated code

asm ("incl %0\n" : "=a"(myvar) : "0"(myvar)); char *code = ...; BEGIN_ASM(code) addl $5, %eax movl %eax, %edx incl {&my_var} END_ASM

slide-12
SLIDE 12

2012-03-27 Mathias Payer, ETH Zürich 12

Data (variable) access

JIT-compiled code enables new data access patterns

  • LLDSAL enables variable access in host space using {variable}

Variable addresses directly encoded in emitted code

  • No parameters are passed
  • No indirection or pointer chasing

// inside indirect_call action BEGIN_ASM(code) incl {&tld->stat->nr_ind_calls} END_ASM

slide-13
SLIDE 13

2012-03-27 Mathias Payer, ETH Zürich 13

Dynamic code generation

long result = 5; char *target = ...; void_func f = (void_func)target; { BEGIN_ASM(target) pushl ${result} call_abs {my_func} movl %eax, {&result} addl $4, %esp ret END_ASM } f(); // result == 25 typedef void (*void_func)(); long my_func(long a) { return a * a; }

pushes $5 to the stack my_func(5) result = my_func(5) Clean-up and return Execute dynamic code

slide-14
SLIDE 14

2012-03-27 Mathias Payer, ETH Zürich 14

Outline

Motivation Binary Translation (BT) Language design Implementation Related work Conclusion

slide-15
SLIDE 15

2012-03-27 Mathias Payer, ETH Zürich 15

LLDSAL implementation

Source file *.dsl GNU C preprocessor LLDSAL Processing GNU assembler

  • bjdump

C output *.c GNU C compiler Compiled object *.o

Translate LLDSAL to C code

slide-16
SLIDE 16

2012-03-27 Mathias Payer, ETH Zürich 16

LLDSAL alternatives

Macro-based approach

  • No additional compilation pass needed
  • Error prone, manual encoding

JIT code generation (GNU lightning, asmjit)

  • Very flexible, dynamic register allocation
  • High overhead, library dependencies

#define PUSHL_IMM32(dst, imm) \\ *dst++=0x68; *((int_32_t*)dst)=imm; dst+=4 ... PUSHL_IMM32(code, 0xdeadbeef);

slide-17
SLIDE 17

2012-03-27 Mathias Payer, ETH Zürich 17

Outline

Motivation Binary Translation (BT) Language design Implementation Related work Conclusion

slide-18
SLIDE 18

2012-03-27 Mathias Payer, ETH Zürich 18

Related work

Compile-time DSL parsing [Porkolab et al., GPCE'10]

  • LLDSAL first dynamic low-level DSAL for BT

Guyer and Lin describe an approach to optimize libraries for different environments [DSL'99]

  • Annotation based, LLDSAL uses assembly code with high-level data

access

Khepora is an approach to s2s DSLs [Faith et al., DSL'97]

  • Full DSL parsing using syntax trees, too heavy-weight for LLDSAL
slide-19
SLIDE 19

2012-03-27 Mathias Payer, ETH Zürich 19

Conclusion

LLDSAL enables dynamic code generation for DBTs

  • Direct access to host variables and data structures
  • Low-overhead (no arguments passed, low-level encoding)
  • No library dependencies

LLDSAL raises level of interaction between developer and BT framework

  • Increased readability of code
  • Better maintainability due to automatic translation
slide-20
SLIDE 20

2012-03-27 Mathias Payer, ETH Zürich 20

Thank you for your attention

?

slide-21
SLIDE 21

2012-03-27 Mathias Payer, ETH Zürich 21

Data (variable) access

Use the address of the variable ${&foo}

  • Instruction stores current address as immediate

Encode the (static) value of the variable ${foo}

  • Instruction stores current value as immediate

Use dynamic value of variable {&foo}

  • Instruction stores address of variable and encodes memory

dereference

Use dynamic value of the address of the variable {foo}

  • Instruction stores value as immediate and encodes memory

dereference

slide-22
SLIDE 22

2012-03-27 Mathias Payer, ETH Zürich 22

Data (variable) access

pushl ${tld}

  • Push current value of tld onto stack

movl {tld->stack-1}, %esp

  • Read value from *(tld->stack-1) and store it in %esp

movl ${tld->stack-1}, %esp

  • Store address of (tls->stack-1) in %esp

movl %eax, {&tld->saved_eax}

  • Store %eax at &tld->saved_eax
slide-23
SLIDE 23

2012-03-27 Mathias Payer, ETH Zürich 23

Example (indirect lookup, inside BT)

BEGIN_ASM(transl_instr) pushfl pushl %ebx pushl %ecx movl 12(%esp), %ebx // Load target address movl %ebx, %ecx // Duplicate RIP /* Load hashline (eip element) */ andl ${MAPPING_PATTERN >> 3}, %ebx; cmpl {tld->mappingtable}(, %ebx, 8), %ecx; jne nohit hit: // Load target movl {tld->mappingtable+4}(, %ebx, 8), %ebx movl %ebx, {&tld->ind_target} popl %ecx popl %ebx popfl leal 4(%esp), %esp jmp *{&tld->ind_target} nohit: // recover mode - there was no hit! ... END_ASM