the geometry of syntax and semantics for directed file
play

The geometry of syntax and semantics for directed file - PowerPoint PPT Presentation

IEEE S&P 2020 LangSec workshop The geometry of syntax and semantics for directed file transformations Steve Huntsman 1 Michael Robinson 2 1 FAST Labs / Cyber Technology 2 American University 21 May 2020 IEEE S&P 2020 LangSec workshop


  1. IEEE S&P 2020 LangSec workshop The geometry of syntax and semantics for directed file transformations Steve Huntsman 1 Michael Robinson 2 1 FAST Labs / Cyber Technology 2 American University 21 May 2020

  2. IEEE S&P 2020 LangSec workshop Geometry of syntax and semantics 2 string.h must be used carefully to prevent buffer overflows • X = strings of ASCII NULL s and printable characters • G = cyclic shifts on individual characters • Goal: remove NULL s and punctuation; make lowercase • This example is discussed in the paper

  3. IEEE S&P 2020 LangSec workshop Geometry of syntax and semantics 3 Transform files to achieve language-theoretical security • X = space of files in some fixed format (e.g., PDF) • G = various invertible transformations • Goal: eliminate nondeterministic syntax • Input ambiguity = vulnerability

  4. IEEE S&P 2020 LangSec workshop Geometry of syntax and semantics 4 Patch binary code to secure critical legacy systems • X = space of disassembled binary code • G = “sugar-neutral” lifts , translations, etc • Goal: parsimoniously patch a known vulnerability • Compiler/build options, dependencies make this hard

  5. IEEE S&P 2020 LangSec workshop Geometry of syntax and semantics 5 Principal bundles model syntax and semantics • X = space of documents • G = group of invertible transformations

  6. IEEE S&P 2020 LangSec workshop Geometry of syntax and semantics 6 Principal bundles model syntax and semantics • X = space of documents • G = group of invertible transformations • Think of X like a manifold and get something akin to a principal bundle P ( X , G )

  7. IEEE S&P 2020 LangSec workshop Geometry of syntax and semantics 7 Principal bundles model syntax and semantics • X = space of documents • G = group of invertible transformations • Think of X like a manifold and get something akin to a principal bundle P ( X , G ) • Locally looks like X × G • G acts on P nicely

  8. IEEE S&P 2020 LangSec workshop Geometry of syntax and semantics 8 Principal bundles model syntax and semantics • X = space of documents • G = group of invertible transformations • Think of X like a manifold and get something akin to a principal bundle P ( X , G ) • Locally looks like X × G • G acts on P nicely • E.g., X = S 1 (time of day); G = Z (epoch); P = R (as a helix above X )

  9. IEEE S&P 2020 LangSec workshop Geometry of syntax and semantics 9 Principal bundles model syntax and semantics • X = space of documents • G = group of invertible transformations • Think of X like a manifold and get something akin to a principal bundle P ( X , G ) • Locally looks like X × G • G acts on P nicely • E.g., X = S 1 ; G = (0 , 1) w/ x ⊞ y := f ( f − 1 ( x ) + f − 1 ( y )) for invertible f : R → (0 , 1)

  10. IEEE S&P 2020 LangSec workshop Geometry of syntax and semantics 10 Principal bundles model syntax and semantics • X = space of documents • G = group of invertible transformations • Think of X like a manifold and get something akin to a principal bundle P ( X , G ) • Locally looks like X × G • G acts on P nicely • E.g., Hopf fibration S 1 → S 3 → S 2

  11. IEEE S&P 2020 LangSec workshop Geometry of syntax and semantics 11 Connections model geometry directing transformations • Principal bundles are a natural arena for geometry realized through a connection

  12. IEEE S&P 2020 LangSec workshop Geometry of syntax and semantics 12 Connections model geometry directing transformations • Principal bundles are a natural arena for geometry realized through a connection • I.e., a “vertical” and “horizontal” direct sum decomposition of tangent spaces . . .

  13. IEEE S&P 2020 LangSec workshop Geometry of syntax and semantics 13 Connections model geometry directing transformations • Principal bundles are a natural arena for geometry realized through a connection • I.e., a “vertical” and “horizontal” direct sum decomposition of tangent spaces . . . • . . . that is equivariant under group action

  14. IEEE S&P 2020 LangSec workshop Geometry of syntax and semantics 14 Connections model geometry directing transformations • Principal bundles are a natural arena for geometry realized through a connection • I.e., a “vertical” and “horizontal” direct sum decomposition of tangent spaces . . . • . . . that is equivariant under group action • Connects local product geometries via parallel transport

  15. IEEE S&P 2020 LangSec workshop Geometry of syntax and semantics 15 Syntactic transformations must be invertible • This requirement of the mathematical model is really a hint about how to perform file transformations

  16. IEEE S&P 2020 LangSec workshop Geometry of syntax and semantics 16 Syntactic transformations must be invertible • This requirement of the mathematical model is really a hint about how to perform file transformations • Record (or in reverse, delete) details of atomic transformations in ancillae

  17. IEEE S&P 2020 LangSec workshop Geometry of syntax and semantics 17 Syntactic transformations must be invertible • This requirement of the mathematical model is really a hint about how to perform file transformations • Record (or in reverse, delete) details of atomic transformations in ancillae objend

  18. IEEE S&P 2020 LangSec workshop Geometry of syntax and semantics 18 Syntactic transformations must be invertible • This requirement of the mathematical model is really a hint about how to perform file transformations • Record (or in reverse, delete) details of atomic transformations in ancillae objend ⇒ objend % objend -> endobj

  19. IEEE S&P 2020 LangSec workshop Geometry of syntax and semantics 19 Syntactic transformations must be invertible • This requirement of the mathematical model is really a hint about how to perform file transformations • Record (or in reverse, delete) details of atomic transformations in ancillae objend ⇒ objend % objend -> endobj ⇒ endobj % objend -> endobj

  20. IEEE S&P 2020 LangSec workshop Geometry of syntax and semantics 20 Syntactic transformations must be invertible • This requirement of the mathematical model is really a hint about how to perform file transformations • Record (or in reverse, delete) details of atomic transformations in ancillae objend ⇒ objend % objend -> endobj ⇒ endobj % objend -> endobj • Sugar-neutral : transformations should handle sugar, but not introduce or eliminate it

  21. IEEE S&P 2020 LangSec workshop Geometry of syntax and semantics 21 Syntactic transformations must be invertible • This requirement of the mathematical model is really a hint about how to perform file transformations • Record (or in reverse, delete) details of atomic transformations in ancillae objend ⇒ objend % objend -> endobj ⇒ endobj % objend -> endobj • Sugar-neutral : transformations should handle sugar, but not introduce or eliminate it • Suggests using normal forms

  22. IEEE S&P 2020 LangSec workshop Geometry of syntax and semantics 22 Normal forms simplify and disambiguate START; S jmp @5 do while b @4: int i; S jmp @9 for (i=0; i<10; i++) do while b @8: { if b jne @19 z+=i; S jmp @10 } do while b @19: S jmp @14 int n=0; enddo @13:@14: while (n<10) { endif jg @13 x+=n; S @9:@10: n++; enddo jge @20 } S jmp @8 enddo; HALT @5:@20: jge @21 (From Lacomis et al. ) (From Zhang and D’Hollander) jmp @4 @21:

  23. IEEE S&P 2020 LangSec workshop Geometry of syntax and semantics 23 Concrete syntax trees parameterize a principal bundle • G corresponds to semantics-preserving CST transformations • Equivalence class of CSTs corresponding to a given AST has group-theoretical and language security significance and indicates format redundancy • E.g., xref table in PDF (which nobody trusts)

  24. IEEE S&P 2020 LangSec workshop Geometry of syntax and semantics 24 Dynamic concretization semantically enriches an AST [Files] can be considered as an abstraction of their semantics. For example the syntax of [files] records the existence of [objects] and maybe their type but not [the trace of a parser or renderer], as defined by the semantics. 1 • Annotating (with, e.g., types) and cross-linking an AST gives a semantically rich derived graph • To understand a file, parse it . . . 1 [Cousot and Cousot], replacing “program” and “variable” with “file” and “object,” respectively.

  25. IEEE S&P 2020 LangSec workshop Geometry of syntax and semantics 25 Dynamic concretization semantically enriches an AST [Files] can be considered as an abstraction of their semantics. For example the syntax of [files] records the existence of [objects] and maybe their type but not [the trace of a parser or renderer], as defined by the semantics. 1 • Annotating (with, e.g., types) and cross-linking an AST gives a semantically rich derived graph • To understand a file, parse it . . . • . . . to understand it more, render/compile it 1 [Cousot and Cousot], replacing “program” and “variable” with “file” and “object,” respectively.

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend