An Incremental Learner for Language-Based Anomaly Detection in XML
Harald Lampesberger
Department of Secure Information Systems University of Applied Sciences Upper Austria harald.lampesberger@fh-hagenberg.at
LangSec Workshop, 26. May 2016
An Incremental Learner for Language-Based Anomaly Detection in XML - - PowerPoint PPT Presentation
An Incremental Learner for Language-Based Anomaly Detection in XML Harald Lampesberger Department of Secure Information Systems University of Applied Sciences Upper Austria harald.lampesberger@fh-hagenberg.at LangSec Workshop, 26. May 2016
Harald Lampesberger
Department of Secure Information Systems University of Applied Sciences Upper Austria harald.lampesberger@fh-hagenberg.at
LangSec Workshop, 26. May 2016
Extensible Markup Language (XML)
Schema validation is a first-line defense
Two language-theoretic flaws
Harald Lampesberger An Incremental Learner for Language-Based Anomaly Detection in XML 1/10
From http://schemas.xmlsoap.org/soap/envelope/
... <xs:element name="Header" type="tns:Header"/> <xs:complexType name="Header"> <xs:sequence> <xs:any namespace="##other" minOccurs="0" maxOccurs="unbounded" processContents="lax"/> </xs:sequence> <xs:anyAttribute namespace="##other" processContents="lax"/> </xs:complexType> ...
Harald Lampesberger An Incremental Learner for Language-Based Anomaly Detection in XML 2/10
Digitally signed part = processed part
soap:Envelope soap:Header wsse:Security ds:Signature ds:SignedInfo ds:Reference @URI soap:Body @wsu:Id MonitorInstances #123 123 verified, processed
Harald Lampesberger An Incremental Learner for Language-Based Anomaly Detection in XML 3/10
Digitally signed part = processed part
soap:Envelope soap:Header wsse:Security ds:Signature ds:SignedInfo ds:Reference @URI soap:Body @wsu:Id MonitorInstances #123 123 verified, processed soap:Envelope soap:Header wsse:Security ds:Signature ds:SignedInfo ds:Reference @URI Wrapper soap:Body @wsu:Id MonitorInstances soap:Body @wsu:Id CreateKeyPair #123 123 attack verified processed
Jensen et al. (2011): removing extension points is hard
Harald Lampesberger An Incremental Learner for Language-Based Anomaly Detection in XML 3/10
Approach: learn the acceptable language
Harald Lampesberger An Incremental Learner for Language-Based Anomaly Detection in XML 4/10
Approach: learn the acceptable language
Harald Lampesberger An Incremental Learner for Language-Based Anomaly Detection in XML 4/10
Approach: learn the acceptable language
Harald Lampesberger An Incremental Learner for Language-Based Anomaly Detection in XML 4/10
Approach: learn the acceptable language
Harald Lampesberger An Incremental Learner for Language-Based Anomaly Detection in XML 4/10
Event stream alphabets
Stack alphabet = states States partitioned into modules (schema types) Transitions in and between modules cXVPA representation
e x Order e x Item itm/eOrder itm/xOrder itm/eOrder, itm/xOrder token, int q0 qf
<ord> <itm>Product A</itm> <itm>8877955335</itm> </ord>
Harald Lampesberger An Incremental Learner for Language-Based Anomaly Detection in XML 5/10
Learner computes an updated dXVPA
Validator checks acceptance Training doci Ai, ωi dXVPAi cXVPAi Learner Validator incWeightedVPA genXVPA accept Document yes no Ai−1, ωi−1 trim . . .
Harald Lampesberger An Incremental Learner for Language-Based Anomaly Detection in XML 6/10
Every event stream prefix gets a unique state
Merge two states if they are k-l-locally the same
dealer usedcars newcars ad ad ad ad ad dealer usedcars newcars model year model VW 2014 Tesla (dealer#usedcars · newcars, ad · ad) (newcars, ad) 1-1 local k l
Harald Lampesberger An Incremental Learner for Language-Based Anomaly Detection in XML 7/10
Every event stream prefix gets a unique state
Merge two states if they are k-l-locally the same
dealer usedcars newcars ad ad ad ad ad dealer usedcars newcars model year model VW 2014 Tesla (dealer#usedcars · newcars, ad · ad) (newcars, ad) 1-1 local k l
Harald Lampesberger An Incremental Learner for Language-Based Anomaly Detection in XML 7/10
Every event stream prefix gets a unique state
Merge two states if they are k-l-locally the same
dealer usedcars newcars ad ad ad ad ad dealer usedcars newcars model year model VW 2014 Tesla (dealer#usedcars · newcars, ad · ad) (newcars, ad) 1-1 local k l
Harald Lampesberger An Incremental Learner for Language-Based Anomaly Detection in XML 7/10
Every event stream prefix gets a unique state
Merge two states if they are k-l-locally the same
dealer usedcars newcars ad ad ad ad ad dealer usedcars newcars model year model VW 2014 Tesla (dealer#usedcars · newcars, ad · ad) (newcars, ad) 1-1 local k l
Harald Lampesberger An Incremental Learner for Language-Based Anomaly Detection in XML 7/10
ωi . . . frequencies of states and transitions from learning Unlearning
Sanitization
Harald Lampesberger An Incremental Learner for Language-Based Anomaly Detection in XML 8/10
Two synthetic and two realistic datasets Learning progress
Catalog, k = 1, l = 2
0% 20% 40% 60% 80% 100%
F1 FPR
20 40 60
MC
20 40 60 80 100 Training iteration
VulnShopAuthOrder, k = 1, l = 2
0% 20% 40% 60% 80% 100%
F1 FPR
100 200
MC
40 80 120 160 200 Training iteration
Harald Lampesberger An Incremental Learner for Language-Based Anomaly Detection in XML 9/10
Learner outperformed schema validation
Contributions in the paper
Use cases
Harald Lampesberger An Incremental Learner for Language-Based Anomaly Detection in XML 10/10
Learner needs a datatyped event stream
instead of character data
Example
{boolean, unsignedByte}
boolean unsignedByte byte language NCName duration dayTimeDuration yearMonthDuration QName Name NMTOKEN token normalizedString string ⊤ base64Binary gMonth gDay gMonthDay gYearMonth double decimal integer unsignedShort unsignedInt unsignedLong nonNegativeInteger short int long gYear nonPositiveInteger negativeInteger hexBinary anyURI dateTime dateTimeStamp time date positiveInteger NMTOKENS ENTITIES Harald Lampesberger An Incremental Learner for Language-Based Anomaly Detection in XML Appendix 1/1