What's behind this model? Fernando Martnez-Plumed, Ral Fabra, Csar - - PowerPoint PPT Presentation

what s behind this model
SMART_READER_LITE
LIVE PREVIEW

What's behind this model? Fernando Martnez-Plumed, Ral Fabra, Csar - - PowerPoint PPT Presentation

What's behind this model? Fernando Martnez-Plumed, Ral Fabra, Csar Ferri, Jos Hernndez-Orallo, M Jose Ramrez Quintana Context: Security Issues and Machine Learning Machine learning is being increasingly used in


slide-1
SLIDE 1

What's behind this model?

Fernando Martínez-Plumed, Raül Fabra, Cèsar Ferri, José Hernández-Orallo, Mª Jose Ramírez Quintana

slide-2
SLIDE 2

Context: Security Issues and Machine Learning

  • Machine learning is being

increasingly used in confidential and security-sensitive applications (such as spam, fraud detection, malware classification, network anomaly detection):

  • models are being deployed with

publicly accessible query interfaces.

  • it is assumed that data can be

actively manipulated by an intelligent, adaptive adversary.

An adversary that can learn the model can also often evade detection

slide-3
SLIDE 3

Adversarial Learning

  • The adversary knows the model type (logistic

regression, decision tree, etc.):

  • Model extraction: The adversary’s goal is to extract

an equivalent or near-equivalent ML model. For instance:

If f(x) is just a class label:

  • Traditional learning-theory settings with membership

queries

slide-4
SLIDE 4

Adversarial Learning

Membership queries (to find points close to f ’s decision boundary)

Black-box oracle access with membership queries that return just the predicted class label.

Idea: sampling m points, querying the oracle, and training a model f’ on these samples.

slide-5
SLIDE 5

Adversarial Learning

  • Examples of attack techniques based on different

family learning techniques:

  • SVMs:
  • Biggio, B., Corona, I., Nelson, B., Rubinstein, B. I., Maiorca, D., Fumera, G., ... & Roli, F. (2014).

Security evaluation of support vector machines in adversarial environments. In Support Vector Machines Applications (pp. 105-153). Springer International Publishing.

  • DTs and Ensembles of DTs:
  • Cui, Zhicheng, et al. "Optimal action extraction for random forests and boosted trees." Proceedings
  • f the 21th ACM SIGKDD International Conf. on Knowledge Discovery and Data Mining. ACM, 2015.
  • Deep Neural Networks:
  • Learning Adversary-Resistant Deep Neural Networks.Qinglong Wang, Wenbo Guo, Kaixuan Zhang,

Alexander G. Ororbia II, Xinyu Xing, Xue Liu, C. Lee Giles. arXiv:1612.01401

  • "Unsupervised representation learning with deep convolutional generative adversarial networks."

Radford, Alec, Luke Metz, and Soumith Chintala. arXiv preprint arXiv:1511.06434 (2015).

slide-6
SLIDE 6

Detecting the ML family

The adversary does NOT KNOW the ML model type.

  • Model Characteristics extraction: The adversary’s goal is to extract the

type of ML model used as well as its intrinsic characteristics so that they can evade it or exploit its weaknesses, vulnerabilities or gaps.

slide-7
SLIDE 7

Detecting the ML family

Model Characteristics extraction:

  • Machine learning family -> decision boundaries

layouts

  • Feature space significance -> which input attributes

are more important

  • varying the output (more discriminating)
  • requiring more magnitude or range (difficulty)
  • Attribute transformations -> and its effect on the

boundaries and the model family.

We plan to start with a small set of ML families (decision trees, set of rules, linear discriminants)

slide-8
SLIDE 8

Dataset Orig Evaluation Oracle/Mimetic comparison Algorithm Recommendation Evaluation

ML Algorithms:

  • Dec. Stump

Decision tree Logistic Regression ...

Meta-features

Meta-features LEARNING MIMETIC TREES META-LEARNING FOR ALGORITHM IDENTIFICATION

C1 C2 CN Query Strategies:

Uniform Optimum size Papernot Oracle models

M1 M2 MN Meta-feature Extraction

Mimetic datasets (artificial)

DN Mimetic D2 Mimetic D1 Mimetic ß1, ß2, …, ßY (from Mimetic DS) λ1, λ2, …, λX (from Mimetic Model)

Mimetic Classifiers (Decision Trees)

Detecting the ML family

slide-9
SLIDE 9

Mimetic Models

slide-10
SLIDE 10

Mimetic Models

slide-11
SLIDE 11

Mimetic Models

slide-12
SLIDE 12

Mimetic Models

slide-13
SLIDE 13

Inspecting IP models

Model Characteristics extraction:

  • Are there relational patterns?-> X1==X2
  • Is the model recursive? -> Exploiting recursive

patterns can be a source of security issues

  • Attribute transformations ->Are complex features

addressed by propositionalisation ?

slide-14
SLIDE 14

Any idea, collaboration, ….. will be welcome