CS11-747 Neural Networks for NLP
Adversarial Methods
Graham Neubig
Site https://phontron.com/class/nn4nlp2020/
With many slides by Zihang Dai & Qizhe Xie
Adversarial Methods Graham Neubig Site - - PowerPoint PPT Presentation
CS11-747 Neural Networks for NLP Adversarial Methods Graham Neubig Site https://phontron.com/class/nn4nlp2020/ With many slides by Zihang Dai & Qizhe Xie <latexit
CS11-747 Neural Networks for NLP
Graham Neubig
Site https://phontron.com/class/nn4nlp2020/
With many slides by Zihang Dai & Qizhe Xie
P(X|Y)
Z, and model P(X) =
X
Z
P(X | Z)P(Z)
<latexit sha1_base64="gzwAYFR/DfB073PQH17WZuyNuCg=">ACBnicbZBPS8MwGMbT+W/Of1WPgSHsF1GK4J6EIZePE6wbmwtJU3TLSxpS5IKo+zmxa/ixYOKVz+DN7+N2daDbj4Q+OV535fkfYKUaks69soLS2vrK6V1ysbm1vbO+bu3r1MoGJgxOWiE6AJGE0Jo6ipFOKgjiASPtYHg9qbcfiJA0ie/UKCUeR/2YRhQjpS3fPGzVOnV4CV2Zcb8L9Q26nIawW9fcrftm1WpYU8FsAuogkIt3/xywRnMQKMyRlz7ZS5eVIKIoZGVfcTJIU4SHqk57GHEivXy6xgeayeEUSL0iRWcur8ncsSlHPFAd3KkBnK+NjH/q/UyFZ17OY3TJEYzx6KMgZVAiehwJAKghUbaUBYUP1XiAdIKx0dBUdgj2/8iI4J42Lhn17Wm1eFWmUwQE4AjVgzPQBDegBRyAwSN4Bq/gzXgyXox342PWjKmX3wR8bnD0Lile8=</latexit><latexit sha1_base64="gzwAYFR/DfB073PQH17WZuyNuCg=">ACBnicbZBPS8MwGMbT+W/Of1WPgSHsF1GK4J6EIZePE6wbmwtJU3TLSxpS5IKo+zmxa/ixYOKVz+DN7+N2daDbj4Q+OV535fkfYKUaks69soLS2vrK6V1ysbm1vbO+bu3r1MoGJgxOWiE6AJGE0Jo6ipFOKgjiASPtYHg9qbcfiJA0ie/UKCUeR/2YRhQjpS3fPGzVOnV4CV2Zcb8L9Q26nIawW9fcrftm1WpYU8FsAuogkIt3/xywRnMQKMyRlz7ZS5eVIKIoZGVfcTJIU4SHqk57GHEivXy6xgeayeEUSL0iRWcur8ncsSlHPFAd3KkBnK+NjH/q/UyFZ17OY3TJEYzx6KMgZVAiehwJAKghUbaUBYUP1XiAdIKx0dBUdgj2/8iI4J42Lhn17Wm1eFWmUwQE4AjVgzPQBDegBRyAwSN4Bq/gzXgyXox342PWjKmX3wR8bnD0Lile8=</latexit><latexit sha1_base64="gzwAYFR/DfB073PQH17WZuyNuCg=">ACBnicbZBPS8MwGMbT+W/Of1WPgSHsF1GK4J6EIZePE6wbmwtJU3TLSxpS5IKo+zmxa/ixYOKVz+DN7+N2daDbj4Q+OV535fkfYKUaks69soLS2vrK6V1ysbm1vbO+bu3r1MoGJgxOWiE6AJGE0Jo6ipFOKgjiASPtYHg9qbcfiJA0ie/UKCUeR/2YRhQjpS3fPGzVOnV4CV2Zcb8L9Q26nIawW9fcrftm1WpYU8FsAuogkIt3/xywRnMQKMyRlz7ZS5eVIKIoZGVfcTJIU4SHqk57GHEivXy6xgeayeEUSL0iRWcur8ncsSlHPFAd3KkBnK+NjH/q/UyFZ17OY3TJEYzx6KMgZVAiehwJAKghUbaUBYUP1XiAdIKx0dBUdgj2/8iI4J42Lhn17Wm1eFWmUwQE4AjVgzPQBDegBRyAwSN4Bq/gzXgyXox342PWjKmX3wR8bnD0Lile8=</latexit><latexit sha1_base64="gzwAYFR/DfB073PQH17WZuyNuCg=">ACBnicbZBPS8MwGMbT+W/Of1WPgSHsF1GK4J6EIZePE6wbmwtJU3TLSxpS5IKo+zmxa/ixYOKVz+DN7+N2daDbj4Q+OV535fkfYKUaks69soLS2vrK6V1ysbm1vbO+bu3r1MoGJgxOWiE6AJGE0Jo6ipFOKgjiASPtYHg9qbcfiJA0ie/UKCUeR/2YRhQjpS3fPGzVOnV4CV2Zcb8L9Q26nIawW9fcrftm1WpYU8FsAuogkIt3/xywRnMQKMyRlz7ZS5eVIKIoZGVfcTJIU4SHqk57GHEivXy6xgeayeEUSL0iRWcur8ncsSlHPFAd3KkBnK+NjH/q/UyFZ17OY3TJEYzx6KMgZVAiehwJAKghUbaUBYUP1XiAdIKx0dBUdgj2/8iI4J42Lhn17Wm1eFWmUwQE4AjVgzPQBDegBRyAwSN4Bq/gzXgyXox342PWjKmX3wR8bnD0Lile8=</latexit>conditioned on some other information using P(X|Y)
Likelihood Generation
(image)
Inference Non-Latent VAE GAN
maximum likelihood! Real MLE Adversarial Image Credit: Lotter et al. 2015
some aspect of the generated output
generated output
generated features to find some trait
it is real or not → P(image is real)
the discriminator into answering “real”
D gradient G gradient
xreal
sample minibatch sample latent vars.
z xfake
convert w/ generator
discriminator loss (higher if fail predictions) yreal
predict w/ discriminator
yfake generator loss (higher if correct predictions)
`G(✓D, ✓G) = −1 2Ez log D(G(z))
`G(✓D, ✓G) = −`D(✓D, ✓G)
`D(✓D, ✓G) = −1 2Ex∼Pdata log D(x) − 1 2Ez log(1 − D(G(z)))
Predict fake for fake data P(fake) = 1 - P(real) Predict real for real data
Process
Result
P(X), which decided by both P(Z) and F
distribution P(X) with a reasonable P(Z) and a powerful enough F
Image Credit: He et al. 2018
P(Z) x = F(z) P(X)
parameterized by powerful neural networks
blurriness, global inconsistency
inform it what and how to improve
et al. 2017)
xreal
sample minibatch sample latent vars.
z xfake
convert w/ generator
y
predict w/ discriminator
Discrete! Can’t backprop
(e.g. Yu et al. 2016)
Gumbel softmax (Gu et al. 2017)
Yu et al. 2017), or pairs of sentences (e.g. Wu et al. 2017)
Type of Discriminator Strength of Discriminator
Learning Rate for Generator Learning Rate for Discriminator
credit assignment problem
D(this) D(this is) D(this is a) D(this is a fake) D(this is a fake sentence)
is a problem
rollouts (Yu et al. 2016)
(e.g. tense, sentiment)
x h y P(y) Adversary!
x h y x h y Adversary! Adversary!
data (Kim et al. 2017)
representations for text classification
2017)
across tasks, others separate
(Qin et al. 2017)
marked, but would like to detect them if they are
the same as text without!
Adversary! x h y (sampled or true
sentences
spaces
(C), use supervised objective (D)
(Lample et al. 2017, Artetxe et al. 2017)
Zhu et al. 2017)
translated sentence
classification systems: calculate gradient to maximize loss
some success (e.g. Ebrahimi et al. 2018)
side, and "meaning destroying" on the target side
that means)
attacks!
training time and make sure that they are also classified correctly
https://adversarial-ml-tutorial.org