Adversarial Learned Molecular Graph Inference and Generation - - PowerPoint PPT Presentation

adversarial learned molecular graph inference and
SMART_READER_LITE
LIVE PREVIEW

Adversarial Learned Molecular Graph Inference and Generation - - PowerPoint PPT Presentation

Adversarial Learned Molecular Graph Inference and Generation Sebastian Plsterl and Christian Wachinger Artificial Intelligence in Medical Imaging, Ludwig-Maximilians-Universitt, Munich European Conference on Machine Learning and Principles


slide-1
SLIDE 1

Adversarial Learned Molecular Graph Inference and Generation

Sebastian Pölsterl and Christian Wachinger

Artificial Intelligence in Medical Imaging, Ludwig-Maximilians-Universität, Munich European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases September 14–18th 2020

slide-2
SLIDE 2

De Novo Chemical Design

Goal

Find a molecule with certain properties, e.g., an antiviral drug to inhibit SARS-CoV-2 replication.

  • S. Pölsterl and C. Wachinger (AI-Med)

Adversarial Learned Molecular Graph Inference and Generation 2 of 18

slide-3
SLIDE 3

De Novo Chemical Design

Goal

Find a molecule with certain properties, e.g., an antiviral drug to inhibit SARS-CoV-2 replication.

Problem

  • 1. The space of molecules is extremely large – in the order of 1033 drug-like molecules.1
  • 2. Molecules are discrete in nature, which prevents the use of gradient-based optimization.
  • 1P. G. Polishchuk et al. (2013). “Estimation of the size of drug-like chemical space based on GDB-17 data”. In: Journal of Computer-Aided Molecular Design

27.8, pp. 675–679

  • S. Pölsterl and C. Wachinger (AI-Med)

Adversarial Learned Molecular Graph Inference and Generation 2 of 18

slide-4
SLIDE 4

De Novo Chemical Design

Goal

Find a molecule with certain properties, e.g., an antiviral drug to inhibit SARS-CoV-2 replication.

Problem

  • 1. The space of molecules is extremely large – in the order of 1033 drug-like molecules.1
  • 2. Molecules are discrete in nature, which prevents the use of gradient-based optimization.

Solution

Use a deep generative model to project molecules into a continuous latent space and perform gradient-based optimization there.

  • 1P. G. Polishchuk et al. (2013). “Estimation of the size of drug-like chemical space based on GDB-17 data”. In: Journal of Computer-Aided Molecular Design

27.8, pp. 675–679

  • S. Pölsterl and C. Wachinger (AI-Med)

Adversarial Learned Molecular Graph Inference and Generation 2 of 18

slide-5
SLIDE 5

Graph Variational Autoencoder

O

Input G Encoder Decoder

OH

Output ˜ G Latent Space z z ∼ Prior Distribution

Reconstruction Loss L(G, ˜ G) Requires solving expensive graph isomorphism problem!

  • S. Pölsterl and C. Wachinger (AI-Med)

Adversarial Learned Molecular Graph Inference and Generation 3 of 18

slide-6
SLIDE 6

Graph Variational Autoencoder

O

Input G Encoder Decoder

OH

Output ˜ G Latent Space z z ∼ Prior Distribution

Reconstruction Loss L(G, ˜ G) Requires solving expensive graph isomorphism problem!

  • S. Pölsterl and C. Wachinger (AI-Med)

Adversarial Learned Molecular Graph Inference and Generation 3 of 18

slide-7
SLIDE 7

Graph Variational Autoencoder

O

Input G Encoder Decoder

OH

Output ˜ G Latent Space z z ∼ Prior Distribution

Reconstruction Loss L(G, ˜ G) Requires solving expensive graph isomorphism problem!

  • S. Pölsterl and C. Wachinger (AI-Med)

Adversarial Learned Molecular Graph Inference and Generation 3 of 18

slide-8
SLIDE 8

Prior Work I

Inference (Encoder): Various Graph Convolutional Neural Networks. Generation (Decoder):

  • In a single step using MLP (De Cao and Kipf, 2018; Ma et al., 2018; Simonovsky and Komodakis,

2018).

  • Sequentially using RNN (Bradshaw et al., 2019; Jin et al., 2018; Li, Zhang, et al., 2018; Li, Vinyals,

et al., 2018; Liu et al., 2018; Podda et al., 2020; Samanta et al., 2019; You et al., 2018).

  • S. Pölsterl and C. Wachinger (AI-Med)

Adversarial Learned Molecular Graph Inference and Generation 4 of 18

slide-9
SLIDE 9

Prior Work II

Generative Models for Molecular Graphs:

  • Likelihood-based (VAEs): compute reconstruction loss by (i) traversing nodes in a fixed
  • rder, (ii) Monte-Carlo sampling, or (iii) graph matching.
  • Adversarial: MolGAN is the only such model, but cannot do inference (De Cao and Kipf,

2018).

  • S. Pölsterl and C. Wachinger (AI-Med)

Adversarial Learned Molecular Graph Inference and Generation 5 of 18

slide-10
SLIDE 10

Prior Work II

Generative Models for Molecular Graphs:

  • Likelihood-based (VAEs): compute reconstruction loss by (i) traversing nodes in a fixed
  • rder, (ii) Monte-Carlo sampling, or (iii) graph matching.
  • Adversarial: MolGAN is the only such model, but cannot do inference (De Cao and Kipf,

2018).

Generative Models for Continuous Data:

  • Adversarial Learned Inference (ALI) and its extension ALICE learn an encoder/decoder

without optimizing an explicit reconstruction loss (Dumoulin et al., 2017; Li, Liu, et al.,

2017).

  • ALI & ALICE are only applicable to continuous-valued data, such as images.
  • S. Pölsterl and C. Wachinger (AI-Med)

Adversarial Learned Molecular Graph Inference and Generation 5 of 18

slide-11
SLIDE 11

Our Contributions

  • We propose Adversarial Learned Molecular Graph Inference and

Generation (ALMGIG) that

  • 1. does not require solving an expensive graph isomorphism problem,
  • S. Pölsterl and C. Wachinger (AI-Med)

Adversarial Learned Molecular Graph Inference and Generation 6 of 18

slide-12
SLIDE 12

Our Contributions

  • We propose Adversarial Learned Molecular Graph Inference and

Generation (ALMGIG) that

  • 1. does not require solving an expensive graph isomorphism problem,
  • 2. performs inference over graphs by extending the Graph Isomorphism Network to

multi-graphs (Xu et al., 2019),

  • S. Pölsterl and C. Wachinger (AI-Med)

Adversarial Learned Molecular Graph Inference and Generation 6 of 18

slide-13
SLIDE 13

Our Contributions

  • We propose Adversarial Learned Molecular Graph Inference and

Generation (ALMGIG) that

  • 1. does not require solving an expensive graph isomorphism problem,
  • 2. performs inference over graphs by extending the Graph Isomorphism Network to

multi-graphs (Xu et al., 2019),

  • 3. generates discrete data (atoms and bonds) via the Gumbel-softmax trick (Jang et al.,

2017; Maddison et al., 2017),

  • S. Pölsterl and C. Wachinger (AI-Med)

Adversarial Learned Molecular Graph Inference and Generation 6 of 18

slide-14
SLIDE 14

Our Contributions

  • We propose Adversarial Learned Molecular Graph Inference and

Generation (ALMGIG) that

  • 1. does not require solving an expensive graph isomorphism problem,
  • 2. performs inference over graphs by extending the Graph Isomorphism Network to

multi-graphs (Xu et al., 2019),

  • 3. generates discrete data (atoms and bonds) via the Gumbel-softmax trick (Jang et al.,

2017; Maddison et al., 2017),

  • 4. generates chemically valid molecules by enforcing connectivity constraints via penalty

terms (Ma et al., 2018).

  • S. Pölsterl and C. Wachinger (AI-Med)

Adversarial Learned Molecular Graph Inference and Generation 6 of 18

slide-15
SLIDE 15

Our Contributions

  • We propose Adversarial Learned Molecular Graph Inference and

Generation (ALMGIG) that

  • 1. does not require solving an expensive graph isomorphism problem,
  • 2. performs inference over graphs by extending the Graph Isomorphism Network to

multi-graphs (Xu et al., 2019),

  • 3. generates discrete data (atoms and bonds) via the Gumbel-softmax trick (Jang et al.,

2017; Maddison et al., 2017),

  • 4. generates chemically valid molecules by enforcing connectivity constraints via penalty

terms (Ma et al., 2018).

  • We show that current evaluation metrics are flawed, and propose a better evaluation

metric to assess the distribution learning capabilities of methods.

  • S. Pölsterl and C. Wachinger (AI-Med)

Adversarial Learned Molecular Graph Inference and Generation 6 of 18

slide-16
SLIDE 16

Adversarial Learned Inference

Dumoulin et al. (2017)

O

gφ(G, ε) ˜ z ∼ qφ(z | G) ˜ z ∼ qφ(z | G) ˜ z ∼ qφ(z | G) gθ(˜ z, ε) ˜ G′ ∼ qθ(G | ˜ z) ˜ G′ ∼ qθ(G | ˜ z) Dη(G, ˜ G′) Dψ(G, ˜ z) q(G) q(G) q(G)

Encoder Latent space Generator Cycle Discriminator Joint Discriminator

z ∼ N(0, I) z ∼ N(0, I) z ∼ N(0, I) gθ(z, ε) ˜ G ∼ qθ(G | z) ˜ G ∼ qθ(G | z) ˜ G ∼ qθ(G | z) Dη(G, G) Dψ( ˜ G, z)

  • Training: match joint distributions over graphs and latent variables
  • S. Pölsterl and C. Wachinger (AI-Med)

Adversarial Learned Molecular Graph Inference and Generation 7 of 18

slide-17
SLIDE 17

Adversarial Learned Inference

Dumoulin et al. (2017)

O

gφ(G, ε) ˜ z ∼ qφ(z | G) ˜ z ∼ qφ(z | G) ˜ z ∼ qφ(z | G) gθ(˜ z, ε) ˜ G′ ∼ qθ(G | ˜ z) ˜ G′ ∼ qθ(G | ˜ z) Dη(G, ˜ G′) Dψ(G, ˜ z) q(G) q(G) q(G)

Encoder Latent space Generator Cycle Discriminator Joint Discriminator

z ∼ N(0, I) z ∼ N(0, I) z ∼ N(0, I) gθ(z, ε) ˜ G ∼ qθ(G | z) ˜ G ∼ qθ(G | z) ˜ G ∼ qθ(G | z) Dη(G, G) Dψ( ˜ G, z)

  • Training: match joint distributions over graphs and latent variables
  • 1. encoder joint distribution: qφ(G, z) = q(G) qφ(z | G)
  • S. Pölsterl and C. Wachinger (AI-Med)

Adversarial Learned Molecular Graph Inference and Generation 7 of 18

slide-18
SLIDE 18

Adversarial Learned Inference

Dumoulin et al. (2017)

O

gφ(G, ε) ˜ z ∼ qφ(z | G) ˜ z ∼ qφ(z | G) ˜ z ∼ qφ(z | G) gθ(˜ z, ε) ˜ G′ ∼ qθ(G | ˜ z) ˜ G′ ∼ qθ(G | ˜ z) Dη(G, ˜ G′) Dψ(G, ˜ z) q(G) q(G) q(G)

Encoder Latent space Generator Cycle Discriminator Joint Discriminator

z ∼ N(0, I) z ∼ N(0, I) z ∼ N(0, I) gθ(z, ε) ˜ G ∼ qθ(G | z) ˜ G ∼ qθ(G | z) ˜ G ∼ qθ(G | z) Dη(G, G) Dψ( ˜ G, z)

  • Training: match joint distributions over graphs and latent variables
  • 1. encoder joint distribution: qφ(G, z) = q(G) qφ(z | G)
  • S. Pölsterl and C. Wachinger (AI-Med)

Adversarial Learned Molecular Graph Inference and Generation 7 of 18

slide-19
SLIDE 19

Adversarial Learned Inference

Dumoulin et al. (2017)

O

gφ(G, ε) ˜ z ∼ qφ(z | G) ˜ z ∼ qφ(z | G) ˜ z ∼ qφ(z | G) gθ(˜ z, ε) ˜ G′ ∼ qθ(G | ˜ z) ˜ G′ ∼ qθ(G | ˜ z) Dη(G, ˜ G′) Dψ(G, ˜ z) q(G) q(G) q(G)

Encoder Latent space Generator Cycle Discriminator Joint Discriminator

z ∼ N(0, I) z ∼ N(0, I) z ∼ N(0, I) gθ(z, ε) ˜ G ∼ qθ(G | z) ˜ G ∼ qθ(G | z) ˜ G ∼ qθ(G | z) Dη(G, G) Dψ( ˜ G, z)

  • Training: match joint distributions over graphs and latent variables
  • 1. encoder joint distribution: qφ(G, z) = q(G) qφ(z | G)
  • S. Pölsterl and C. Wachinger (AI-Med)

Adversarial Learned Molecular Graph Inference and Generation 7 of 18

slide-20
SLIDE 20

Adversarial Learned Inference

Dumoulin et al. (2017)

O

gφ(G, ε) ˜ z ∼ qφ(z | G) ˜ z ∼ qφ(z | G) ˜ z ∼ qφ(z | G) gθ(˜ z, ε) ˜ G′ ∼ qθ(G | ˜ z) ˜ G′ ∼ qθ(G | ˜ z) Dη(G, ˜ G′) Dψ(G, ˜ z) q(G) q(G) q(G)

Encoder Latent space Generator Cycle Discriminator Joint Discriminator

z ∼ N(0, I) z ∼ N(0, I) z ∼ N(0, I) gθ(z, ε) ˜ G ∼ qθ(G | z) ˜ G ∼ qθ(G | z) ˜ G ∼ qθ(G | z) Dη(G, G) Dψ( ˜ G, z)

  • Training: match joint distributions over graphs and latent variables
  • 1. encoder joint distribution: qφ(G, z) = q(G) qφ(z | G)
  • 2. decoder joint distribution: pθ(G, z) = pz(z) qθ(G | z)
  • S. Pölsterl and C. Wachinger (AI-Med)

Adversarial Learned Molecular Graph Inference and Generation 7 of 18

slide-21
SLIDE 21

Adversarial Learned Inference

Dumoulin et al. (2017)

O

gφ(G, ε) ˜ z ∼ qφ(z | G) ˜ z ∼ qφ(z | G) ˜ z ∼ qφ(z | G) gθ(˜ z, ε) ˜ G′ ∼ qθ(G | ˜ z) ˜ G′ ∼ qθ(G | ˜ z) Dη(G, ˜ G′) Dψ(G, ˜ z) q(G) q(G) q(G)

Encoder Latent space Generator Cycle Discriminator Joint Discriminator

z ∼ N(0, I) z ∼ N(0, I) z ∼ N(0, I) gθ(z, ε) ˜ G ∼ qθ(G | z) ˜ G ∼ qθ(G | z) ˜ G ∼ qθ(G | z) Dη(G, G) Dψ( ˜ G, z)

  • Training: match joint distributions over graphs and latent variables
  • 1. encoder joint distribution: qφ(G, z) = q(G) qφ(z | G)
  • 2. decoder joint distribution: pθ(G, z) = pz(z) qθ(G | z)
  • S. Pölsterl and C. Wachinger (AI-Med)

Adversarial Learned Molecular Graph Inference and Generation 7 of 18

slide-22
SLIDE 22

Adversarial Learned Inference

Dumoulin et al. (2017)

O

gφ(G, ε) ˜ z ∼ qφ(z | G) ˜ z ∼ qφ(z | G) ˜ z ∼ qφ(z | G) gθ(˜ z, ε) ˜ G′ ∼ qθ(G | ˜ z) ˜ G′ ∼ qθ(G | ˜ z) Dη(G, ˜ G′) Dψ(G, ˜ z) q(G) q(G) q(G)

Encoder Latent space Generator Cycle Discriminator Joint Discriminator

z ∼ N(0, I) z ∼ N(0, I) z ∼ N(0, I) gθ(z, ε) ˜ G ∼ qθ(G | z) ˜ G ∼ qθ(G | z) ˜ G ∼ qθ(G | z) Dη(G, G) Dψ( ˜ G, z)

  • Training: match joint distributions over graphs and latent variables
  • 1. encoder joint distribution: qφ(G, z) = q(G) qφ(z | G)
  • 2. decoder joint distribution: pθ(G, z) = pz(z) qθ(G | z)
  • S. Pölsterl and C. Wachinger (AI-Med)

Adversarial Learned Molecular Graph Inference and Generation 7 of 18

slide-23
SLIDE 23

Adversarial Learned Inference

Dumoulin et al. (2017)

O

gφ(G, ε) ˜ z ∼ qφ(z | G) ˜ z ∼ qφ(z | G) ˜ z ∼ qφ(z | G) gθ(˜ z, ε) ˜ G′ ∼ qθ(G | ˜ z) ˜ G′ ∼ qθ(G | ˜ z) Dη(G, ˜ G′) Dψ(G, ˜ z) q(G) q(G) q(G)

Encoder Latent space Generator Cycle Discriminator Joint Discriminator

z ∼ N(0, I) z ∼ N(0, I) z ∼ N(0, I) gθ(z, ε) ˜ G ∼ qθ(G | z) ˜ G ∼ qθ(G | z) ˜ G ∼ qθ(G | z) Dη(G, G) Dψ( ˜ G, z)

  • Training: match joint distributions over graphs and latent variables
  • 1. encoder joint distribution: qφ(G, z) = q(G) qφ(z | G)
  • 2. decoder joint distribution: pθ(G, z) = pz(z) qθ(G | z)
  • However, reconstruction ˜

G′ remains unconstrained.

  • S. Pölsterl and C. Wachinger (AI-Med)

Adversarial Learned Molecular Graph Inference and Generation 7 of 18

slide-24
SLIDE 24

Adversarial Learned Inference

ALICE (Li, Liu, et al., 2017)

O

gφ(G, ε) ˜ z ∼ qφ(z | G) gθ(˜ z, ε) ˜ G′ ∼ qθ(G | ˜ z) Dη(G, ˜ G′) Dψ(G, ˜ z) q(G)

Encoder Latent space Generator Cycle Discriminator Joint Discriminator

z ∼ N(0, I) gθ(z, ε) ˜ G ∼ qθ(G | z) Dη(G, G) Dψ( ˜ G, z)

  • ALICE adds cycle discriminator on pairs of graphs to enforce consistent reconstruction.
  • S. Pölsterl and C. Wachinger (AI-Med)

Adversarial Learned Molecular Graph Inference and Generation 8 of 18

slide-25
SLIDE 25

Adversarial Learned Inference

ALICE (Li, Liu, et al., 2017)

O

gφ(G, ε) ˜ z ∼ qφ(z | G) gθ(˜ z, ε) ˜ G′ ∼ qθ(G | ˜ z) Dη(G, ˜ G′) Dψ(G, ˜ z) q(G)

Encoder Latent space Generator Cycle Discriminator Joint Discriminator

z ∼ N(0, I) gθ(z, ε) ˜ G ∼ qθ(G | z) Dη(G, G) Dψ( ˜ G, z)

  • ALICE adds cycle discriminator on pairs of graphs to enforce consistent reconstruction.
  • At the optimum, encoder and decoder joint distribution will match, and ˜

G′ = G.

  • S. Pölsterl and C. Wachinger (AI-Med)

Adversarial Learned Molecular Graph Inference and Generation 8 of 18

slide-26
SLIDE 26

Adversarial Learned Inference

ALICE (Li, Liu, et al., 2017)

O

gφ(G, ε) ˜ z ∼ qφ(z | G) gθ(˜ z, ε) ˜ G′ ∼ qθ(G | ˜ z) Dη(G, ˜ G′) Dψ(G, ˜ z) q(G)

Encoder Latent space Generator Cycle Discriminator Joint Discriminator

z ∼ N(0, I) gθ(z, ε) ˜ G ∼ qθ(G | z) Dη(G, G) Dψ( ˜ G, z)

  • ALICE adds cycle discriminator on pairs of graphs to enforce consistent reconstruction.
  • At the optimum, encoder and decoder joint distribution will match, and ˜

G′ = G.

  • However, in practice reaching the optimum is extremely hard.
  • S. Pölsterl and C. Wachinger (AI-Med)

Adversarial Learned Molecular Graph Inference and Generation 8 of 18

slide-27
SLIDE 27

Adversarial Learned Inference

Unary Discriminator

O

gφ(G, ε) ˜ z ∼ qφ(z | G) gθ(˜ z, ε) ˜ G′ ∼ qθ(G | ˜ z) Dη(G, ˜ G′) Dψ(G, ˜ z) q(G)

Encoder Latent space Generator Cycle Discriminator Joint Discriminator

z ∼ N(0, I) gθ(z, ε) ˜ G ∼ qθ(G | z) Dη(G, G) Dψ( ˜ G, z) Dξ( ˜ G)

Unary Discriminator

Dξ(G)

Match q(G) and qθ(G | z) Match q(G)qφ(z | G) and pz(z)qθ(G | z) Match q(G) and qθ(G | ˜ z) Unary Discriminator: Joint Discriminator: Cycle Discriminator:

  • Unary discriminator facilitates training when the joint distribution is difficult to learn.
  • S. Pölsterl and C. Wachinger (AI-Med)

Adversarial Learned Molecular Graph Inference and Generation 9 of 18

slide-28
SLIDE 28

Adversarial Learned Inference

Unary Discriminator

O

gφ(G, ε) ˜ z ∼ qφ(z | G) gθ(˜ z, ε) ˜ G′ ∼ qθ(G | ˜ z) Dη(G, ˜ G′) Dψ(G, ˜ z) q(G)

Encoder Latent space Generator Cycle Discriminator Joint Discriminator

z ∼ N(0, I) gθ(z, ε) ˜ G ∼ qθ(G | z) Dη(G, G) Dψ( ˜ G, z) Dξ( ˜ G)

Unary Discriminator

Dξ(G)

Match q(G) and qθ(G | z) Match q(G)qφ(z | G) and pz(z)qθ(G | z) Match q(G) and qθ(G | ˜ z) Unary Discriminator: Joint Discriminator: Cycle Discriminator:

  • Unary discriminator facilitates training when the joint distribution is difficult to learn.
  • S. Pölsterl and C. Wachinger (AI-Med)

Adversarial Learned Molecular Graph Inference and Generation 9 of 18

slide-29
SLIDE 29

Adversarial Learned Inference

Unary Discriminator

O

gφ(G, ε) ˜ z ∼ qφ(z | G) gθ(˜ z, ε) ˜ G′ ∼ qθ(G | ˜ z) Dη(G, ˜ G′) Dψ(G, ˜ z) q(G)

Encoder Latent space Generator Cycle Discriminator Joint Discriminator

z ∼ N(0, I) gθ(z, ε) ˜ G ∼ qθ(G | z) Dη(G, G) Dψ( ˜ G, z) Dξ( ˜ G)

Unary Discriminator

Dξ(G)

Match q(G) and qθ(G | z) Match q(G)qφ(z | G) and pz(z)qθ(G | z) Match q(G) and qθ(G | ˜ z) Unary Discriminator: Joint Discriminator: Cycle Discriminator:

  • Unary discriminator facilitates training when the joint distribution is difficult to learn.
  • S. Pölsterl and C. Wachinger (AI-Med)

Adversarial Learned Molecular Graph Inference and Generation 9 of 18

slide-30
SLIDE 30

Adversarial Learned Inference

Unary Discriminator

O

gφ(G, ε) ˜ z ∼ qφ(z | G) gθ(˜ z, ε) ˜ G′ ∼ qθ(G | ˜ z) Dη(G, ˜ G′) Dψ(G, ˜ z) q(G)

Encoder Latent space Generator Cycle Discriminator Joint Discriminator

z ∼ N(0, I) gθ(z, ε) ˜ G ∼ qθ(G | z) Dη(G, G) Dψ( ˜ G, z) Dξ( ˜ G)

Unary Discriminator

Dξ(G)

Match q(G) and qθ(G | z) Match q(G)qφ(z | G) and pz(z)qθ(G | z) Match q(G) and qθ(G | ˜ z) Unary Discriminator: Joint Discriminator: Cycle Discriminator:

  • Unary discriminator facilitates training when the joint distribution is difficult to learn.
  • S. Pölsterl and C. Wachinger (AI-Med)

Adversarial Learned Molecular Graph Inference and Generation 9 of 18

slide-31
SLIDE 31

Experiments

Data: Molecules from the QM9 dataset (≤9 heavy atoms, 4 atom types, 3 bond types). Competing Methods

  • CGVAE (Liu et al., 2018), NeVAE (Samanta et al., 2019): Graph-based VAE with

RNN-decoder and valence constraints.

  • GrammarVAE (Kusner et al., 2017): SMILES-based VAE.
  • MolGAN (De Cao and Kipf, 2018): Graph-based WGAN without encoder.
  • Random: chooses atom and bonds randomly, but honors valence constraints.
  • S. Pölsterl and C. Wachinger (AI-Med)

Adversarial Learned Molecular Graph Inference and Generation 10 of 18

slide-32
SLIDE 32

Simple Metrics are Flawed

  • Validity: Percentage of valid molecules.
  • Uniqueness: Percentage of unique molecules.
  • Novelty: Percentage of unique molecules not in the data.

· · · · · ·

F F F F F F F F F F F F F F F F F F F F F F F F

· · ·

⇒ 100% Validity ⇒ 100% Uniqueness ⇒ 100% Novelty

  • S. Pölsterl and C. Wachinger (AI-Med)

Adversarial Learned Molecular Graph Inference and Generation 11 of 18

slide-33
SLIDE 33

Simple Metrics are Flawed

  • Validity: Percentage of valid molecules.
  • Uniqueness: Percentage of unique molecules.
  • Novelty: Percentage of unique molecules not in the data.

· · · · · ·

F F F F F F F F F F F F F F F F F F F F F F F F

· · ·

⇒ 100% Validity ⇒ 100% Uniqueness ⇒ 100% Novelty

  • S. Pölsterl and C. Wachinger (AI-Med)

Adversarial Learned Molecular Graph Inference and Generation 11 of 18

slide-34
SLIDE 34

Simple Metrics are Flawed

  • Validity: Percentage of valid molecules.
  • Uniqueness: Percentage of unique molecules.
  • Novelty: Percentage of unique molecules not in the data.

· · · · · ·

F F F F F F F F F F F F F F F F F F F F F F F F

· · ·

⇒ 100% Validity ⇒ 100% Uniqueness ⇒ 100% Novelty

  • S. Pölsterl and C. Wachinger (AI-Med)

Adversarial Learned Molecular Graph Inference and Generation 11 of 18

slide-35
SLIDE 35

Simple Metrics are Flawed

  • Validity: Percentage of valid molecules.
  • Uniqueness: Percentage of unique molecules.
  • Novelty: Percentage of unique molecules not in the data.

· · · · · ·

F F F F F F F F F F F F F F F F F F F F F F F F

· · ·

⇒ 100% Validity ⇒ 100% Uniqueness ⇒ 100% Novelty

  • S. Pölsterl and C. Wachinger (AI-Med)

Adversarial Learned Molecular Graph Inference and Generation 11 of 18

slide-36
SLIDE 36

Simple Metrics are Flawed

  • Validity: Percentage of valid molecules.
  • Uniqueness: Percentage of unique molecules.
  • Novelty: Percentage of unique molecules not in the data.
  • Metrics do not capture what models learned from the training data.

· · · · · ·

F F F F F F F F F F F F F F F F F F F F F F F F

· · ·

⇒ 100% Validity ⇒ 100% Uniqueness ⇒ 100% Novelty

  • S. Pölsterl and C. Wachinger (AI-Med)

Adversarial Learned Molecular Graph Inference and Generation 11 of 18

slide-37
SLIDE 37

Advanced Metrics

What we are actually interested in: Can we generate chemically meaningful molecules with similar properties as in the training data?

  • S. Pölsterl and C. Wachinger (AI-Med)

Adversarial Learned Molecular Graph Inference and Generation 12 of 18

slide-38
SLIDE 38

Advanced Metrics

What we are actually interested in: Can we generate chemically meaningful molecules with similar properties as in the training data?

0.2 0.4 0.6 0.8 1.0

P (reference) Q (generated)

  • Brown et al. (2019) compared the distribution of 10 chemical

descriptors in terms of KL divergence DKL(P Q).

  • S. Pölsterl and C. Wachinger (AI-Med)

Adversarial Learned Molecular Graph Inference and Generation 12 of 18

slide-39
SLIDE 39

Advanced Metrics

What we are actually interested in: Can we generate chemically meaningful molecules with similar properties as in the training data?

0.2 0.4 0.6 0.8 1.0

P (reference) Q (generated)

  • Brown et al. (2019) compared the distribution of 10 chemical

descriptors in terms of KL divergence DKL(P Q).

  • We propose using Earth Mover’s Distance (EMD):

EMD KL div Indiscernibility of identicals ✓ ✓ Symmetry ✓ ✗ Triangle inequality ✓ ✗ Quantify spatial shift ✓ ✗ Non-overlapping supports ✓ ✗

  • S. Pölsterl and C. Wachinger (AI-Med)

Adversarial Learned Molecular Graph Inference and Generation 12 of 18

slide-40
SLIDE 40

Advanced Metrics

What we are actually interested in: Can we generate chemically meaningful molecules with similar properties as in the training data?

0.2 0.4 0.6 0.8 1.0

P (reference) Q (generated)

  • Brown et al. (2019) compared the distribution of 10 chemical

descriptors in terms of KL divergence DKL(P Q).

  • We propose using Earth Mover’s Distance (EMD):

EMD KL div Indiscernibility of identicals ✓ ✓ Symmetry ✓ ✗ Triangle inequality ✓ ✗ Quantify spatial shift ✓ ✗ Non-overlapping supports ✓ ✗

  • S. Pölsterl and C. Wachinger (AI-Med)

Adversarial Learned Molecular Graph Inference and Generation 12 of 18

slide-41
SLIDE 41

Comparison – Advanced Metrics

mean −exp(EMD)

0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7

0.753 0.746 0.569 0.457 0.187 0.350 Distribution Learning wrt Testing Method ALMGIG CGVAE MolGAN NeVAE GrammarVAE Random Higher is better

20 40 60 80 100 120 140 160

Mol Wt EMD = 0.166 Observed EMD = 1.022

1 2 3 4 5 6 7 9 10 0.00 0.05 0.10 0.15 0.20 0.25 0.30 0.35

Aliphatic Rings EMD = 0.111

1 2 3 4 5 6 7 8 9 10+

EMD = 0.649

ALMGIG CGVAE

  • S. Pölsterl and C. Wachinger (AI-Med)

Adversarial Learned Molecular Graph Inference and Generation 13 of 18

slide-42
SLIDE 42

Comparison – Advanced Metrics

mean −exp(EMD)

0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7

0.753 0.746 0.569 0.457 0.187 0.350 Distribution Learning wrt Testing Method ALMGIG CGVAE MolGAN NeVAE GrammarVAE Random Higher is better

20 40 60 80 100 120 140 160

Mol Wt EMD = 0.166 Observed EMD = 1.022

1 2 3 4 5 6 7 9 10 0.00 0.05 0.10 0.15 0.20 0.25 0.30 0.35

Aliphatic Rings EMD = 0.111

1 2 3 4 5 6 7 8 9 10+

EMD = 0.649

ALMGIG CGVAE

  • S. Pölsterl and C. Wachinger (AI-Med)

Adversarial Learned Molecular Graph Inference and Generation 13 of 18

slide-43
SLIDE 43

Comparison – Advanced Metrics

mean −exp(EMD)

0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7

0.753 0.746 0.569 0.457 0.187 0.350 Distribution Learning wrt Testing Method ALMGIG CGVAE MolGAN NeVAE GrammarVAE Random Higher is better

20 40 60 80 100 120 140 160

Mol Wt EMD = 0.166 Observed EMD = 1.022

1 2 3 4 5 6 7 9 10 0.00 0.05 0.10 0.15 0.20 0.25 0.30 0.35

Aliphatic Rings EMD = 0.111

1 2 3 4 5 6 7 8 9 10+

EMD = 0.649

ALMGIG CGVAE

  • S. Pölsterl and C. Wachinger (AI-Med)

Adversarial Learned Molecular Graph Inference and Generation 13 of 18

slide-44
SLIDE 44

Comparison – Advanced Metrics

mean −exp(EMD)

0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7

0.753 0.746 0.569 0.457 0.187 0.350 Distribution Learning wrt Testing Method ALMGIG CGVAE MolGAN NeVAE GrammarVAE Random Higher is better

20 40 60 80 100 120 140 160

Mol Wt EMD = 0.166 Observed EMD = 1.022

1 2 3 4 5 6 7 9 10 0.00 0.05 0.10 0.15 0.20 0.25 0.30 0.35

Aliphatic Rings EMD = 0.111

1 2 3 4 5 6 7 8 9 10+

EMD = 0.649

ALMGIG CGVAE

  • S. Pölsterl and C. Wachinger (AI-Med)

Adversarial Learned Molecular Graph Inference and Generation 13 of 18

slide-45
SLIDE 45

Comparison – Advanced Metrics

mean −exp(EMD)

0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7

0.753 0.746 0.569 0.457 0.187 0.350 Distribution Learning wrt Testing Method ALMGIG CGVAE MolGAN NeVAE GrammarVAE Random Higher is better

20 40 60 80 100 120 140 160

Mol Wt EMD = 0.166 Observed EMD = 1.022

1 2 3 4 5 6 7 9 10 0.00 0.05 0.10 0.15 0.20 0.25 0.30 0.35

Aliphatic Rings EMD = 0.111

1 2 3 4 5 6 7 8 9 10+

EMD = 0.649

ALMGIG CGVAE

  • S. Pölsterl and C. Wachinger (AI-Med)

Adversarial Learned Molecular Graph Inference and Generation 13 of 18

slide-46
SLIDE 46

Comparison – Adversarial Learning Scheme

mean −exp(EMD)

0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7

0.753 0.613 0.614 0.573 Distribution Learning wrt Testing Method ALMGIG ALICE ALI WGAN Higher is better

60 80 100 120 140

Mol Wt EMD = 1.328 Observed

1 2 0.0 0.2 0.4 0.6 0.8 1.0

Aromatic Rings EMD = 0.813

  • S. Pölsterl and C. Wachinger (AI-Med)

Adversarial Learned Molecular Graph Inference and Generation 14 of 18

slide-47
SLIDE 47

Comparison – Adversarial Learning Scheme

mean −exp(EMD)

0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7

0.753 0.613 0.614 0.573 Distribution Learning wrt Testing Method ALMGIG ALICE ALI WGAN Higher is better ALICE

60 80 100 120 140

Mol Wt EMD = 1.328 Observed

1 2 0.0 0.2 0.4 0.6 0.8 1.0

Aromatic Rings EMD = 0.813

  • S. Pölsterl and C. Wachinger (AI-Med)

Adversarial Learned Molecular Graph Inference and Generation 14 of 18

slide-48
SLIDE 48

Conclusion

  • 1. ALMGIG allows training without computing a reconstructing loss, which would require

solving an expensive graph isomorphism problem.

  • S. Pölsterl and C. Wachinger (AI-Med)

Adversarial Learned Molecular Graph Inference and Generation 15 of 18

slide-49
SLIDE 49

Conclusion

  • 1. ALMGIG allows training without computing a reconstructing loss, which would require

solving an expensive graph isomorphism problem.

  • 2. ALMGIG more accurately represents the distribution over the space of molecules

than previous methods.

  • S. Pölsterl and C. Wachinger (AI-Med)

Adversarial Learned Molecular Graph Inference and Generation 15 of 18

slide-50
SLIDE 50

Conclusion

  • 1. ALMGIG allows training without computing a reconstructing loss, which would require

solving an expensive graph isomorphism problem.

  • 2. ALMGIG more accurately represents the distribution over the space of molecules

than previous methods.

  • 3. Common validation metrics validity, novelty, and uniqueness are insufficient to properly

assess the performance of methods.

  • S. Pölsterl and C. Wachinger (AI-Med)

Adversarial Learned Molecular Graph Inference and Generation 15 of 18

slide-51
SLIDE 51

Conclusion

  • 1. ALMGIG allows training without computing a reconstructing loss, which would require

solving an expensive graph isomorphism problem.

  • 2. ALMGIG more accurately represents the distribution over the space of molecules

than previous methods.

  • 3. Common validation metrics validity, novelty, and uniqueness are insufficient to properly

assess the performance of methods.

  • 4. Distributions of chemical descriptors provide detailed insight into what type of

molecules a model can generate.

  • S. Pölsterl and C. Wachinger (AI-Med)

Adversarial Learned Molecular Graph Inference and Generation 15 of 18

slide-52
SLIDE 52

Conclusion

  • 1. ALMGIG allows training without computing a reconstructing loss, which would require

solving an expensive graph isomorphism problem.

  • 2. ALMGIG more accurately represents the distribution over the space of molecules

than previous methods.

  • 3. Common validation metrics validity, novelty, and uniqueness are insufficient to properly

assess the performance of methods.

  • 4. Distributions of chemical descriptors provide detailed insight into what type of

molecules a model can generate.

  • 5. Code available at https://github.com/ai-med/almgig
  • S. Pölsterl and C. Wachinger (AI-Med)

Adversarial Learned Molecular Graph Inference and Generation 15 of 18

slide-53
SLIDE 53

References I

Bradshaw, J., B. Paige, M. J. Kusner, M. Segler, and J. M. Hernández-Lobato (2019). “A Model to Search for Synthesizable Molecules”. In: Advances in Neural Information Processing Systems 32, pp. 7937–7949. Brown, N., M. Fiscato, M. H. Segler, and A. C. Vaucher (2019). “GuacaMol: Benchmarking Models for de Novo Molecular Design”. In: Journal of Chemical Information and Modeling 59.3, pp. 1096–1108. De Cao, N. and T. Kipf (2018). “MolGAN: An implicit generative model for small molecular graphs”. In: Dumoulin, V., I. Belghazi, B. Poole, O. Mastropietro, A. Lamb, M. Arjovsky, and A. Courville (2017). “Adversarially learned inference”. In: 5th International Conference on Learning Representations. Jang, E., S. Gu, and B. Poole (2017). “Categorical Reparameterization with Gumbel-Softmax”. In: 5th International Conference on Learning Representations. Jin, W., R. Barzilay, and T. Jaakkola (2018). “Junction Tree Variational Autoencoder for Molecular Graph Generation”. In: 35th International Conference on Machine Learning, pp. 2323–2332. Kusner, M. J., B. Paige, and J. M. Hernández-Lobato (2017). “Grammar Variational Autoencoder”. In: 34th International Conference on Machine Learning, pp. 1945–1954.

  • S. Pölsterl and C. Wachinger (AI-Med)

Adversarial Learned Molecular Graph Inference and Generation 16 of 18

slide-54
SLIDE 54

References II

Li, C., H. Liu, C. Chen, Y. Pu, L. Chen, R. Henao, and L. Carin (2017). “ALICE: Towards Understanding Adversarial Learning for Joint Distribution Matching”. In: Advances in Neural Information Processing Systems 30, pp. 5495–5503. Li, Y., L. Zhang, and Z. Liu (2018). “Multi-objective de novo drug design with conditional graph generative model”. In: Journal of Cheminformatics 10, p. 33. Li, Y., O. Vinyals, C. Dyer, R. Pascanu, and P. Battaglia (2018). “Learning Deep Generative Models of Graphs”. In: Liu, Q., M. Allamanis, M. Brockschmidt, and A. Gaunt (2018). “Constrained Graph Variational Autoencoders for Molecule Design”. In: Advances in Neural Information Processing Systems 31, pp. 7806–7815. Ma, T., J. Chen, and C. Xiao (2018). “Constrained Generation of Semantically Valid Graphs via Regularizing Variational Autoencoders”. In: Advances in Neural Information Processing Systems 31, pp. 7113–7124. Maddison, C. J., A. Mnih, and Y. W. Teh (2017). “The Concrete Distribution: A Continuous Relaxation of Discrete Random Variables”. In: 5th International Conference on Learning Representations.

  • S. Pölsterl and C. Wachinger (AI-Med)

Adversarial Learned Molecular Graph Inference and Generation 17 of 18

slide-55
SLIDE 55

References III

Podda, M., D. Bacciu, and A. Micheli (2020). “A Deep Generative Model for Fragment-Based Molecule Generation”. In: Proc. of AISTATS. Polishchuk, P. G., T. I. Madzhidov, and A. Varnek (2013). “Estimation of the size of drug-like chemical space based on GDB-17 data”. In: Journal of Computer-Aided Molecular Design 27.8, pp. 675–679. Samanta, B., A. De, G. Jana, N. Ganguly, and M. Gomez-Rodriguez (2019). “NeVAE: A Deep Generative Model for Molecular Graphs”. In: 33rd AAAI Conference on Artificial Intelligence, pp. 1110–1117. Simonovsky, M. and N. Komodakis (2018). “GraphVAE: Towards Generation of Small Graphs Using Variational Autoencoders”. In: ICANN, pp. 412–422. Xu, K., W. Hu, J. Leskovec, and S. Jegelka (2019). “How Powerful are Graph Neural Networks?” In: 7th International Conference on Learning Representations. You, J., B. Liu, R. Ying, V. Pande, and J. Leskovec (2018). “Graph Convolutional Policy Network for Goal-Directed Molecular Graph Generation”. In: Advances in Neural Information Processing Systems 31,

  • pp. 6412–6422.
  • S. Pölsterl and C. Wachinger (AI-Med)

Adversarial Learned Molecular Graph Inference and Generation 18 of 18