Representation and Generation of Molecular Graphs Wengong Jin MIT - - PowerPoint PPT Presentation

representation and generation of molecular graphs
SMART_READER_LITE
LIVE PREVIEW

Representation and Generation of Molecular Graphs Wengong Jin MIT - - PowerPoint PPT Presentation

Representation and Generation of Molecular Graphs Wengong Jin MIT CSAIL in collaboration with Tommi Jaakkola, Regina Barzilay, Kevin Yang, Kyle Swanson Why are molecules interesting for ML? E.g., antibiotic (cephalosporin) substructures


slide-1
SLIDE 1

Representation and Generation of Molecular Graphs

Wengong Jin MIT CSAIL

in collaboration with Tommi Jaakkola, Regina Barzilay, Kevin Yang, Kyle Swanson

slide-2
SLIDE 2
  • E.g., antibiotic (cephalosporin)

Why are molecules interesting for ML?

3D information node labels substructures (motifs) edge labels

slide-3
SLIDE 3
  • E.g., antibiotic (cephalosporin)

Why are molecules interesting for ML?

3D information node labels substructures (motifs)

Together give rise to various chemical properties (e.g., solubility, toxicity, …)

edge labels

slide-4
SLIDE 4

Why are molecules interesting for ML?

  • Properties may depend on intricate structures;
  • The key challenges are to automatically predict chemical properties and to

generate molecules with desirable characteristics

(Daptomycin antibiotic)

slide-5
SLIDE 5

Interesting ML Problems

  • Deeper into known chemistry
  • extract chemical knowledge from journals, notebooks (NLP)
  • Deeper into drug design
  • molecular property prediction (graph representation)
  • (multi-criteria) lead optimization (graph generation)
  • Deeper into reactions
  • forward reaction prediction (structured prediction)
  • forward reaction optimization (combinatorial optimization)
  • Deeper into synthesis
  • retrosynthesis planning (reinforcement learning)
slide-6
SLIDE 6

Interesting ML Problems

  • Deeper into known chemistry
  • extract chemical knowledge from journals, notebooks (NLP)
  • Deeper into drug design
  • molecular property prediction (graph representation)
  • (multi-criteria) lead optimization (graph generation)
  • Deeper into reactions
  • forward reaction prediction (structured prediction)
  • forward reaction optimization (combinatorial optimization)
  • Deeper into synthesis
  • retrosynthesis planning (reinforcement learning)
slide-7
SLIDE 7

Automating Drug design

  • Key challenges:

  • 1. representation and prediction: learn to predict molecular properties

  • 2. generation and optimization: realize target molecules with better

properties programmatically


  • 3. understanding: uncover principles (or diagnose errors) underlying complex

predictions

slide-8
SLIDE 8

GNNs for property prediction?

  • Are GNN models operating on molecular graphs sufficiently expressive for

predicting molecular properties (in the presence of “property cliffs”)?

  • A number of recent results pertaining to the power of GNNs (e.g., Xu et al.

2018, Sato et al. 2019, Maron et al., 2019, …);

solubility, toxicity, bioactivity, etc.

GNN embedding aggregation prediction

slide-9
SLIDE 9

Are basic GNNs sufficiently expressive?

  • Theorem [Garg et al., 2019]: GNNs with permutation invariant readout

functions cannot “decide”

  • girth (length of the shortest cycle)
  • circumference (length of the longest cycle)
  • diameter, radius
  • presence of conjoint cycle
  • total number of cycles
  • presence of c-clique
  • etc. (?)
  • (most results also apply to MPNNs)

property

slide-10
SLIDE 10

Beyond simple GNNs: sub-structures

  • Learning to view molecules at multiple levels [Jin et al., 2019]
  • 1. original molecular

graph

Hierarchical Graph-to-Graph Translation for Molecules (2019). W. Jin, R. Barzilay, and T. Jaakkola

slide-11
SLIDE 11
  • Learning to view molecules at multiple levels
  • 1. original molecular

graph

Beyond simple GNNs: sub-structures

Hierarchical Graph-to-Graph Translation for Molecules (2019). W. Jin, R. Barzilay, and T. Jaakkola

slide-12
SLIDE 12
  • Learning to view molecules at multiple levels
  • 1. original molecular

graph

N N N N N N

O Cl S O S N S N

… …

s

a dictionary of substructures

Beyond simple GNNs: sub-structures

Hierarchical Graph-to-Graph Translation for Molecules (2019). W. Jin, R. Barzilay, and T. Jaakkola

slide-13
SLIDE 13
  • Learning to view molecules at multiple levels
  • 1. original molecular

graph

N N N N N N

O Cl S O S N S N

… …

s

a dictionary of substructures

  • 2. substructure graph

Beyond simple GNNs: sub-structures

Hierarchical Graph-to-Graph Translation for Molecules (2019). W. Jin, R. Barzilay, and T. Jaakkola

Pooling

slide-14
SLIDE 14
  • Learning to view molecules at multiple levels
  • 1. original molecular

graph

  • 2. substructure graph

with attachments

Beyond simple GNNs: sub-structures

Hierarchical Graph-to-Graph Translation for Molecules (2019). W. Jin, R. Barzilay, and T. Jaakkola

N N N N N N

O Cl S O S N S N

… …

s

a dictionary of substructures

slide-15
SLIDE 15
  • Learning to view molecules at multiple levels
  • 1. original molecular

graph

Beyond simple GNNs: sub-structures

Hierarchical Graph-to-Graph Translation for Molecules (2019). W. Jin, R. Barzilay, and T. Jaakkola

  • 2. substructure graph

with attachments

N N N N N N

O Cl S O S N S N

… …

s

a dictionary of substructures

slide-16
SLIDE 16
  • Learning to view molecules at multiple levels
  • 1. original molecular

graph

Beyond simple GNNs: sub-structures

Hierarchical Graph-to-Graph Translation for Molecules (2019). W. Jin, R. Barzilay, and T. Jaakkola

  • 2. substructure graph

with attachments Propagate atom embeddings

slide-17
SLIDE 17
  • Learning to view molecules at multiple levels
  • 1. original molecular

graph

Beyond simple GNNs: sub-structures

  • 3. substructure

graph

  • 2. substructure graph

with attachments

Hierarchical Graph-to-Graph Translation for Molecules (2019). W. Jin, R. Barzilay, and T. Jaakkola

slide-18
SLIDE 18

Multi-resolution representations

  • Learning to view molecules at multiple levels
  • 1. original molecular

graph

Hierarchical Graph-to-Graph Translation for Molecules (2019). W. Jin, R. Barzilay, and T. Jaakkola

  • 3. substructure

graph

  • 2. substructure graph

with attachments

  • Related to graph-pooling

(Ying et al., 2018, …)

Hierarchical message passing

slide-19
SLIDE 19

Experiments on solubility

  • ESOL dataset (averaged over 5 folds)

ESOL RMSE

0.5 0.675 0.85 1.025 1.2 GNN GNN-Feature Hier-MPNN

0.65 0.69 1.11

slide-20
SLIDE 20

Experiments on solubility

  • ESOL dataset (averaged over 5 folds)

ESOL RMSE

0.5 0.675 0.85 1.025 1.2 GNN GNN-Feature Hier-MPNN

0.65 0.69 1.11

Raw GNN

  • atom feature: only atom type label
slide-21
SLIDE 21

Experiments on solubility

  • ESOL dataset (averaged over 5 folds)

ESOL RMSE

0.5 0.675 0.85 1.025 1.2 GNN GNN-Feature Hier-MPNN

0.65 0.69 1.11

Raw GNN

  • atom feature: only atom type label

GNN with features

  • atom type label
  • degree
  • valence
  • whether an atom is in a cycle
  • whether an atom is in an aromatic ring
  • ……

Cycle information

slide-22
SLIDE 22

Experiments on solubility

  • ESOL dataset (averaged over 5 folds)

ESOL RMSE

0.5 0.675 0.85 1.025 1.2 GNN GNN-Feature HierGNN

0.65 0.69 1.11

Hierarchical GNN

  • Atom features: still just atom type
  • But has extra substructure information built

into the architecture

slide-23
SLIDE 23

New Antibiotic Discovery

  • If we can accurately predict molecular properties, we can screen (select and

repurpose) molecules from a large candidate set
 
 
 
 


  • Antibiotic Discovery [Stokes et al., 2019]
  • Trained a model to predict the inhibition against E. Coli (some bacteria…)
  • Data: ~2000 measured compounds from Broad Institute at MIT
  • Screened in total ~100 million compounds
  • Biologists tested 15 molecules (top prediction, structurally diverse) in the lab
  • 7 of them are validated to be inhibitive in-vitro
  • 1 of them demonstrate strong inhibition against other bacteria (e.g., A. baumannii)
  • All of them are new antibiotics distinct from existing ones!

Learning to Discover Novel Antibiotics from Vast Chemical Spaces (2019), J. Stokes, K. Yang, K. Swanson, W. Jin, R. Barzilay, T. Jaakkola et al.

slide-24
SLIDE 24

Automating Drug design

  • Key challenges:

  • 1. representation and prediction: learn to predict molecular properties

  • 2. generation and optimization: realize target molecules with better

properties programmatically


  • 3. understanding: uncover principles (or diagnose errors) underlying complex

predictions

slide-25
SLIDE 25
  • Goal: We aim to programmatically turn precursor molecules into molecules that

satisfy given design specifications

De novo molecule optimization

slide-26
SLIDE 26
  • Goal: We aim to programmatically turn precursor molecules into molecules that

satisfy given design specifications

De novo molecule optimization

  • Similar but …
  • Better drug-likeness
slide-27
SLIDE 27
  • Goal: We aim to programmatically turn precursor molecules into molecules that

satisfy given design specifications

De novo molecule optimization

  • Similar but …
  • Better drug-likeness
  • Similar but …
  • Better solubility
slide-28
SLIDE 28
  • Goal: We aim to programmatically turn precursor molecules into molecules that

satisfy given design specifications

De novo molecule optimization

  • Similar but …
  • Better drug-likeness
  • Similar but …
  • Better solubility
  • Need to learn a molecule-to-molecule mapping (i.e., graph-to-graph)
slide-29
SLIDE 29
  • Goal: We aim to programmatically turn precursor molecules into molecules that

satisfy given design specifications

Molecule optimization as Graph Translation

… … …

Encode Decode

X

Y

Source Target

slide-30
SLIDE 30
  • Goal: We aim to programmatically turn precursor molecules into molecules that

satisfy given design specifications

  • The training set consists of (source, target) molecular pairs, e.g.,

Molecule optimization as Graph Translation

… …

… … …

Encode Decode

X

Y

Source Target Source Target

slide-31
SLIDE 31
  • Goal: We aim to programmatically turn precursor molecules into molecules that

satisfy given design specifications

  • The training set consists of (source, target) molecular pairs, e.g.,
  • Key challenges: graph generation, diversity, multi-criteria optimization

Molecule optimization as Graph Translation

… …

Source Target … … …

Encode Decode

X

Y

Source Target

slide-32
SLIDE 32

Graph-to-Graph Translation (Decoder)

  • Modifying a pre-cursor to meet target specifications

… … target specs + …

Hierarchical Graph-to-Graph Translation for Molecules (2019). W. Jin, R. Barzilay, and T. Jaakkola

Hierarchical GNN Encoder (more expressive power) Source vectors

slide-33
SLIDE 33

Graph-to-Graph Translation (Decoder)

  • Modifying a pre-cursor to meet target specifications

… … target specs + …

Hierarchical Graph-to-Graph Translation for Molecules (2019). W. Jin, R. Barzilay, and T. Jaakkola

Hierarchical GNN Encoder (more expressive power) Source vectors Hierarchical Graph Decoder (reverse of encoding process)

slide-34
SLIDE 34
  • Modifying a pre-cursor to meet target specifications

Graph-to-Graph Translation (Decoder)

… … target specs + …

Hierarchical Graph-to-Graph Translation for Molecules (2019). W. Jin, R. Barzilay, and T. Jaakkola

Original molecular graph Substructure graph Substructure graph with attachments

slide-35
SLIDE 35
  • Modifying a pre-cursor to meet target specifications

Graph-to-Graph Translation (Decoder)

… … target specs + …

Hierarchical Graph-to-Graph Translation for Molecules (2019). W. Jin, R. Barzilay, and T. Jaakkola

  • riginal molecular

graph Substructure graph Substructure graph with attachments

slide-36
SLIDE 36
  • Modifying a pre-cursor to meet target specifications

Graph-to-Graph Translation (Decoder)

… … target specs +

S

Hierarchical Graph-to-Graph Translation for Molecules (2019). W. Jin, R. Barzilay, and T. Jaakkola

a dictionary of substructures

slide-37
SLIDE 37
  • Modifying a pre-cursor to meet target specifications

Graph-to-Graph Translation (Decoder)

… … target specs +

S

X

<latexit sha1_base64="06PoYEg7KmSFSfFLHxz1oGT1rw=">AB8XicbVDLSgNBEOyNrxhfUY9eFoPgKez6QI8BQTxGMA9MljA76U2GzMwuM7NCWPIXjwo4tW/8ebfOEn2oIkFDUVN91dYcKZNp737RWVtfWN4qbpa3tnd298v5BU8epotigMY9VOyQaOZPYMxwbCcKiQg5tsLRzdRvPaHSLJYPZpxgIMhAsohRYqz02KVDpCNB1KhXrnhVbwZ3mfg5qUCOeq/81e3HNBUoDeVE647vJSbIiDKMcpyUuqnGhNARGWDHUkE6iCbXTxT6zSd6NY2ZLGnam/JzIitB6L0HYKYoZ60ZuK/3md1ETXQcZkhqUdL4oSrlrYnf6vtnCqnhY0sIVcze6tIhUYQaG1LJhuAvrxMmdV/7x6eX9Rqd3mcRThCI7hFHy4ghrcQR0aQEHCM7zCm6OdF+fd+Zi3Fpx85hD+wPn8Aam6kO4=</latexit>

Hierarchical Graph-to-Graph Translation for Molecules (2019). W. Jin, R. Barzilay, and T. Jaakkola

Predict the next substructure

slide-38
SLIDE 38
  • Modifying a pre-cursor to meet target specifications

Graph-to-Graph Translation (Decoder)

… … target specs + …

Hierarchical Graph-to-Graph Translation for Molecules (2019). W. Jin, R. Barzilay, and T. Jaakkola

Expand the substructure graph

slide-39
SLIDE 39
  • Modifying a pre-cursor to meet target specifications

Graph-to-Graph Translation (Decoder)

… … target specs + …

Hierarchical Graph-to-Graph Translation for Molecules (2019). W. Jin, R. Barzilay, and T. Jaakkola

How to attach them?

? ?

Substructures are still “disconnected”

slide-40
SLIDE 40
  • Modifying a pre-cursor to meet target specifications

Graph-to-Graph Translation (Decoder)

… … target specs + …

Hierarchical Graph-to-Graph Translation for Molecules (2019). W. Jin, R. Barzilay, and T. Jaakkola

Predict attaching points

S S S S S S

? ?

slide-41
SLIDE 41
  • Modifying a pre-cursor to meet target specifications

Graph-to-Graph Translation (Decoder)

… … target specs + X

<latexit sha1_base64="06PoYEg7KmSFSfFLHxz1oGT1rw=">AB8XicbVDLSgNBEOyNrxhfUY9eFoPgKez6QI8BQTxGMA9MljA76U2GzMwuM7NCWPIXjwo4tW/8ebfOEn2oIkFDUVN91dYcKZNp737RWVtfWN4qbpa3tnd298v5BU8epotigMY9VOyQaOZPYMxwbCcKiQg5tsLRzdRvPaHSLJYPZpxgIMhAsohRYqz02KVDpCNB1KhXrnhVbwZ3mfg5qUCOeq/81e3HNBUoDeVE647vJSbIiDKMcpyUuqnGhNARGWDHUkE6iCbXTxT6zSd6NY2ZLGnam/JzIitB6L0HYKYoZ60ZuK/3md1ETXQcZkhqUdL4oSrlrYnf6vtnCqnhY0sIVcze6tIhUYQaG1LJhuAvrxMmdV/7x6eX9Rqd3mcRThCI7hFHy4ghrcQR0aQEHCM7zCm6OdF+fd+Zi3Fpx85hD+wPn8Aam6kO4=</latexit>

Hierarchical Graph-to-Graph Translation for Molecules (2019). W. Jin, R. Barzilay, and T. Jaakkola

Predict attaching points

S S S S S S

?

slide-42
SLIDE 42
  • Modifying a pre-cursor to meet target specifications

Graph-to-Graph Translation (Decoder)

… … target specs + …

Hierarchical Graph-to-Graph Translation for Molecules (2019). W. Jin, R. Barzilay, and T. Jaakkola

Predict attaching points in the neighbor substructure

?

slide-43
SLIDE 43
  • Modifying a pre-cursor to meet target specifications

Graph-to-Graph Translation (Decoder)

… … target specs + …

Hierarchical Graph-to-Graph Translation for Molecules (2019). W. Jin, R. Barzilay, and T. Jaakkola

Update the molecular graph

slide-44
SLIDE 44
  • Modifying a pre-cursor to meet target specifications

Graph-to-Graph Translation (Decoder)

… … target specs + …

Hierarchical Graph-to-Graph Translation for Molecules (2019). W. Jin, R. Barzilay, and T. Jaakkola

Update atom / substructure representations

slide-45
SLIDE 45
  • Modifying a pre-cursor to meet target specifications

… … target specs +

Graph-to-Graph Translation (Decoder)

Hierarchical Graph-to-Graph Translation for Molecules (2019). W. Jin, R. Barzilay, and T. Jaakkola

slide-46
SLIDE 46

De novo molecule optimization: diversity

  • Goal: We aim to programmatically turn precursor molecules into versions that

satisfy given design specifications

X Y

diversity z ~ P(z)

Encode Decode

… … …

Hierarchical Graph-to-Graph Translation for Molecules (2019). W. Jin, R. Barzilay, and T. Jaakkola

Input X Target 1

zT

Variational inference

slide-47
SLIDE 47
  • Goal: We aim to programmatically turn precursor molecules into versions that

satisfy given design specifications

De novo molecule optimization: diversity

diversity z ~ P(z) … … …

X Y

Encode Decode Hierarchical Graph-to-Graph Translation for Molecules (2019). W. Jin, R. Barzilay, and T. Jaakkola

Target 2

zT

Target 1 Input X

Variational inference

slide-48
SLIDE 48

De novo molecule optimization: specs

  • Goal: We aim to programmatically turn precursor molecules into versions that

satisfy given design specifications

X Y

diversity z ~ P(z) design specs g (e.g., drug-like & DRD2 active) … … …

Encode Decode Hierarchical Graph-to-Graph Translation for Molecules (2019). W. Jin, R. Barzilay, and T. Jaakkola

Input X Target 1

g

slide-49
SLIDE 49

De novo molecule optimization: specs

  • Goal: We aim to programmatically turn precursor molecules into versions that

satisfy given design specifications

X Y

diversity z ~ P(z) design specs g (e.g., drug-like & DRD2 inactive) … … …

Encode Decode Hierarchical Graph-to-Graph Translation for Molecules (2019). W. Jin, R. Barzilay, and T. Jaakkola

Target 2 Target 1 Input X

g

slide-50
SLIDE 50

Example results (DRD2)

  • Single property optimization: DRD2 success % (from inactive to active)

40 52.5 65 77.5 90

MMPA Seq2Seq JT-G2G AtomG2G HierG2G

85.9 75.8 77.8 75.9 46.4

Hierarchical Graph-to-Graph Translation for Molecules (2019). W. Jin, R. Barzilay, and T. Jaakkola

slide-51
SLIDE 51

Example results (DRD2)

  • Single property optimization: DRD2 success % (from inactive to active)

40 52.5 65 77.5 90

MMPA Seq2Seq JT-G2G AtomG2G HierG2G

85.9 75.8 77.8 75.9 46.4

Hierarchical Graph-to-Graph Translation for Molecules (2019). W. Jin, R. Barzilay, and T. Jaakkola

node-by-node hierarchical

slide-52
SLIDE 52

Example results (drug-likeness)

  • Single property optimization: drug-likeness (QED) success % (QED > 0.9)

20 40 60 80

MMPA Seq2Seq JT-G2G AtomG2G HierG2G

76.9 73.6 59.9 58.5 32.9

Hierarchical Graph-to-Graph Translation for Molecules (2019). W. Jin, R. Barzilay, and T. Jaakkola

slide-53
SLIDE 53

Example results (multiple design specs)

  • Multi-criteria success % (design specs driven generation)
  • Challenge: only 1.6% training pairs are both drug-like and DRD2-active

60 65 70 75 80 Seq2Seq AtomG2G HierG2G

78.5 74.5 67.8

5 10 15 Seq2Seq AtomG2G HierG2G

13 12.5 5

Drug-like and DRD2-active Drug-like but DRD2-inactive

Hierarchical Graph-to-Graph Translation for Molecules (2019). W. Jin, R. Barzilay, and T. Jaakkola

slide-54
SLIDE 54

Disentangling what’s important

  • Models are complicated, important to assess how individual parts contribute to

performance

Method QED DRD2 HierG2G 76.9% 85.9% · atom-based decoder 76.1% 75.0% · two-layer encoder 75.8% 83.5% · one-layer encoder 67.8% 74.1% · GRU MPN 72.6% 83.7%

Hierarchical Graph-to-Graph Translation for Molecules (2019). W. Jin, R. Barzilay, and T. Jaakkola

slide-55
SLIDE 55

Still many ways to improve

  • Generating complex objects (e.g., molecules) is hard
  • Assessing the quality of the object (property prediction) is substantially easier

hard to realize easier to check/predict

Constraints:

  • Molecular Similarity
  • Drug-likeness

sim(X, Y) ≥ 0.4 QED(Y) ≥ 0.9

Translate

slide-56
SLIDE 56
  • Generating complex objects (e.g., molecules) is hard
  • Assessing the quality of the object (property prediction) is substantially easier
  • Target augmentation: we can use property predictor to generate additional

(self-supervised) data for the generative model

Property-guided generation

Iterative Target Augmentation for Effective Conditional Generation (2019). K. Yang, W. Jin, K. Swanson, R. Barzilay, and T. Jaakkola

hard to realize easier to check/predict

Constraints:

  • Molecular Similarity
  • Drug-likeness

sim(X, Y) ≥ 0.4 QED(Y) ≥ 0.9

Translate

slide-57
SLIDE 57

Target augmentation = stochastic EM

  • Objective: maximize the log-probability that generated candidates satisfy the

properties of interest (structure is now a latent variable)

  • E-step: generate candidates from the current model; filter/reweight


by property predictor (~ posterior samples)

  • M-step: maximize the log-probability

  • f new (weighted) targets

X

X∈source set

log  X

Y

P(target specs|Y )P(Y |X; θ)

  • <latexit sha1_base64="HrC3nNo6QxVxwu0CH8mO1C9WDg=">ACQ3icbVBNa9tAFw5beO4X057GWpKdgXIzUNCeRiKJQeXYi/agmzWj/Li1crsftUamT9t1z6B3rH8glh5aQa6FryYfW6cDCMDPv7e6EqRQGXfeHUzt48PDRYf2o8fjJ02fPm8cvhibJNIcBT2SixyEzIWCAQqUME41sDiUMApX7f+6AtoIxJ1iesUgphFSiwEZ2ilWfOzb7J4lo9oaiP8BXzajU1gEXhyTyQxF0zI2of12FUKmI0BqUuCm2Ew61phsxhc+LgFZpxwJZs2W23VL0PvE25EW2aE/a3735wnPYlDIJTNm6rkpBjnTKLiEouFnBlLGVyCqaWKxWCvOygoG+sMqeLRNujkJbq3xM5i41Zx6FNxgyXZt/biv/zphkuzoNcqDRDULy6aJFJigndFkrnQgNHubaEcS3sWylfMs042tobtgRv/8v3yfBt1zvpn561+p92NVRJ6/Ia9ImHjkjPfKR9MmAcHJFrslP8sv5tw4t85dFa05u5mX5B84v/8AHPuyuw=</latexit>

Iterative Target Augmentation for Effective Conditional Generation (2019). K. Yang, W. Jin, K. Swanson, R. Barzilay, and T. Jaakkola

slide-58
SLIDE 58

Target augmentation = stochastic EM

  • Objective: maximize the log-probability that generated candidates satisfy the

properties of interest (structure is now a latent variable)

  • E-step: generate candidates from the current model; filter/reweight


by property predictor (~ posterior samples)

  • M-step: maximize the log-probability

  • f new (weighted) targets

X

X∈source set

log  X

Y

P(target specs|Y )P(Y |X; θ)

  • <latexit sha1_base64="HrC3nNo6QxVxwu0CH8mO1C9WDg=">ACQ3icbVBNa9tAFw5beO4X057GWpKdgXIzUNCeRiKJQeXYi/agmzWj/Li1crsftUamT9t1z6B3rH8glh5aQa6FryYfW6cDCMDPv7e6EqRQGXfeHUzt48PDRYf2o8fjJ02fPm8cvhibJNIcBT2SixyEzIWCAQqUME41sDiUMApX7f+6AtoIxJ1iesUgphFSiwEZ2ilWfOzb7J4lo9oaiP8BXzajU1gEXhyTyQxF0zI2of12FUKmI0BqUuCm2Ew61phsxhc+LgFZpxwJZs2W23VL0PvE25EW2aE/a3735wnPYlDIJTNm6rkpBjnTKLiEouFnBlLGVyCqaWKxWCvOygoG+sMqeLRNujkJbq3xM5i41Zx6FNxgyXZt/biv/zphkuzoNcqDRDULy6aJFJigndFkrnQgNHubaEcS3sWylfMs042tobtgRv/8v3yfBt1zvpn561+p92NVRJ6/Ia9ImHjkjPfKR9MmAcHJFrslP8sv5tw4t85dFa05u5mX5B84v/8AHPuyuw=</latexit>

Generate

Iterative Target Augmentation for Effective Conditional Generation (2019). K. Yang, W. Jin, K. Swanson, R. Barzilay, and T. Jaakkola

slide-59
SLIDE 59

Target augmentation = stochastic EM

  • Objective: maximize the log-probability that generated candidates satisfy the

properties of interest (structure is now a latent variable)

  • E-step: generate candidates from the current model; filter/reweight


by property predictor (~ posterior samples)

  • M-step: maximize the log-probability

  • f new (weighted) targets

X

X∈source set

log  X

Y

P(target specs|Y )P(Y |X; θ)

  • <latexit sha1_base64="HrC3nNo6QxVxwu0CH8mO1C9WDg=">ACQ3icbVBNa9tAFw5beO4X057GWpKdgXIzUNCeRiKJQeXYi/agmzWj/Li1crsftUamT9t1z6B3rH8glh5aQa6FryYfW6cDCMDPv7e6EqRQGXfeHUzt48PDRYf2o8fjJ02fPm8cvhibJNIcBT2SixyEzIWCAQqUME41sDiUMApX7f+6AtoIxJ1iesUgphFSiwEZ2ilWfOzb7J4lo9oaiP8BXzajU1gEXhyTyQxF0zI2of12FUKmI0BqUuCm2Ew61phsxhc+LgFZpxwJZs2W23VL0PvE25EW2aE/a3735wnPYlDIJTNm6rkpBjnTKLiEouFnBlLGVyCqaWKxWCvOygoG+sMqeLRNujkJbq3xM5i41Zx6FNxgyXZt/biv/zphkuzoNcqDRDULy6aJFJigndFkrnQgNHubaEcS3sWylfMs042tobtgRv/8v3yfBt1zvpn561+p92NVRJ6/Ia9ImHjkjPfKR9MmAcHJFrslP8sv5tw4t85dFa05u5mX5B84v/8AHPuyuw=</latexit>

Generate

Iterative Target Augmentation for Effective Conditional Generation (2019). K. Yang, W. Jin, K. Swanson, R. Barzilay, and T. Jaakkola

slide-60
SLIDE 60

Target augmentation = stochastic EM

  • Objective: maximize the log-probability that generated candidates satisfy the

properties of interest (structure is now a latent variable)

  • E-step: generate candidates from the current model; filter/reweight


by property predictor (~ posterior samples)

  • M-step: maximize the log-probability

  • f new (weighted) targets

X

X∈source set

log  X

Y

P(target specs|Y )P(Y |X; θ)

  • <latexit sha1_base64="HrC3nNo6QxVxwu0CH8mO1C9WDg=">ACQ3icbVBNa9tAFw5beO4X057GWpKdgXIzUNCeRiKJQeXYi/agmzWj/Li1crsftUamT9t1z6B3rH8glh5aQa6FryYfW6cDCMDPv7e6EqRQGXfeHUzt48PDRYf2o8fjJ02fPm8cvhibJNIcBT2SixyEzIWCAQqUME41sDiUMApX7f+6AtoIxJ1iesUgphFSiwEZ2ilWfOzb7J4lo9oaiP8BXzajU1gEXhyTyQxF0zI2of12FUKmI0BqUuCm2Ew61phsxhc+LgFZpxwJZs2W23VL0PvE25EW2aE/a3735wnPYlDIJTNm6rkpBjnTKLiEouFnBlLGVyCqaWKxWCvOygoG+sMqeLRNujkJbq3xM5i41Zx6FNxgyXZt/biv/zphkuzoNcqDRDULy6aJFJigndFkrnQgNHubaEcS3sWylfMs042tobtgRv/8v3yfBt1zvpn561+p92NVRJ6/Ia9ImHjkjPfKR9MmAcHJFrslP8sv5tw4t85dFa05u5mX5B84v/8AHPuyuw=</latexit>

Generate

Iterative Target Augmentation for Effective Conditional Generation (2019). K. Yang, W. Jin, K. Swanson, R. Barzilay, and T. Jaakkola

slide-61
SLIDE 61

Target augmentation = stochastic EM

  • Objective: maximize the log-probability that generated candidates satisfy the

properties of interest (structure is now a latent variable)

  • E-step: generate candidates from the current model; filter/reweight


by property predictor (~ posterior samples)

  • M-step: maximize the log-probability

  • f new (weighted) targets

X

X∈source set

log  X

Y

P(target specs|Y )P(Y |X; θ)

  • <latexit sha1_base64="HrC3nNo6QxVxwu0CH8mO1C9WDg=">ACQ3icbVBNa9tAFw5beO4X057GWpKdgXIzUNCeRiKJQeXYi/agmzWj/Li1crsftUamT9t1z6B3rH8glh5aQa6FryYfW6cDCMDPv7e6EqRQGXfeHUzt48PDRYf2o8fjJ02fPm8cvhibJNIcBT2SixyEzIWCAQqUME41sDiUMApX7f+6AtoIxJ1iesUgphFSiwEZ2ilWfOzb7J4lo9oaiP8BXzajU1gEXhyTyQxF0zI2of12FUKmI0BqUuCm2Ew61phsxhc+LgFZpxwJZs2W23VL0PvE25EW2aE/a3735wnPYlDIJTNm6rkpBjnTKLiEouFnBlLGVyCqaWKxWCvOygoG+sMqeLRNujkJbq3xM5i41Zx6FNxgyXZt/biv/zphkuzoNcqDRDULy6aJFJigndFkrnQgNHubaEcS3sWylfMs042tobtgRv/8v3yfBt1zvpn561+p92NVRJ6/Ia9ImHjkjPfKR9MmAcHJFrslP8sv5tw4t85dFa05u5mX5B84v/8AHPuyuw=</latexit>

Generate

Iterative Target Augmentation for Effective Conditional Generation (2019). K. Yang, W. Jin, K. Swanson, R. Barzilay, and T. Jaakkola

slide-62
SLIDE 62

Example results: gains

  • Substantial gains in translation/optimization success %

DRD2 Success

25 50 75 100 HierG2G HierG2G++

95.6 85.9 QED Success

25 50 75 100 HierG2G HierG2G++

87.9 76.6

Iterative Target Augmentation for Effective Conditional Generation (2019). K. Yang, W. Jin, K. Swanson, R. Barzilay, and T. Jaakkola

slide-63
SLIDE 63

Example results: gains

  • Consistently improving …

HierG2G Validation Set Performance Success Rate

80 85 90 95 100

Iteration

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18

Iterative Target Augmentation for Effective Conditional Generation (2019). K. Yang, W. Jin, K. Swanson, R. Barzilay, and T. Jaakkola

slide-64
SLIDE 64

Example results: robustness

  • The gains are robust against errors in the property predictor
  • Note: curves are for a weaker seq2seq model; baseline performance is much lower, but final

performance with augmentation comparable to hierG2G.

Predictor RMSE QED Success 25 50 75 100 0.02 0.04 0.06 0.08 Predictor RMSE DRD2 Success 25 50 75 100 0.1 0.2 0.3 0.4

Left

w/o augmentation with augmentation with augmentation w/o augmentation

Iterative Target Augmentation for Effective Conditional Generation (2019). K. Yang, W. Jin, K. Swanson, R. Barzilay, and T. Jaakkola

slide-65
SLIDE 65

Summary

  • Molecules as structured objects provide a rich domain for developing ML tools;

key underlying challenges shared with other areas involving generation/ manipulation of diverse objects

  • ML molecular design methods are rapidly becoming viable tools for drug

discovery

  • Several key challenges remain, however:
  • effective multi-criteria optimization
  • incorporating 3D features, physical constraints
  • generalizing to new, unexplored chemical spaces (domain transfer)
  • explainability, etc.