Hierarchical Generation of Molecular Graphs using Structural Motifs - - PowerPoint PPT Presentation

hierarchical generation of molecular graphs using
SMART_READER_LITE
LIVE PREVIEW

Hierarchical Generation of Molecular Graphs using Structural Motifs - - PowerPoint PPT Presentation

Hierarchical Generation of Molecular Graphs using Structural Motifs Wengong Jin, Regina Barzilay, Tommi Jaakkola MIT CSAIL Drug Discovery via Generative Models Drug discovery: finding molecules with desired chemical properties The


slide-1
SLIDE 1

Hierarchical Generation of Molecular Graphs using Structural Motifs

Wengong Jin, Regina Barzilay, Tommi Jaakkola MIT CSAIL

slide-2
SLIDE 2
  • Drug discovery: finding molecules with desired chemical properties
  • The primary challenge: large search space

Drug Discovery via Generative Models

Search Find

1030

Potential candidates Remdesivir? Criterion:

  • Safe
  • Cures COVID
slide-3
SLIDE 3
  • Generative models can be used to efficiently search in the chemical space
  • Given a specified criterion, the model generates a molecule with desired

properties.

Drug Discovery via Generative Models

Condition Generate Remdesivir Criterion:

  • Safe
  • Cures COVID

Generative Model

slide-4
SLIDE 4
  • Consider connected graphs…
  • Different type of graphs require different generation method.
  • What kind of generation method is suitable for molecules?

Molecular Graph Generation

Complexity Line graph (text) Grid graph (Images) Fully connected graph Low tree-width graph (molecule)

slide-5
SLIDE 5

Previous Methods for Molecule Generation

  • Atom based methods: CG-VAE (Liu et al. 2018), DeepGMG (Li et al. 2018),

GraphRNN (You et al. 2018), and more Atom based

N O N S O N O N O N S O

slide-6
SLIDE 6

Previous Methods for Molecule Generation

  • Atom based methods: CG-VAE (Liu et al. 2018), DeepGMG (Li et al. 2018),

GraphRNN (You et al. 2018), and more

  • Substructure based methods: JT-VAE (Jin et al., 2018)
  • Incorporating inductive bias (i.e., low tree-width) into generation
  • Each time generate a cycle or edge

Atom based Substructure based

N O N S O N O N O N S O

N O N S O O S N

slide-7
SLIDE 7

Previous methods: limitation

  • Atom based methods: CG-VAE (Liu et al. 2018), DeepGMG (Li et al. 2018),

GraphRNN (You et al. 2018), and more

  • Substructure based methods: JT-VAE (Jin et al., 2018)

Reconstruction Accuracy w.r.t. Molecule Size

Accuracy 16 32 48 64 80 20 40 60 80 100

JT-VAE CG-VAE

slide-8
SLIDE 8

Previous methods: limitation

  • Atom based methods: CG-VAE (Liu et al. 2018), DeepGMG (Li et al. 2018),

GraphRNN (You et al. 2018), and more

  • Substructure based methods: JT-VAE (Jin et al., 2018)

Reconstruction Accuracy w.r.t. Molecule Size

Accuracy 16 32 48 64 80 20 40 60 80 100

JT-VAE CG-VAE

Large molecules (e.g., peptides, polymers)

slide-9
SLIDE 9

Failure in Generating Large Molecules

N O N O H S N N O N O O N O N S H N O N O

CG-VAE 70 atom predictions + 70 bond predictions

  • Atom based methods: CG-VAE (Liu et al. 2018), DeepGMG (Li et al. 2018),

GraphRNN (You et al. 2018), and more

  • Many Generation Steps: Vanishing gradient + error accumulation
slide-10
SLIDE 10

Failure in Generating Large Molecules

  • Atom based methods: CG-VAE (Liu et al. 2018), DeepGMG (Li et al. 2018),

GraphRNN (You et al. 2018), and more

  • Substructure based methods: JT-VAE (Jin et al., 2018)
  • JT-VAE decoder requires each substructure neighborhood to be assembled in
  • ne go, making it combinatorially challenging to handle large substructures.

N O N O H S N N O N O O N O N S H N O N O

JT-VAE: 35 substructure (ring/bond) predictions

slide-11
SLIDE 11

Larger Building Blocks: Motifs

  • JT-VAE only considered single rings and bonds as building blocks
  • How about using larger building blocks — motifs with flexible structures, not

restricted to rings and bonds?

  • Large molecules such as polymers exhibit clear hierarchical structure, being

built from repeated structural motifs.

N O N O H S N N O N O O N O N S H N O N O

  • Only 11 steps to generate

this polymer structure.

slide-12
SLIDE 12

NLP Analogy

  • Atom-based generation == character-based generation
  • Substructure-based generation == word-based generation
  • Motif-based generation == phrase-based generation
  • Substructures
  • (ring and bond only)
  • Word-based generation

N N N N N N

O Cl S O S N S N

N N N N N O O N O O N N N N S H H S N S N N S N Si Si

  • Motifs
  • (structures can be flexible)
  • Phrase-based generation
slide-13
SLIDE 13

Our New Architecture: HierVAE

  • Generates molecules motif by motif
  • Faster and more efficient
  • Much higher reconstruction accuracy for large molecules

Reconstruction Accuracy w.r.t. Molecule Size

Accuracy 18 36 54 72 90 20 40 60 80 100 Motif (Ours) Substructure Atom

slide-14
SLIDE 14

Our New Architecture: HierVAE

  • Motif extraction from data
  • Motif extraction is based on heuristics
  • Later I will discuss how motifs can be learned (based on given properties).
  • Hierarchical Graph Encoder
  • Representing molecules at both motif and atom level.
  • Designed to match the decoding process
  • Hierarchical Graph Decoder
  • Each generation step needs to resolve:
  • 1. What’s the next motif?
  • 2. How it should be attached to current graph?
slide-15
SLIDE 15

Motif Extraction Algorithm

  • A molecule is decomposed into disconnected motifs as follows:
  • 1. Find all the bridge bonds (u, v) such that either u or v is part of a ring.

N N O F F F O F F F

Bridge bonds

slide-16
SLIDE 16

Motif Extraction Algorithm

  • A molecule is decomposed into disconnected motifs as follows:
  • 1. Find all the bridge bonds (u, v) such that either u or v is part of a ring.
  • 2. Detach all bridge bonds from its neighbors.

F F F F F F O O N N

Detach Detach

slide-17
SLIDE 17

Motif Extraction Algorithm

  • A molecule is decomposed into disconnected components as follows:
  • 1. Find all the bridge bonds (u, v) such that either u or v is part of a ring.
  • 2. Detach all bridge bonds from its neighbors.
  • 3. Select all components as motifs if it occurs frequently in the training set.

F F F F F F O O N N

Occurs frequently, select as motif

slide-18
SLIDE 18

Motif Extraction Algorithm

  • A molecule is decomposed into disconnected components as follows:
  • 1. Find all the bridge bonds (u, v) such that either u or v is part of a ring.
  • 2. Detach all bridge bonds from its neighbors.
  • 3. Select all components as motifs if it occurs frequently in the training set.
  • 4. If a component is not selected, further decompose it into basic rings and

bonds.

F F F F F F O O N N

Break into three bonds (motifs) Break into two bonds (motifs)

slide-19
SLIDE 19

Mark attaching points

  • Motif decomposition loses atom-level connectivity information
  • For ease of reconstruction, we propose to mark attaching points in each motif.

F F F O O N N F F F F F F

slide-20
SLIDE 20

Motif Vocabulary

  • We can construct a motif vocabulary given a training set (usually <500)
  • Each motif also has a vocabulary of possible attaching point configurations.
  • Usually less than 10 because motifs have regular attachment patterns.
  • The attachment vocabulary covers >97% of the molecules in test set.

N N N N N O O N O O N N N N S H H S N S N N S N Si Si

N N N N N N

O Cl S O S N S N

N N N N N N N N N N N N N N N N

slide-21
SLIDE 21

Generation Process

Current state

O N O N

During generation, we maintain all possible positions to which new motifs will be attached

slide-22
SLIDE 22

Generation Process

Current state Step 1: Motif Prediction

S H N N H S O N O N O N O N

N N N N N O O N O O N N N N S H H S N S N N S N Si Si

Motif Vocabulary

slide-23
SLIDE 23

Generation Process

Current state Step 2: Attachment Prediction

O N O N S H N N H S O N O N

S H N N H S S H N N H S S H N N H S

Attachment Vocabulary

slide-24
SLIDE 24

Generation Process

Current state Step 3: Graph Prediction

O N O N S H N N H S O N O N

slide-25
SLIDE 25

Generation Process

Current state Next State

O N O N

O N O N S H N N H S

slide-26
SLIDE 26

Generation Process

Current state Next State

O N O N

O N O N S H N N H S

  • JT-VAE assembles each neighborhood (multiple motifs) in one go.
  • HierVAE decomposes the assembly process into multiple “baby steps”
  • First predict attaching points, then matching atoms.
  • Assembles one motif at a time, not the entire neighborhood.
slide-27
SLIDE 27

Hierarchical Graph Encoder (bottom up)

Atom Layer O N O N S H N O N H S

  • Atom layer serves graph

prediction (step 3)

slide-28
SLIDE 28

Hierarchical Graph Encoder (bottom up)

Atom Layer Attachment Layer

O N O N O N

S H N N H S N

O N O N S H N O N H S

  • Attachment layer serves

attachment prediction (step 2)

  • Atom layer serves graph

prediction (step 3)

slide-29
SLIDE 29

Hierarchical Graph Encoder (bottom up)

Atom Layer Attachment Layer Motif Layer

O N O N

S H N N H S

O N O N O N O N

S H N N H S N

N

O N O N S H N O N H S

  • Motif layer designed for motif

prediction (step 1)

  • Attachment layer is designed for

attachment prediction (step 2)

  • Atom layer is designed for graph

prediction (step 3)

slide-30
SLIDE 30

Hierarchical Graph Encoder (bottom up)

Atom Layer Attachment Layer Motif Layer O N O N S H N O N H S

  • Run motif layer message

passing network

  • Run attachment layer message

passing network

  • Run atom layer message

passing network

Propagate messages to corresponding nodes Propagate messages to corresponding nodes

Motif vectors Attachment vectors Atom vectors

slide-31
SLIDE 31

Hierarchical Graph Decoder (top down)

  • Motif Prediction
  • Classification: predict the right

motif in the vocabulary

O N O N S H N N H S

N

Motif vectors Attachment vectors 1 Atom vectors

slide-32
SLIDE 32

Hierarchical Graph Decoder (top down)

  • Motif Prediction
  • Classification: predict the right

motif in the vocabulary

  • Attachment Prediction
  • Classification: predict the right

attachment in the vocabulary

O N O N S H N N H S

N

N

Motif vectors Attachment vectors 1 2 Atom vectors

slide-33
SLIDE 33

Hierarchical Graph Decoder (top down)

O N O N S H N N H S

N

N

Motif vectors Attachment vectors 1 2 3

N

Atom vectors

  • Motif Prediction
  • Classification: predict the right

motif in the vocabulary

  • Attachment Prediction
  • Classification: predict the right

attachment in the vocabulary

  • Graph Prediction:
  • Classification: predict the

corresponding matching atoms

slide-34
SLIDE 34

Experiment 1: Polymer Generation

[1] St. John et al., “Message-passing neural networks for high-throughput polymer screening.” The Journal of chemical physics, 150 (23):234111, 2019

Dataset [1]: 86K polymers (76K training, 5K validation, 5K testing) Evaluation Metrics: Sample 5000 molecules from models

  • Reconstruction accuracy
  • Validity
  • Uniqueness
  • Diversity
  • Property statistics: Frechet distance between property distributions of molecules

in the generated set and test set (logP, QED, SA, molecular weight).

  • Structural statistics:
  • Nearest neighbor similarity (SNN)
  • Fragment similarity (Frag)
  • Scaffold similarity (Scaf)
slide-35
SLIDE 35

Experiment 1: Polymer Generation

N O S S S O O N OH O O O N N O O O O Si O O N O S S O N O S S S O N O O O N O S S S S S S S S S S Si Si N N Si Si S O S S S S O O O O S O S S S O O S O O N O O N O O N O O S S O N O O N O O N O S S N O S S S S O N S O O S O N S O O N O O S N O S O N O S S S N N O N O N N S S S O O N O N N O O O O S S S S S N S O N O S S S N O O N N S O N S N N N N S N O S S S S S S N N Si N N S S S S S O O O O S SH O O O O S S S O S S S S OH N O N S O S OH N O N S S S
slide-36
SLIDE 36

Experiment 1: Polymer Generation

Reconstruction Accuracy w.r.t. Molecule Size

Accuracy 18 36 54 72 90 20 40 60 80 100 Ours Substructure Atom

Training speed (mol/sec)

5 10 15 20 25 CG-VAE JT-VAE HierVAE Bonds & Rings Atoms Motifs

Motif size / Frequency

0K 14K 28K 42K 56K 70K 6 8 10 12 14 16 18 20 22 24

slide-37
SLIDE 37
  • Goal: We aim to transform given molecules into molecules that satisfy given

design specifications (first introduced in Jin et al., 2019)

Experiment 2: Lead optimization

Learning Multimodal Graph-to-Graph Translation for Molecular Optimization, W. Jin, R. Barzilay, T. Jaakkola, ICLR 2019

slide-38
SLIDE 38
  • Goal: We aim to transform given molecules into molecules that satisfy given

design specifications (first introduced in Jin et al., 2019)

Experiment 2: Lead optimization

  • Similar but …
  • Better drug-likeness

Learning Multimodal Graph-to-Graph Translation for Molecular Optimization, W. Jin, R. Barzilay, T. Jaakkola, ICLR 2019

slide-39
SLIDE 39
  • Goal: We aim to transform given molecules into molecules that satisfy given

design specifications (first introduced in Jin et al., 2019)

Experiment 2: Lead optimization

  • Similar but …
  • Better drug-likeness
  • Similar but …
  • Better solubility

Learning Multimodal Graph-to-Graph Translation for Molecular Optimization, W. Jin, R. Barzilay, T. Jaakkola, ICLR 2019

slide-40
SLIDE 40
  • Goal: We aim to transform given molecules into molecules that satisfy given

design specifications (first introduced in Jin et al., 2019)

Experiment 2: Lead optimization

  • Similar but …
  • Better drug-likeness
  • Similar but …
  • Better solubility
  • Need to learn a molecule-to-molecule mapping (i.e., graph-to-graph)

Learning Multimodal Graph-to-Graph Translation for Molecular Optimization, W. Jin, R. Barzilay, T. Jaakkola, ICLR 2019

slide-41
SLIDE 41
  • Goal: We aim to transform given molecules into molecules that satisfy given

design specifications (first introduced in Jin et al., 2019)

Lead optimization as Graph Translation

… … …

Encode Decode

X

Y

Source Target

Learning Multimodal Graph-to-Graph Translation for Molecular Optimization, W. Jin, R. Barzilay, T. Jaakkola, ICLR 2019

slide-42
SLIDE 42
  • Goal: We aim to transform given molecules into molecules that satisfy given

design specifications (first introduced in Jin et al., 2019)

  • The training set consists of (source, target) molecular pairs, e.g.,

Lead optimization as Graph Translation

… …

… … …

Encode Decode

X

Y

Source Target Source Target

Learning Multimodal Graph-to-Graph Translation for Molecular Optimization, W. Jin, R. Barzilay, T. Jaakkola, ICLR 2019

slide-43
SLIDE 43
  • Goal: We aim to transform given molecules into molecules that satisfy given

design specifications

  • The training set consists of (source, target) molecular pairs, e.g.,
  • Easy to modify HierVAE into a translation model (just add attention layers)

Lead optimization as Graph Translation

… …

Source Target … … …

Encode Decode

X

Y

Source Target

slide-44
SLIDE 44

DRD2 Optimization

  • Single property optimization: DRD2 success % (from inactive to active)

40 55 70 85 100 MMPA Seq2Seq JT-G2G AtomG2G HierG2G

85.9 75.8 77.8 75.9 46.4

Similarity(X, Y) > 0.4 DRD2(Y) > 0.5 DRD2(X) < 0.05

  • We use a property predictor [1] to

evaluate DRD2 activity of generated compounds

[1] Olivecrona et al., Molecular de-novo design through deep reinforcement learning, J. Chem. Inf. Model. 2017

slide-45
SLIDE 45

QED Optimization

  • Single property optimization: drug-likeness (QED) success %

20 35 50 65 80

MMPA Seq2Seq JT-G2G AtomG2G HierG2G

76.9 73.6 59.9 58.5 32.9

Similarity(X, Y) > 0.4 QED(Y) > 0.9 QED(X) < 0.8

  • QED is computed by RDKit
slide-46
SLIDE 46

Summary

  • Molecular graph generation is an important problem for ML and drug discovery
  • In this paper, we proposed HierVAE to generate molecules motif by motif.
  • HierVAE works better than previous methods, both in large molecules

(polymers) as well as small molecules (graph translation).

  • Since motifs structures are flexible, how should we construct a good motif

vocabulary?

  • Jin et al., Multi-objective molecule generation using interpretable substructures. ICML 2020
  • Use interpretability techniques to construct a motif vocabulary relevant for downstream

task (poster ID 2748)