An Efficient Parallel Algorithm for Accelerating Computational - - PowerPoint PPT Presentation

an efficient parallel algorithm for accelerating
SMART_READER_LITE
LIVE PREVIEW

An Efficient Parallel Algorithm for Accelerating Computational - - PowerPoint PPT Presentation

An Efficient Parallel Algorithm for Accelerating Computational Protein Design 1 Institute for Interdisciplinary Information Sciences Tsinghua University 2 Department of Computer Science, Department of Biochemistry Duke University Jul, 2014


slide-1
SLIDE 1

An Efficient Parallel Algorithm for Accelerating Computational Protein Design

Yichao Zhou 1 Wei Xu 1 Bruce R. Donald 2 Jianyang Zeng 1,∗

1Institute for Interdisciplinary Information Sciences

Tsinghua University

2Department of Computer Science, Department of Biochemistry

Duke University

Jul, 2014

Yichao Zhou , Wei Xu , Bruce R. Donald , Jianyang Zeng An Efficient Parallel Algorithm for Computational Protein Design

slide-2
SLIDE 2

Why We Need Protein Design?

Applications

New Drug Discovery Enzyme Optimization Drug Resistance Prediction New Biosensor Design

Yichao Zhou , Wei Xu , Bruce R. Donald , Jianyang Zeng An Efficient Parallel Algorithm for Computational Protein Design

slide-3
SLIDE 3

What is Structure-Based Protein Design?

Wild-type protein’s structure Design Algorithm Rotamer Library Energy Functions Backbone Template Toolbox

Ile Pro His · · · Gly Gly Pro Glu Val Gly Ser Asp Pro Ala Ile Trp · · · Ile Ser Ile

1D amino acid sequence

Yichao Zhou , Wei Xu , Bruce R. Donald , Jianyang Zeng An Efficient Parallel Algorithm for Computational Protein Design

slide-4
SLIDE 4

Protein Design as an Optimization Problem

backbone rotamer ir rotamer js E

1

( i

r

) E2(ir, js)

NP Hard!

Energy Function (Optimization Target) ET =

  • ir

E1(ir) +

  • ir
  • js,i<j

E2(ir, js) Self energy of rotamer ir Pairwise energy between rotamer ir and js

Yichao Zhou , Wei Xu , Bruce R. Donald , Jianyang Zeng An Efficient Parallel Algorithm for Computational Protein Design

slide-5
SLIDE 5

Protein Design as an Optimization Problem

backbone rotamer ir rotamer js E

1

( i

r

) E2(ir, js)

NP Hard!

Energy Function (Optimization Target) ET =

  • ir

E1(ir) +

  • ir
  • js,i<j

E2(ir, js) Self energy of rotamer ir Pairwise energy between rotamer ir and js

Yichao Zhou , Wei Xu , Bruce R. Donald , Jianyang Zeng An Efficient Parallel Algorithm for Computational Protein Design

slide-6
SLIDE 6

Search Algorithm

Space Search Algorithm

Exact Algorithm

DEE/A* [Leach98] Branch and Bound [Hong09] Integer Lin- ear Program [Kings- ford05] Tree De- composition [Xu06] …

Approx Algorithm

… Monte Carlo [Voigt00] Simulated Annealing Belief Propagation [Yanover02] Yichao Zhou , Wei Xu , Bruce R. Donald , Jianyang Zeng An Efficient Parallel Algorithm for Computational Protein Design

slide-7
SLIDE 7

Search Algorithm

Space Search Algorithm

Exact Algorithm

DEE/A* [Leach98] Branch and Bound [Hong09] Integer Lin- ear Program [Kings- ford05] Tree De- composition [Xu06] …

Approx Algorithm

… Monte Carlo [Voigt00] Simulated Annealing Belief Propagation [Yanover02] Yichao Zhou , Wei Xu , Bruce R. Donald , Jianyang Zeng An Efficient Parallel Algorithm for Computational Protein Design

slide-8
SLIDE 8

Dead End Elimination/A* Algorithm

Yichao Zhou , Wei Xu , Bruce R. Donald , Jianyang Zeng An Efficient Parallel Algorithm for Computational Protein Design

slide-9
SLIDE 9

Heuristic Function in A* Search

  • 37
  • 37
  • 37
  • 32
  • 30
  • 30
  • 25
  • 30
  • 34
  • 34
  • 34
  • 29
  • 31
  • 22
  • 17

⇒ Residue 1 ⇒ Residue 2 ⇒ Residue 3

Admissible

f (x) = g(x) + h(x) g(x) =

  • ir ∈D(x)

E1(ir) +

  • ir ∈D(x)
  • js∈D(x),

i<j

E2(ir, js) h(x) =

  • i∈U(x)

min

r

  • E1(ir) +
  • js∈D(x)

E2(ir, js) +

  • k∈U(x)

min

u

E2(ir, ku)

  • Yichao Zhou , Wei Xu , Bruce R. Donald , Jianyang Zeng

An Efficient Parallel Algorithm for Computational Protein Design

slide-10
SLIDE 10

Heuristic Function in A* Search

  • 37
  • 37
  • 37
  • 32
  • 30
  • 30
  • 25
  • 30
  • 34
  • 34
  • 34
  • 29
  • 31
  • 22
  • 17

⇒ Residue 1 ⇒ Residue 2 ⇒ Residue 3

Admissible

= ⇒

f (x) = g(x) + h(x) g(x) =

  • ir ∈D(x)

E1(ir) +

  • ir ∈D(x)
  • js∈D(x),

i<j

E2(ir, js) h(x) =

  • i∈U(x)

min

r

  • E1(ir) +
  • js∈D(x)

E2(ir, js) +

  • k∈U(x)

min

u

E2(ir, ku)

  • Yichao Zhou , Wei Xu , Bruce R. Donald , Jianyang Zeng

An Efficient Parallel Algorithm for Computational Protein Design

slide-11
SLIDE 11

A* Search

x1

  • 37
  • 37
  • 32
  • 30
  • 30
  • 25
  • 30

x3

  • 34
  • 34
  • 29
  • 31
  • 22
  • 17

In Qveue Visiting Visited DEE Pruned

Priority Qveue x x2 x3 x1 f (x3) = −34 f (x1) = −37

Yichao Zhou , Wei Xu , Bruce R. Donald , Jianyang Zeng An Efficient Parallel Algorithm for Computational Protein Design

slide-12
SLIDE 12

A* Search

x x1

  • 37
  • 37
  • 32
  • 30
  • 30
  • 25
  • 30

x3

  • 34
  • 34
  • 29
  • 31
  • 22
  • 17

In Qveue Visiting Visited DEE Pruned

Priority Qveue x x2 x3 x1 f (x3) = −34 f (x1) = −37

Yichao Zhou , Wei Xu , Bruce R. Donald , Jianyang Zeng An Efficient Parallel Algorithm for Computational Protein Design

slide-13
SLIDE 13

A* Search

x x1

  • 37
  • 37
  • 32
  • 30
  • 30
  • 25
  • 30

x3

  • 34
  • 34
  • 29
  • 31
  • 22
  • 17

In Qveue Visiting Visited DEE Pruned

Priority Qveue x x2 x3 x1 f (x3) = −34 f (x1) = −37

Yichao Zhou , Wei Xu , Bruce R. Donald , Jianyang Zeng An Efficient Parallel Algorithm for Computational Protein Design

slide-14
SLIDE 14

A* Search

x

  • 37
  • 37
  • 37
  • 32
  • 30
  • 30
  • 25
  • 30

x3

  • 34
  • 34
  • 29
  • 31
  • 22
  • 17

In Qveue Visiting Visited DEE Pruned

Priority Qveue x x2 x3 x1 f (x3) = −34 f (x1) = −37

Yichao Zhou , Wei Xu , Bruce R. Donald , Jianyang Zeng An Efficient Parallel Algorithm for Computational Protein Design

slide-15
SLIDE 15

A* Search

x

  • 37
  • 37
  • 37
  • 32
  • 30
  • 30
  • 25
  • 30

x3

  • 34
  • 34
  • 29
  • 31
  • 22
  • 17

In Qveue Visiting Visited DEE Pruned

Priority Qveue x x2 x3 x1 f (x3) = −34 f (x1) = −37

Yichao Zhou , Wei Xu , Bruce R. Donald , Jianyang Zeng An Efficient Parallel Algorithm for Computational Protein Design

slide-16
SLIDE 16

A* Search

x

  • 37
  • 37
  • 37
  • 32
  • 30
  • 30
  • 25
  • 30

x3

  • 34
  • 34
  • 29
  • 31
  • 22
  • 17

In Qveue Visiting Visited DEE Pruned

Priority Qveue x x2 x3 x1 f (x3) = −34 f (x1) = −37

Yichao Zhou , Wei Xu , Bruce R. Donald , Jianyang Zeng An Efficient Parallel Algorithm for Computational Protein Design

slide-17
SLIDE 17

A* Search

x

  • 37
  • 37
  • 37
  • 32
  • 30
  • 30
  • 25
  • 30

x3

  • 34
  • 34
  • 29
  • 31
  • 22
  • 17

In Qveue Visiting Visited DEE Pruned

Priority Qveue x x2 x3 x1 f (x3) = −34 f (x1) = −37

Yichao Zhou , Wei Xu , Bruce R. Donald , Jianyang Zeng An Efficient Parallel Algorithm for Computational Protein Design

slide-18
SLIDE 18

A* Search

x

  • 37
  • 37
  • 37
  • 32
  • 30
  • 30
  • 25
  • 30
  • 34
  • 34
  • 34
  • 29
  • 31
  • 22
  • 17

In Qveue Visiting Visited DEE Pruned

Priority Qveue x x2 x3 x1 f (x3) = −34 f (x1) = −37

Yichao Zhou , Wei Xu , Bruce R. Donald , Jianyang Zeng An Efficient Parallel Algorithm for Computational Protein Design

slide-19
SLIDE 19

A* Search

  • 37

x

  • 37
  • 37
  • 32
  • 30
  • 30
  • 25
  • 30
  • 34
  • 34
  • 34
  • 29
  • 31
  • 22
  • 17

In Qveue Visiting Visited DEE Pruned

Priority Qveue x x2 x1 f (x2) = −30 f (x1) = −37

Yichao Zhou , Wei Xu , Bruce R. Donald , Jianyang Zeng An Efficient Parallel Algorithm for Computational Protein Design

slide-20
SLIDE 20

A* Search

  • 37

x

  • 37

x1

  • 37
  • 32
  • 30

x2

  • 30
  • 25
  • 30
  • 34
  • 34
  • 34
  • 29
  • 31
  • 22
  • 17

In Qveue Visiting Visited DEE Pruned

Priority Qveue x x2 x1 f (x2) = −30 f (x1) = −37

Yichao Zhou , Wei Xu , Bruce R. Donald , Jianyang Zeng An Efficient Parallel Algorithm for Computational Protein Design

slide-21
SLIDE 21

A* Search

  • 37
  • 37

x

  • 37
  • 32
  • 30
  • 30
  • 25
  • 30
  • 34
  • 34
  • 34
  • 29
  • 31
  • 22
  • 17

In Qveue Visiting Visited DEE Pruned

Priority Qveue x x2 x1 f (x2) f (x1) = −37

Yichao Zhou , Wei Xu , Bruce R. Donald , Jianyang Zeng An Efficient Parallel Algorithm for Computational Protein Design

slide-22
SLIDE 22

A* Search

  • 37
  • 37

x

  • 37

x1

  • 32
  • 30
  • 30
  • 25
  • 30
  • 34
  • 34
  • 34
  • 29
  • 31
  • 22
  • 17

In Qveue Visiting Visited DEE Pruned

Priority Qveue x x2 x1 f (x2) f (x1) = −37

Yichao Zhou , Wei Xu , Bruce R. Donald , Jianyang Zeng An Efficient Parallel Algorithm for Computational Protein Design

slide-23
SLIDE 23

Our Contribution

Dead-End-Elimination A* Search Algorithm Massive Parallel on a GPU Faster Heuristic Memory Bounded Our Contribution

20000x speedup!

Yichao Zhou , Wei Xu , Bruce R. Donald , Jianyang Zeng An Efficient Parallel Algorithm for Computational Protein Design

slide-24
SLIDE 24

Our Contribution

Dead-End-Elimination A* Search Algorithm Massive Parallel on a GPU Faster Heuristic Memory Bounded Our Contribution

20000x speedup!

Yichao Zhou , Wei Xu , Bruce R. Donald , Jianyang Zeng An Efficient Parallel Algorithm for Computational Protein Design

slide-25
SLIDE 25

A Brief Concept to GPU

Core 1 Core 2 CPU (several cores) ☺ Decent single thread performance ☹ Limited in parallelism GPU (thousands of cores) ☺ More computational units ☺ Energy efficient ☹ Need massive parallelism

Yichao Zhou , Wei Xu , Bruce R. Donald , Jianyang Zeng An Efficient Parallel Algorithm for Computational Protein Design

slide-26
SLIDE 26

First Level of Parallelism

Priority Qveue x · · · xk−1 xk x2 x1 · · · f (xk−1) f (xk) f (x2) f (x1) Problem: How to parallelize A* algo- rithm?

Yichao Zhou , Wei Xu , Bruce R. Donald , Jianyang Zeng An Efficient Parallel Algorithm for Computational Protein Design

slide-27
SLIDE 27

First Level of Parallelism

Priority Qveue x · · · xk−1 xk x2 x1 · · · f (xk−1) f (xk) f (x2) f (x1) Computation of heuristic func- tions can be done in parallel.

Yichao Zhou , Wei Xu , Bruce R. Donald , Jianyang Zeng An Efficient Parallel Algorithm for Computational Protein Design

slide-28
SLIDE 28

First Level of Parallelism

Priority Qveue x · · · xk−1 xk x2 x1 · · · f (xk−1) f (xk) f (x2) f (x1) Problem: k ≪ # of cores. How to further increase paral- lelism?

Yichao Zhou , Wei Xu , Bruce R. Donald , Jianyang Zeng An Efficient Parallel Algorithm for Computational Protein Design

slide-29
SLIDE 29

One Possible Solution

Priority Qveue x(2) x(1) x(3) · · · x(2)

1

x(1)

k1

· · · x(1)

1

x(2)

k2

x(3)

1

· · · x(3)

k3

f (x(1)

1 )

· · · f (x(1)

k1 )

f (x(2)

1 )

· · · f (x(2)

k2 )

f (x(3)

1 )

· · · f (x(3)

k3 )

Multiple nodes can be ex- tracted from priority queue at beginning.

Yichao Zhou , Wei Xu , Bruce R. Donald , Jianyang Zeng An Efficient Parallel Algorithm for Computational Protein Design

slide-30
SLIDE 30

One Possible Solution

Priority Qveue x(2) x(1) x(3) · · · x(2)

1

x(1)

k1

· · · x(1)

1

x(2)

k2

x(3)

1

· · · x(3)

k3

f (x(1)

1 )

· · · f (x(1)

k1 )

f (x(2)

1 )

· · · f (x(2)

k2 )

f (x(3)

1 )

· · · f (x(3)

k3 ) Yichao Zhou , Wei Xu , Bruce R. Donald , Jianyang Zeng An Efficient Parallel Algorithm for Computational Protein Design

slide-31
SLIDE 31

One Possible Solution

Priority Qveue x(2) x(1) x(3) · · · x(2)

1

x(1)

k1

· · · x(1)

1

x(2)

k2

x(3)

1

· · · x(3)

k3

f (x(1)

1 )

· · · f (x(1)

k1 )

f (x(2)

1 )

· · · f (x(2)

k2 )

f (x(3)

1 )

· · · f (x(3)

k3 )

Problem: All visits to the priority queue are sequential.

Yichao Zhou , Wei Xu , Bruce R. Donald , Jianyang Zeng An Efficient Parallel Algorithm for Computational Protein Design

slide-32
SLIDE 32

Full Parallelization of A* Search Algorithm

Priority Qveue 1 Priority Qveue 2 Priority Qveue 3 x(2) x(1) x(3) · · · x(2)

1

x(1)

k1

· · · x(1)

1

x(2)

k2

x(3)

1

· · · x(3)

k3

f (x(1)

1 )

· · · f (x(1)

k1 )

f (x(2)

1 )

· · · f (x(2)

k2 )

f (x(3)

1 )

· · · f (x(3)

k3 )

Solution: multiple priority queues.

Yichao Zhou , Wei Xu , Bruce R. Donald , Jianyang Zeng An Efficient Parallel Algorithm for Computational Protein Design

slide-33
SLIDE 33

Full Parallelization of A* Search Algorithm

Priority Qveue 1 Priority Qveue 2 Priority Qveue 3 x(2) x(1) x(3) · · · x(2)

1

x(1)

k1

· · · x(1)

1

x(2)

k2

x(3)

1

· · · x(3)

k3

f (x(1)

1 )

· · · f (x(1)

k1 )

f (x(2)

1 )

· · · f (x(2)

k2 )

f (x(3)

1 )

· · · f (x(3)

k3 ) Yichao Zhou , Wei Xu , Bruce R. Donald , Jianyang Zeng An Efficient Parallel Algorithm for Computational Protein Design

slide-34
SLIDE 34

Full Parallelization of A* Search Algorithm

Priority Qveue 1 Priority Qveue 2 Priority Qveue 3 x(2) x(1) x(3) · · · x(2)

1

x(1)

k1

· · · x(1)

1

x(2)

k2

x(3)

1

· · · x(3)

k3

f (x(1)

1 )

· · · f (x(1)

k1 )

f (x(2)

1 )

· · · f (x(2)

k2 )

f (x(3)

1 )

· · · f (x(3)

k3 ) Yichao Zhou , Wei Xu , Bruce R. Donald , Jianyang Zeng An Efficient Parallel Algorithm for Computational Protein Design

slide-35
SLIDE 35

Full Parallelization of A* Search Algorithm

Priority Qveue 1 Priority Qveue 2 Priority Qveue 3 x(2) x(1) x(3) · · · x(2)

1

x(1)

k1

· · · x(1)

1

x(2)

k2

x(3)

1

· · · x(3)

k3

f (x(1)

1 )

· · · f (x(1)

k1 )

f (x(2)

1 )

· · · f (x(2)

k2 )

f (x(3)

1 )

· · · f (x(3)

k3 ) Yichao Zhou , Wei Xu , Bruce R. Donald , Jianyang Zeng An Efficient Parallel Algorithm for Computational Protein Design

slide-36
SLIDE 36

Full Parallelization of A* Search Algorithm

Priority Qveue 1 Priority Qveue 2 Priority Qveue 3 x(2) x(1) x(3) · · · x(2)

1

x(1)

k1

· · · x(1)

1

x(2)

k2

x(3)

1

· · · x(3)

k3

f (x(1)

1 )

· · · f (x(1)

k1 )

f (x(2)

1 )

· · · f (x(2)

k2 )

f (x(3)

1 )

· · · f (x(3)

k3 ) Yichao Zhou , Wei Xu , Bruce R. Donald , Jianyang Zeng An Efficient Parallel Algorithm for Computational Protein Design

slide-37
SLIDE 37

Full Parallelization of A* Search Algorithm

Priority Qveue 1 Priority Qveue 2 Priority Qveue 3 x(2) x(1) x(3) · · · x(2)

1

x(1)

k1

· · · x(1)

1

x(2)

k2

x(3)

1

· · · x(3)

k3

f (x(1)

1 )

· · · f (x(1)

k1 )

f (x(2)

1 )

· · · f (x(2)

k2 )

f (x(3)

1 )

· · · f (x(3)

k3 ) Yichao Zhou , Wei Xu , Bruce R. Donald , Jianyang Zeng An Efficient Parallel Algorithm for Computational Protein Design

slide-38
SLIDE 38

Full Parallelization of A* Search Algorithm

Priority Qveue 1 Priority Qveue 2 Priority Qveue 3 x(2) x(1) x(3) · · · x(2)

1

x(1)

k1

· · · x(1)

1

x(2)

k2

x(3)

1

· · · x(3)

k3

f (x(1)

1 )

· · · f (x(1)

k1 )

f (x(2)

1 )

· · · f (x(2)

k2 )

f (x(3)

1 )

· · · f (x(3)

k3 ) Yichao Zhou , Wei Xu , Bruce R. Donald , Jianyang Zeng An Efficient Parallel Algorithm for Computational Protein Design

slide-39
SLIDE 39

Parallel Memory Bounded A* Algorithm

Memory Exausted Problem Memory Resources prevents us from solving larger problem! Basic idea: throw aways unpromising nodes

Do a global sort according to nodes’ heuristic values Only keep nodes whose ranks exceed threshold

Pros and cons ☺ Continue to execute afuer running out of memory ☹ Cannot fully guarantee of GMEC 😑 Compared to traditional heuristic algorithm:

☺ Guarantee GMEC solution when memory is large enough ☺ Possible to know whether we get GMEC solution

Yichao Zhou , Wei Xu , Bruce R. Donald , Jianyang Zeng An Efficient Parallel Algorithm for Computational Protein Design

slide-40
SLIDE 40

Parallel Memory Bounded A* Algorithm

Memory Exausted Problem Memory Resources prevents us from solving larger problem! Basic idea: throw aways unpromising nodes

Do a global sort according to nodes’ heuristic values Only keep nodes whose ranks exceed threshold

Pros and cons ☺ Continue to execute afuer running out of memory ☹ Cannot fully guarantee of GMEC 😑 Compared to traditional heuristic algorithm:

☺ Guarantee GMEC solution when memory is large enough ☺ Possible to know whether we get GMEC solution

Yichao Zhou , Wei Xu , Bruce R. Donald , Jianyang Zeng An Efficient Parallel Algorithm for Computational Protein Design

slide-41
SLIDE 41

Environment

Experiment On OSPREY platform from Donald Lab (Duke University) Native sequence recovery (core redesign) iMinDEE to consider side-chain flexibility To measure:

speed and memory consumption of GPU-Based A* correctness of parallel A* under bounded memory

Availability: http://github.com/zhou13/gOSPREY Environment Information CPU: Intel Xeon™ E5-1620 3.6GHz GPU: NVIDIA Tesla K20C (2496 logic cores)

Yichao Zhou , Wei Xu , Bruce R. Donald , Jianyang Zeng An Efficient Parallel Algorithm for Computational Protein Design

slide-42
SLIDE 42

Environment

Experiment On OSPREY platform from Donald Lab (Duke University) Native sequence recovery (core redesign) iMinDEE to consider side-chain flexibility To measure:

speed and memory consumption of GPU-Based A* correctness of parallel A* under bounded memory

Availability: http://github.com/zhou13/gOSPREY Environment Information CPU: Intel Xeon™ E5-1620 3.6GHz GPU: NVIDIA Tesla K20C (2496 logic cores)

Yichao Zhou , Wei Xu , Bruce R. Donald , Jianyang Zeng An Efficient Parallel Algorithm for Computational Protein Design

slide-43
SLIDE 43

Performance of GPU-Based A*

GPU-Based A* with faster heuristic and 768/4992 priority queues CPU-Based A* with faster heuristic Traditional A* from OSPREY

PDB Space OSPREY A*1 GA*768 GA*4992 2QCP 2 · 1017 21551916 51091 3075 1146 1XMK 2 · 1014 247585 2990 296 121 1X6I 7 · 1013 96990 1406 138 73 1UCS 6 · 1012 88135 1771 182 79 1CC8 3 · 1014 77614 1078 99 53 2CS7 8 · 1012 64187 1154 149 57 … … … … … …

Time is measured in millisecond.

Yichao Zhou , Wei Xu , Bruce R. Donald , Jianyang Zeng An Efficient Parallel Algorithm for Computational Protein Design

slide-44
SLIDE 44

Performance of GPU-Based A*

GPU-Based A* with faster heuristic and 768/4992 priority queues CPU-Based A* with faster heuristic Traditional A* from OSPREY

PDB Space OSPREY A*1 GA*768 GA*4992 2QCP 2 · 1017 21551916 51091 3075 1146 1XMK 2 · 1014 247585 2990 296 121 1X6I 7 · 1013 96990 1406 138 73 1UCS 6 · 1012 88135 1771 182 79 1CC8 3 · 1014 77614 1078 99 53 2CS7 8 · 1012 64187 1154 149 57 … … … … … …

Time is measured in millisecond.

Yichao Zhou , Wei Xu , Bruce R. Donald , Jianyang Zeng An Efficient Parallel Algorithm for Computational Protein Design

slide-45
SLIDE 45

Performance of GPU-Based A*

GPU-Based A* with faster heuristic and 768/4992 priority queues CPU-Based A* with faster heuristic Traditional A* from OSPREY

PDB Space OSPREY A*1 GA*768 GA*4992 2QCP 2 · 1017 21551916 51091 3075 1146 1XMK 2 · 1014 247585 2990 296 121 1X6I 7 · 1013 96990 1406 138 73 1UCS 6 · 1012 88135 1771 182 79 1CC8 3 · 1014 77614 1078 99 53 2CS7 8 · 1012 64187 1154 149 57 … … … … … …

Time is measured in millisecond.

20 000x speedup

Yichao Zhou , Wei Xu , Bruce R. Donald , Jianyang Zeng An Efficient Parallel Algorithm for Computational Protein Design

slide-46
SLIDE 46

Speedup from Parallelization

104 105 106 107 10 20 30 40 Size of problem Speedup from parallization 768 Priority Qveues 4992 Priority Qveues

Yichao Zhou , Wei Xu , Bruce R. Donald , Jianyang Zeng An Efficient Parallel Algorithm for Computational Protein Design

slide-47
SLIDE 47

Memory Overhead

104 105 106 107 1 2 3 4 5 6 Size of the problem Memory overhead of GPU-Based A* 768 Priority Qveues 4992 Priority Qveues

Yichao Zhou , Wei Xu , Bruce R. Donald , Jianyang Zeng An Efficient Parallel Algorithm for Computational Protein Design

slide-48
SLIDE 48

Memory Bounded Parallel A* Search Native sequence recovery experiment

PDB 1OAI 1U2H 1ZZK 2CS7 2DSX 3D3B # of Mutable Residues 16 18 14 15 15 15 Conformation Space 2·1022 2·1020 2·1015 2·1023 3·1020 6·1818 A* Search Space 4·107 8·106 8·106 4·107 4·107 3·107 3 × 106 node limit GMEC Gotuen YES YES YES YES YES YES GMEC Assured YES YES YES YES YES YES Recover Ratio 74% 75% 87% 46% 48% 53% 3 × 105 node limit GMEC Gotuen YES YES YES YES YES YES GMEC Assured NO YES YES NO NO NO Recover Ratio 74% 75% 87% 46% 48% 54% 3 × 104 node limit GMEC Gotuen NO YES YES NO YES NO GMEC Assured NO NO NO NO NO NO Recover Ratio 62% 75% 85% 48% 46% 48%

Yichao Zhou , Wei Xu , Bruce R. Donald , Jianyang Zeng An Efficient Parallel Algorithm for Computational Protein Design

slide-49
SLIDE 49

Memory Bounded Parallel A* Search Native sequence recovery experiment

PDB 1OAI 1U2H 1ZZK 2CS7 2DSX 3D3B # of Mutable Residues 16 18 14 15 15 15 Conformation Space 2·1022 2·1020 2·1015 2·1023 3·1020 6·1818 A* Search Space 4·107 8·106 8·106 4·107 4·107 3·107 3 × 106 node limit GMEC Gotuen YES YES YES YES YES YES GMEC Assured YES YES YES YES YES YES Recover Ratio 74% 75% 87% 46% 48% 53% 3 × 105 node limit GMEC Gotuen YES YES YES YES YES YES GMEC Assured NO YES YES NO NO NO Recover Ratio 74% 75% 87% 46% 48% 54% 3 × 104 node limit GMEC Gotuen NO YES YES NO YES NO GMEC Assured NO NO NO NO NO NO Recover Ratio 62% 75% 85% 48% 46% 48%

Yichao Zhou , Wei Xu , Bruce R. Donald , Jianyang Zeng An Efficient Parallel Algorithm for Computational Protein Design

slide-50
SLIDE 50

Memory Bounded Parallel A* Search Native sequence recovery experiment

PDB 1OAI 1U2H 1ZZK 2CS7 2DSX 3D3B # of Mutable Residues 16 18 14 15 15 15 Conformation Space 2·1022 2·1020 2·1015 2·1023 3·1020 6·1818 A* Search Space 4·107 8·106 8·106 4·107 4·107 3·107 3 × 106 node limit GMEC Gotuen YES YES YES YES YES YES GMEC Assured YES YES YES YES YES YES Recover Ratio 74% 75% 87% 46% 48% 53% 3 × 105 node limit GMEC Gotuen YES YES YES YES YES YES GMEC Assured NO YES YES NO NO NO Recover Ratio 74% 75% 87% 46% 48% 54% 3 × 104 node limit GMEC Gotuen NO YES YES NO YES NO GMEC Assured NO NO NO NO NO NO Recover Ratio 62% 75% 85% 48% 46% 48%

Yichao Zhou , Wei Xu , Bruce R. Donald , Jianyang Zeng An Efficient Parallel Algorithm for Computational Protein Design

slide-51
SLIDE 51

Memory Bounded Parallel A* Search Native sequence recovery experiment

PDB 1OAI 1U2H 1ZZK 2CS7 2DSX 3D3B # of Mutable Residues 16 18 14 15 15 15 Conformation Space 2·1022 2·1020 2·1015 2·1023 3·1020 6·1818 A* Search Space 4·107 8·106 8·106 4·107 4·107 3·107 3 × 106 node limit GMEC Gotuen YES YES YES YES YES YES GMEC Assured YES YES YES YES YES YES Recover Ratio 74% 75% 87% 46% 48% 53% 3 × 105 node limit GMEC Gotuen YES YES YES YES YES YES GMEC Assured NO YES YES NO NO NO Recover Ratio 74% 75% 87% 46% 48% 54% 3 × 104 node limit GMEC Gotuen NO YES YES NO YES NO GMEC Assured NO NO NO NO NO NO Recover Ratio 62% 75% 85% 48% 46% 48%

Yichao Zhou , Wei Xu , Bruce R. Donald , Jianyang Zeng An Efficient Parallel Algorithm for Computational Protein Design

slide-52
SLIDE 52

Conclusion and Future Work

Our Contribution Optimized computation of heuristic function Massively parallel GPU-Based A* Memory bounded search algorithm Results a 20000x speedup in native sequence recovery 1/10 memory while guaranteeing GMEC Future Work Testing on more affordable GPUs Porting to large CPUs/GPUs cluster Parallelizing other parts of the framework

Yichao Zhou , Wei Xu , Bruce R. Donald , Jianyang Zeng An Efficient Parallel Algorithm for Computational Protein Design

slide-53
SLIDE 53

Acknowledgement

Duke University

  • Mr. Kyle Roberts
  • Mr. Mark Hallen
  • Mr. Pablo Gainza

Tsinghua University

  • Mr. Pufan He
  • Mr. Qiwei Feng

Funding National Basic Research Program of China National Natural Science Foundation of China National Institutes of Health from B.R.D

Yichao Zhou , Wei Xu , Bruce R. Donald , Jianyang Zeng An Efficient Parallel Algorithm for Computational Protein Design

slide-54
SLIDE 54

Qvestion and Answer

Thank you!

Yichao Zhou , Wei Xu , Bruce R. Donald , Jianyang Zeng An Efficient Parallel Algorithm for Computational Protein Design

slide-55
SLIDE 55

Rotamer Library

Harder, Tim, et al. (2010) and Gainza, P., Roberts, K. E., & Donald, B. R. (2012).

Yichao Zhou , Wei Xu , Bruce R. Donald , Jianyang Zeng An Efficient Parallel Algorithm for Computational Protein Design

slide-56
SLIDE 56

Price Table (2014)

Model Price Release Year SP GFLOPS Intel Xeon E5-1620 300$ 2012 488 NVIDIA Tesla K20C 3000$ 2012 3520 NVIDIA GeForce GTX 680 300$ 2012 3090 NVIDIA GeForce GTX 770 400$ 2013 3213 NVIDIA GeForce GTX 780 600$ 2013 3977

Yichao Zhou , Wei Xu , Bruce R. Donald , Jianyang Zeng An Efficient Parallel Algorithm for Computational Protein Design

slide-57
SLIDE 57

Full A* Search

x1 x1 x1 x2 x2

  • 30
  • 25

x2 x3

  • 34
  • 34
  • 29
  • 31
  • 22
  • 17

In the Qveue Visiting Visited DEE Pruned

Priority Qveue x x3 x2 x1 f (x3) f (x2) f (x1)

Yichao Zhou , Wei Xu , Bruce R. Donald , Jianyang Zeng An Efficient Parallel Algorithm for Computational Protein Design

slide-58
SLIDE 58

Full A* Search

x x1 x1 x1 x2 x2

  • 30
  • 25

x2 x3

  • 34
  • 34
  • 29
  • 31
  • 22
  • 17

In the Qveue Visiting Visited DEE Pruned

Priority Qveue x x3 x2 x1 f (x3) f (x2) f (x1)

Yichao Zhou , Wei Xu , Bruce R. Donald , Jianyang Zeng An Efficient Parallel Algorithm for Computational Protein Design

slide-59
SLIDE 59

Full A* Search

x x1 x1 x1 x2 x2

  • 30
  • 25

x2 x3

  • 34
  • 34
  • 29
  • 31
  • 22
  • 17

In the Qveue Visiting Visited DEE Pruned

Priority Qveue x x3 x2 x1 f (x3) f (x2) f (x1)

Yichao Zhou , Wei Xu , Bruce R. Donald , Jianyang Zeng An Efficient Parallel Algorithm for Computational Protein Design

slide-60
SLIDE 60

Full A* Search

  • 37

x1 x1 x2 x2

  • 30
  • 25

x2 x3

  • 34
  • 34
  • 29
  • 31
  • 22
  • 17

In the Qveue Visiting Visited DEE Pruned

Priority Qveue x x3 x2 x1 f (x3) f (x2) f (x1)

Yichao Zhou , Wei Xu , Bruce R. Donald , Jianyang Zeng An Efficient Parallel Algorithm for Computational Protein Design

slide-61
SLIDE 61

Full A* Search

  • 37

x1 x1 x2 x2

  • 30
  • 25

x2 x3

  • 34
  • 34
  • 29
  • 31
  • 22
  • 17

In the Qveue Visiting Visited DEE Pruned

Priority Qveue x x3 x2 x1 f (x3) f (x2) f (x1)

Yichao Zhou , Wei Xu , Bruce R. Donald , Jianyang Zeng An Efficient Parallel Algorithm for Computational Protein Design

slide-62
SLIDE 62

Full A* Search

  • 37

x1 x1 x2 x2

  • 30
  • 25
  • 30

x3

  • 34
  • 34
  • 29
  • 31
  • 22
  • 17

In the Qveue Visiting Visited DEE Pruned

Priority Qveue x x3 x2 x1 f (x3) f (x2) f (x1)

Yichao Zhou , Wei Xu , Bruce R. Donald , Jianyang Zeng An Efficient Parallel Algorithm for Computational Protein Design

slide-63
SLIDE 63

Full A* Search

  • 37

x1 x1 x2 x2

  • 30
  • 25
  • 30

x3

  • 34
  • 34
  • 29
  • 31
  • 22
  • 17

In the Qveue Visiting Visited DEE Pruned

Priority Qveue x x3 x2 x1 f (x3) f (x2) f (x1)

Yichao Zhou , Wei Xu , Bruce R. Donald , Jianyang Zeng An Efficient Parallel Algorithm for Computational Protein Design

slide-64
SLIDE 64

Full A* Search

  • 37

x1 x1 x2 x2

  • 30
  • 25
  • 30
  • 34
  • 34
  • 34
  • 29
  • 31
  • 22
  • 17

In the Qveue Visiting Visited DEE Pruned

Priority Qveue x x3 x2 x1 f (x3) f (x2) f (x1)

Yichao Zhou , Wei Xu , Bruce R. Donald , Jianyang Zeng An Efficient Parallel Algorithm for Computational Protein Design

slide-65
SLIDE 65

Full A* Search

  • 37

x1 x1 x2 x2

  • 30
  • 25
  • 30
  • 34
  • 34
  • 34
  • 29
  • 31
  • 22
  • 17

In the Qveue Visiting Visited DEE Pruned

Priority Qveue x x3 x2 x1 f (x3) f (x2) f (x1)

Yichao Zhou , Wei Xu , Bruce R. Donald , Jianyang Zeng An Efficient Parallel Algorithm for Computational Protein Design

slide-66
SLIDE 66

Full A* Search

x x1 x1 x2 x2

  • 30
  • 25
  • 30
  • 34
  • 34
  • 34
  • 29
  • 31
  • 22
  • 17

In the Qveue Visiting Visited DEE Pruned

Priority Qveue x x2 x1 f (x2) f (x1)

Yichao Zhou , Wei Xu , Bruce R. Donald , Jianyang Zeng An Efficient Parallel Algorithm for Computational Protein Design

slide-67
SLIDE 67

Full A* Search

x x1 x1 x2 x2

  • 30
  • 25
  • 30
  • 34
  • 34
  • 34
  • 29
  • 31
  • 22
  • 17

In the Qveue Visiting Visited DEE Pruned

Priority Qveue x x2 x1 f (x2) f (x1)

Yichao Zhou , Wei Xu , Bruce R. Donald , Jianyang Zeng An Efficient Parallel Algorithm for Computational Protein Design

slide-68
SLIDE 68

Full A* Search

  • 37
  • 37

x1 x2 x2

  • 30
  • 25
  • 30
  • 34
  • 34
  • 34
  • 29
  • 31
  • 22
  • 17

In the Qveue Visiting Visited DEE Pruned

Priority Qveue x x2 x1 f (x2) f (x1)

Yichao Zhou , Wei Xu , Bruce R. Donald , Jianyang Zeng An Efficient Parallel Algorithm for Computational Protein Design

slide-69
SLIDE 69

Full A* Search

  • 37
  • 37

x1 x2 x2

  • 30
  • 25
  • 30
  • 34
  • 34
  • 34
  • 29
  • 31
  • 22
  • 17

In the Qveue Visiting Visited DEE Pruned

Priority Qveue x x2 x1 f (x2) f (x1)

Yichao Zhou , Wei Xu , Bruce R. Donald , Jianyang Zeng An Efficient Parallel Algorithm for Computational Protein Design

slide-70
SLIDE 70

Full A* Search

  • 37
  • 37

x1 x2

  • 30
  • 30
  • 25
  • 30
  • 34
  • 34
  • 34
  • 29
  • 31
  • 22
  • 17

In the Qveue Visiting Visited DEE Pruned

Priority Qveue x x2 x1 f (x2) f (x1)

Yichao Zhou , Wei Xu , Bruce R. Donald , Jianyang Zeng An Efficient Parallel Algorithm for Computational Protein Design

slide-71
SLIDE 71

Full A* Search

  • 37
  • 37

x1 x2

  • 30
  • 30
  • 25
  • 30
  • 34
  • 34
  • 34
  • 29
  • 31
  • 22
  • 17

In the Qveue Visiting Visited DEE Pruned

Priority Qveue x x2 x1 f (x2) f (x1)

Yichao Zhou , Wei Xu , Bruce R. Donald , Jianyang Zeng An Efficient Parallel Algorithm for Computational Protein Design

slide-72
SLIDE 72

Full A* Search

  • 37
  • 37

x1 x2

  • 30
  • 30
  • 25
  • 30
  • 34
  • 34
  • 34
  • 29
  • 31
  • 22
  • 17

In the Qveue Visiting Visited DEE Pruned

Priority Qveue x x2 x1 f (x2) f (x1)

Yichao Zhou , Wei Xu , Bruce R. Donald , Jianyang Zeng An Efficient Parallel Algorithm for Computational Protein Design

slide-73
SLIDE 73

Full A* Search

  • 37

x x1 x2

  • 30
  • 30
  • 25
  • 30
  • 34
  • 34
  • 34
  • 29
  • 31
  • 22
  • 17

In the Qveue Visiting Visited DEE Pruned

Priority Qveue x x2 x1 f (x2) f (x1)

Yichao Zhou , Wei Xu , Bruce R. Donald , Jianyang Zeng An Efficient Parallel Algorithm for Computational Protein Design

slide-74
SLIDE 74

Full A* Search

  • 37

x

  • 37

x2

  • 30
  • 30
  • 25
  • 30
  • 34
  • 34
  • 34
  • 29
  • 31
  • 22
  • 17

In the Qveue Visiting Visited DEE Pruned

Priority Qveue x x2 x1 f (x2) f (x1)

Yichao Zhou , Wei Xu , Bruce R. Donald , Jianyang Zeng An Efficient Parallel Algorithm for Computational Protein Design

slide-75
SLIDE 75

Full A* Search

  • 37
  • 37
  • 37

x2

  • 30
  • 30
  • 25
  • 30
  • 34
  • 34
  • 34
  • 29
  • 31
  • 22
  • 17

In the Qveue Visiting Visited DEE Pruned

Priority Qveue x x2 x1 f (x2) f (x1)

Yichao Zhou , Wei Xu , Bruce R. Donald , Jianyang Zeng An Efficient Parallel Algorithm for Computational Protein Design

slide-76
SLIDE 76

Full A* Search

  • 37
  • 37
  • 37
  • 32
  • 30
  • 30
  • 25
  • 30
  • 34
  • 34
  • 34
  • 29
  • 31
  • 22
  • 17

In the Qveue Visiting Visited DEE Pruned

Priority Qveue x x2 x1 f (x2) f (x1)

Yichao Zhou , Wei Xu , Bruce R. Donald , Jianyang Zeng An Efficient Parallel Algorithm for Computational Protein Design

slide-77
SLIDE 77

Full A* Search

  • 37
  • 37
  • 37
  • 32
  • 30
  • 30
  • 25
  • 30
  • 34
  • 34
  • 34
  • 29
  • 31
  • 22
  • 17

In the Qveue Visiting Visited DEE Pruned

Priority Qveue x x2 x1 f (x2) f (x1)

Yichao Zhou , Wei Xu , Bruce R. Donald , Jianyang Zeng An Efficient Parallel Algorithm for Computational Protein Design

slide-78
SLIDE 78

Full A* Search

  • 37
  • 37
  • 37
  • 32
  • 30
  • 30
  • 25
  • 30
  • 34
  • 34
  • 34
  • 29
  • 31
  • 22
  • 17

In the Qveue Visiting Visited DEE Pruned

Priority Qveue x x2 x1 f (x2) f (x1)

Yichao Zhou , Wei Xu , Bruce R. Donald , Jianyang Zeng An Efficient Parallel Algorithm for Computational Protein Design