Accelerated Photodynamic Cancer Therapy Planning with FullMonte - - PowerPoint PPT Presentation

accelerated photodynamic cancer therapy planning with
SMART_READER_LITE
LIVE PREVIEW

Accelerated Photodynamic Cancer Therapy Planning with FullMonte - - PowerPoint PPT Presentation

Accelerated Photodynamic Cancer Therapy Planning with FullMonte Jeffrey Cassidy, PhD Candidate University of Toronto (Canada) #OpenPOWERSummit Photodynamic Therapy (PDT) for Cancer Photosensitizer Light Exposure (drug) (fluence J/cm 2 )


slide-1
SLIDE 1

Accelerated Photodynamic Cancer Therapy Planning with FullMonte Jeffrey Cassidy, PhD Candidate University of Toronto (Canada)

#OpenPOWERSummit

slide-2
SLIDE 2

Join the conversation at #OpenPOWERSummit

Photodynamic Therapy (PDT) for Cancer

2

Photosensitizer (drug) Light Exposure (fluence J/cm2) Tissue Oxygen Multiple FDA- approved drugs

  • Topical, IV, or oral

Normally present in tissue Surface illumination, Implanted fibre,

  • r intraoperative

Cells killed

slide-3
SLIDE 3

Join the conversation at #OpenPOWERSummit

Photodynamic Therapy (PDT) for Cancer

3

▪ Benefits ▪ Very low systemic toxicity ▪ Repeatable ▪ Highly targeted ▪ Simple, inexpensive delivery ▪ Single-shot, possibly outpatient basis

  • ▪ Challenges

▪ Variable outcomes ▪ Compute-intensive to model

slide-4
SLIDE 4

Join the conversation at #OpenPOWERSummit

Photodynamic Therapy (PDT) for Cancer

4

▪ Current use

▪ Skin (basal-cell carcinoma, actinic keratosis) ▪ Superficial oral cavity

  • ▪ Trials

▪ Prostate ▪ Bladder ▪ Head & neck ▪ Brain

  • Bladder model
slide-5
SLIDE 5

Join the conversation at #OpenPOWERSummit

Image planning volume Delineate

  • rgans

Define dose parameters Propose plan Simulate Approve plan

PDT Treatment planning

5

▪ Many sims required

▪ Many free parameters ▪ Properties variable

▪ Minutes per sim on CPU

  • Image courtesy Robert Weersink, Princess Margaret Cancer Centre
slide-6
SLIDE 6

Join the conversation at #OpenPOWERSummit

PDT Treatment planning

6

slide-7
SLIDE 7

Join the conversation at #OpenPOWERSummit

FullMonte simulation kernel

7

Launch Draw step Region lookup Hop Interface Drop Exit Spin Dead

Monte Carlo simulation traces photons through tetrahedral mesh

“Digimouse” open-source mouse atlas (standard pre-clinical model)

slide-8
SLIDE 8

Join the conversation at #OpenPOWERSummit

FullMonte simulation kernel

8

Altera Stratix V FPGA

Flow stages <-> hardware modules

Queues at join points

1 step/clk @ 250 MHz

Launch Draw step Region lookup Hop Interface Drop Exit Spin Dead

slide-9
SLIDE 9

Join the conversation at #OpenPOWERSummit

Preliminary Results

9

Performance Metric FPGA* 1 instance (<25% area) 4 FPGA + server @ 4 inst/FPGA (projected)

Throughput / node 4x 64x Throughput / $capital 0.95x 3.6x Throughput / W 67x 41x

Intel Sandy Bridge (32nm) i7-2600K 3.6 GHz 4-core HT gcc –O3, multithreaded, hand-tuned SSE4 Price $1200 (excl. GPU & monitor)

  • Altera Stratix V (28nm) at Fmax=280MHz

Quartus II PowerPlay power estimation 4.5W single instance Price $5000 (Nallatech 385 list price)

  • *Prototype x86-hosted system, limited (48k) mesh size
slide-10
SLIDE 10

Join the conversation at #OpenPOWERSummit

OpenPOWER/CAPI Scale-up

10

CAPI platform:

  • ▪ Enables large meshes (no more 64k limit)

▪ On-chip cache for most frequently accessed ▪ Host serves long tail via CAPI

  • ▪ Maintains power & performance advantage
  • ▪ Provides support code (host & FPGA side)
  • ▪ Supports fast host-accelerator communication
slide-11
SLIDE 11

Join the conversation at #OpenPOWERSummit

Digimouse (lung tumour)

Miss Rate

0.25 0.5 0.75 1

Cache Size (elements)

1 2 4 8 16 32 64 128 256 512 1k 2k 4k 8k 16k 32k 64k Static LRU Hybrid Oracle 1

OpenPOWER/CAPI Scale-up

11

FPGA memory capacity

  • Misses served

by CAPI

slide-12
SLIDE 12

Join the conversation at #OpenPOWERSummit

OpenPOWER/CAPI Scale-up

12

64k

Digimouse (lung tumour)

Miss Rate (relative to Oracle=1.0)

1 2 3 4

Cache Size (elements)

1 2 4 8 16 32 64 128 256 512 1k 2k 4k 8k 16k 32k Static LRU Hybrid 2 3

Simple hybrid cache

  • ▪ L1 4-el. LRU

▪ L2 N-el. static

▪ Most frequent ▪ Banked ▪ Host-managed

  • 32% misses vs. pure LRU

and simpler to implement

  • +60% vs. clairvoyant

(perfect prediction)

slide-13
SLIDE 13

Join the conversation at #OpenPOWERSummit

& BlueLink

13

  • ▪ Designed with Bluespec SystemVerilog (BSV)

▪ Atomic rules for complex concurrency ▪ High performance ▪ Strong typing, good IP library, fast sim

  • ▪ Created open-source BSV library BlueLink

▪ Interface to IBM CAPI hardware & sim env. ▪ Simplified module interface ▪ IP for host <-> FPGA xfer to reg/MLAB/ BRAM ▪ Examples

  • github.com/jeffreycassidy/bluelink

(work in progress)

slide-14
SLIDE 14

Join the conversation at #OpenPOWERSummit

Summary

14

MC photon transport simulation on tetrahedral mesh

  • ▪ Performance per Watt -> FPGA

▪ Fixed-point (18b) arithmetic ▪ Spatial dataflow pipeline ▪ Inexpensive, effective custom caching

Tight host-FPGA coupling -> OpenPOWER CAPI ▪ Large meshes (>> FPGA on-chip mem.) ▪ Host mem. serves infrequent items ▪ Host-managed static cache set

  • >40x more performance/W vs CPU
slide-15
SLIDE 15

Join the conversation at #OpenPOWERSummit

Acknowledgements

15

PhD Supervisors

  • Prof. Vaughn Betz, Univ. Toronto ECE
  • Prof. Lothar Lilge, Princess Margaret Cancer Centre
  • Funding

IBM, Altera, CIHR, NSERC

  • In-Kind Support

IBM, SOSCIP , Altera, Bluespec

  • Discussion

Robert Weersink, PMCC Henry Wong, U Toronto