v
play

v ? ? ? ? ? ? ? ? ? ? ? ? ? ??? ? ?? ? ? ? ? ?? - PowerPoint PPT Presentation

MASSACHUSETTS GENERAL HOSPITAL RADIATION ONCOLOGY A CCELERATING MI - B ASED B - S PLINE R EGISTRATION U SING CUDA E NABLED GPU S James Shackleford (1) , Nagarajan Kandasamy (2), Gregory C. Sharp (1) (1) Massachusetts General Hospital, Radiation


  1. MASSACHUSETTS GENERAL HOSPITAL RADIATION ONCOLOGY A CCELERATING MI - B ASED B - S PLINE R EGISTRATION U SING CUDA E NABLED GPU S James Shackleford (1) , Nagarajan Kandasamy (2), Gregory C. Sharp (1) (1) Massachusetts General Hospital, Radiation Oncology (2) Drexel University, Electrical and Computer Engineering

  2. S LIDE 2 OF 33 F IXED I MAGE M OVING I MAGE I NTRODUCTION W HAT IS D EFORMABLE R EGISTRATION?

  3. S LIDE 3 OF 33 F IXED I MAGE M OVING I MAGE I NTRODUCTION W HAT IS D EFORMABLE R EGISTRATION?

  4. S LIDE 4 OF 33 F IXED I MAGE D EFORMATION V ECTOR F IELD M OVING I MAGE I NTRODUCTION W HAT IS D EFORMABLE R EGISTRATION?

  5. S LIDE 5 OF 33 B-S PLINE G RID P ARAMETERIZATION M ETHOD P Y β X β Y P X P ARAMETER C OEFF P ARAMETER W EIGHT R EGIONAL I NFLUENCE

  6. S LIDE 6 OF 33 v Y v Y v Y P Y v X v X = ( β X β Y ) P X v Y = ( β X β Y ) P Y P X P Y β X β Y P X P ARAMETER C OEFF P ARAMETER W EIGHT R EGIONAL I NFLUENCE

  7. S LIDE 7 OF 33 16 C ONTRIBUTIONS 4 4 v X = Σ Σ ( β X,i β Y,j ) P X,i,j j=1 i=1 4 4 v Y = Σ Σ ( β X,i β Y,j ) P Y,i,j j=1 i=1 P Y β X β Y P X P ARAMETER C OEFF P ARAMETER W EIGHT R EGIONAL I NFLUENCE

  8. S LIDE 8 OF 33 v ? ? ? ? ? ? ? ? ? ? ? ? ? ??? ? ?? ? ? ? ? ?? ? ? ?? ? ? ? F F F M ? ? ? ?? C ORRESPONDANCE AND C OST 𝚬 C OST w.r.t. V ECTORS D ECOMPRESS V ECTOR F IELD C ∂C New ∂v P ∂C ? ∂P ? ? ? ? ? ? ? ? ? ? ? ? ? ?? ??? ? ? ? ? ?? ? ? ?? ? ? ? Q UASI-NEWTONIAN ? ? ? ?? F O PTIMIZER 𝚬 C OST w.r.t. C OEFFICIENTS

  9. S LIDE 9 OF 33 M OVING I MAGE V ALUE F M H(F) + H(M) – H(F,M) C = F IXED I MAGE V ALUE H(F) H(F) H(F | M) ⨉ h j (i, j) N C = 1 B F B M Σ Σ h j (i, j) ln C H(F,M) N ⨉ h M ( j ) h F ( i ) j=1 i=1 H(M | F) H(M) H(M)

  10. S LIDE 10 OF 33 F M F M 2 3 # of voxels 4 1 Static Image Moving Image intensity 4 1 B C 3 2 A D A B C D Nearest Neighbors Partial Volumes ( ∂v ) ⨉ h j (i n , j n ) N ∂C ∂P = ∂C ∂h ∂v ∂C ∂C ∂C ⨉ ∂w n 4 ⨉ ⨉ Σ ln - C = ∂v = ∂h ∂v ∂P ∂h ∂h x n ⨉ h M ( j n ) h F ( i n ) n=1 x n

  11. S LIDE 11 OF 33 S ERIAL I MPLEMENTATION F OLLOWING A S INGLE T HREAD

  12. S LIDE 12 OF 33 use partial volumes for moving & joint MOVING IMAGE INTENSITY Generate Histograms get corresponding voxels in moving image 4 1 B C 3 2 A D F M Nearest Neighbors Partial Volumes compute compute vector partial volumes for each voxel FIXED IMAGE INTENSITY

  13. S LIDE 13 OF 33 use partial volumes for moving & joint MOVING IMAGE INTENSITY Generate Histograms get corresponding voxels in moving image 4 1 B C 3 2 A D F M Nearest Neighbors Partial Volumes compute compute vector partial volumes for each voxel FIXED IMAGE INTENSITY Compute Score simply cycle Traditional Serial CPU thru histograms ⨉ h j (i, j) N C = 1 B F B M is very fast Σ Σ h j (i, j) ln N ⨉ h M ( j ) h F ( i ) (time required is negligible) j=1 i=1

  14. S LIDE 14 OF 33 use partial volumes for moving & joint MOVING IMAGE INTENSITY Generate Histograms get corresponding voxels in moving image 4 1 B C 3 2 A D F M Nearest Neighbors Partial Volumes compute compute vector partial volumes for each voxel FIXED IMAGE INTENSITY Compute Score simply cycle Traditional Serial CPU thru histograms ⨉ h j (i, j) N C = 1 B F B M is very fast Σ Σ h j (i, j) ln N ⨉ h M ( j ) h F ( i ) (time required is negligible) j=1 i=1 change in cost as Compute Gradient ( ∂v ) vector changes get corresponding ∂C ∂C ⨉ ∂w n 4 voxels in moving image Σ ∂v = 4 1 ∂h x n B C n=1 3 2 A D F M Nearest Neighbors Partial Volumes compute NEXT get vector partial volume for each voxel derivatives ∂C ∂P = ∂C ∂v ⨉ h j (i n , j n ) N ⨉ ∂C ln - C = ∂v ∂P ∂h ⨉ h M ( j n ) h F ( i n ) x n

  15. S LIDE 15 OF 33 v ? ? ? ? ? ? ? ? ? ? ? ? ? ??? ? ?? ? ? ? ? ?? ? ? ?? ? ? ? F F F M ? ? ? ?? C ORRESPONDANCE AND C OST 𝚬 C OST w.r.t. V ECTORS D ECOMPRESS V ECTOR F IELD C ∂C New ∂v P ∂C ? ∂P ? ? ? ? ? ? ? ? ? ? ? ? ? ?? ??? ? ? ? ? ?? ? ? ?? ? ? ? Q UASI-NEWTONIAN ? ? ? ?? F O PTIMIZER 𝚬 C OST w.r.t. C OEFFICIENTS

  16. S LIDE 16 OF 33 β X ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?? ? ? ? ? ? 1 2 3 4 ? ? ? ? ? ? ? ? ? ? ? ? F 7 8 5 6 C HANGE IN C OST w.r.t. C OEFFICIENTS 9 10 11 12 4 4 v X = Σ Σ ( β X,i β Y,j ) P X,i,j β Y 13 14 15 16 j=1 i=1 ∂C ∂C ∂v ∂C 4 4 Σ Σ Σ β X,i β Y,j ∂P = ∂P = ∂v ∂v j=1 i=1

  17. S LIDE 17 OF 33 β X ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?? ? ? ? ? ? 1 2 3 4 ? ? ? ? ? ? ? ? ? ? ? ? F 7 8 5 6 C HANGE IN C OST w.r.t. C OEFFICIENTS 9 10 11 12 4 4 v X = Σ Σ ( β X,i β Y,j ) P X,i,j β Y 13 14 15 16 j=1 i=1 ∂C ∂C ∂v ∂C 4 4 Σ Σ Σ β X,i β Y,j ∂P = ∂P = ∂v ∂v j=1 i=1

  18. S LIDE 18 OF 33 P ARALLELIZATION L EVERAGING GPU S , O PEN MP, ETC

  19. S LIDE 19 OF 33 What do we parallelize ? ✓ ✓ ✓ v ? ? ? ? ? ? ? ? ? ? ? ? ? ??? ? ?? ? ? ? ? ?? ? ? ?? ? ? ? F F F M ? ? ? ?? C ORRESPONDANCE AND C OST 𝚬 C OST w.r.t. V ECTORS D ECOMPRESS V ECTOR F IELD C ✗ ∂C New ∂v P ✓ ✗ ∂C ? ∂P ? ? ? ? ? ? ? ? ? ? ? ? ? ?? ??? ? ? ? ? ?? ? ? ?? ? ? ? Q UASI-NEWTONIAN ? ? ? ?? F O PTIMIZER 𝚬 C OST w.r.t. C OEFFICIENTS

  20. S LIDE 20 OF 33 C OMPUTE V ECTOR F ROM C OEFF F C OMPUTE H ISTOGRAMS C YCLE H IST C OST ( MI ) F M C OMPUTE C HANGE IN C OST w.r.t V ECTOR ? ? ? ? ? ? ? ? ? ? ? ? ? ?? ? ? ? ? ?? ? ? ? ? ∂C ? ? ? ? ? ? ? ? ∂v ? ? ? ? ? ? ? ? ? ? ? ? F ? ? ? ? ? ?

  21. S LIDE 21 OF 33 use partial volumes for moving & joint MOVING IMAGE INTENSITY Generate Histograms get corresponding voxels in moving image 4 1 B C 3 2 A D F M Nearest Neighbors Partial Volumes compute compute vector partial volumes for each voxel FIXED IMAGE INTENSITY Compute Score simply cycle Traditional Serial CPU thru histograms ⨉ h j (i, j) N C = 1 B F B M is very fast Σ Σ h j (i, j) ln N ⨉ h M ( j ) h F ( i ) (time required is negligible) j=1 i=1 change in cost as Compute Gradient ( ∂v ) vector changes get corresponding ∂C ∂C ⨉ ∂w n 4 voxels in moving image Σ ∂v = 4 1 ∂h x n B C n=1 3 2 A D F M Nearest Neighbors Partial Volumes compute NEXT get vector partial volume for each voxel derivatives ∂C ∂P = ∂C ∂v ⨉ h j (i n , j n ) N ⨉ ∂C ln - C = ∂v ∂P ∂h ⨉ h M ( j n ) h F ( i n ) x n

  22. S LIDE 22 OF 33 β X β X 1 2 3 4 1 2 3 4 5 6 7 8 ? 5 6 7 8 9 10 11 12 ? ? ? ? ? ? ? ? ? ? ? ? β Y 9 10 11 12 13 1415 16 β Y ? ? ? ? ? ?? ? ? ? ? ? ? ? 13 1415 16 ? ? ? ? ? ? ? ? ? ? F C HANGE IN C OST w.r.t. C OEFFICIENTS CPU 1 4 4 v X = Σ Σ ( β X,i β Y,j ) P X,i,j j=1 i=1 . . . 1 2 3 4 5 16 ∂C ∂C ∂v ∂C 4 4 Σ Σ β X,i β Y,j ∂P = ∂P = ∂v ∂v j=1 i=1 . . . 1 2 3 4 5 16 . . . 1 2 3 4 5 16

  23. S LIDE 23 OF 33 β X β X 1 2 3 4 1 2 3 4 5 6 7 8 ? 5 6 7 8 9 10 11 12 ? ? ? ? ? ? ? ? ? ? ? ? β Y 9 10 11 12 13 1415 16 β Y ? ? ? ? ? ?? ? ? ? ? ? ? ? 13 1415 16 ? ? ? ? ? ? ? ? ? ? F C HANGE IN C OST w.r.t. C OEFFICIENTS CPU 2 CPU 1 4 4 v X = Σ Σ ( β X,i β Y,j ) P X,i,j j=1 i=1 . . . 1 2 3 4 5 16 ∂C ∂C ∂v ∂C 4 4 Σ Σ β X,i β Y,j ∂P = ∂P = ∂v ∂v j=1 i=1 . . . 1 2 3 4 5 16 . . . 1 2 3 4 5 16

  24. S LIDE 24 OF 33 C ONSTANT C ONTROL P OINT S PACING 16x 15 x 15 x 15 speedup 30 min → 1.8 min J. Shackleford, N. Kandasamy, and G. Sharp, Deformable Volumetric Registration using B-splines. GPU Computing Gems: Emerald Edition, Morgan Kaufmann Pub, 2011. J. Shackleford, N. Kandasamy, and G. Sharp, “On developing B-spline registration algorithms for multi-core processors,” Physics in Medicine and Biology , vol. 55, p. 6329, 2010.

  25. S LIDE 25 OF 33 C ONSTANT V OLUME S IZE 256 x 256 x 256 J. Shackleford, N. Kandasamy, and G. Sharp, Deformable Volumetric Registration using B-splines. GPU Computing Gems: Emerald Edition, Morgan Kaufmann Pub, 2011. J. Shackleford, N. Kandasamy, and G. Sharp, “On developing B-spline registration algorithms for multi-core processors,” Physics in Medicine and Biology , vol. 55, p. 6329, 2010.

  26. S LIDE 26 OF 33 OpenMP CUDA thread-level histograms (shared memory) + block-level histograms (global memory) complete histograms (global memory) H ISTOGRAM C OMPUTATION L EVERAGING GPU S , O PEN MP, ETC

  27. S LIDE 27 OF 33 OpenMP CUDA block thread-level histograms (shared memory) + block-level histograms (global memory) complete histogram (global memory) H ISTOGRAM C OMPUTATION L EVERAGING GPU S , O PEN MP, ETC

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend