Similarity Metrics and Efficient Optimization for Simultaneous - PDF document

Similarity Metrics and Efficient Optimization for Simultaneous Registration Supplementary Material - CVPR 2009 Christian Wachinger and Nassir Navab Computer Aided Medical Procedures (CAMP), Technische Universit¨ at M¨ unchen, Germany wachinge@in.tum.de, navab@in.tum.de 1 Multivariate Similarity Measures Having n images I = { I 1 , . . . , I n } and corresponding transformations x i , we set up a maximum likelihood framework to describe the registration process mathematically. The joint density function of interest is p ( I 1 , I 2 , . . . , I n ). Due to its high dimensionality, direct calculation is prohibitive, which makes an approximation necessary. We will consider two approximations that were recently introduced. The first is an approximation by accumulating pair-wise estimates (APE). The second approximation is based on the congealing framework. 1.1 APE The pair-wise approximation is derived as follows. First, we derive a pair-wise approximation with respect to image I n using the product rule and conditional independence p ( I 1 , . . . , I n ) Prod.Rule = p ( I 1 , . . . , I n − 1 | I n ) · p ( I n ) (1) n − 1 Cond.Indep. ❨ = p ( I i | I n ) · p ( I n ) (2) i =1 Second, we take the n -th power of the joint density function and do the above derivation for each of the images, leading to n − 1 n p ( I 1 , . . . , I n ) n (2) ❨ ❨ = p ( I 1 ) · p ( I i | I 1 ) · . . . · p ( I n ) · p ( I i | I n ) (3) i =2 i =1 n n ❨ ❨ ❨ = p ( I i ) · p ( I j | I i ) . (4) i =1 i =1 j � = i Third the logarithm is applied log p ( I 1 , . . . , I n ) n = n · log p ( I 1 , . . . , I n ) (5) n n ❳ ❳ ❳ = log p ( I i ) + log p ( I j | I i ) . (6) i =1 i =1 j � = i And after transforming the equation n n log p ( I 1 , . . . , I n ) = 1 log p ( I i ) + 1 ❳ ❳ ❳ log p ( I j | I i ) (7) n n i =1 i =1 j � = i n ❳ ❳ ≈ log p ( I j | I i ) . (8) i =1 j � = i Following the works of Viola [1] and Roche et al . [2], it is possible to derive SSD, NCC, CR, and MI from log p ( I j | I i ).

1.2 Congealing In the congealing framework [3], independent but not identical distribution of the coordinate samples are assumed p k ( I 1 ( s k ) , . . . , I n ( s k )) ❨ p ( I 1 , . . . , I n ) = (9) s k ∈ Ω and then further i.i.d. images I n ❨ ❨ p k ( I i ( s k )) . p ( I 1 , . . . , I n ) = (10) s k ∈ Ω i =1 Applying the logarithm we get to log p ( I 1 , . . . , I n ) ≈ − P s k ∈ Ω H ( I ( s k )) with H the sample entropy. In our extension, we do not consider the images to be independent but assume each image I i dependent on a neighborhood of images N i . This leads to n ❨ ❨ p k ( I i ( s k ) | I N i ( s k )) . p ( I 1 , . . . , I n ) = (11) s k ∈ Ω i =1 The size of the neighborhood depends on the structure in the image stack. If there is no further information about the images, considering a total neighborhood seems reasonable. If there is, however, a certain ordering in the stack (camera parameters,...), a smaller neighborhood may lead to better estimates. 1.2.1 Voxel-wise SSD Since the approach taken in the congealing framework focuses on certain pixel or voxel locations, it is also referred to as voxel-wise estimation [4]. In [5], a similarity measure called voxel-wise SSD was proposed, which does a voxel-wise estimation and combines this with the assumptions of SSD. The above introduced extension nicely allows us to derive this measure. We assume, like for SSD, Gaußian distributed intensity values and the identity as intensity mapping. Furhter, we incorporate the neighborhood information by estimating the mean µ k for each voxel location s k with n µ k = 1 ❳ I l ( s k ) . (12) n l =1 With respect to the neighborhood N i , the calculation of the mean should not include the image I i itself. However, this leads to a much higher computational cost because for each image and each voxel location a different mean has to be calculated. We therefore go ahead with the approximation, leading to n − ( I i ( s k ) − µ k ) 2 1 ❶ ➀ ❨ ❨ √ p ( I 1 , . . . , I n ) = 2 πσ exp (13) 2 σ 2 s k ∈ Ω i =1 with a Gaußian variance σ 2 . This leads to the formula for voxel-wise SSD n ( I i ( s k ) − µ k ) 2 . ❳ ❳ log p ( I 1 , . . . , I n ) ≈ − (14) s k ∈ Ω i =1 Looking at the formula, we can see that voxel-wise SSD leads to the calculation of the variance at each location and subsequently accumulates the values [6]. The variance is one of the measures to express the statistical dispersion of a random variable. In contrast to entropy, measuring the structured-ness of a variable, it can only deal with mono-modal matchings.

1.3 Connection between APE and congealing We show the connection between the two approximations, by starting with the extension of congealing, equation (11), and derive the formula of APE from it n p k ( I i ( s k ) | I N i ( s k )) ❨ ❨ p ( I 1 , . . . , I n ) = (15) s k ∈ Ω i =1 n p k ( I N i ( s k ) | I i ( s k )) p k ( I i ( s k )) Bayes ❨ ❨ = (16) p k ( I N i ( s k )) s k ∈ Ω i =1 ✷ ✸ n ✺ p k ( I i ( s k )) Cond.Indep. ❨ ❨ ✹ ❨ p k ( I j ( s k ) | I i ( s k )) = (17) p k ( I N i ( s k )) s k ∈ Ω i =1 j ∈N i ✷ ✸ n p k ( I i ( s k )) Indep. ❨ ❨ ✹ ❨ p k ( I j ( s k ) | I i ( s k )) = (18) ✺ ◗ j ∈N i p k ( I j ( s k )) s k ∈ Ω i =1 j ∈N i applying the logarithm n n ❳ ❳ ❳ ⑨ log p k ( I j ( s k ) | I i ( s k )) − log p k ( I j ( s k )) ❾ ❳ ❳ log p k ( I i ( s k )) log p ( I 1 , . . . , I n ) = + (19) s k ∈ Ω i =1 j ∈N i s k ∈ Ω i =1 and assuming further a total neighborhood n n ❳ ❳ ❳ ⑨ log p k ( I j ( s k ) | I i ( s k )) − log p k ( I j ( s k )) ❾ ❳ ❳ log p k ( I i ( s k )) log p ( I 1 , . . . , I n ) = + (20) s k ∈ Ω s k ∈ Ω i =1 j � = i i =1 An assumption that is different between the pair-wise and voxel-wise (congealing), per design, is that the voxel- wise coordinate samples are not identically distributed. To relate the two approaches, we set the distribution of the coordinate samples identical n n ❳ ❳ ❳ log p ( I 1 , . . . , I n ) = (log p ( I j | I i ) − log p ( I j )) + log p ( I i ) (21) i =1 j � = i i =1 n n n ❳ ❳ ❳ ❳ ❳ = log p ( I j | I i ) + log p ( I i ) − log p ( I j ) (22) i =1 j � = i i =1 i =1 j � = i n n n ❳ ❳ ❳ ❳ = log p ( I j | I i ) + log p ( I i ) − ( n − 1) log p ( I j ) (23) i =1 j � = i i =1 j =1 For comparison, the derived formula for the pair-wise approach is (reciting Equation (7)) n n ❳ ❳ ❳ log p pw ( I 1 , . . . , I n ) = log p ( I j | I i ) + log p ( I i ) (24) i =1 i =1 j � = i So the pair-wise and congealing approximation are, under the assumption of a total neighborhood, conditional independent images and identical distribution of coordinate samples, equal up to the term − ( n − 1) P n j =1 log p ( I j ). This term can be neglected because it does not change during the optimization process.

2 Gradients of Similarity Measures In the following we state the gradients of the similarity measures - mutual information, correlation ratio, and correlation coefficient - for which we have found no place in the main article. This completes the article and gives the reader all the necessary information for implementing an efficient gradient-based optimization of the multivariate cost function. However, we also want to state that there is no contribution in the calculation of these gradients, since they are standard, and can e.g . be found in the work of Hermosillo et al . [7]. The gradient that we show in the following was in the article denoted by ∂ SM( I i , I ) ☞ = ∇ SM( I i , I ↓ ☞ j ) (25) ☞ ∂I I = I ↓ ☞ j with I i the fixed and I ↓ j the moving image. We define the following auxiliary variables (mean, variance) with i 1 an intensity in I i and i 2 an intensity in I ↓ j ❩ µ 1 = i 1 p ( i 1 ) di 1 (26) ❩ µ 2 = i 2 p ( i 2 ) di 2 (27) p ( i 1 , i 2 ) ❩ µ 2 | 1 = i 2 p ( i 1 ) di 2 (28) ❩ i 2 1 p ( i 1 ) di 1 − µ 2 v 1 = (29) 1 ❩ i 2 2 p ( i 2 ) di 2 − µ 2 v 2 = (30) 2 ❩ i 1 i 2 p ( i 1 , i 2 ) d ( i 1 , i 2 ) − µ 1 · µ 2 v 1 , 2 = (31) p ( i 1 ) the probability for intensity i 1 in I i and p ( i 1 , i 2 ) the joint probability. 2.1 Mutual Information The formula for mutual information is MI( I i , I ↓ j ) = H( I i ) + H( I ↓ j ) − H( I i , I ↓ j ) (32) p ( I i , I ↓ j ) ❩ R 2 p ( I i , I ↓ = j ) log (33) p ( I i ) · p ( I j ) with H the entropy. The derivation is ❸❸ ∂ ∂I j p ( I i , I ↓ ∂I j p ( I ↓ ➂➂ ∂ j ) j ) j ) = G Ψ ∗ 1 ∇ MI( I i , I ↓ − (34) p ( I i , I ↓ p ( I ↓ | Ω | j ) j ) with the Gaußian G Ψ and the image grid | Ω | . 2.2 Correlation Ratio The formula for correlation ratio is E (Var( I ↓ j | I i )) CR( I i , I ↓ j ) = 1 − (35) . Var( I ↓ j ) The derivation is µ 2 − µ 2 | 1 + CR( I i , I ↓ j ) · ( i 2 − µ 2 ) ∇ CR( I i , I ↓ j ) = G Ψ ∗ . (36) 1 2 · v 2 · | Ω |

Similarity Metrics and Efficient Optimization for Simultaneous - PDF document

Similarity Metrics and Efficient Optimization for Simultaneous Registration Supplementary Material - CVPR 2009 Christian Wachinger and Nassir Navab Computer Aided Medical Procedures (CAMP), Technische Universit at M unchen, Germany

Semantic Similarity MultiJEDI ERC 259234 Semantic Similarity Semantic Similarity Mostly

I/O-EFFICIENT SIMILARITY JOIN R. Pagh, N. Pham, F. Silvestri, M. Stckel Similarity Join R = Q

Align, Disambiguate, and Walk A Unified Approach for Measuring Semantic Similarity Semantic

Time- -dependent Similarity Measure dependent Similarity Measure Time Time-dependent Similarity

Learning Similarity Metrics for Numerical Simulations Georg Kohl Kiwon Um Nils Thuerey

What we learned from Community Metrics Agenda Why are metrics used? How metrics are used

Proposal Metrics Dashboard What Gets Measured Gets Done Topics Why Keep Metrics? What

Performance Metrics for Graph Mining Tasks 1 Outline Introduction to Performance Metrics

AGENCY OPERATIONS METRICS The Metrics of Me The Metrics of Me x 159 13,006 5 days old books

Unification of CSC and SE ABET Effor ts Similarity of CSC and SE Programs Similarity of CSC and

LECTURE 4 Similarity and Distance Recommender Systems SIMILARITY AND DISTANCE Thanks to: Tan,

DATA MINING LECTURE 4 Similarity and Distance Recommender Systems SIMILARITY AND DISTANCE

DATA MINING LECTURE 5 Similarity and Distance Sketching, Locality Sensitive Hashing SIMILARITY

COMP9313: Big Data Management High Dimensional Similarity Search Similarity Search Problem

15-780: Optimization J. Zico Kolter March 14-16, 2015 1 Outline Introduction to optimization

Software Metrics And I gnominy Software Metrics And I gnominy Software Metrics And I gnominy

How to build How to build a successful research group? a successful research group? Prof. M

GPU COMPUTING RESEARCH WITH OPENCL Studying Future Workloads and Devices Perhaad Mistry, Dana

1 Learning Objectives Upon completion of this module, learners will be able to: 1. Identify the

Chapter 7: Clinical Treatment Planning in External Photon Beam Radiotherapy Set of 232 slides

Ionoacoustic range monitoring for proton therapy W. Assmann Faculty of Physics Department for

Ra Radiation T Therapy Kirk Bertsche, Accuray, Inc. For the 2018 Annual Meeting of the

Lecture 1 Introduction Lin ZHANG, PhD School of Software Engineering Tongji University Fall

The Computer a com puter system is m ade up of various elem ents chapter 2 each of these elem