SLIDE 6 2.3 Monge’s Formulation
Monge problem. Monge problem (1) is extended to the setting of two arbitrary probability measures (α, β) on two spaces (X, Y) as finding a map T : X → Y that minimizes inf
T
c(x, T(x))dα(x) ; T♯α = β
(6) The constraint T♯α = β means that T pushes forward the mass of α to β, and makes use of the push-forward
For empirical measure with same number n = m of points, one retrieves the optimal matching problem. Indeed, this corresponds to the setting of empirical measures α =
i δxi and β = i δyi. In this case,
T♯α = β necessarily implies that σ is one-to-one, T : xi → xσ(i), so that
c(x, T(x))dα(x) =
c(xi, xσ(i)). In general, an optimal map T solving (6) might fail to exist. In fact, the constraint set T♯α = β, which is the case for instance if α = δx and β is not a single Dirac. Even if the constraint set is not empty the infimum might not be reached, the most celebrated example being the case of α being distributed uniformly
- n a single segment and β being distributed on two segments on the two sides.
Monge distance. In the special case c(x, y) = dp(x, y) where d is a distance, we denote ˜ W
p p(α, β)
def.
= inf
T
def.
=
d(x, T(x))pdα(x) ; T♯α = β
(7) If the constraint set is empty, then we set ˜ W
p p(α, β) = +∞. The following proposition shows that quantity
defines a distance. Proposition 1. ˜ W is a distance.
W
p p(α, β) = 0 then necessarily the optimal map is Id on the support of α and β = α. Let us prove
that ˜ W
p p(α, β) ˜
W
p p(α, γ) + ˜
W
p p(γ, β). If ˜
W
p p(α, β) = +∞, then either ˜
W
p p(α, γ) = +∞ or ˜
W
p p(γ, β) = +∞,
because otherwise we consider two maps (S, T) such that S♯α = γ and T♯γ = β and then (T ◦ S)♯α = β so that ˜ W
p p(α, β) Eα(S ◦ T) < +∞. So necessarily ˜
W
p p(α, β) < +∞ and we can restrict our attention to the
cases where ˜ W
p p(α, γ) < +∞ and ˜
W
p p(γ, β) < +∞ because otherwise the inequality is trivial. For any ε > 0,
we consider ε-minimizer S♯α = γ and T♯γ = β such that Eα(S)
1 p ˜
Wp(α, γ) + ε and Eγ(T)
1 p ˜
Wp(γ, β) + ε. Now we have that (T ◦ S)♯α = γ, so that one has, using sub-optimality of this map and the triangular inequality Wp(α, γ)
1 p
- (d(x, S(x)) + d(S(x), T(S(x))))pdα(x)
1 p .
The using Minkowski inequality Wp(α, γ)
1 p +
1 p Wp(α, β) + Wp(β, γ) + 2ε.
Letting ε → 0 gives the result. 6