On the Limitations of Representing Functions on Sets
Edward Wagstaff*, Fabian Fuchs*, Martin Engelcke* Ingmar Posner, Michael Osborne
*Equal contribution
Machine Learning Research Group
On the Limitations of Representing Functions on Sets Edward - - PowerPoint PPT Presentation
On the Limitations of Representing Functions on Sets Edward Wagstaff*, Fabian Fuchs*, Martin Engelcke* Ingmar Posner, Michael Osborne M achine L earning R esearch G roup *Equal contribution Examples for Permutation Invariant Problems:
Edward Wagstaff*, Fabian Fuchs*, Martin Engelcke* Ingmar Posner, Michael Osborne
*Equal contribution
Machine Learning Research Group
Smiling Blond Hair
CelebA Dataset, Liu et al.
Input
Input
Input Latent A
Input Latent A
Input Latent A Latent B
Input Latent A Latent B
Input Output Latent A Latent B
X ⊂ ℝM ℝNxM ℝN ℝ
Input Output
x1 xM
f(x1, …, xM)
ϕ(x1) ϕ(xM)
X ⊂ ℝM ℝNxM ℝN ℝ
Input Output
x1 xM
f(x1, …, xM)
ϕ(x1) ϕ(xM)
Theorem 1 (Zaheer et al.): This architecture can successfully model any permutation invariant function, even for latent dimension N=1.
X ⊂ ℝM ℝNxM ℝN ℝ
Input Output
x1 xM
f(x1, …, xM)
ϕ(x1) ϕ(xM)
Theorem 1 (Zaheer et al.): This architecture can successfully model any permutation invariant function, even for latent dimension N=1.
Proof
X ⊂ ℝM ℝNxM ℝN ℝ
Input Output
x1 xM
f(x1, …, xM)
ϕ(x1) ϕ(xM)
Theorem 1 (Zaheer et al.): This architecture can successfully model any permutation invariant function, even for latent dimension N=1.
Proof
Assume that neural networks Φ and ρ are universal function approximators
X ⊂ ℝM ℝNxM ℝN ℝ
Input Output
x1 xM
f(x1, …, xM)
ϕ(x1) ϕ(xM)
Theorem 1 (Zaheer et al.): This architecture can successfully model any permutation invariant function, even for latent dimension N=1.
Proof
Assume that neural networks Φ and ρ are universal function approximators Find a Φ such that mapping from input set X to latent representation Y is injective
X ⊂ ℝM ℝNxM ℝN ℝ
Input Output
x1 xM
f(x1, …, xM)
ϕ(x1) ϕ(xM)
Theorem 1 (Zaheer et al.): This architecture can successfully model any permutation invariant function, even for latent dimension N=1.
Proof
Assume that neural networks Φ and ρ are universal function approximators Find a Φ such that mapping from input set X to latent representation Y is injective
Everything can be modelled
X ⊂ ℝM ℝNxM ℝN ℝ
Input Output
x1 xM
f(x1, …, xM)
ϕ(x1) ϕ(xM)
Theorem 1 (Zaheer et al.): This architecture can successfully model any permutation invariant function, even for latent dimension N=1.
Proof
Assume that neural networks Φ and ρ are universal function approximators Find a Φ such that mapping from input set X to latent representation Y is injective
Everything can be modelled
define c(x) : ℚ → ℕ
X ⊂ ℝM ℝNxM ℝN ℝ
Input Output
x1 xM
f(x1, …, xM)
ϕ(x1) ϕ(xM)
Theorem 1 (Zaheer et al.): This architecture can successfully model any permutation invariant function, even for latent dimension N=1.
Proof
Assume that neural networks Φ and ρ are universal function approximators Find a Φ such that mapping from input set X to latent representation Y is injective
Everything can be modelled
define c(x) : ℚ → ℕ then define ϕ(x) = 2c(x)
We need to take real numbers into account!
X ⊂ ℝM ℝNxM ℝN ℝ
Input Output
x1 xM
f(x1, …, xM)
ϕ(x1) ϕ(xM)
X ⊂ ℝM ℝNxM ℝN ℝ
Input Output
x1 xM
f(x1, …, xM)
ϕ(x1) ϕ(xM)
Theorem 2: If we want to model all permutation invariant functions, it is suf<icient and necessary that the latent dimension N is at least as large as the maximum input set size M.
X ⊂ ℝM ℝNxM ℝN ℝ
Input Output
x1 xM
f(x1, …, xM)
ϕ(x1) ϕ(xM)
Theorem 2: If we want to model all permutation invariant functions, it is suf<icient and necessary that the latent dimension N is at least as large as the maximum input set size M.
Sketch of Proof for Necessity
X ⊂ ℝM ℝNxM ℝN ℝ
Input Output
x1 xM
f(x1, …, xM)
ϕ(x1) ϕ(xM)
Theorem 2: If we want to model all permutation invariant functions, it is suf<icient and necessary that the latent dimension N is at least as large as the maximum input set size M.
Sketch of Proof for Necessity
To prove necessity, we
which can’t be decomposed with N<M. We pick max(X).
X ⊂ ℝM ℝNxM ℝN ℝ
Input Output
x1 xM
f(x1, …, xM)
ϕ(x1) ϕ(xM)
Theorem 2: If we want to model all permutation invariant functions, it is suf<icient and necessary that the latent dimension N is at least as large as the maximum input set size M.
Sketch of Proof for Necessity
To prove necessity, we
which can’t be decomposed with N<M. We pick max(X). We show that, in order to represent max(X), needs to be injective
Φ(X) = ∑
x
ϕ(x)
X ⊂ ℝM ℝNxM ℝN ℝ
Input Output
x1 xM
f(x1, …, xM)
ϕ(x1) ϕ(xM)
Theorem 2: If we want to model all permutation invariant functions, it is suf<icient and necessary that the latent dimension N is at least as large as the maximum input set size M.
Sketch of Proof for Necessity
To prove necessity, we
which can’t be decomposed with N<M. We pick max(X). We show that, in order to represent max(X), needs to be injective This is not possible with N<M
Φ(X) = ∑
x
ϕ(x)
100 101 102 103
N (latent dim)
10−2 10−1 100
RMSE 15 30 60 100 200 300 400 500
100 200 300 400 500 600
input size M
20 40 60 80 100
critical latent dim Nc