Invariant-equivariant representation learning for multi-class data
Ilya Feige Faculty
➔
Invariant-equivariant representation learning for multi-class data - - PowerPoint PPT Presentation
Invariant-equivariant representation learning for multi-class data Ilya Feige Faculty Invariant-equivariant representation learning High-level introduction 2 Separating content from style This work is about disentangling
Ilya Feige Faculty
➔
Invariant-equivariant representation learning
2
What? Want to represent the class and the data instance separately This work is about disentangling representations. We present a novel approach to an old problem.
3
What? Want to represent the class and the data instance separately This work is about disentangling representations. We present a novel approach to an old problem.
4
class, r
<latexit sha1_base64="Ys2D6lj3c1+uto6IwD1WQZF5vj0=">AB+nicbVDLSsNAFJ3UV42vqEs3g6XgQkpSBV0W3bisYB/QhjKZTtuhk0mYuSmW2D9xJSiIW/ElX/jNM1CWw9cOJxz78y9J4gF1+C631ZhbX1jc6u4be/s7u0fOIdHTR0lirIGjUSk2gHRTHDJGsBsHasGAkDwVrB+HbutyZMaR7JB5jGzA/JUPIBpwSM1HOcLrBHSKkgWp/jGVY9p+RW3Ax4lXg5KaEc9Z7z1e1HNAmZhOyVjufG4KdEAaeCzexyN9EsJnRMhqxjqCQh036arT7DZaP08SBSpiTgTLV/TaQk1HoaBqYzJDSy95c/M/rJDC49lMu4wSYpIuPBonAEOF5DrjPFaMgpoYQqrhZFtMRUYSCScs2KXjLN6+SZrXiXVSq95el2k2eRxGdoFN0hjx0hWroDtVRA1E0Qc/oFb1ZT9aL9W59LFoLVj5zjP7A+vwBm+KThw=</latexit>What? Want to represent the class and the data instance separately This work is about disentangling representations. We present a novel approach to an old problem.
5
datapoint, (r, v)
<latexit sha1_base64="e3aF6vstTuwkI9LmeOhMNzmeTyM=">ACAnicbVDLSgNBEJz1GeNr1YvgZTAEIoSwGwU9Br14jGAekCxhdjKbDJmdXWZ6g2GJz/Fk6AgXv0LT/6Nk8dBEwsaiqpurv8WHANjvNtrayurW9sZray2zu7e/v2wWFdR4mirEYjEamTzQTXLIacBCsGStGQl+whj+4mfiNIVOaR/IeRjHzQtKTPOCUgJE69nEb2AOkXQIkjriEIh7jgioOzp2zik5U+Bl4s5JDs1R7dhf7W5Ek5BJoIJo3XKdGLyUKOBUsHE23040iwkdkB5rGSpJyLSXTl8Y47xRujiIlCkJeKpmf02kJNR6FPqmMyTQ14veRPzPayUQXHkpl3ECTNLZoiARGCI8yQN3uWIUxMgQhU3x2LaJ4pQMKlTQru4s/LpF4uel8t1FrnI9zyODTtApKiAXaIKukVEMUPaJn9IrerCfrxXq3PmatK9Z85gj9gfX5AwGTlng=</latexit>class, r
<latexit sha1_base64="Ys2D6lj3c1+uto6IwD1WQZF5vj0=">AB+nicbVDLSsNAFJ3UV42vqEs3g6XgQkpSBV0W3bisYB/QhjKZTtuhk0mYuSmW2D9xJSiIW/ElX/jNM1CWw9cOJxz78y9J4gF1+C631ZhbX1jc6u4be/s7u0fOIdHTR0lirIGjUSk2gHRTHDJGsBsHasGAkDwVrB+HbutyZMaR7JB5jGzA/JUPIBpwSM1HOcLrBHSKkgWp/jGVY9p+RW3Ax4lXg5KaEc9Z7z1e1HNAmZhOyVjufG4KdEAaeCzexyN9EsJnRMhqxjqCQh036arT7DZaP08SBSpiTgTLV/TaQk1HoaBqYzJDSy95c/M/rJDC49lMu4wSYpIuPBonAEOF5DrjPFaMgpoYQqrhZFtMRUYSCScs2KXjLN6+SZrXiXVSq95el2k2eRxGdoFN0hjx0hWroDtVRA1E0Qc/oFb1ZT9aL9W59LFoLVj5zjP7A+vwBm+KThw=</latexit>What? Want to represent the class and the data instance separately Why?
This work is about disentangling representations. We present a novel approach to an old problem.
6
datapoint, (r, v)
<latexit sha1_base64="e3aF6vstTuwkI9LmeOhMNzmeTyM=">ACAnicbVDLSgNBEJz1GeNr1YvgZTAEIoSwGwU9Br14jGAekCxhdjKbDJmdXWZ6g2GJz/Fk6AgXv0LT/6Nk8dBEwsaiqpurv8WHANjvNtrayurW9sZray2zu7e/v2wWFdR4mirEYjEamTzQTXLIacBCsGStGQl+whj+4mfiNIVOaR/IeRjHzQtKTPOCUgJE69nEb2AOkXQIkjriEIh7jgioOzp2zik5U+Bl4s5JDs1R7dhf7W5Ek5BJoIJo3XKdGLyUKOBUsHE23040iwkdkB5rGSpJyLSXTl8Y47xRujiIlCkJeKpmf02kJNR6FPqmMyTQ14veRPzPayUQXHkpl3ECTNLZoiARGCI8yQN3uWIUxMgQhU3x2LaJ4pQMKlTQru4s/LpF4uel8t1FrnI9zyODTtApKiAXaIKukVEMUPaJn9IrerCfrxXq3PmatK9Z85gj9gfX5AwGTlng=</latexit>class, r
<latexit sha1_base64="Ys2D6lj3c1+uto6IwD1WQZF5vj0=">AB+nicbVDLSsNAFJ3UV42vqEs3g6XgQkpSBV0W3bisYB/QhjKZTtuhk0mYuSmW2D9xJSiIW/ElX/jNM1CWw9cOJxz78y9J4gF1+C631ZhbX1jc6u4be/s7u0fOIdHTR0lirIGjUSk2gHRTHDJGsBsHasGAkDwVrB+HbutyZMaR7JB5jGzA/JUPIBpwSM1HOcLrBHSKkgWp/jGVY9p+RW3Ax4lXg5KaEc9Z7z1e1HNAmZhOyVjufG4KdEAaeCzexyN9EsJnRMhqxjqCQh036arT7DZaP08SBSpiTgTLV/TaQk1HoaBqYzJDSy95c/M/rJDC49lMu4wSYpIuPBonAEOF5DrjPFaMgpoYQqrhZFtMRUYSCScs2KXjLN6+SZrXiXVSq95el2k2eRxGdoFN0hjx0hWroDtVRA1E0Qc/oFb1ZT9aL9W59LFoLVj5zjP7A+vwBm+KThw=</latexit>What? Want to represent the class and the data instance separately Why?
What else? This is not a new topic…
This work is about disentangling representations. We present a novel approach to an old problem.
7
datapoint, (r, v)
<latexit sha1_base64="e3aF6vstTuwkI9LmeOhMNzmeTyM=">ACAnicbVDLSgNBEJz1GeNr1YvgZTAEIoSwGwU9Br14jGAekCxhdjKbDJmdXWZ6g2GJz/Fk6AgXv0LT/6Nk8dBEwsaiqpurv8WHANjvNtrayurW9sZray2zu7e/v2wWFdR4mirEYjEamTzQTXLIacBCsGStGQl+whj+4mfiNIVOaR/IeRjHzQtKTPOCUgJE69nEb2AOkXQIkjriEIh7jgioOzp2zik5U+Bl4s5JDs1R7dhf7W5Ek5BJoIJo3XKdGLyUKOBUsHE23040iwkdkB5rGSpJyLSXTl8Y47xRujiIlCkJeKpmf02kJNR6FPqmMyTQ14veRPzPayUQXHkpl3ECTNLZoiARGCI8yQN3uWIUxMgQhU3x2LaJ4pQMKlTQru4s/LpF4uel8t1FrnI9zyODTtApKiAXaIKukVEMUPaJn9IrerCfrxXq3PmatK9Z85gj9gfX5AwGTlng=</latexit>class, r
<latexit sha1_base64="Ys2D6lj3c1+uto6IwD1WQZF5vj0=">AB+nicbVDLSsNAFJ3UV42vqEs3g6XgQkpSBV0W3bisYB/QhjKZTtuhk0mYuSmW2D9xJSiIW/ElX/jNM1CWw9cOJxz78y9J4gF1+C631ZhbX1jc6u4be/s7u0fOIdHTR0lirIGjUSk2gHRTHDJGsBsHasGAkDwVrB+HbutyZMaR7JB5jGzA/JUPIBpwSM1HOcLrBHSKkgWp/jGVY9p+RW3Ax4lXg5KaEc9Z7z1e1HNAmZhOyVjufG4KdEAaeCzexyN9EsJnRMhqxjqCQh036arT7DZaP08SBSpiTgTLV/TaQk1HoaBqYzJDSy95c/M/rJDC49lMu4wSYpIuPBonAEOF5DrjPFaMgpoYQqrhZFtMRUYSCScs2KXjLN6+SZrXiXVSq95el2k2eRxGdoFN0hjx0hWroDtVRA1E0Qc/oFb1ZT9aL9W59LFoLVj5zjP7A+vwBm+KThw=</latexit>Inferring the class latent using strategic data routing Invariant (class) latent is deterministically calculated from “complementary” same-class examples:
8
{x1, x2, . . . , xm} − → ry
<latexit sha1_base64="/R7rH7pCczoJ5nIklzQdq4MNOA=">ACFHicbZDNSgMxFIUz/lv/qi7dBIsgImWmCroU3bhUsLbQmQ6ZNG2DmWRI7mjL0Hdw5aO4EhTErRtXvo1pOwtPRD4OPdebu6JEsENuO63MzM7N7+wuLRcWFldW98obm7dGpVqyqpUCaXrETFMcMmqwEGweqIZiSPBatHdxbBeu2facCVvoJ+wICYdyducErBWDzws17TO+w1K4e+aCkwFmN/gH2hZEfzTheI1uoB67AfFktu2R0JT4OXQwnlugqLX35L0TRmEqgxjQ8N4EgIxo4FWxQ2PNTwxJC70iHNSxKEjMTZKOjBnjPOi3cVto+CXjkFn5NZCQ2ph9HtjMm0DWTtaH5X62RQvs0yLhMUmCSjhe1U4FB4WFCuMU1oyD6FgjV3H4W0y7RhILNsWBT8CZvnobStk7Kleuj0tn53keS2gH7aJ95KETdIYu0RWqIoe0TN6RW/Ok/PivDsf49YZJ5/ZRn/kfP4AZ1qeZQ=</latexit>Inferring the class latent using strategic data routing Invariant (class) latent is deterministically calculated from “complementary” same-class examples: Equivariant (instance) latent is stochastically inferred from both class and datapoint information:
9
{x1, x2, . . . , xm} − → ry
<latexit sha1_base64="/R7rH7pCczoJ5nIklzQdq4MNOA=">ACFHicbZDNSgMxFIUz/lv/qi7dBIsgImWmCroU3bhUsLbQmQ6ZNG2DmWRI7mjL0Hdw5aO4EhTErRtXvo1pOwtPRD4OPdebu6JEsENuO63MzM7N7+wuLRcWFldW98obm7dGpVqyqpUCaXrETFMcMmqwEGweqIZiSPBatHdxbBeu2facCVvoJ+wICYdyducErBWDzws17TO+w1K4e+aCkwFmN/gH2hZEfzTheI1uoB67AfFktu2R0JT4OXQwnlugqLX35L0TRmEqgxjQ8N4EgIxo4FWxQ2PNTwxJC70iHNSxKEjMTZKOjBnjPOi3cVto+CXjkFn5NZCQ2ph9HtjMm0DWTtaH5X62RQvs0yLhMUmCSjhe1U4FB4WFCuMU1oyD6FgjV3H4W0y7RhILNsWBT8CZvnobStk7Kleuj0tn53keS2gH7aJ95KETdIYu0RWqIoe0TN6RW/Ok/PivDsf49YZJ5/ZRn/kfP4AZ1qeZQ=</latexit>Inferring the class latent using strategic data routing Invariant (class) latent is deterministically calculated from “complementary” same-class examples: Equivariant (instance) latent is stochastically inferred from both class and datapoint information:
10
{x1, x2, . . . , xm} − → ry
<latexit sha1_base64="/R7rH7pCczoJ5nIklzQdq4MNOA=">ACFHicbZDNSgMxFIUz/lv/qi7dBIsgImWmCroU3bhUsLbQmQ6ZNG2DmWRI7mjL0Hdw5aO4EhTErRtXvo1pOwtPRD4OPdebu6JEsENuO63MzM7N7+wuLRcWFldW98obm7dGpVqyqpUCaXrETFMcMmqwEGweqIZiSPBatHdxbBeu2facCVvoJ+wICYdyducErBWDzws17TO+w1K4e+aCkwFmN/gH2hZEfzTheI1uoB67AfFktu2R0JT4OXQwnlugqLX35L0TRmEqgxjQ8N4EgIxo4FWxQ2PNTwxJC70iHNSxKEjMTZKOjBnjPOi3cVto+CXjkFn5NZCQ2ph9HtjMm0DWTtaH5X62RQvs0yLhMUmCSjhe1U4FB4WFCuMU1oyD6FgjV3H4W0y7RhILNsWBT8CZvnobStk7Kleuj0tn53keS2gH7aJ95KETdIYu0RWqIoe0TN6RW/Ok/PivDsf49YZJ5/ZRn/kfP4AZ1qeZQ=</latexit>Inspired by GQNs (Eslami et al., 2018)
Invariant-equivariant representation learning
11
12
p
n=1
N
Y
n=1
Z dvn drn pθ(xn|rn, vn) δ
Generative model
ryn = 1 m
m
X
i=1
fθinv(xi)
<latexit sha1_base64="MUL4qZ1jRwrLFQUvrAVhFVT6XuU=">ACInicbVDLSgMxFM34rPVdekmWATdlBkV7EYounGpYFXo1CGT3rHBJDMkd8QyzK+48lNcCQriSvBjTGsXvg4EDufcy805cSaFRd9/9yYmp6ZnZitz1fmFxaXl2srquU1zw6HNU5may5hZkEJDGwVKuMwMBVLuIhvjob+xS0YK1J9hoMuopda5EIztBJUa1pomIQ6ZIe0DAxjAcqtLmKCnEQlFeKJlERYh+QRSHCHRZC35bl1t2V2I5qdb/hj0D/kmBM6mSMk6j2FvZSnivQyCWzthP4GXYLZlBwCWV1M8wtZIzfsGvoOKqZAtstRhFLumUHk1S45GOlKr3zYKpqwdqNhNKoZ9+9sbiv95nRyTZtflynIEzb8OJbmkmNJhX7QnDHCUA0cYN8J9lvI+c02ha7XqWgh+Z/5LzncawW5j53Sv3joc91Eh62SDbJGA7JMWOSYnpE04uSeP5Jm8eA/ek/fqvX2NTnjnTXyA97HJ+adpHo=</latexit>Inference model
qφ(v|ry, x) = N
φ(ry, x)I
Llab = Eq(v|ry,x) log p(x|ry, v) − DKL ⇥ q(v|ry, x)
⇤ + log p(y)
<latexit sha1_base64="hOfV6MjiysYcQuF4cfmSB5y4cM=">ACeXicbVFb9MwFHbCbYRbgce9HCiVWhVsiFtL0jTAmJPQyJbpOaKLJdt7PmXLBPqkZpfgM/bk/8EJ54wUk7LhtHsvX5+8757HPMciUN+v53x71x89btOxt3vXv3Hzx81Hn85NhkheZixDOV6VNGjVAyFSOUqMRprgVNmBIn7Pxdo5/MhTYyS79gmYsobNUTiWnaKm48y1MKJ5xqrDOg5RLBSlNXeW2gFxqoPdVx97c+XOi63FoMaQpXNIO8vYAkNBfMBeK/ht837S5tPhzaXydkY/lS3xLdrMV8dY68V5em5SDudP2h3wZcB8EadMk6juLORTjJeJGIFLmixowDP8eoholV6L2emFhRE75OZ2JsYUpTYSJqnZyNfQsM4Fpu1KEVrW+6uiokxZcJsZtOguao15P+0cYHTvaiSaV6gSPnqomhADNovgEmUguOqrSAci3tY4GfU052s/y7BSCqz1fB8fbw2BnuP35TXf/YD2PDbJnpM+Ccgu2ScfyREZEU5+OJvOC6fn/HSfuX35SrVdY1T8k/4e78AnI7vYQ=</latexit>Lunlab = Eq(y|x) h Eq(v|ry,x) log p(x|ry, v) − DKL ⇥ q(v|ry, x)
⇤i − DKL ⇥ q(y|x)
⇤
<latexit sha1_base64="54+Aip3J5p/ua8uIn2MC3OPjeo=">ACzHichVFdaxQxFM2MX3Va7VYfQkuhS3oMlMFfSmUqlCwSAW3LewMQyab3YZmMmNyZ90hO6/+QZ989ZeYyY7SbQUvJyc+5SW5WCq4hDH96/p279+4/2HgYbG49erzd23lypotKUTaihSjURUY0E1yEXAQ7KJUjOSZYOfZ1btWP58zpXkhv0BdsiQnM8mnBKwVNr7FecELikR5qRJY2ALMJUJGuCAxw4LcvMhyY1Xwf1crHX4PiIz8Z4XZkvVq/wE4WxQyXgwVeYsfN94KX+G+P9396fDyxqVnrdK3aMUu3WYv56py4jsl/XdrbrRvUnUHa64fD0AW+DaIO9FEXp2nvRzwpaJUzCVQrcdRWEJiAJOBWuC3bjSrCT0iszY2EJcqYT4bR4F3LTPC0UHZJwI4NrlUYkmtd5nNbB+kb2ot+S9tXMH0bWK4LCtgkq4aTSuBocDtZPGEK0ZB1BYQqri9LKaXRBEKdv6B/YXo5ptvg7P9YfRquP/5df/wqPuPDfQMPUcDFKE36BAdo1M0QtQ79qT3zVv4n3zwjd+sUn2vq3mK1sL/huXVdzO</latexit>Objective
13
p
n=1
N
Y
n=1
Z dvn drn pθ(xn|rn, vn) δ
Generative model
ryn = 1 m
m
X
i=1
fθinv(xi)
<latexit sha1_base64="MUL4qZ1jRwrLFQUvrAVhFVT6XuU=">ACInicbVDLSgMxFM34rPVdekmWATdlBkV7EYounGpYFXo1CGT3rHBJDMkd8QyzK+48lNcCQriSvBjTGsXvg4EDufcy805cSaFRd9/9yYmp6ZnZitz1fmFxaXl2srquU1zw6HNU5may5hZkEJDGwVKuMwMBVLuIhvjob+xS0YK1J9hoMuopda5EIztBJUa1pomIQ6ZIe0DAxjAcqtLmKCnEQlFeKJlERYh+QRSHCHRZC35bl1t2V2I5qdb/hj0D/kmBM6mSMk6j2FvZSnivQyCWzthP4GXYLZlBwCWV1M8wtZIzfsGvoOKqZAtstRhFLumUHk1S45GOlKr3zYKpqwdqNhNKoZ9+9sbiv95nRyTZtflynIEzb8OJbmkmNJhX7QnDHCUA0cYN8J9lvI+c02ha7XqWgh+Z/5LzncawW5j53Sv3joc91Eh62SDbJGA7JMWOSYnpE04uSeP5Jm8eA/ek/fqvX2NTnjnTXyA97HJ+adpHo=</latexit>Inference model
qφ(v|ry, x) = N
φ(ry, x)I
Llab = Eq(v|ry,x) log p(x|ry, v) − DKL ⇥ q(v|ry, x)
⇤ + log p(y)
<latexit sha1_base64="hOfV6MjiysYcQuF4cfmSB5y4cM=">ACeXicbVFb9MwFHbCbYRbgce9HCiVWhVsiFtL0jTAmJPQyJbpOaKLJdt7PmXLBPqkZpfgM/bk/8EJ54wUk7LhtHsvX5+8757HPMciUN+v53x71x89btOxt3vXv3Hzx81Hn85NhkheZixDOV6VNGjVAyFSOUqMRprgVNmBIn7Pxdo5/MhTYyS79gmYsobNUTiWnaKm48y1MKJ5xqrDOg5RLBSlNXeW2gFxqoPdVx97c+XOi63FoMaQpXNIO8vYAkNBfMBeK/ht837S5tPhzaXydkY/lS3xLdrMV8dY68V5em5SDudP2h3wZcB8EadMk6juLORTjJeJGIFLmixowDP8eoholV6L2emFhRE75OZ2JsYUpTYSJqnZyNfQsM4Fpu1KEVrW+6uiokxZcJsZtOguao15P+0cYHTvaiSaV6gSPnqomhADNovgEmUguOqrSAci3tY4GfU052s/y7BSCqz1fB8fbw2BnuP35TXf/YD2PDbJnpM+Ccgu2ScfyREZEU5+OJvOC6fn/HSfuX35SrVdY1T8k/4e78AnI7vYQ=</latexit>Lunlab = Eq(y|x) h Eq(v|ry,x) log p(x|ry, v) − DKL ⇥ q(v|ry, x)
⇤i − DKL ⇥ q(y|x)
⇤
<latexit sha1_base64="54+Aip3J5p/ua8uIn2MC3OPjeo=">ACzHichVFdaxQxFM2MX3Va7VYfQkuhS3oMlMFfSmUqlCwSAW3LewMQyab3YZmMmNyZ90hO6/+QZ989ZeYyY7SbQUvJyc+5SW5WCq4hDH96/p279+4/2HgYbG49erzd23lypotKUTaihSjURUY0E1yEXAQ7KJUjOSZYOfZ1btWP58zpXkhv0BdsiQnM8mnBKwVNr7FecELikR5qRJY2ALMJUJGuCAxw4LcvMhyY1Xwf1crHX4PiIz8Z4XZkvVq/wE4WxQyXgwVeYsfN94KX+G+P9396fDyxqVnrdK3aMUu3WYv56py4jsl/XdrbrRvUnUHa64fD0AW+DaIO9FEXp2nvRzwpaJUzCVQrcdRWEJiAJOBWuC3bjSrCT0iszY2EJcqYT4bR4F3LTPC0UHZJwI4NrlUYkmtd5nNbB+kb2ot+S9tXMH0bWK4LCtgkq4aTSuBocDtZPGEK0ZB1BYQqri9LKaXRBEKdv6B/YXo5ptvg7P9YfRquP/5df/wqPuPDfQMPUcDFKE36BAdo1M0QtQ79qT3zVv4n3zwjd+sUn2vq3mK1sL/huXVdzO</latexit>Objective Generative model with 2 latent variables
14
p
n=1
N
Y
n=1
Z dvn drn pθ(xn|rn, vn) δ
Generative model
ryn = 1 m
m
X
i=1
fθinv(xi)
<latexit sha1_base64="MUL4qZ1jRwrLFQUvrAVhFVT6XuU=">ACInicbVDLSgMxFM34rPVdekmWATdlBkV7EYounGpYFXo1CGT3rHBJDMkd8QyzK+48lNcCQriSvBjTGsXvg4EDufcy805cSaFRd9/9yYmp6ZnZitz1fmFxaXl2srquU1zw6HNU5may5hZkEJDGwVKuMwMBVLuIhvjob+xS0YK1J9hoMuopda5EIztBJUa1pomIQ6ZIe0DAxjAcqtLmKCnEQlFeKJlERYh+QRSHCHRZC35bl1t2V2I5qdb/hj0D/kmBM6mSMk6j2FvZSnivQyCWzthP4GXYLZlBwCWV1M8wtZIzfsGvoOKqZAtstRhFLumUHk1S45GOlKr3zYKpqwdqNhNKoZ9+9sbiv95nRyTZtflynIEzb8OJbmkmNJhX7QnDHCUA0cYN8J9lvI+c02ha7XqWgh+Z/5LzncawW5j53Sv3joc91Eh62SDbJGA7JMWOSYnpE04uSeP5Jm8eA/ek/fqvX2NTnjnTXyA97HJ+adpHo=</latexit>Inference model
qφ(v|ry, x) = N
φ(ry, x)I
Llab = Eq(v|ry,x) log p(x|ry, v) − DKL ⇥ q(v|ry, x)
⇤ + log p(y)
<latexit sha1_base64="hOfV6MjiysYcQuF4cfmSB5y4cM=">ACeXicbVFb9MwFHbCbYRbgce9HCiVWhVsiFtL0jTAmJPQyJbpOaKLJdt7PmXLBPqkZpfgM/bk/8EJ54wUk7LhtHsvX5+8757HPMciUN+v53x71x89btOxt3vXv3Hzx81Hn85NhkheZixDOV6VNGjVAyFSOUqMRprgVNmBIn7Pxdo5/MhTYyS79gmYsobNUTiWnaKm48y1MKJ5xqrDOg5RLBSlNXeW2gFxqoPdVx97c+XOi63FoMaQpXNIO8vYAkNBfMBeK/ht837S5tPhzaXydkY/lS3xLdrMV8dY68V5em5SDudP2h3wZcB8EadMk6juLORTjJeJGIFLmixowDP8eoholV6L2emFhRE75OZ2JsYUpTYSJqnZyNfQsM4Fpu1KEVrW+6uiokxZcJsZtOguao15P+0cYHTvaiSaV6gSPnqomhADNovgEmUguOqrSAci3tY4GfU052s/y7BSCqz1fB8fbw2BnuP35TXf/YD2PDbJnpM+Ccgu2ScfyREZEU5+OJvOC6fn/HSfuX35SrVdY1T8k/4e78AnI7vYQ=</latexit>Lunlab = Eq(y|x) h Eq(v|ry,x) log p(x|ry, v) − DKL ⇥ q(v|ry, x)
⇤i − DKL ⇥ q(y|x)
⇤
<latexit sha1_base64="54+Aip3J5p/ua8uIn2MC3OPjeo=">ACzHichVFdaxQxFM2MX3Va7VYfQkuhS3oMlMFfSmUqlCwSAW3LewMQyab3YZmMmNyZ90hO6/+QZ989ZeYyY7SbQUvJyc+5SW5WCq4hDH96/p279+4/2HgYbG49erzd23lypotKUTaihSjURUY0E1yEXAQ7KJUjOSZYOfZ1btWP58zpXkhv0BdsiQnM8mnBKwVNr7FecELikR5qRJY2ALMJUJGuCAxw4LcvMhyY1Xwf1crHX4PiIz8Z4XZkvVq/wE4WxQyXgwVeYsfN94KX+G+P9396fDyxqVnrdK3aMUu3WYv56py4jsl/XdrbrRvUnUHa64fD0AW+DaIO9FEXp2nvRzwpaJUzCVQrcdRWEJiAJOBWuC3bjSrCT0iszY2EJcqYT4bR4F3LTPC0UHZJwI4NrlUYkmtd5nNbB+kb2ot+S9tXMH0bWK4LCtgkq4aTSuBocDtZPGEK0ZB1BYQqri9LKaXRBEKdv6B/YXo5ptvg7P9YfRquP/5df/wqPuPDfQMPUcDFKE36BAdo1M0QtQ79qT3zVv4n3zwjd+sUn2vq3mK1sL/huXVdzO</latexit>Objective Generative model with 2 latent variables Deterministic, from same- class complementary data
15
p
n=1
N
Y
n=1
Z dvn drn pθ(xn|rn, vn) δ
Generative model
ryn = 1 m
m
X
i=1
fθinv(xi)
<latexit sha1_base64="MUL4qZ1jRwrLFQUvrAVhFVT6XuU=">ACInicbVDLSgMxFM34rPVdekmWATdlBkV7EYounGpYFXo1CGT3rHBJDMkd8QyzK+48lNcCQriSvBjTGsXvg4EDufcy805cSaFRd9/9yYmp6ZnZitz1fmFxaXl2srquU1zw6HNU5may5hZkEJDGwVKuMwMBVLuIhvjob+xS0YK1J9hoMuopda5EIztBJUa1pomIQ6ZIe0DAxjAcqtLmKCnEQlFeKJlERYh+QRSHCHRZC35bl1t2V2I5qdb/hj0D/kmBM6mSMk6j2FvZSnivQyCWzthP4GXYLZlBwCWV1M8wtZIzfsGvoOKqZAtstRhFLumUHk1S45GOlKr3zYKpqwdqNhNKoZ9+9sbiv95nRyTZtflynIEzb8OJbmkmNJhX7QnDHCUA0cYN8J9lvI+c02ha7XqWgh+Z/5LzncawW5j53Sv3joc91Eh62SDbJGA7JMWOSYnpE04uSeP5Jm8eA/ek/fqvX2NTnjnTXyA97HJ+adpHo=</latexit>Inference model
qφ(v|ry, x) = N
φ(ry, x)I
Llab = Eq(v|ry,x) log p(x|ry, v) − DKL ⇥ q(v|ry, x)
⇤ + log p(y)
<latexit sha1_base64="hOfV6MjiysYcQuF4cfmSB5y4cM=">ACeXicbVFb9MwFHbCbYRbgce9HCiVWhVsiFtL0jTAmJPQyJbpOaKLJdt7PmXLBPqkZpfgM/bk/8EJ54wUk7LhtHsvX5+8757HPMciUN+v53x71x89btOxt3vXv3Hzx81Hn85NhkheZixDOV6VNGjVAyFSOUqMRprgVNmBIn7Pxdo5/MhTYyS79gmYsobNUTiWnaKm48y1MKJ5xqrDOg5RLBSlNXeW2gFxqoPdVx97c+XOi63FoMaQpXNIO8vYAkNBfMBeK/ht837S5tPhzaXydkY/lS3xLdrMV8dY68V5em5SDudP2h3wZcB8EadMk6juLORTjJeJGIFLmixowDP8eoholV6L2emFhRE75OZ2JsYUpTYSJqnZyNfQsM4Fpu1KEVrW+6uiokxZcJsZtOguao15P+0cYHTvaiSaV6gSPnqomhADNovgEmUguOqrSAci3tY4GfU052s/y7BSCqz1fB8fbw2BnuP35TXf/YD2PDbJnpM+Ccgu2ScfyREZEU5+OJvOC6fn/HSfuX35SrVdY1T8k/4e78AnI7vYQ=</latexit>Lunlab = Eq(y|x) h Eq(v|ry,x) log p(x|ry, v) − DKL ⇥ q(v|ry, x)
⇤i − DKL ⇥ q(y|x)
⇤
<latexit sha1_base64="54+Aip3J5p/ua8uIn2MC3OPjeo=">ACzHichVFdaxQxFM2MX3Va7VYfQkuhS3oMlMFfSmUqlCwSAW3LewMQyab3YZmMmNyZ90hO6/+QZ989ZeYyY7SbQUvJyc+5SW5WCq4hDH96/p279+4/2HgYbG49erzd23lypotKUTaihSjURUY0E1yEXAQ7KJUjOSZYOfZ1btWP58zpXkhv0BdsiQnM8mnBKwVNr7FecELikR5qRJY2ALMJUJGuCAxw4LcvMhyY1Xwf1crHX4PiIz8Z4XZkvVq/wE4WxQyXgwVeYsfN94KX+G+P9396fDyxqVnrdK3aMUu3WYv56py4jsl/XdrbrRvUnUHa64fD0AW+DaIO9FEXp2nvRzwpaJUzCVQrcdRWEJiAJOBWuC3bjSrCT0iszY2EJcqYT4bR4F3LTPC0UHZJwI4NrlUYkmtd5nNbB+kb2ot+S9tXMH0bWK4LCtgkq4aTSuBocDtZPGEK0ZB1BYQqri9LKaXRBEKdv6B/YXo5ptvg7P9YfRquP/5df/wqPuPDfQMPUcDFKE36BAdo1M0QtQ79qT3zVv4n3zwjd+sUn2vq3mK1sL/huXVdzO</latexit>Objective Generative model with 2 latent variables Deterministic, from same- class complementary data Standard VAE
16
p
n=1
N
Y
n=1
Z dvn drn pθ(xn|rn, vn) δ
Generative model
ryn = 1 m
m
X
i=1
fθinv(xi)
<latexit sha1_base64="MUL4qZ1jRwrLFQUvrAVhFVT6XuU=">ACInicbVDLSgMxFM34rPVdekmWATdlBkV7EYounGpYFXo1CGT3rHBJDMkd8QyzK+48lNcCQriSvBjTGsXvg4EDufcy805cSaFRd9/9yYmp6ZnZitz1fmFxaXl2srquU1zw6HNU5may5hZkEJDGwVKuMwMBVLuIhvjob+xS0YK1J9hoMuopda5EIztBJUa1pomIQ6ZIe0DAxjAcqtLmKCnEQlFeKJlERYh+QRSHCHRZC35bl1t2V2I5qdb/hj0D/kmBM6mSMk6j2FvZSnivQyCWzthP4GXYLZlBwCWV1M8wtZIzfsGvoOKqZAtstRhFLumUHk1S45GOlKr3zYKpqwdqNhNKoZ9+9sbiv95nRyTZtflynIEzb8OJbmkmNJhX7QnDHCUA0cYN8J9lvI+c02ha7XqWgh+Z/5LzncawW5j53Sv3joc91Eh62SDbJGA7JMWOSYnpE04uSeP5Jm8eA/ek/fqvX2NTnjnTXyA97HJ+adpHo=</latexit>Inference model
qφ(v|ry, x) = N
φ(ry, x)I
Llab = Eq(v|ry,x) log p(x|ry, v) − DKL ⇥ q(v|ry, x)
⇤ + log p(y)
<latexit sha1_base64="hOfV6MjiysYcQuF4cfmSB5y4cM=">ACeXicbVFb9MwFHbCbYRbgce9HCiVWhVsiFtL0jTAmJPQyJbpOaKLJdt7PmXLBPqkZpfgM/bk/8EJ54wUk7LhtHsvX5+8757HPMciUN+v53x71x89btOxt3vXv3Hzx81Hn85NhkheZixDOV6VNGjVAyFSOUqMRprgVNmBIn7Pxdo5/MhTYyS79gmYsobNUTiWnaKm48y1MKJ5xqrDOg5RLBSlNXeW2gFxqoPdVx97c+XOi63FoMaQpXNIO8vYAkNBfMBeK/ht837S5tPhzaXydkY/lS3xLdrMV8dY68V5em5SDudP2h3wZcB8EadMk6juLORTjJeJGIFLmixowDP8eoholV6L2emFhRE75OZ2JsYUpTYSJqnZyNfQsM4Fpu1KEVrW+6uiokxZcJsZtOguao15P+0cYHTvaiSaV6gSPnqomhADNovgEmUguOqrSAci3tY4GfU052s/y7BSCqz1fB8fbw2BnuP35TXf/YD2PDbJnpM+Ccgu2ScfyREZEU5+OJvOC6fn/HSfuX35SrVdY1T8k/4e78AnI7vYQ=</latexit>Lunlab = Eq(y|x) h Eq(v|ry,x) log p(x|ry, v) − DKL ⇥ q(v|ry, x)
⇤i − DKL ⇥ q(y|x)
⇤
<latexit sha1_base64="54+Aip3J5p/ua8uIn2MC3OPjeo=">ACzHichVFdaxQxFM2MX3Va7VYfQkuhS3oMlMFfSmUqlCwSAW3LewMQyab3YZmMmNyZ90hO6/+QZ989ZeYyY7SbQUvJyc+5SW5WCq4hDH96/p279+4/2HgYbG49erzd23lypotKUTaihSjURUY0E1yEXAQ7KJUjOSZYOfZ1btWP58zpXkhv0BdsiQnM8mnBKwVNr7FecELikR5qRJY2ALMJUJGuCAxw4LcvMhyY1Xwf1crHX4PiIz8Z4XZkvVq/wE4WxQyXgwVeYsfN94KX+G+P9396fDyxqVnrdK3aMUu3WYv56py4jsl/XdrbrRvUnUHa64fD0AW+DaIO9FEXp2nvRzwpaJUzCVQrcdRWEJiAJOBWuC3bjSrCT0iszY2EJcqYT4bR4F3LTPC0UHZJwI4NrlUYkmtd5nNbB+kb2ot+S9tXMH0bWK4LCtgkq4aTSuBocDtZPGEK0ZB1BYQqri9LKaXRBEKdv6B/YXo5ptvg7P9YfRquP/5df/wqPuPDfQMPUcDFKE36BAdo1M0QtQ79qT3zVv4n3zwjd+sUn2vq3mK1sL/huXVdzO</latexit>Objective Generative model with 2 latent variables Deterministic, from same- class complementary data Standard VAE Standard ELBOs
17
Invariant-equivariant representation learning
18
19
Latent spaces disentangle The invariant latent learns to separate the classes The equivariant latent learns to ignore class information
20
Samples from each class Equivariant interpolations, for multiple invariant latents Invariant steps, for multiple equivariant latents
21
Samples from each class Equivariant interpolations, for multiple invariant latents Invariant steps, for multiple equivariant latents
Semi supervised (Using label-inference distribution)
22
Fully supervised (Using 0-parameter distance to nearest )
ry
<latexit sha1_base64="HBR01F8pV72qJ7S0Eq0QpicXOZA=">AB63icbVBNS8NAEJ3Urxq/qh69LJaCp5JUQY9FLx4r2lpoQ9lsN+3S3U3Y3Qgh9Cd4EhTEq7/Ik/GbZuDtj4YeLw3w8y8MOFMG8/7dkpr6xubW+Vtd2d3b/+gcnjU0XGqCG2TmMeqG2JNOZO0bZjhtJsoikXI6WM4uZn5j09UaRbLB5MlNB4JFnECDZWuleDbFCpenVvDrRK/IJUoUBrUPnqD2OSCioN4Vjrnu8lJsixMoxwOnVr/VTBJMJHtGepRILqoN8fusU1awyRFGsbEmD5qr7ayLHQutMhLZTYDPWy95M/M/rpSa6CnImk9RQSRaLopQjE6PZ42jIFCWGZ5Zgopg9FpExVpgYG49rU/CXf14lnUbdP6837i6qzesijzKcwCmcgQ+X0IRbaEbCIzgGV7hzRHOi/PufCxaS04xcwx/4Hz+ANb4jhE=</latexit>Benchmark is always the equivalent network, +2 dropout layers, trained to classify the labelled data
Invariant-equivariant representation learning
23
Advantages
C.Reasonably intuitive D.Performs similarly to comparable approaches At Faculty, we are implementing this technique into our generic
24
Disadvantages
info@faculty.ai +44 20 3637 9415 54 Welbeck St, Marylebone, London W1G 9XS, UK Follow us:
If you’re interested in finding out more about this work, or Faculty in general, please get in touch!
25
ilya@faculty.ai Director of AI