SLIDE 5 Contributions
Novel Search Strategy for One-shot NAS
- 1. arbitrary search space (categorical + ordinal)
- 2. robust against its inputs (hyper-param. and search space)
Our approach
max
w,c f(w, c) ⇒ max w,θ J(w, θ) :=
Z f(w, c)p(c | θ)dc
<latexit sha1_base64="ei1X4bv0b/x2an2mpDxfz7crXD4=">AC93icbVLdbtMwFHbC3xb+OrjkxmJC6iRUJZsQ0yS0CW4QVwPRbVJdVY7jtFbtOLJPVqrgh+AJuEPc8jwDrwDTlqdeuRLH8+5zvy8fc5LaWwEMe/g/DW7Tt3721tR/cfPHz0uLPz5MzqyjDeZ1pqc5FSy6UoeB8ESH5RGk5VKvl5On3X1M8vubFCF59hXvKhouNC5IJR8KlRB4iX0Y1mc1eYsKYw3n3P97DEfkxhOgxuiZP1xhAoDH7qrwx4+eoOJKGCtv+z6jSiRLTkZSVXNXDTq7Ma9uA18EyRLsHty3Ht1/PXvt9PRTjAlmWaV4gUwSa0dJHEJw5oaExyF5HK8pKyKR3zgYcFVdwO61Yeh1/4TIZzbfzyE7bZqx01VdbOVeqZisLEXq81yU21QX54bAWRVkBL9jiorySGDRutMaZMJyBnHtAmRF+Vswm1FAG3pG1W1Ll31AafSkyzrRStMgapV1dk3bcOpWe6Rr9Zs5t4HrVlzDM7dQehOzsW5F9b/BS0pbOoEJ9A5b09y3Yyb4Gy/lxz09j96n96iRWyhZ+g56qIEvUYn6D06RX3E0J8ABdtBFM7D7+GP8OeCGgbLnqdoLcJf/wAWwfXD</latexit>
exponential family differentiable w.r.t. w and θ
- 2. Stochastic Natural Gradient + Adaptive Step-Size
wt+1 = wt + ✏t
w
\ rwJ(wt, θt) θt+1 = θt + ✏t
θF(θt)−1
\ rθJ(wt+1, θt)
<latexit sha1_base64="HLe5ikCOIzFH1404FMTVwWAgKc=">ADSnicbVFdb9MwFHWyAaN8dfDIi0UFKhqUZCDBC9IEkI8DYluk+q2chxntWo7kX2zqrLyA5H4A/wN3hAvOGkY69orWbo+95x7XuSQgoLUfQzCHd2b9y8tXe7c+fuvfsPuvsPT2xeGsaHLJe5OUuo5VJoPgQBkp8VhlOVSH6azD/W9dMLbqzI9TdYFnys6LkWmWAUPDTtfieLxcTBQVzhZ+/x6lLhA0x4YX0DOexagK+JFI+o+CIpomkKx/6beSF5gAQJ09rwjptJd/bdvSel8PNo0VhVmSuU9V/7LFxL30ys2RteRypm9+deq024sGURN4M4nbpIfaOJ7uB4KkOSsV18AktXYURwWMHTUgmORVh5SWF5TN6Tkf+VRTxe3YNTuv8FOPpDjLjT8acINeVTiqrF2qxDPrD9rtRrcVhuVkL0bO6GLErhmq0FZKTHkuDYQp8JwBnLpE8qM8G/FbEYNZeBtXpuSKP+HwuQXfo0sV4rqtHNOdI81/mlsnlFEuUWVbWFy9h/ruFpw2RbmbUxW9oSmHGgXuHNia9bsZmcHA7i14PDr296Rx9am/bQY/QE9VGM3qIj9BkdoyFiwatgGEyCafgj/BX+Dv+sqGHQah6htdjZ/Qt0fBdr</latexit>
Natural Gradient
J(wt, θt) < J(wt+1, θt) < J(wt+1, θt+1)
<latexit sha1_base64="y6/6GQgAGnSQUJ1h9XWy4NyciY=">ACyXicjVHLThsxFHWGtlBoS4AlGwtUCUQVzdAFLFhEdIPKhkoNIGVC5PHcEAs/pvYdQmrNil/od/RrumHRfkudCVULZNErWTo651zfV1ZI4TCO7xrR3LPnL+YXi4uvXr9Zrm5snrqTGk5dLiRxp5nzIEUGjoUMJ5YGpTMJZdvVhop9dg3XC6M84LqCn2KUWA8EZBqrfPq4lY5GFx6rdzRFxAnapgf0D72T/IcQ8Ha/uRm34jroU5Dcg832Rrz7a49PumvNESaG14q0Mglc6bxAX2PLMouIRqMS0dFIxfsUvoBqiZAtfz9cgVfRuYnA6MDU8jrdl/MzxTzo1VFpyK4dA91ibkLK1b4mC/54UuSgTNp4UGpaRo6GR/NBcWOMpxAIxbEXqlfMgs4xi2/KBKpsIMhTXIgdulGI692F3lfdp3a7PZHBWab8qKpmeDn/67WQ104+0xnOMOvbFIeALGSE4ySPT/EUnO62kvet3U/hSodkGgtknWyQLZKQPdImR+SEdAgn38kP8pP8io6jL9FN9HVqjRr3OWvkQUS3vwEFZe4</latexit>
Monotone Improvement
Under appropriate step-size
5