safe exploration for interactive machine learning
play

Safe Exploration for Interactive Machine Learning Matteo Turchetta, - PowerPoint PPT Presentation

Safe Exploration for Interactive Machine Learning Matteo Turchetta, Felix Berkenkamp, Andreas Krause <latexit


  1. Safe Exploration for Interactive Machine Learning Matteo Turchetta, Felix Berkenkamp, Andreas Krause

  2. <latexit sha1_base64="6dFmwAauOuGIL1koQpOLW+AHK4o=">AB7XicbVBNSwMxEJ2tX7V+VT16CRahIpTdKuix6MVjBfsB7VKyabaNZpMlyapl6X/w4kERr/4fb/4b03YP2vpg4PHeDPzgpgzbVz328ktLa+sruXCxubW9s7xd29paJIrRBJeqHWBNORO0YZjhtB0riqOA01ZwfzXxWw9UaSbFrRnF1I/wQLCQEWys1AzLT8cnj71iya24U6BF4mWkBnqveJXty9JElFhCMdadzw3Nn6KlWGE03Ghm2gaY3KPB7RjqcAR1X46vXaMjqzSR6FUtoRBU/X3RIojrUdRYDsjbIZ63puI/3mdxIQXfspEnBgqyGxRmHBkJq8jvpMUWL4yBJMFLO3IjLEChNjAyrYELz5lxdJs1rxTivVm7NS7TKLIw8HcAhl8OAcanANdWgAgTt4hld4c6Tz4rw7H7PWnJPN7MfOJ8/r6Oiw=</latexit> <latexit sha1_base64="hL+FaLtOT9luwfLW3Ut08xl3Pcw=">AB6HicbVDLTgJBEOzF+IL9ehlIjHxRHbRI9ELx4hkUcCGzI79MLI7OxmZtZICF/gxYPGePWTvPk3DrAHBSvpFLVne6uIBFcG9f9dnJr6xubW/ntws7u3v5B8fCoqeNUMWywWMSqHVCNgktsG4EthOFNAoEtoLR7cxvPaLSPJb3ZpygH9GB5CFn1Fip/tQrltyOwdZJV5GSpCh1it+dfsxSyOUhgmqdcdzE+NPqDKcCZwWuqnGhLIRHWDHUkj1P5kfuiUnFmlT8JY2ZKGzNXfExMaT2OAtsZUTPUy95M/M/rpCa89idcJqlByRaLwlQE5PZ16TPFTIjxpZQpri9lbAhVZQZm03BhuAtv7xKmpWyd1Gu1C9L1ZsjycwCmcgwdXUIU7qEDGCA8wyu8OQ/Oi/PufCxac042cwx/4Hz+AOeHjQA=</latexit> Interactive Machine Learning • Agent can query no noisy sy values of an unk unkno nown n function • Use data to make inform rmed queries x f ( x ) + w • Available queries may depend from previous ones: model dependency with directed graph • Includes: Bayesian optimization, active learning and exploration of deterministic Markov decision processes Icon made by Freepik, Good Ware from www.flaticon.com Matteo Turchetta

  3. <latexit sha1_base64="hL+FaLtOT9luwfLW3Ut08xl3Pcw=">AB6HicbVDLTgJBEOzF+IL9ehlIjHxRHbRI9ELx4hkUcCGzI79MLI7OxmZtZICF/gxYPGePWTvPk3DrAHBSvpFLVne6uIBFcG9f9dnJr6xubW/ntws7u3v5B8fCoqeNUMWywWMSqHVCNgktsG4EthOFNAoEtoLR7cxvPaLSPJb3ZpygH9GB5CFn1Fip/tQrltyOwdZJV5GSpCh1it+dfsxSyOUhgmqdcdzE+NPqDKcCZwWuqnGhLIRHWDHUkj1P5kfuiUnFmlT8JY2ZKGzNXfExMaT2OAtsZUTPUy95M/M/rpCa89idcJqlByRaLwlQE5PZ16TPFTIjxpZQpri9lbAhVZQZm03BhuAtv7xKmpWyd1Gu1C9L1ZsjycwCmcgwdXUIU7qEDGCA8wyu8OQ/Oi/PufCxac042cwx/4Hz+AOeHjQA=</latexit> <latexit sha1_base64="CtSaWxXLG+cQl1A/lMlyUODJ8dM=">AB7nicbVA9SwNBEJ2LXzF+RS1tFoMQm3AXBS0sgjaWEcwHJEfY2+wlS/Z2z909MRz5ETYWitj6e+z8N26SKzTxwcDjvRlm5gUxZ9q47reTW1ldW9/Ibxa2tnd294r7B0tE0Vog0guVTvAmnImaMw2k7VhRHAaetYHQz9VuPVGkmxb0Zx9SP8ECwkBFsrNR6KD+dXiG3Vy5FXcGtEy8jJQgQ71X/Or2JUkiKgzhWOuO58bGT7EyjHA6KXQTWNMRnhAO5YKHFHtp7NzJ+jEKn0USmVLGDRTf0+kONJ6HAW2M8JmqBe9qfif10lMeOmnTMSJoYLMF4UJR0ai6e+ozxQlho8twUQxeysiQ6wMTahg3BW3x5mTSrFe+sUr07L9WuszjycATHUAYPLqAGt1CHBhAYwTO8wpsTOy/Ou/Mxb8052cwh/IHz+QPFw46K</latexit> <latexit sha1_base64="1WDpB7YDdQ1gSi87g0UxOTPeTV4=">AB8XicbVDLSgNBEOz1GeMr6tHLYBDiJexGQY9BLx4jmAcmS5idJIhs7ObmVkxLPkLx4U8erfePNvnDwOmljQUFR1090VxIJr47rfzsrq2vrGZmYru72zu7efOzis6ShRDKsEpFqBFSj4BKrhuBjVghDQOB9WBwM/Hrj6g0j+S9GcXoh7QneZczaqz0MCw8nbV6OCRuO5d3i+4UZJl4c5KHOSrt3FerE7EkRGmYoFo3PTc2fkqV4UzgONtKNMaUDWgPm5ZKGqL20+nFY3JqlQ7pRsqWNGSq/p5Iaj1KAxsZ0hNXy96E/E/r5mY7pWfchknBiWbLeomgpiITN4nHa6QGTGyhDLF7a2E9amizNiQsjYEb/HlZVIrFb3zYunuIl+nseRgWM4gQJ4cAluIUKVIGBhGd4hTdHOy/Ou/Mxa1x5jNH8AfO5w9OaZAF</latexit> <latexit sha1_base64="6dFmwAauOuGIL1koQpOLW+AHK4o=">AB7XicbVBNSwMxEJ2tX7V+VT16CRahIpTdKuix6MVjBfsB7VKyabaNZpMlyapl6X/w4kERr/4fb/4b03YP2vpg4PHeDPzgpgzbVz328ktLa+sruXCxubW9s7xd29paJIrRBJeqHWBNORO0YZjhtB0riqOA01ZwfzXxWw9UaSbFrRnF1I/wQLCQEWys1AzLT8cnj71iya24U6BF4mWkBnqveJXty9JElFhCMdadzw3Nn6KlWGE03Ghm2gaY3KPB7RjqcAR1X46vXaMjqzSR6FUtoRBU/X3RIojrUdRYDsjbIZ63puI/3mdxIQXfspEnBgqyGxRmHBkJq8jvpMUWL4yBJMFLO3IjLEChNjAyrYELz5lxdJs1rxTivVm7NS7TKLIw8HcAhl8OAcanANdWgAgTt4hld4c6Tz4rw7H7PWnJPN7MfOJ8/r6Oiw=</latexit> Safety constrained interactive machine learning f ( x ) + w q ( x ) ≥ 0 Unknown safety constrai raint q(x)>0 that must be satisfied at al all ti times x q ( x ) < 0 Encompasses many problems Therapy design Mars exploration Model free RL [Sui et al. 2015], [Turchetta et al. 2016], [Berkenkamp et al. 2016] [Sui et al. 2018] [Wachi et al. 2018] Icon made by Smashicons from www.flaticon.com Matteo Turchetta

  4. <latexit sha1_base64="vhcSZKB5REty1fMyhkxMk/A5FwM=">AB8nicbVDLSgMxFL1TX7W+qi7dBIvgqsxUQZdFXbisYB8wHUomzbShmWRIMkIZ+hluXCji1q9x59+YaWehrQcCh3PuJeMOFMG9f9dkpr6xubW+Xtys7u3v5B9fCo2WqCG0TyaXqhVhTzgRtG2Y47SWK4jktBtObnO/+0SVZlI8mlCgxiPBIsYwcZKfj/GZkwz+5mg2rNrbtzoFXiFaQGBVqD6ld/KEkaU2EIx1r7npuYIMPKMLprNJPNU0wmeAR9S0VOKY6yOaRZ+jMKkMUSWfMGiu/t7IcKz1NA7tZB5RL3u5+J/npya6DjImktRQRYfRSlHRqL8fjRkihLDp5ZgopjNisgYK0yMbaliS/CWT14lnUbdu6g3Hi5rzZuijKcwCmcgwdX0IR7aEbCEh4hld4c4z4rw7H4vRklPsHMfOJ8/diuRXg=</latexit> <latexit sha1_base64="PkvxSbIJAC9+r2/QYT+PqTqc0Ag=">AB6nicbVBNS8NAEJ34WetX1aOXxSJ4KkV9Fj0oMeK9gPaUDbTbt0swm7E6GE/gQvHhTx6i/y5r9x2+agrQ8GHu/NMDMvSKQw6Lrfzsrq2vrGZmGruL2zu7dfOjhsmjVjDdYLGPdDqjhUijeQIGStxPNaRI3gpGN1O/9cS1EbF6xHC/YgOlAgFo2ilh9se9kplt+LOQJaJl5My5Kj3Sl/dfszSiCtkhrT8dwE/YxqFEzySbGbGp5QNqID3rFU0YgbP5udOiGnVumTMNa2FJKZ+nsio5Ex4yiwnRHFoVn0puJ/XifF8MrPhEpS5IrNF4WpJBiT6d+kLzRnKMeWUKaFvZWwIdWUoU2naEPwFl9eJs1qxTuvVO8vyrXrPI4CHMJnIEHl1CDO6hDAxgM4Ble4c2Rzovz7nzMW1ecfOYI/sD5/AEn5I2</latexit> <latexit sha1_base64="PkvxSbIJAC9+r2/QYT+PqTqc0Ag=">AB6nicbVBNS8NAEJ34WetX1aOXxSJ4KkV9Fj0oMeK9gPaUDbTbt0swm7E6GE/gQvHhTx6i/y5r9x2+agrQ8GHu/NMDMvSKQw6Lrfzsrq2vrGZmGruL2zu7dfOjhsmjVjDdYLGPdDqjhUijeQIGStxPNaRI3gpGN1O/9cS1EbF6xHC/YgOlAgFo2ilh9se9kplt+LOQJaJl5My5Kj3Sl/dfszSiCtkhrT8dwE/YxqFEzySbGbGp5QNqID3rFU0YgbP5udOiGnVumTMNa2FJKZ+nsio5Ex4yiwnRHFoVn0puJ/XifF8MrPhEpS5IrNF4WpJBiT6d+kLzRnKMeWUKaFvZWwIdWUoU2naEPwFl9eJs1qxTuvVO8vyrXrPI4CHMJnIEHl1CDO6hDAxgM4Ble4c2Rzovz7nzMW1ecfOYI/sD5/AEn5I2</latexit> <latexit sha1_base64="iTYg1Ft2L5xMtjPOc3OrWXcTUmE=">AB+XicbVDLSsNAFJ3UV62vqEs3wSK4KkVdFl047KifUAbw2Q6aYdOZsLMTaGE/okbF4q49U/c+TdO2iy09cDA4Zx7uHdOmHCmwXW/rdLa+sbmVnm7srO7t39gHx61tUwVoS0iuVTdEGvKmaAtYMBpN1EUxyGnXB8m/udCVWaSfEI04T6MR4KFjGCwUiBbfelsfN09jB7SgI7Kpbc+dwVolXkCoq0Azsr/5AkjSmAgjHWvc8NwE/woY4XRW6aeaJpiM8ZD2DBU4ptrP5pfPnDOjDJxIKvMEOHP1dyLDsdbTODSTMYaRXvZy8T+vl0J07WdMJClQRaLopQ7IJ28BmfAFCXAp4Zgopi51SEjrDABU1bFlOAtf3mVtOs176JWv7+sNm6KOsroBJ2ic+ShK9RAd6iJWoigCXpGr+jNyqwX6936WIyWrCJzjP7A+vwBIxyT+w=</latexit> <latexit sha1_base64="iTYg1Ft2L5xMtjPOc3OrWXcTUmE=">AB+XicbVDLSsNAFJ3UV62vqEs3wSK4KkVdFl047KifUAbw2Q6aYdOZsLMTaGE/okbF4q49U/c+TdO2iy09cDA4Zx7uHdOmHCmwXW/rdLa+sbmVnm7srO7t39gHx61tUwVoS0iuVTdEGvKmaAtYMBpN1EUxyGnXB8m/udCVWaSfEI04T6MR4KFjGCwUiBbfelsfN09jB7SgI7Kpbc+dwVolXkCoq0Azsr/5AkjSmAgjHWvc8NwE/woY4XRW6aeaJpiM8ZD2DBU4ptrP5pfPnDOjDJxIKvMEOHP1dyLDsdbTODSTMYaRXvZy8T+vl0J07WdMJClQRaLopQ7IJ28BmfAFCXAp4Zgopi51SEjrDABU1bFlOAtf3mVtOs176JWv7+sNm6KOsroBJ2ic+ShK9RAd6iJWoigCXpGr+jNyqwX6936WIyWrCJzjP7A+vwBIxyT+w=</latexit> Existing approaches D p Build a conserv rvat ative estimat ate of the decisions that are safe to evaluate S t p S t Uni Uniforml mly reduce uce unce uncertaint nty on the boundary of this region G t G t Treating the ex expansion of the safe set as a pro proxy xy objective can be was waste teful Example: 1D optimization task StageOPT [Sui et al. 2018] Many unnecessary samples when optimum has already been found f ( s ) ≡ q ( s ) q ( s ) ≥ 0 # evaluations Domain D Matteo Turchetta

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend