type driven automated learning
play

Type-Driven Automated Learning with L ALE Martin Hirzel, Kiran - PowerPoint PPT Presentation

Type-Driven Automated Learning with L ALE Martin Hirzel, Kiran Kate, Avi Shinnar, Pari Ram, and Guillaume Baudart Tuesday 4 November 2019 https://github.com/ibm/lale Value Proposition Augment, but dont replace, the Automation data


  1. Type-Driven Automated Learning with L ALE Martin Hirzel, Kiran Kate, Avi Shinnar, Pari Ram, and Guillaume Baudart Tuesday 4 November 2019 https://github.com/ibm/lale

  2. Value Proposition Augment, but don’t replace, the Automation data scientist. Easy search and tuning of pipelines Interop Usability Python building Like scikit-learn blocks & beyond plus types 2

  3. Categorical + Continuous Dataset https://nbviewer.jupyter.org/github/IBM/lale/blob/master/examples/talk_2019-1105-lale.ipynb 3

  4. Manual Pipeline 4

  5. Pipeline Combinators L ALE features Name Description Scikit-learn features >> or pipe feed to next make_pipeline make_pipeline & or make_union or and run both ColumnTransformer make_union | or N/A (specific to given or choose one make_choice Auto-ML tool) 5

  6. Automated Pipeline 6

  7. Displaying Automation Results 7

  8. Bindings as Lifecycle: Venn Diagram Individual operator Pipeline Meta-model schemas, priors steps, grammar arrange Planned graph topology init Trainable hyperparameters operator choices fit Trained learned coefficients compose ( >> , & , | ) “Type-Driven Automated Learning with Lale”, https://arxiv.org/pdf/1906.03957.pdf 8

  9. Semi-Automated Data Science Manual control over automation Examples • Interpretable Restrict available operator choices • Based on licenses • Based on GPU requirements Tweak graph topology • Custom preprocessing • Multi-modal data • Fairness mitigation • Adjust range for continuous Tweak hyperparameter schemas • Restrict choices for categorical Expand available operator choices • Wrap existing library • Write your own operators pipeline = ( arrange, init, freeze search, ( Project(columns={'type': 'number'}) >> Norm fit, & Project(columns={'type': 'string'}) >> OneHot) Data >> Concat score pretty-print, visualize Scientist >> (LR | XGBoost | LinearSVC)) 9

  10. Constraints in Scikit-learn 10

  11. Type-Driven Manual Learning in L ALE Schemas Data validate Scientist Hyperparameters Trainable Pipeline Project Norm Concat XGBoost Project OneHot 11

  12. Constraints in L ALE 12

  13. Types as Documentation 13

  14. Constraints in Auto-ML Problem: Some automated trials raise exceptions Solution 1: Unconstrained search space • { solver : [ linear , sag , lbfgs ], penalty : [ l1 , l2 ]} • Catch exception (after some time) • Return made-up loss np.float.max Solution 2: Constrained search space • { solver : [ linear , sag , lbfgs ], penalty : [ l1 , l2 ]} and ( if solver : [ sag , lbfgs ] then penalty : [ l2 ]) • No exceptions (no time wasted) • No made-up loss 14

  15. Types as Search Spaces Planned Pipeline Search Space generate L ALE can generate search Project Norm spaces for various Auto-ML Concat LR | XGBoost | LinearSVC tools including hyperopt, Project OneHot GridSearchCV, and SMAC Schemas Data acquire Scientist Hyperparameters Trainable Pipeline Search Point decode Sample from search space, Project Norm encoded by given Auto-ML tool Concat XGBoost Project OneHot “Type-Driven Automated Learning with Lale”, https://arxiv.org/pdf/1906.03957.pdf 15

  16. Types as Single Source of Truth Planned Pipeline Search Space generate L ALE can generate search Project Norm spaces for various Auto-ML Concat LR | XGBoost | LinearSVC tools including hyperopt, Project OneHot GridSearchCV, and SMAC Schemas Data validate acquire Scientist Hyperparameters Trainable Pipeline Search Point decode Sample from search space, Project Norm encoded by given Auto-ML tool Concat XGBoost Project OneHot “Type-Driven Automated Learning with Lale”, https://arxiv.org/pdf/1906.03957.pdf 16

  17. Customizing Types 17

  18. Scikit-learn Compatible Interopability Pipeline ( bold : best found choice) Modality Dataset Movie reviews (BERT | TFIDF ) Text (sentiment >> (LR | MLP | KNN | SVC | PAC ) analysis) Car (structured J48 | ArulesCBA | LR | KNN Table with categorical features) CIFAR-10 Images (image ResNet50 classification) Epilepsy WindowTransformer Time-series (seizure >> (KNN | XGBoost | LR ) classification) >> Voting 18

  19. Ongoing Work • General improvements • More operators • More Auto-ML tools • More robustness • Resource usage • Memory • Compute • Expressiveness • Grammars • Ensembles We welcome your suggestions and contributions! 19

  20. Conclusion Automation Easy search and tuning of pipelines github.com/ibm/lale Interop Usability Python building Like scikit-learn blocks & beyond plus types Scikit-learn compatible interop 20

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend