EXPLOITING STRUCTURE FOR META-LEARNING NeurIPS Metalearning - PowerPoint PPT Presentation

EXPLOITING STRUCTURE FOR META-LEARNING NeurIPS Metalearning Workshop | December 8, 2018 Lise Getoor | UC Santa Cruz | @lgetoor

STRUCTURE STRUCTURE IN STRUCTURE IN INPUTS OUTPUTS STRUCTURE IN META-LEARNING MODEL

THIS TALK Structure & Meta-learning

STATISTICAL RELATIONAL LEARNING 1 Make use of logical structure Handle uncertainty 2 Perform collective inference 3 [GETOOR & TASKAR ’07]

PROBABILISTIC SOFT LOGIC (PSL) A probabilistic programming language for collective inference problems • Predicate = relationship or property • Ground Atom = (continuous) random variable • Weighted Rules = capture dependency or constraint PSL Program = Rules + Input DB psl.linqs.org KEY REFERENCE: Hinge-Loss Markov Random Fields and Probabilistic Soft Logic, Stephen Bach, Matthias Broecheler, Bert Huang, Lise Getoor, JMLR 2017

COLLECTIVE Reasoning outputs depend on each other

COLLECTIVE Classification Pattern local-predictor(x,l) à label(x,l) label(x,l) & link(x,y) à label(y,l)

COLLECTIVE CLASSIFICATION SPOUSE FRIEND FRIEND COLLEAGUE FRIEND FRIEND SPOUSE SPOUSE COLLEAGUE QUESTION: or ?

COLLECTIVE CLASSIFICATION SPOUSE FRIEND ? FRIEND COLLEAGUE FRIEND FRIEND SPOUSE ? ? SPOUSE COLLEAGUE QUESTION: or ?

COLLECTIVE CLASSIFICATION Local rules: SPOUSE FRIEND • “ If X donates to party P, X votes for P ” • “ If X tweets party P slogans, X votes for P” Relational rules: FRIEND COLLEAGUE FRIEND FRIEND SPOUSE • “ If X is linked to Y, and X votes for P, Y votes for P” SPOUSE COLLEAGUE

COLLECTIVE CLASSIFICATION Local rules: SPOUSE FRIEND • “ If X donates to party P, X votes for P ” • “ If X tweets party P slogans, X votes for P” Relational rules: FRIEND COLLEAGUE FRIEND FRIEND SPOUSE • “ If X is linked to Y, and X votes for P, Y votes for P” SPOUSE COLLEAGUE Donates(X,P) � Votes(X,P)

COLLECTIVE CLASSIFICATION Local rules: SPOUSE FRIEND • “ If X donates to party P, X votes for P ” • “ If X tweets party P slogans, X votes for P” Relational rules: FRIEND COLLEAGUE FRIEND FRIEND SPOUSE • “ If X is linked to Y, and X votes for P, Y votes for P” SPOUSE COLLEAGUE Tweets(X,“Affordable Health”) � Votes(X,“Democrat”)

COLLECTIVE CLASSIFICATION Local rules: SPOUSE FRIEND • “ If X donates to party P, X votes for P ” • “ If X tweets party P slogans, X votes for P” Relational rules: FRIEND COLLEAGUE FRIEND FRIEND SPOUSE • “ If X is linked to Y, and X votes for P, Y votes for P” SPOUSE COLLEAGUE Votes(X,P) & Friends(X,Y) � Votes(Y,P) Votes(X,P) & Spouse(X,Y) � Votes(Y,P)

COLLECTIVE Activity Recognition inferring activities in video sequence

ACTIVITY RECOGNITION crossing waiting queueing walking talking dancing jogging

COLLECTIVE Pattern local-predictor(x,l,f) à activity(x,l,f) activity(x,l,f) & same-frame(x,y,f) à activity(y,l,f) activity(x,l,f) & next-frame(f,f’) à activity(x,l,f’)

EMPIRICAL HIGHLIGHTS Improved activity recognition in video: 5 Activities 6 Activities HOG 47.4% .481 F1 59.6% .582 F1 HOG + PSL 59.8% .603 F1 79.3% .789 F1 ACD 67.5% .678 F1 83.5% .835 F1 ACD + PSL 69.2% .693 F1 86.0% .860 F1 London et al., Collective Activity Detection using Hinge-loss Markov Random Fields , CVPR WS 13

COLLECTIVE Stance Prediction Inferring users’ stance in online debates

DEBATE STANCE DHANYA CLASSIFICATION SRIDHAR TOPIC: Climate Change Disagree TASK: Disagree Pro Jointly infer users’ attitude on topics and interaction polarity Anti Agree Anti Disagree Pro Sridhar, Foulds, Huang, Getoor & Walker, Joint Models of Disagreement and Stance , ACL 2015

PSL FOR STANCE CLASSIFICATION // local text classifiers w 1 : LocalPro(U,T) -> Pro(U,T) w 1 : LocalDisagree(U1,U2) -> Disagrees(U1,U2) //Rules for stance w 2 : Pro(U1,T) & Disagrees(U1,U2) -> !Pro(U2,T) w 2 : Pro(U1,T) & !Disagrees(U1,U2) -> Pro(U2,T) //Rules for disagreement w 3 : Pro(U1,T) & Pro(U1,T) -> !Disagrees(U1,U2) w 3 : !Pro(U1,T) & Pro(U2,T) -> Disagrees(U1,U2) bitbucket.org/linqs/psl-joint-stance

PREDICTING STANCE IN ONLINE FORUMS Task: Predict post and user stance from two online debate forums • 4Forums.com: ~300 users,~6000 posts • CreateDebate.org: ~300 users, ~1200 posts 4FORUMS.COM CREATEDEBATE.ORG ACCURACY ACCURACY Text-only Baseline 69.0 Text-only Baseline 62.7 PSL 80.3 PSL 72.7 Sridhar, Foulds, Huang, Getoor & Walker, Joint Models of Disagreement and Stance , ACL 2015

LINK Prediction Pattern link(x,y) & similar(y,z) à link(x,z)

CLUSTERING Pattern link(x,y) & link(y,z) à link(x,z)

MATCHING Pattern link(x,y) & !same(y,z) à !link(x,z)

THIS TALK Structure & Meta-learning

SRL <-> META-LEARN SRL Concepts Meta-learning Concepts Templated Models Tied Hyperparameters Weight Learning Hyperparameter Optimization Structure Learning Feature & Algorithm Selection Latent Variables Landmarks Logical rules Few/Zero-shot learning

TEMPLATING Probabilistic programming language for defining distributions /* Local rules */ w d : Donates(A, P) -> Votes(A, P) w t : Mentions(A, “Affordable Health”) -> Votes(A, “Democrat”) w t : Mentions(A, “Tax Cuts”) -> Votes(A, + = “Republican”) /* Relational rules */ w s : Votes(A,P) & Spouse(B,A) -> Votes(B,P) w f : Votes(A,P) & Friend(B,A) -> Votes(B,P) w c : Votes(A,P) & Colleague(B,A) -> Votes(B,P) /* Range constraint */ Votes(A, “Republican”) + Votes(A, “Democrat”) = 1.0 .

LEARN when structural patterns hold across many instantiations

STRUCTURE LEARNING • Large subfield of statistical relational learning • Friedman et al. IJCAI 99, Getoor et al. JMLR 02, Kok & Domingos ICML05, Mihalkova & Mooney ICML07, DeRaedt et al. MLJ 2008, Khosravi et al AAAI10, Khot et al. ICDM 11, Van Haaren et al. MLJ15, among others • NIPS Relational Representation Learning Workshop • Basic Idea • Search model space • Model space is very rich • Optimize parameters • Information theoretic criteria, likelihood-based, and Bayesian approaches

META when structural patterns hold across many learning tasks LEARN

META LEARNING Tasks Works Configurations

META LEARNING Works Similar Similar Rules express: ? • “If configuration C works well for task ? T1, and task T2 is similar to T1, C will work well for T2” • “If configuration C1 works well for task ? T, and configuration C2 similar to C1, C2 will work well for T”

META LEARNING Works Similar Similar Rules express: ? • “If configuration C works well for ? task T1, and task T2 is similar to T1, C will work well for T2” • “If configuration C1 works well for task ? T, and configuration C2 similar to C1, C2 will work well for T” Works(C,T1) & SimilarTask(T1,T2) � Works(C,T2)

META LEARNING Works Similar Similar Rules express: ? • “If configuration C works well for task ? T1, and task T2 is similar to T1, C will work well for T2” • “If configuration C1 works well for ? task T, and configuration C2 similar to C1, C2 will work well for T” Works(C1,T) & SimilarConfig(C1,C2) � Works(C2,T)

META-LEARNING • Challenge: defining similarity • Advantages: • can make use of multiple similarity measures • can use domain knowledge for defining task and configuration similarity • Research questions: • Are there benefits from using this approach? • What are opportunities for collective reasoning?

LANDMARKING • Can be described using latent variables • E.g., Task-Area and Learner-Expertise as latent variables • Research questions: • Are there benefits from using SRL approach? • What are opportunities for collective reasoning?

ALGORITHM & MODEL SELECTION • Can be described using (probabilistic/soft) logical rules • Research questions: • Are there benefits from using SRL approach? • What are opportunities for collective reasoning?

PIPELINE CONSTRUCTION • Can be described using logical rules and constraints • Research questions: • Are there benefits from using SRL approach? • What are opportunities for collective reasoning?

CLOSING

STRUCTURE AND META-LEARNING CLOSING THE LOOP

CLOSING COMMENTS Provided some examples of structure and collective reasoning Opportunity for Meta-Learning methods that can mix: • probabilistic & logical inference • data-driven & knowledge-driven modeling OPPORTUNITY! • Meta-modeling for meta-modeling Compelling applications abound!

THANK YOU! PROBABILISTIC SOFT LOGIC psl.linqs.org Contact information: getoor@ucsc.edu | @lgetoor

EXPLOITING STRUCTURE FOR META-LEARNING NeurIPS Metalearning - PowerPoint PPT Presentation

EXPLOITING STRUCTURE FOR META-LEARNING NeurIPS Metalearning Workshop | December 8, 2018 Lise Getoor | UC Santa Cruz | @lgetoor STRUCTURE STRUCTURE IN STRUCTURE IN INPUTS OUTPUTS STRUCTURE IN META-LEARNING MODEL THIS TALK Structure &

Meta- Meta -Programming with Programming with Modelica Modelica for Meta- for Meta

9.4 Local Perception Filters 9.4 Local Perception Filters Exploiting Exploiting Perceptual

META Seal of Recognition and META Prize Award Ceremony Georg Rehm (DFKI) on behalf of the

Bayesian Model-Agnostic Meta-Learning Taesup Kim* (presenter), Jaesik Yoon* Ousmane Dia,

Meta Learning Shengchao Liu Background Meta Learning (AKA Learning to Learn) A

A few meta learning papers Guy Gur-Ari Machine Learning Journal Club, September 2017 Meta

The Meta-Learning Problem & Black-Box Meta-Learning CS 330 Logistics Homework 1 posted today,

MetaFun: Meta-Learning with Iterative Functional Updates Jin Xu, Jean-Francois Ton, Hyunjik Kim,

Intelligent Tutoring Systems: A Meta-Analysis Meta-Analysis Wenting Ma March, 2011

Company profile Capabilities Customers & References META-LRA Kft. 8400 Ajka,

Individual Participant Data (IPD) Reviews and Meta analyses Lesley Stewart Director, CRD Larysa

Lecture 31/Chapter 25 More about Meta-Analysis Benefits and Pitfalls An Application:

Simultaneous meta and data manipulation in Blaise Marien Lina Statistics netherlands Statistics

META-SHARE META SHARE the Open Resource Exchange Facility Stelios Piperidis ILSP-Athena RC,

CS 671 Automated Reasoning Meta Reasoning Object Level versus Meta Level Object level:

Me Meta Lear Learnin ing A Bri Brief Introduct ction Xiachong Feng Ou Outline

Implementing Procedure Calls February 1822, 2013 1 / 39 Outline Intro to procedure calls

Compiler construction Martin Steffen March 22, 2017 Contents 1 Abstract 1 1.1 Run-time

Lab 8: 21 May 2012 Exercises on Clustering 1. Use the k-means algorithm and Euclidean distance to

Clustering methods R.W. Oldford Interactive data visualization An important advantage of data

pointer-manipulating programs Nadia Polikarpova joint work with Ilya Sergey (Yale-NUS) follow

Lecture 17: Mutable Linked Lists Quiz 5 tomorrow at the beginning of lecture May cover

Data Structures for Disjoint Sets Course: CS 5130 - Advanced Data Structures and Algorithms

Multimodal Interaction Multimodal Interaction and TV Wei Yun Yau Programme Manager,

EXPLOITING STRUCTURE FOR META-LEARNING NeurIPS Metalearning - PowerPoint PPT Presentation

EXPLOITING STRUCTURE FOR META-LEARNING NeurIPS Metalearning Workshop | December 8, 2018 Lise Getoor | UC Santa Cruz | @lgetoor STRUCTURE STRUCTURE IN STRUCTURE IN INPUTS OUTPUTS STRUCTURE IN META-LEARNING MODEL THIS TALK Structure &

Meta- Meta -Programming with Programming with Modelica Modelica for Meta- for Meta

9.4 Local Perception Filters 9.4 Local Perception Filters Exploiting Exploiting Perceptual

META Seal of Recognition and META Prize Award Ceremony Georg Rehm (DFKI) on behalf of the

Bayesian Model-Agnostic Meta-Learning Taesup Kim* (presenter), Jaesik Yoon* Ousmane Dia,

Meta Learning Shengchao Liu Background Meta Learning (AKA Learning to Learn) A

A few meta learning papers Guy Gur-Ari Machine Learning Journal Club, September 2017 Meta

The Meta-Learning Problem &amp; Black-Box Meta-Learning CS 330 Logistics Homework 1 posted today,

MetaFun: Meta-Learning with Iterative Functional Updates Jin Xu, Jean-Francois Ton, Hyunjik Kim,

Intelligent Tutoring Systems: A Meta-Analysis Meta-Analysis Wenting Ma March, 2011

Company profile Capabilities Customers &amp; References META-LRA Kft. 8400 Ajka,

Individual Participant Data (IPD) Reviews and Meta analyses Lesley Stewart Director, CRD Larysa

Lecture 31/Chapter 25 More about Meta-Analysis Benefits and Pitfalls An Application:

Simultaneous meta and data manipulation in Blaise Marien Lina Statistics netherlands Statistics

META-SHARE META SHARE the Open Resource Exchange Facility Stelios Piperidis ILSP-Athena RC,

CS 671 Automated Reasoning Meta Reasoning Object Level versus Meta Level Object level:

Me Meta Lear Learnin ing A Bri Brief Introduct ction Xiachong Feng Ou Outline

Implementing Procedure Calls February 1822, 2013 1 / 39 Outline Intro to procedure calls

Compiler construction Martin Steffen March 22, 2017 Contents 1 Abstract 1 1.1 Run-time

Lab 8: 21 May 2012 Exercises on Clustering 1. Use the k-means algorithm and Euclidean distance to

Clustering methods R.W. Oldford Interactive data visualization An important advantage of data

pointer-manipulating programs Nadia Polikarpova joint work with Ilya Sergey (Yale-NUS) follow

Lecture 17: Mutable Linked Lists Quiz 5 tomorrow at the beginning of lecture May cover

Data Structures for Disjoint Sets Course: CS 5130 - Advanced Data Structures and Algorithms

Multimodal Interaction Multimodal Interaction and TV Wei Yun Yau Programme Manager,

The Meta-Learning Problem & Black-Box Meta-Learning CS 330 Logistics Homework 1 posted today,

Company profile Capabilities Customers & References META-LRA Kft. 8400 Ajka,