A unifying computational framework for teaching and active learning - - PowerPoint PPT Presentation
A unifying computational framework for teaching and active learning - - PowerPoint PPT Presentation
A unifying computational framework for teaching and active learning Scott Cheng-Hsin Yang, Wai Keen Vong, Yue Yu & Patrick Shafto Active learning World Learner Teaching Teacher World Learner Self-teaching Self as teacher World
World Learner
Active learning
Teacher
Teaching
World Learner
Self as teacher
Self-teaching
World Learner
World: h* Learner
Active learning
PL(x) x y PL(h|x,y) intervene
- bserve
consequence update belief
h1 h2 h3 h4 h1 h2 h3 h4 step 0 step 1 h1 h2 h3 h4 step 2
active learning strategy
World: h* Learner
Teaching
x,y PL(h|x,y)
Teacher
PT(x,y|h*) PL(x) Shafto et al. 2008, 2014 teaching strategy active learning strategy update belief show Teacher knows y and h*; learner does not.
PL(h|x, y) ∝ PT (x, y|h)PL(h) PT (x, y|h) ∝ PL(h|x, y)PT (x, y)
learner’s inference teacher’s selection
World: h* Learner
Teaching (marginalize out y)
x y PL(h|x,y)
Teacher
PT(x|h*) PL(x) Yang & Shafto 2017 teaching strategy (y marginalized) active learning strategy show
- bserve
consequence update belief
PL(h|x, y) ∝ P(y|x, h)PT (x|h)PL(h) PT (x|h) = X
y∈Y
PT (x, y|h)
learner’s inference teacher’s selection
World: h* Learner
Knowledgeability (marginalize out “h”)
Teacher
Shafto, Eaves, et al. 2012
h1 h2 h3 h4 g1 1/4 1/4 1/4 1/4 g2 1/4 1/4 1/4 1/4 g3 1/4 1/4 1/4 1/4 g4 1/4 1/4 1/4 1/4
δST(g|h) = PL(h): truth learner’s belief
PT (x|h) = X
g∈H
PT (x|g)δ(g|h) PT (x) = X
g∈H
PT (x|g)PL(g)
PT(x|h*)
h1 h2 h3 h4 g1 1 g2 1 g3 1 g4 1
δ(g|h): truth teacher’s belief teaching strategy (y marginalized) PL(x) active learning strategy PT(x) = PL(x) teaching strategy (y & h* marginalized) =
World: h* Learner
Self-teaching
PT(x) = PL(x) x y PL(h|x,y) self-teaching
PL(h|x, y) = P(y|x, h)PT (x)PL(h) P
h02H P (y|x, h0) PT (x)PL (h0)
PT (x) = X
g2H
PT (x|g)PL(g)
learner’s inference self-teacher’s selection
How is the Self-Teaching model different from the most common model of active learning objective —optimizing for expected information gain? Does the Self-Teaching model capture human’s active learning behavior?
- Meta-reasons about oneself
as the teacher
- Reasons about the world
EIG(x) = H(h) − X
y∈Y
PL(y|x)H(h|x, y)
<latexit sha1_base64="qafWQDz31JGEjQxfXGriyYGcUIM=">ACFnicbVDLSgMxFM3UV62vqks3wSK0YMuMFNSFUBSxgosK9iFtGTJp2oZmMkOSkQ7TfoUbf8WNC0Xcijv/xvSx0NYDFw7n3Mu9zg+o1KZ5rcRW1hcWl6JrybW1jc2t5LbOxXpBQKTMvaYJ2oOkoRTsqKkZqviDIdRipOr2LkV9IEJSj9+p0CdNF3U4bVOMlJbsZPby+irdz8AzWEx3MzALGzJw7ShsUA7vhyX7Jh0O+hntDfqHYcZOpsycOQacJ9aUpMAUJTv51Wh5OHAJV5ghKeuW6atmhISimJFhohFI4iPcQx1S15Qjl8hmNH5rCA+0oJtT+jiCo7V3xMRcqUMXUd3ukh15aw3Ev/z6oFqnzQjyv1AEY4ni9oBg8qDo4xgiwqCFQs1QVhQfSvEXSQVjrJhA7Bmn15nlSOclY+d3qbTxXOp3HEwR7YB2lgWNQAEVQAmWAwSN4Bq/gzXgyXox342PSGjOmM7vgD4zPH3RmnI=</latexit>Self-Teaching Expected information gain
PT (x) = X
g∈H
X
y∈Y
PL(g|x, y)PT (x, y) Z(g) PL(g)
<latexit sha1_base64="cqcFkR/GCh4kNstgZEF+ICQVjI=">ACM3icbVBNS8MwGE7n9/yaevQSHMIGMloR1IMgehHxMG56TZKmqVdWJqWJBVL7X/y4h/xIgHRbz6H0xrDzp9IeR5n+d9SN7HCRmVyjSfjdLE5NT0zOxceX5hcWm5srJ6KYNIYNLCAQtEx0GSMpJS1HFSCcUBPkOI21ndJzp7RsiJA34hYpD0veRx6lLMVKasiunTfuidluHB7AnI9OvB7l8CQtujrnTnCoSTpn1W8+5ut+J6btJ3mlzXvHqaC3W7UjUbZl7wL7AKUAVFNe3KY28Q4MgnXGpOxaZqj6CRKYkbSci+SJER4hDzS1ZAjn8h+ku+cwk3NDKAbCH24gjn705EgX8rYd/Skj9RQjmsZ+Z/WjZS7108oDyNFOP5+yI0YVAHMAoQDKghWLNYAYUH1XyEeIp2P0jGXdQjW+Mp/weV2w9p7J/vVA+PijhmwTrYADVgV1wCE5AE7QABvfgCbyCN+PBeDHejY/v0ZJReNbArzI+vwASwqjF</latexit>- Uses only the rules of
probability
- Also uses entropy and
subtraction
- Hypothesis testing for
distinctive hypothesis
- Overall uncertainty reduction
Self-teaching: confirming distinctive h
A distinctive hypothesis is
- ne that is on average less
likely to be inferred if all interventions and
- bservations are equally
likely to occur.
Z(g) = X
y∈Y
X
x∈X
PL(g|x, y)PT (x, y)
Distinctiveness Learner’s posterior h1 h2 h3 h4 x1 y0 x1 y1 x2 y0 x2 y1 x3 y0 x3 y1 Self-teaching probability x2 x3
*
x1
PT (x) = X
g∈H
PT (x|g)PL(g) = X
g∈H
X
y∈Y
PL(g|x, y)PT (x, y)PL(g)Z(g)−1
<latexit sha1_base64="ohAqT4yxP6i/L/LHqRm7pZ9r6b8=">ACznichVLdatswGJXdtWuzrUvby918LAwS2I9AksvCoXeBLqLDPLTLU6DrCipiCy7khyaOqa3e7d9QH2HpMdb0uTwT4QHJ1zPh39+RFnSjvOo2XvPNvde75/UHrx8tXh6/LRcU+FsS0S0IeyisfK8qZoF3NKdXkaQ48Dnt+7OLTO/PqVQsFB29iOgwFPBJoxgbahR+WcbRpBAB1Kowh3UAM7AU3GQs1PwmAvwPqGYG6IlrGl8LRlaWy1gvuc9n8/6v80Reb+te1lN8rLk3Se2OtbaSvc3/TAb4V6Dr54KalUbni1J28YBu4Baigotqj8g9vHJI4oEITjpUauE6khwmWmhFO05IXKxphMsNTOjBQ4ICqYZI/RwrvDOGSjNEBpydr0jwYFSi8A3zuzQalPLyH9pg1hPmsOEiSjWVJBV0CTmoEPI3hbGTFKi+cIATCQzewVygyUm2vyA7BLczSNvg97Hutuon35pVM6bxXsozfoLaoiF31C56iF2qiLiHVp3Vr3VmK37bmd2g8rq20VPSfoSdnfwGscqr</latexit>How is the Self-Teaching model different from the most common model of active learning objective —optimizing for expected information gain? Does the Self-Teaching model capture human’s active learning behavior?
Boundary game
? ? ? task
Causal graph learning
? ? Coenen et al. 2015 task
Coenen, Rehder, & Gureckis. (2015). Strategies to intervene on causal systems are adaptively selected. Cognitive psychology, 79, 102-133.
Human choices Expected information gain
icti
Self-Teaching model Expected information gain
Collaborators
Wai Keen Vong Yue Yu Patrick Shafto Yang, Vong, Yu & Shafto. (2019). A unifying computational framework for teaching and active learning. Topics in Cognitive Science 11(2): 316-337.
Conclusions
- We derived a Self-Teaching model, a novel form of active learning.
- It depends on only the rules of probability (may have implications for
active machine learning).
- It unifies teaching and active learning under a single learning mechanism.
- It matches human’s active learning behavior in many cases.