A Semantic Loss Function for Deep Learning with Symbolic Knowledge - - PowerPoint PPT Presentation

a semantic loss function for deep learning with symbolic
SMART_READER_LITE
LIVE PREVIEW

A Semantic Loss Function for Deep Learning with Symbolic Knowledge - - PowerPoint PPT Presentation

A Semantic Loss Function for Deep Learning with Symbolic Knowledge Jingyi Xu, Zilu Zhang , Tal Friedman , Yitao Liang, Guy Van den Broeck Goal : Constrain neural network outputs using logic 1 Multiclass Classification 0.8 0.3 0.9 2 2


slide-1
SLIDE 1

A Semantic Loss Function for Deep Learning with Symbolic Knowledge

Jingyi Xu, Zilu Zhang, Tal Friedman, Yitao Liang, Guy Van den Broeck

slide-2
SLIDE 2

Goal: Constrain neural network outputs using logic

1

slide-3
SLIDE 3

Multiclass Classification

2 2

0.3 0.8 0.9

slide-4
SLIDE 4

Multiclass Classification

3

Want exactly one class:

3

0.3 0.8 0.9

𝒚𝟐¬𝒚𝟑¬𝒚𝟒 ∨ ¬𝒚𝟐𝒚𝟑¬𝒚𝟒 ∨ ¬𝒚𝟐¬𝒚𝟑𝒚𝟒

slide-5
SLIDE 5

Multiclass Classification

4

Want exactly one class: No information gained!

4

0.3 0.8 0.9 T T F

𝒚𝟐¬𝒚𝟑¬𝒚𝟒 ∨ ¬𝒚𝟐𝒚𝟑¬𝒚𝟒 ∨ ¬𝒚𝟐¬𝒚𝟑𝒚𝟒

slide-6
SLIDE 6

5

Why is mixing so difficult?

Deep Learning

  • Continuous
  • Smooth
  • Differentiable

Logic

  • Discrete
  • Symbolic
  • Strong semantics

5

slide-7
SLIDE 7

Multiclass Classification

6

Want exactly one class: Probability constraint is satisfied

6

0.3 0.8 0.9

𝒚𝟐¬𝒚𝟑¬𝒚𝟒 ∨ ¬𝒚𝟐𝒚𝟑¬𝒚𝟒 ∨ ¬𝒚𝟐¬𝒚𝟑𝒚𝟒

slide-8
SLIDE 8

Use a probabilistic interpretation!

7

slide-9
SLIDE 9

Multiclass Classification

8

Want exactly one class: Probability constraint is satisfied 𝒚𝟐 𝟐 − 𝒚𝟑 𝟐 − 𝒚𝟒 + 𝟐 − 𝒚𝟐 𝒚𝟑 𝟐 − 𝒚𝟒 + 𝟐 − 𝒚𝟐 𝟐 − 𝒚𝟑 𝒚𝟒 = 𝟏. 𝟐𝟗𝟗

8

0.3 0.8 0.9

𝒚𝟐¬𝒚𝟑¬𝒚𝟒 ∨ ¬𝒚𝟐𝒚𝟑¬𝒚𝟒 ∨ ¬𝒚𝟐¬𝒚𝟑𝒚𝟒

slide-10
SLIDE 10

Semantic Loss

  • Continuous, smooth, easily differentiable function
  • Represents how close outputs are to satisfying the constraint
  • Axiomatically respects semantics of logic, maintains precise meaning

– independent of syntax

9

slide-11
SLIDE 11

How do we compute semantic loss?

10

slide-12
SLIDE 12

Logical Circuits

  • In general: #P-hard
  • Linear in size of circuit

11

L(α,p) = L( , p) = - log( )

slide-13
SLIDE 13

Supervised Learning

12

  • Predict shortest paths
  • Add semantic loss representing paths

Is output a path? Does output have true edges? Is output the true shortest path?

slide-14
SLIDE 14

Semi-Supervised Learning

13

  • Unlabeled data must have some label
slide-15
SLIDE 15

Semi-Supervised Learning

14

  • Unlabeled data must have some label
  • Exactly-one constraint increases confidence
slide-16
SLIDE 16

15

slide-17
SLIDE 17

Main Takeaway

  • Deep learning and logic can be combined by using a probabilistic

approach

  • Maintain precise meaning while fitting into the deep learning

framework

16

slide-18
SLIDE 18

Thanks!

17