a semantic loss function for deep learning with symbolic

A Semantic Loss Function for Deep Learning with Symbolic Knowledge - PowerPoint PPT Presentation

A Semantic Loss Function for Deep Learning with Symbolic Knowledge Jingyi Xu, Zilu Zhang , Tal Friedman , Yitao Liang, Guy Van den Broeck Goal : Constrain neural network outputs using logic 1 Multiclass Classification 0.8 0.3 0.9 2 2


  1. A Semantic Loss Function for Deep Learning with Symbolic Knowledge Jingyi Xu, Zilu Zhang , Tal Friedman , Yitao Liang, Guy Van den Broeck

  2. Goal : Constrain neural network outputs using logic 1

  3. Multiclass Classification 0.8 0.3 0.9 2 2

  4. Multiclass Classification π’š 𝟐 Β¬π’š πŸ‘ Β¬π’š πŸ’ ∨ 0.8 0.3 0.9 Want exactly one class: Β¬π’š 𝟐 π’š πŸ‘ Β¬π’š πŸ’ ∨ Β¬π’š 𝟐 Β¬π’š πŸ‘ π’š πŸ’ 3 3

  5. Multiclass Classification π’š 𝟐 Β¬π’š πŸ‘ Β¬π’š πŸ’ T F T ∨ 0.8 0.3 0.9 Want exactly one class: Β¬π’š 𝟐 π’š πŸ‘ Β¬π’š πŸ’ ∨ Β¬π’š 𝟐 Β¬π’š πŸ‘ π’š πŸ’ No information gained! 4 4

  6. Why is mixing so difficult? Deep Learning Logic β€’ Continuous β€’ Discrete β€’ Smooth β€’ Symbolic β€’ Differentiable β€’ Strong semantics 5 5

  7. Multiclass Classification π’š 𝟐 Β¬π’š πŸ‘ Β¬π’š πŸ’ ∨ 0.8 0.3 0.9 Want exactly one class: Β¬π’š 𝟐 π’š πŸ‘ Β¬π’š πŸ’ ∨ Β¬π’š 𝟐 Β¬π’š πŸ‘ π’š πŸ’ Probability constraint is satisfied 6 6

  8. Use a probabilistic interpretation! 7

  9. Multiclass Classification π’š 𝟐 Β¬π’š πŸ‘ Β¬π’š πŸ’ ∨ 0.8 0.3 0.9 Want exactly one class: Β¬π’š 𝟐 π’š πŸ‘ Β¬π’š πŸ’ ∨ Β¬π’š 𝟐 Β¬π’š πŸ‘ π’š πŸ’ Probability constraint is satisfied π’š 𝟐 𝟐 βˆ’ π’š πŸ‘ 𝟐 βˆ’ π’š πŸ’ + 𝟐 βˆ’ π’š 𝟐 π’š πŸ‘ 𝟐 βˆ’ π’š πŸ’ + 𝟐 βˆ’ π’š 𝟐 𝟐 βˆ’ π’š πŸ‘ π’š πŸ’ = 𝟏. πŸπŸ—πŸ— 8 8

  10. Semantic Loss β€’ Continuous, smooth, easily differentiable function β€’ Represents how close outputs are to satisfying the constraint β€’ Axiomatically respects semantics of logic, maintains precise meaning – independent of syntax 9

  11. How do we compute semantic loss? 10

  12. Logical Circuits β€’ In general: #P-hard β€’ Linear in size of circuit = - log( ) L(Ξ±, p ) = L( , p ) 11

  13. Supervised Learning β€’ Predict shortest paths β€’ Add semantic loss representing paths Is output Is output Does output a path? the true shortest path? have true edges? 12

  14. Semi-Supervised Learning β€’ Unlabeled data must have some label 13

  15. Semi-Supervised Learning β€’ Unlabeled data must have some label β€’ Exactly-one constraint increases confidence 14

  16. 15

  17. Main Takeaway β€’ Deep learning and logic can be combined by using a probabilistic approach β€’ Maintain precise meaning while fitting into the deep learning framework 16

  18. Thanks! 17

Recommend


More recommend