learning context dependent label permutations for multi
play

Learning Context-dependent Label Permutations for Multi-label - PowerPoint PPT Presentation

Learning Context-dependent Label Permutations for Multi-label Classification Jinseok Nam Amazon Alexa AI Joint work with Young-Bum Kim, Eneldo Loza Menca, Sunghyun Park, Ruhi Sarikaya and Johannes Frnkranz Mu Multi-lab label el Clas


  1. Learning Context-dependent Label Permutations for Multi-label Classification Jinseok Nam Amazon Alexa AI Joint work with Young-Bum Kim, Eneldo Loza Mencía, Sunghyun Park, Ruhi Sarikaya and Johannes Fürnkranz

  2. Mu Multi-lab label el Clas lassif ific icatio tion (MLC) • Goal : learn a function f that maps instances to a subset of labels Sea Desert Building f − − − − − − → Sky Cloud Mountain • It is important to take into account label dependencies . • Joint probability of labels L Y P ( y 1 , y 2 , · · · , y L | x ) = P ( y i | y <i , x ) i =1

  3. Ma Maximi mization on of of t the j joi oint p prob obability • Traditional approaches for minimizing subset 0/1 loss : • (Probabilistic) classifier chain (Dembczyński et al., ICML 2010; Read et al., MLJ 2011) Y = {Sea, Desert, Building, Sky, Cloud, Mountain} 1. Creates a chain of L labels Desert Sea Cloud Mountain Sky Building f 1 f 2 f 3 f 4 f 5 f 6 Desert = 0 Desert = 0 Desert = 0 Desert = 0 Desert = 0 Sea = 1 Sea = 1 Sea = 1 Sea = 1 Additional input features Cloud = 0 Cloud = 0 Cloud = 0 2. Train L independent classifiers Mountain = 1 Mountain = 1 Sky = 1 given input and partial label vector

  4. Ma Maximi mization on of of t the j joi oint p prob obability • Traditional approaches for minimizing subset 0/1 loss : • (Probabilistic) classifier chain (Dembczyński et al., ICML 2010; Read et al., MLJ 2011) Y = {Sea, Desert, Building, Sky, Cloud, Mountain} 1. Creates a chain of L labels Desert Sea Cloud Mountain Sky Building f 1 f 2 f 3 f 4 f 5 f 6 Desert = 0 Desert = 0 Desert = 0 Desert = 0 Desert = 0 Sea = 1 Sea = 1 Sea = 1 Sea = 1 Additional input features Cloud = 0 Cloud = 0 Cloud = 0 2. Train L independent classifiers Mountain = 1 Mountain = 1 Sky = 1 given input and partial label vector • Error-propagation at test time Limitations • Effect of label orders in the chain

  5. Ma Maximi mization on of of t the j joi oint p prob obability • Traditional approaches for minimizing subset 0/1 loss : • (Probabilistic) classifier chain (Dembczyński et al., ICML 2010; Read et al., MLJ 2011) Y = {Sea, Desert, Building, Sky, Cloud, Mountain} 1. Creates a chain of L labels Desert Sea Cloud Mountain Sky Building f 1 f 2 f 3 f 4 f 5 f 6 Desert = 0 Desert = 0 Desert = 0 Desert = 0 Desert = 0 Sea = 1 Sea = 1 Sea = 1 Sea = 1 Sea = 0 Sea = 0 Sea = 0 Sea = 0 Additional input features Cloud = 0 Cloud = 0 Cloud = 0 2. Train L independent classifiers Mountain = 1 Mountain = 1 Sky = 1 given input and partial label vector • Error-propagation at test time Limitations • Effect of label orders in the chain

  6. Re Recurrent Neural Networks for MLC • Learning from a set of relevant labels in a sequential manner (Nam et al., NIPS 2017) • Number of relevant labels is much smaller than the total number of labels Sea Building Sky Mountain END o 1 o 2 o 3 o 4 o 5 h 0 h 1 h 2 h 3 h 4 h 5 Sea Building Sky Mountain

  7. Re Recurrent Neural Networks for MLC • Learning from a set of relevant labels in a sequential manner (Nam et al., NIPS 2017) • Number of relevant labels is much smaller than the total number of labels Sea Building Sky Mountain END o 1 o 2 o 3 o 4 o 5 h 0 h 1 h 2 h 3 h 4 h 5 Sea Building Sky Mountain • Question : The effect of label permutation remain! How to determine the target label permutation?

  8. Target label permutations for RNN training Ta • Static label permutation for all instances • Arbitrary label sequence randomly chosen at the beginning • Label frequency distribution: freq2rare , rare2freq • Label structures (e.g., pairwise label dependencies) ➜ Suboptimal choice; learn from only one permutation • Different label permutations for individual instances • Choosing randomly every time • Learning from all possible label permutations ➜ More robust to the effect of label permutation; Computational complexity We need MLC algorithms that learn context-dependent label permutations efficiently !

  9. Target label permutations for RNN training Ta • Static label permutation for all instances • Arbitrary label sequence randomly chosen at the beginning • Label frequency distribution: freq2rare , rare2freq • Label structures (e.g., pairwise label dependencies) ➜ Suboptimal choice; learn from only one permutation • Different label permutations for individual instances • Choosing randomly every time • Learning from all possible label permutations ➜ More robust to the effect of label permutation; Computational complexity We need MLC algorithms that learn context-dependent label permutations efficiently !

  10. Target label permutations for RNN training Ta • Static label permutation for all instances • Arbitrary label sequence randomly chosen at the beginning • Label frequency distribution: freq2rare , rare2freq • Label structures (e.g., pairwise label dependencies) ➜ Suboptimal choice; learn from only one permutation • Different label permutations for individual instances • Choosing randomly every time • Learning from all possible label permutations ➜ More robust to the effect of label permutation; Computational complexity We need MLC algorithms that learn context-dependent label permutations efficiently !

  11. Mod Model ba based ed label bel per permut utation ⑴ ⑵ False positive True positive False negative prediction prediction prediction computing errors & label sequence sampling updating parameters 2 1 S 2 1 4 3 5 S x B x x 2 x x x x 1 x x B x 2 x 1 x 4 x 3 x 5 sampled target label true target label set : 1 2 3 4 5 2 1 4 3 5 permutation :

  12. <latexit sha1_base64="IgrCFsrhHA8C8hGl8eyMxhFpsME=">ACfHicbVFdaxNBFJ1dv2r8aGwfBiFDZIwq6I9qVQEF8itK0hey63JnMJkNnP5i5Wwjr/or+M9/8Kb6Is0lQ23ph4Mw587HubzSylIY/vD8W7fv3L23c7/34OGjx7v9J3sntqyNkFNR6tKcbRSq0JOSZGWZ5WRmHMtT/n5+04/vZDGqrI4plUlkxwXhcqUQHJU2r+MC+Qa0yampSRsPwUbMIRDiHOkJefNhzZtJn8cXx3Aum0h1jKjGcS2ztNGHYZOR5Fjr96ovOVC/jbDwGmCr6BTdUQgi8Oj4AH3W4YG7VYUpL2B+E4XBfcBNEWDNi2Jmn/ezwvRZ3LgoRGa2dRWFHSoCEltGx7cW1lheIcF3LmYIG5tEmzDq+Fl46ZQ1YatwqCNftvR4O5taucO2eXh72udeT/tFlN2UHSqKqSRZic1FWa6ASuknAXBkpSK8cQGUeyuIJRoU5ObVcyFE1798E5y8HkfhOPr8ZnB0sI1jhz1lz1nAIvaOHbGPbMKmTLCf3jMv8IbeL/+F/8ofbay+t+3Z1fKf/sbepLAlg=</latexit> <latexit sha1_base64="IgrCFsrhHA8C8hGl8eyMxhFpsME=">ACfHicbVFdaxNBFJ1dv2r8aGwfBiFDZIwq6I9qVQEF8itK0hey63JnMJkNnP5i5Wwjr/or+M9/8Kb6Is0lQ23ph4Mw587HubzSylIY/vD8W7fv3L23c7/34OGjx7v9J3sntqyNkFNR6tKcbRSq0JOSZGWZ5WRmHMtT/n5+04/vZDGqrI4plUlkxwXhcqUQHJU2r+MC+Qa0yampSRsPwUbMIRDiHOkJefNhzZtJn8cXx3Aum0h1jKjGcS2ztNGHYZOR5Fjr96ovOVC/jbDwGmCr6BTdUQgi8Oj4AH3W4YG7VYUpL2B+E4XBfcBNEWDNi2Jmn/ezwvRZ3LgoRGa2dRWFHSoCEltGx7cW1lheIcF3LmYIG5tEmzDq+Fl46ZQ1YatwqCNftvR4O5taucO2eXh72udeT/tFlN2UHSqKqSRZic1FWa6ASuknAXBkpSK8cQGUeyuIJRoU5ObVcyFE1798E5y8HkfhOPr8ZnB0sI1jhz1lz1nAIvaOHbGPbMKmTLCf3jMv8IbeL/+F/8ofbay+t+3Z1fKf/sbepLAlg=</latexit> <latexit sha1_base64="IgrCFsrhHA8C8hGl8eyMxhFpsME=">ACfHicbVFdaxNBFJ1dv2r8aGwfBiFDZIwq6I9qVQEF8itK0hey63JnMJkNnP5i5Wwjr/or+M9/8Kb6Is0lQ23ph4Mw587HubzSylIY/vD8W7fv3L23c7/34OGjx7v9J3sntqyNkFNR6tKcbRSq0JOSZGWZ5WRmHMtT/n5+04/vZDGqrI4plUlkxwXhcqUQHJU2r+MC+Qa0yampSRsPwUbMIRDiHOkJefNhzZtJn8cXx3Aum0h1jKjGcS2ztNGHYZOR5Fjr96ovOVC/jbDwGmCr6BTdUQgi8Oj4AH3W4YG7VYUpL2B+E4XBfcBNEWDNi2Jmn/ezwvRZ3LgoRGa2dRWFHSoCEltGx7cW1lheIcF3LmYIG5tEmzDq+Fl46ZQ1YatwqCNftvR4O5taucO2eXh72udeT/tFlN2UHSqKqSRZic1FWa6ASuknAXBkpSK8cQGUeyuIJRoU5ObVcyFE1798E5y8HkfhOPr8ZnB0sI1jhz1lz1nAIvaOHbGPbMKmTLCf3jMv8IbeL/+F/8ofbay+t+3Z1fKf/sbepLAlg=</latexit> <latexit sha1_base64="IgrCFsrhHA8C8hGl8eyMxhFpsME=">ACfHicbVFdaxNBFJ1dv2r8aGwfBiFDZIwq6I9qVQEF8itK0hey63JnMJkNnP5i5Wwjr/or+M9/8Kb6Is0lQ23ph4Mw587HubzSylIY/vD8W7fv3L23c7/34OGjx7v9J3sntqyNkFNR6tKcbRSq0JOSZGWZ5WRmHMtT/n5+04/vZDGqrI4plUlkxwXhcqUQHJU2r+MC+Qa0yampSRsPwUbMIRDiHOkJefNhzZtJn8cXx3Aum0h1jKjGcS2ztNGHYZOR5Fjr96ovOVC/jbDwGmCr6BTdUQgi8Oj4AH3W4YG7VYUpL2B+E4XBfcBNEWDNi2Jmn/ezwvRZ3LgoRGa2dRWFHSoCEltGx7cW1lheIcF3LmYIG5tEmzDq+Fl46ZQ1YatwqCNftvR4O5taucO2eXh72udeT/tFlN2UHSqKqSRZic1FWa6ASuknAXBkpSK8cQGUeyuIJRoU5ObVcyFE1798E5y8HkfhOPr8ZnB0sI1jhz1lz1nAIvaOHbGPbMKmTLCf3jMv8IbeL/+F/8ofbay+t+3Z1fKf/sbepLAlg=</latexit> Po Policy gr gradi dient " T − 1 # X r θ J ( θ ) = E P τ r θ log P θ ( a i | s i )( R i � b ( s i )) θ i =0 Label policy distribution 2 1 S Model prediction evaluation Model parameter updates Generated label permutation: 2 1 true target label set: 1 2 3 4 5 x B x x 2 x x x x 1 x

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend