GEEM : An algorithm for Active Learning on Attributed Graphs - - PowerPoint PPT Presentation

geem an algorithm for active learning on attributed graphs
SMART_READER_LITE
LIVE PREVIEW

GEEM : An algorithm for Active Learning on Attributed Graphs - - PowerPoint PPT Presentation

GEEM : An algorithm for Active Learning on Attributed Graphs Florence Regol* Soumyasundar Pal*, Yingxue Zhang**, Mark Coates* * McGill University Compnet Lab **Huawei Noahs Ark Lab, Montreal Research Center July 14th 2020 1 / 22 Active


slide-1
SLIDE 1

GEEM : An algorithm for Active Learning on Attributed Graphs

Florence Regol* Soumyasundar Pal*, Yingxue Zhang**, Mark Coates*

* McGill University Compnet Lab **Huawei Noah’s Ark Lab, Montreal Research Center

July 14th 2020

1 / 22

slide-2
SLIDE 2

Active Learning - Problem Setting What is active learning? Label / Target Feature Space

2 / 22

slide-3
SLIDE 3

Active Learning - Problem Setting What is active learning?

  • Access to unlabelled data.

Label / Target Feature Space

2 / 22

slide-4
SLIDE 4

Active Learning - Problem Setting What is active learning?

  • Access to unlabelled data.
  • Query an oracle for labels/targets.

Label / Target Feature Space

2 / 22

slide-5
SLIDE 5

Active Learning - Problem Setting What is active learning?

  • Access to unlabelled data.
  • Query an oracle for labels/targets.

Label / Target Feature Space

2 / 22

slide-6
SLIDE 6

Active Learning - Problem Setting What is active learning?

  • Access to unlabelled data.
  • Query an oracle for labels/targets. → Expensive process.

Label / Target Feature Space

2 / 22

slide-7
SLIDE 7

Active Learning - Problem Setting What is active learning?

  • Access to unlabelled data.
  • Query an oracle for labels/targets. → Expensive process.

Label / Target Feature Space

? ?

Goal: Choose optimal queries to maximize performance.

2 / 22

slide-8
SLIDE 8

Active Learning for Node Classification

? ?

3 / 22

slide-9
SLIDE 9

Active Learning Process Pool-based active learning algorithm steps :

4 / 22

slide-10
SLIDE 10

Active Learning Process Pool-based active learning algorithm steps :

1

PREDICT : Infer ˆ Y = ft(X).

4 / 22

slide-11
SLIDE 11

Active Learning Process Pool-based active learning algorithm steps :

1

PREDICT : Infer ˆ Y = ft(X). Trained on (X, YLt) current labelled set Lt.

4 / 22

slide-12
SLIDE 12

Active Learning Process Pool-based active learning algorithm steps :

1

PREDICT : Infer ˆ Y = ft(X). Trained on (X, YLt) current labelled set Lt.

2

QUERY : Select q from the unlabelled set Ut.

4 / 22

slide-13
SLIDE 13

Active Learning Process Pool-based active learning algorithm steps :

1

PREDICT : Infer ˆ Y = ft(X). Trained on (X, YLt) current labelled set Lt.

2

QUERY : Select q from the unlabelled set Ut. Update Lt+1 = Lt ∪ {qt} and Ut+1 = Ut \ {qt}.

4 / 22

slide-14
SLIDE 14

Active Learning Process Pool-based active learning algorithm steps :

1

PREDICT : Infer ˆ Y = ft(X). Trained on (X, YLt) current labelled set Lt.

2

QUERY : Select q from the unlabelled set Ut. Update Lt+1 = Lt ∪ {qt} and Ut+1 = Ut \ {qt}. Repeat until the query budget B has been reached.

4 / 22

slide-15
SLIDE 15

Active Learning on Graphs - Existing work GCN-based models

5 / 22

slide-16
SLIDE 16

Active Learning on Graphs - Existing work GCN-based models

SOTA Active leaning strategies based on GCN output. (AGE [1] and ANRMAB [2])

5 / 22

slide-17
SLIDE 17

Active Learning on Graphs - Existing work GCN-based models

SOTA Active leaning strategies based on GCN output. (AGE [1] and ANRMAB [2])

1

PREDICT : Infer ˆ Y = ft(X).

5 / 22

slide-18
SLIDE 18

Active Learning on Graphs - Existing work GCN-based models

SOTA Active leaning strategies based on GCN output. (AGE [1] and ANRMAB [2])

1

PREDICT : Infer ˆ Y = ft(X). → Run one epoch of GCN.

5 / 22

slide-19
SLIDE 19

Active Learning on Graphs - Existing work GCN-based models

SOTA Active leaning strategies based on GCN output. (AGE [1] and ANRMAB [2])

1

PREDICT : Infer ˆ Y = ft(X). → Run one epoch of GCN. → Save the node embeddings output from the GCN.

5 / 22

slide-20
SLIDE 20

Active Learning on Graphs - Existing work GCN-based models

SOTA Active leaning strategies based on GCN output. (AGE [1] and ANRMAB [2])

1

PREDICT : Infer ˆ Y = ft(X). → Run one epoch of GCN. → Save the node embeddings output from the GCN.

2

QUERY Select q ∈ Ut.

5 / 22

slide-21
SLIDE 21

Active Learning on Graphs - Existing work GCN-based models

SOTA Active leaning strategies based on GCN output. (AGE [1] and ANRMAB [2])

1

PREDICT : Infer ˆ Y = ft(X). → Run one epoch of GCN. → Save the node embeddings output from the GCN.

2

QUERY Select q ∈ Ut. → Select q based on metrics derived from GCN output.

5 / 22

slide-22
SLIDE 22

Active Learning on Graphs - Existing work GCN-based models

SOTA Active leaning strategies based on GCN output. (AGE [1] and ANRMAB [2])

1

PREDICT : Infer ˆ Y = ft(X). → Run one epoch of GCN. → Save the node embeddings output from the GCN.

2

QUERY Select q ∈ Ut. → Select q based on metrics derived from GCN output.

[1] Cai et al. ”Active learning for graph embedding” arXiv 2017 [2] Gao et al. ”Active discriminative network representation learning” IJCAI 2018

5 / 22

slide-23
SLIDE 23

Existing work - Results

GCN-based algorithms on Cora. 20 40 60 Number of nodes in labeled set 50 60 70 80 Accuracy

Accuracy of GCN without active learning with x = 120 nodes in labeled set

AGE ANRMAB

6 / 22

slide-24
SLIDE 24

Limitation : Deep learning models generally rely on sizable validation set for hyperparameters tuning.

7 / 22

slide-25
SLIDE 25

Limitation : Deep learning models generally rely on sizable validation set for hyperparameters tuning. Results with non-optimized GCN hyperparameter highlight this dependence.

7 / 22

slide-26
SLIDE 26

Existing work - Non optimized model

Cora with non-optimized version of AGE. 20 40 60 Number of nodes in labeled set 40 50 60 70 80 Accuracy AGE AGE non optimized ANRMAB

8 / 22

slide-27
SLIDE 27

Existing work - Unseen dataset

Amazon-photo. Hyperparameters not fine-tuned to the dataset. 10 20 30 40 50 Number of nodes in labeled set 40 60 80 Accuracy

Accuracy of GCN without active learning with x = 160 nodes in labeled set

AGE AGE non optimized ANRMAB

9 / 22

slide-28
SLIDE 28

Proposed Algorithm : Graph Expected Error Minimization (GEEM)

10 / 22

slide-29
SLIDE 29

Proposed algorithm - GEEM Expected Error Minimization (EEM)

11 / 22

slide-30
SLIDE 30

Proposed algorithm - GEEM Expected Error Minimization (EEM)

Risk of q : The expected 0/1 error once added to Lt.

11 / 22

slide-31
SLIDE 31

Proposed algorithm - GEEM Expected Error Minimization (EEM)

Risk of q : The expected 0/1 error once added to Lt. Denoted by R+q

|YLt .

11 / 22

slide-32
SLIDE 32

Proposed algorithm - GEEM Expected Error Minimization (EEM)

Risk of q : The expected 0/1 error once added to Lt. Denoted by R+q

|YLt .

EEM selects the query q that minimizes this risk.

11 / 22

slide-33
SLIDE 33

Proposed algorithm - GEEM Expected Error Minimization (EEM)

Risk of q : The expected 0/1 error once added to Lt. Denoted by R+q

|YLt .

EEM selects the query q that minimizes this risk. q∗ = arg min

q∈Ut

R+q

|YLt

11 / 22

slide-34
SLIDE 34

Proposed algorithm - GEEM Expected Error Minimization (EEM)

Risk of q : The expected 0/1 error once added to Lt. Denoted by R+q

|YLt .

EEM selects the query q that minimizes this risk. q∗ = arg min

q∈Ut

R+q

|YLt

R+q

|YLt =

  • k∈K

1 |Uq

t |

  • i∈Uq

t

  • 1−max

k′∈K p(yi = k′|YLt, yq = k)

  • p(yq = k|YLt)

12 / 22

slide-35
SLIDE 35

Proposed algorithm - GEEM Expected Error Minimization (EEM)

Risk of q : The expected 0/1 error once added to Lt. Denoted by R+q

|YLt .

EEM selects the query q that minimizes this risk. q∗ = arg min

q∈Ut

R+q

|YLt

R+q

|YLt =

  • k∈K

1 |Uq

t |

  • i∈Uq

t

  • 1 − max

k′∈K p(yi = k′|YLt, yq = k)

  • p(yq = k|YLt)

13 / 22

slide-36
SLIDE 36

Proposed algorithm - GEEM Expected Error Minimization (EEM)

Risk of q : The expected 0/1 error once added to Lt. Denoted by R+q

|YLt .

EEM selects the query q that minimizes this risk. q∗ = arg min

q∈Ut

R+q

|YLt

R+q

|YLt =

  • k∈K

1 |Uq

t |

  • i∈Uq

t

  • 1 − max

k′∈K p(yi = k′|YLt, yq = k)

  • p(yq = k|YLt)

14 / 22

slide-37
SLIDE 37

Proposed Algorithm - p(y|·) All that remains is to define p(y|·)

15 / 22

slide-38
SLIDE 38

Proposed Algorithm - p(y|·) All that remains is to define p(y|·)

Simplified GCN [3] : Removes non-linearities of GCNs to

  • btain a linearized logistic regression model.

15 / 22

slide-39
SLIDE 39

Proposed Algorithm - p(y|·) All that remains is to define p(y|·)

Simplified GCN [3] : Removes non-linearities of GCNs to

  • btain a linearized logistic regression model.

Set p(yj = k|YL) = σ(˜ xjWYL)(k)

15 / 22

slide-40
SLIDE 40

Proposed Algorithm - p(y|·) All that remains is to define p(y|·)

Simplified GCN [3] : Removes non-linearities of GCNs to

  • btain a linearized logistic regression model.

Set p(yj = k|YL) = σ(˜ xjWYL)(k) GEEM : R+q

|YLt =

  • k∈K

1 |U−q

t

|

  • i∈Uq

(1−max

k′∈K σ(˜

xiWLt,+q,yk)(k′))σ(˜ xqWYLt )(k)

16 / 22

slide-41
SLIDE 41

Proposed Algorithm - p(y|·) All that remains is to define p(y|·)

Simplified GCN [3] : Removes non-linearities of GCNs to

  • btain a linearized logistic regression model.

Set p(yj = k|YL) = σ(˜ xjWYL)(k) GEEM : R+q

|YLt =

  • k∈K

1 |U−q

t

|

  • i∈Uq

(1−max

k′∈K σ(˜

xiWLt,+q,yk)(k′))σ(˜ xqWYLt )(k)

17 / 22

slide-42
SLIDE 42

Results

18 / 22

slide-43
SLIDE 43

Results - GEEM

  • Cora. GEEM outperforms GCN-based methods even when GCN

hyperparameters are fine-tuned. 20 40 60 Number of nodes in labeled set 40 50 60 70 80 Accuracy AGE AGE non optimized ANRMAB GEEM* Random

19 / 22

slide-44
SLIDE 44

Results - GEEM

Amazon-photo. GEEM significantly outperforms GCN-based methods. 10 20 30 40 50 Number of nodes in labeled set 40 60 80 Accuracy

20 / 22

slide-45
SLIDE 45

Conclusion

The proposed GEEM algorithm:

21 / 22

slide-46
SLIDE 46

Conclusion

The proposed GEEM algorithm: Offers SOTA performance.

21 / 22

slide-47
SLIDE 47

Conclusion

The proposed GEEM algorithm: Offers SOTA performance. Does not rely on validation set → More realistic scenario.

21 / 22

slide-48
SLIDE 48

Conclusion

The proposed GEEM algorithm: Offers SOTA performance. Does not rely on validation set → More realistic scenario. Additional contributions :

21 / 22

slide-49
SLIDE 49

Conclusion

The proposed GEEM algorithm: Offers SOTA performance. Does not rely on validation set → More realistic scenario. Additional contributions : Combined GEEM : Hybrid mixed with LP covers more cases.

21 / 22

slide-50
SLIDE 50

Conclusion

The proposed GEEM algorithm: Offers SOTA performance. Does not rely on validation set → More realistic scenario. Additional contributions : Combined GEEM : Hybrid mixed with LP covers more cases. Preemptive GEEM (PreGEEM) : Take advantage of oracle delay with approximations.

21 / 22

slide-51
SLIDE 51

Conclusion

The proposed GEEM algorithm: Offers SOTA performance. Does not rely on validation set → More realistic scenario. Additional contributions : Combined GEEM : Hybrid mixed with LP covers more cases. Preemptive GEEM (PreGEEM) : Take advantage of oracle delay with approximations. → Provide bounds on the approximation error.

21 / 22

slide-52
SLIDE 52

References

[1] H. Cai, V. W. Zheng, and K. C. Chang, “Active learning for graph embedding,” arXiv preprint arXiv:1705.05085, 2017. [2] L. Gao, H. Yang, C. Zhou, J. Wu, S. Pan, and Y. Hu, “Active discriminative network representation learning,” in Proc. Int. Joint Conf. Artificial Intell., 2018, pp. 2142–2148. [3] F. Wu, A. Souza, T. Zhang, C. Fifty, T. Yu, and K. Weinberger, “Simplifying graph convolutional networks,” in Proc. Int. Conf. Machine Learning, Long Beach, California, USA, Jun. 2019, pp. 6861–6871.

22 / 22