bias variance analysis for network data
play

Bias/Variance Analysis for Network Data Jennifer Neville and David - PowerPoint PPT Presentation

Bias/Variance Analysis for Network Data Jennifer Neville and David Jensen Knowledge Discovery Laboratory Knowledge Discovery Laboratory University of Massachusetts Amherst University of Massachusetts Amherst Collective inference + + + + +


  1. Bias/Variance Analysis for Network Data Jennifer Neville and David Jensen Knowledge Discovery Laboratory Knowledge Discovery Laboratory University of Massachusetts Amherst University of Massachusetts Amherst

  2. Collective inference + + + + + + + + + – – – + + + – – – − Apply models to collectively infer – – – + + + class labels throughout network + + + + + + – – – – – – – – – − Exploit autocorrelation to + + + – – – – – – – – – improve model performance + + + – – – – – – – – – – – – – – + + + – – – – – – – − Collective SRL models + + + – – – – – – − Probabilistic relational models + + + – – – + + + (e.g., RBNs, RDNs, RMNs) + + + + + + – – – – – – − Probabilistic logic models – – – – – – – – – (e.g., BLPs, MLNs) − Adhoc collective models + + + + + + + + + – – – (e.g., pRNs, LBC) + + + – – – – – – + + + 2/13

  3. Comparing collective models Latent group models Relational dependency networks 3/13

  4. Comparing collective models Latent group models Relational dependency networks Why do RDNs perform poorly when few instances are labeled in test set? 3/13

  5. Understanding RDN performance − Hypothesis − High autocorrelation → features selection chooses class label rather than observed attributes − Few labeled test set instances → identifiability problem − Gibbs sampling → increased variance − How to evaluate hypothesis? − Variance is due to collective inference procedure − Need an analysis framework that can differentiate model errors due to learning and inference 4/13

  6. Bias/variance analysis − Conventional bias/variance analysis − Decomposes errors due to learning alone − Assumes no variation due to inference − Relational bias/variance analysis − Collective inference introduces new source of error − SRL models exhibit different types of errors − Network characteristics affect performance 5/13

  7. Conventional bias/variance framework M 1 M 2 Model predictions Test Set Training M 3 Set Samples Models 6/13

  8. Conventional bias/variance framework M 1 variance M 2 Model predictions _ Y* bias Y − Expected Expected error error per per instance instance Test Set Training − Decompose Decompose into into model model bias/variance bias/variance M 3 Set Samples Models 6/13

  9. Bias/variance framework for relational data – – + M 1 + + + – – + + – – – – – – + – – – + + – – M 2 + – – – + – – + – – – – – + – – – – – + + – – + – + Model predictions + + – – – – – Fully labeled Test Set – – – Training – – M 3 – – Set – – – + Samples Models 7/13

  10. Bias/variance framework for relational data learning bias – – + M 1 + + + – – learning variance + + – – – – – – + – – – + + – – M 2 + – – – + – – + – – – – – + – – – – – + + _ – – + – + Model predictions + + Y* Y L – – – – – Fully labeled Measure learning bias Test Set − Measure bias and and variance variance with with full full labeling labeling – – – Training – – M 3 – – Set – – – + Samples Models 7/13

  11. Bias/variance framework for relational data – – – – – – – – – – – – + M 1 + + + – – – – – – – – – + + – – – – – – + – – – + + M 2 + – – – + – – + – – – + – – + + – – Model predictions + – + + – – – + – – – – – – – – – – – – – – – Training – – M 3 – – Set – – – + – – – – – – – Samples Models – – – Test Set Inference Runs 8/13

  12. Bias/variance framework for relational data – – – total – – bias – – – – – – – + M 1 + + + – – – – – total – – variance – – + + – – – – – – + – – – + + M 2 + – – – + – – + – – – + – – + + _ – – Model predictions + – + + – – – + Y* Y – – – – – – – – – Measure total bias − Measure bias and and variance variance – – – – – – Training – – M 3 − Expectation over training Expectation over training and test sets test sets – – Set – – – + – – – – – – – Samples Models – – – Test Set Inference Runs 8/13

  13. Bias/variance framework for relational data – – – learning total inference – – bias bias bias – – – – – – – + M 1 + + + – – – – – total – – variance – – + + – – – – – – + – – – + + M 2 + – – – + – – + – – – + – – + + _ _ _ – – Model predictions + – + + – – – + Y* Y* Y L Y Y – – – – – – – – – Measure total bias − Measure − Measure Measure learning bias bias and bias and and variance and variance variance variance with with full full labeling labeling – – – – – – Training – – M 3 − Expectation over training Expectation over training and test sets test sets − Measure Measure total bias bias and and variance variance – – Set – – – − Expectation over training Expectation over training and test sets test sets + – – – – – Difference: inference bias − Difference: bias and and variance variance – – Samples Models – – – Test Set Inference Runs 8/13

  14. Synthetic data experiments − Vary group size, linkage, autocorrelation − Compare LGMs, RDNs, RMNs − Preliminary findings − LGMs: high learning bias when algorithm cannot identify underlying group structure − RDNs: high inference variance when little information seeding inference process − RMNs: high inference bias when network is densely connected or tightly clustered 9/13

  15. Feature selection increases RDN inference variance 10/13

  16. Feature selection increases RDN inference variance Inference Variance 10/13

  17. Modified inference decreases variance 11/13

  18. Improved performance on real data 12/13

  19. Conclusions − Framework can be used to explain mechanisms behind SRL model performance − Improves understanding of model behavior − Suggests algorithmic modifications to increase performance − Future work − Extend framework (e.g., loss functions, joint estimation) − Investigate interaction effects between learning and inference errors − Real data experiments to evaluate design choices 13/13

  20. Further information: jneville@cs.umass.edu kdl.cs.umass.edu 14/13

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend