comp 762 paper presentation technical details alexander
play

COMP 762: Paper Presentation (Technical Details) Alexander - PowerPoint PPT Presentation

COMP 762: Paper Presentation (Technical Details) Alexander Nicholson Are Deep Neural Networks the Best Choice for Modeling Source Code? Authors: Vincent J. Hellendoorn, Premkumar Devanbu Overview: Big question in the title Creates a


  1. COMP 762: Paper Presentation (Technical Details) Alexander Nicholson Are Deep Neural Networks the Best Choice for Modeling Source Code? Authors: Vincent J. Hellendoorn, Premkumar Devanbu

  2. Overview: ▪ Big question in the title ▪ Creates a robust baseline model (Important!) ▪ Optimizations for a specific, practical task

  3. Scope Language specific scoping? Static vs. Dynamic

  4. Why online training/dynamism with (R)NN is hard: E.g. 50000 * 512 = 25600000 parameters between first and second layer. Image From: Towards Deep Learning Software Repositories, White et al.

  5. Smoothing (Discounting/Correction and Interpolation) Discounting/Correction: Interpolation: Subtract/Add a number � “Add information from from the counts. known distributions” Weight additional distributions using � Resources: https://www.youtube.com/watch?v=FUS7XkhYBLo&list=PLBv09BD7ez_7Ke6U7yGBvfP4_Hau3ZGj2&index =5 https://nlp.stanford.edu/~wcmac/papers/20050421-smoothing-tutorial.pdf

  6. Laplace/Lidstone Correction Add and re-normalize Image From: Stanford Smoothing Tutorial - https://nlp.stanford.edu/~wcmac/papers/20050421-smoothing-tutorial.pdf

  7. Absolute Discounting Subtract and re-normalize Paper’s modification uses three values of � Image From: Stanford Smoothing Tutorial - https://nlp.stanford.edu/~wcmac/papers/20050421-smoothing-tutorial.pdf

  8. Kneser-Ney Smoothing Paper’s modification uses three values of � Image From: Stanford Smoothing Tutorial - https://nlp.stanford.edu/~wcmac/papers/20050421-smoothing-tutorial.pdf

  9. Jelinek-Mercer Smoothing General Interpolation: � P(X) + (1- � )P(Z) In J-M: � is a constant.

  10. Witten-Bell Smoothing Image From: Stanford Smoothing Tutorial - https://nlp.stanford.edu/~wcmac/papers/20050421-smoothing-tutorial.pdf

  11. Trie Data-structure From Wikipedia: Children have common prefix.

  12. Trie Data-structure Each scope has its own trie

  13. Zipf’s Law Image From: https://phys.org/news/2017-08-unzipping-zipf-law-solution-century-old.html

  14. Memoization - Optimization technique commonly used in dynamic programming - Cache-(ish) to avoid multiple recalculations. http://cs.mcgill.ca/~jcheung/teaching/fall-2017/comp550/index.html

  15. Dependency Models (Trees) https://en.wikibooks.org/wiki/LaTeX/Linguistics

  16. Dropout http://cs.mcgill.ca/~hvanho2/comp551/

  17. Evaluation Terms - Mean Reciprocal Rank - Explained well in Sec. 4.1 - Two-tailed t-test - Statistical significance test - Tests if target is higher OR lower than reference. - Cohen’s D - Effect Size - Used with t-test

  18. Thanks!

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend